Commit Graph

1701 Commits

Author SHA1 Message Date
Mark Backman
c79a739c85 Merge pull request #3856 from zkleb-aai/assemblyai-u3-rt-pro
Assemblyai u3 rt pro
2026-03-02 20:28:28 -05:00
Mark Backman
aad1211a57 Merge pull request #3885 from pipecat-ai/mb/latency-breakdown
Add latency breakdown to UserBotLatencyObserver
2026-03-02 19:27:35 -05:00
Mark Backman
7dbb130666 Add chronological_events utility function to display UserBotLatencyObserver report 2026-03-02 19:23:42 -05:00
Aleix Conchillo Flaqué
141b0ee014 Merge pull request #3902 from pipecat-ai/aleix/deepgram-sagemaker-move
Move Deepgram SageMaker modules to sagemaker/ subpackage
2026-03-02 15:25:17 -08:00
Aleix Conchillo Flaqué
088eb9b01c examples: update to new sagemaker packages 2026-03-02 15:20:52 -08:00
zack
32773b42d6 Improve terminology: rename file and replace 'STT mode' with 'AssemblyAI turn detection'
- Rename 07o-interruptible-assemblyai-stt.py -> 07o-interruptible-assemblyai-turn-detection.py
- Replace 'STT mode' with 'AssemblyAI turn detection mode' throughout codebase
- Replace 'Mode 1'/'Mode 2' with descriptive 'Pipecat turn detection'/'AssemblyAI turn detection'
- Update changelog to use 'built-in turn detection' terminology
- Addresses PR feedback about confusing terminology
2026-03-02 18:08:46 -05:00
zack
b449515410 Address PR review feedback: remove debug logs, fix hasattr logic, add VADAnalyzer 2026-03-02 17:54:31 -05:00
Mark Backman
aae9136df9 Review feedback 2026-03-02 17:52:39 -05:00
filipi87
49c73bb0a3 Merge branch 'main' into filipi/lemonslice
# Conflicts:
#	README.md
#	uv.lock
2026-03-02 19:24:52 -03:00
filipi87
f07e55a4ed Wrap LemonSlice session creation params in LemonSliceNewSessionRequest 2026-03-02 19:15:18 -03:00
filipi87
7afd7068b5 Retrieving the elevenlabs voice ID from environment variable 2026-03-02 19:02:51 -03:00
Mark Backman
ff5b985009 Convert observer data models to Pydantic BaseModel with timestamps
Enables .model_dump() serialization for Pipecat Cloud collection.
All metrics now include start_time (Unix timestamp) for timeline
plotting alongside duration_secs.
2026-03-02 16:11:43 -05:00
Mark Backman
a738a4d82b Add function call latency tracking to LatencyBreakdown 2026-03-02 16:11:43 -05:00
Mark Backman
ddba1b84a9 Add first-bot-speech latency to UserBotLatencyObserver
Measure time from ClientConnectedFrame to first BotStartedSpeakingFrame,
emitting a one-time on_first_bot_speech_latency event with breakdown.
2026-03-02 16:11:43 -05:00
Mark Backman
18155b6a63 Add latency breakdown to UserBotLatencyObserver
Add per-service latency breakdown metrics alongside existing user-to-bot
latency measurement. When enable_metrics=True, the observer now emits an
on_latency_breakdown event with TTFB, text aggregation, and user turn
duration metrics collected between VADUserStoppedSpeakingFrame and
BotStartedSpeakingFrame.

- Add LatencyBreakdown dataclass with ttfb, text_aggregation,
  user_turn_secs fields
- Accumulate MetricsFrame data during user→bot cycles
- Reset accumulators on InterruptionFrame to discard stale metrics
- Measure user_turn_secs from actual user silence (VAD timestamp -
  stop_secs) to turn release (UserStoppedSpeakingFrame)
- Filter zero-value TTFB entries from startup metric resets
- Add frame deduplication using bounded deque + set pattern
- Update example 29 with latency breakdown display
2026-03-02 16:11:43 -05:00
Mark Backman
b1e55fd6c2 Merge pull request #3881 from pipecat-ai/mb/startup-observer
Add StartupTimingObserver
2026-03-02 16:07:28 -05:00
Aleix Conchillo Flaqué
193f93c2ce Update Nvidia example to use llama-3.3-70b-instruct model 2026-03-02 10:16:27 -08:00
Mark Backman
68e8732e72 Add BotConnectedFrame and on_transport_timing_report event
Add BotConnectedFrame (SystemFrame) pushed by SFU transports (Daily,
LiveKit, HeyGen, Tavus) when the bot joins the room. Replace the
on_transport_readiness_measured event with on_transport_timing_report
which includes both bot_connected_secs and client_connected_secs.
2026-03-02 13:10:09 -05:00
Mark Backman
0836066898 Add ClientConnectedFrame and transport readiness timing
Introduce ClientConnectedFrame (SystemFrame) pushed by all transports
when a client connects. StartupTimingObserver uses this to measure
transport readiness — the time from StartFrame to first client
connection — via a new on_transport_readiness_measured event.
2026-03-02 13:10:09 -05:00
Mark Backman
c54232bdb4 Add StartupTimingObserver for measuring processor start() times
Tracks how long each processor start method takes during pipeline
startup by measuring StartFrame arrive/leave deltas. Emits a timing
report via the on_startup_timing_report event and auto-logs a summary.
Internal pipeline processors are excluded from reports by default.
2026-03-02 10:48:50 -05:00
zack
42f91a9056 Apply ruff formatting fixes 2026-03-01 11:44:37 -05:00
zack
d1cbc81108 Fix 07o example to use new min_turn_silence parameter name in docs and comments 2026-03-01 11:36:46 -05:00
zack
07ae4b8d38 Update AssemblyAI examples to use u3-rt-pro and improve 55d example
- Update 13d-assemblyai-transcription.py to explicitly use u3-rt-pro model
- Update 55d-update-settings-assemblyai-stt.py to demonstrate keyterms updates instead of language updates
- Add helpful logging to show before/after keyterms boosting effect
- Use difficult names (Xiomara, Saoirse, Krzystof) to demonstrate boosting effectiveness
2026-03-01 11:27:31 -05:00
zack
21a409e447 Update prompt warning and rename min_end_of_turn_silence_when_confident to min_turn_silence
- Add "beta feature" note to custom prompt warning
- Rename min_end_of_turn_silence_when_confident parameter to min_turn_silence across all AssemblyAI code
- Update documentation, examples, and test files to use new parameter name
2026-03-01 11:17:39 -05:00
Mark Backman
950a8628dc Miscellaneous foundational example updates 2026-02-27 19:49:45 -05:00
zack
d7ce1eedd9 Add foundational examples for AssemblyAI u3-rt-pro
- 07o-interruptible-assemblyai.py: Basic example using Pipecat VAD mode
- 07o-interruptible-assemblyai-stt.py: Advanced example using STT-controlled
  turn detection with comprehensive documentation on u3-rt-pro features
  (turn detection tuning, prompt-based enhancement, speaker diarization)
2026-02-27 17:58:18 -05:00
filipi87
0839e3813f Refactoring the examples to use the new context summarization classes. 2026-02-27 18:42:39 -03:00
filipi87
69414e8a5a Added example 54b-context-summarization-manual-openai.py demonstrating on-demand summarization triggered via a function call tool. 2026-02-27 18:42:23 -03:00
Mark Backman
712305c5b1 Add example 54c showing custom context summarization 2026-02-27 12:07:34 -05:00
filipi87
1f45e80f9d Updated the 52-live-translation.py example to demonstrate the fix 2026-02-27 11:56:52 -03:00
kompfner
7fe458fe59 Merge pull request #3817 from pipecat-ai/pk/service-settings-fix-back-compat-for-nested-external-sdk-types
Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings`…
2026-02-26 11:08:27 -05:00
Mark Backman
907ff58d41 Align Ultravox Realtime service with OpenAI/Gemini patterns
- Add InterruptionFrame handling with stop_all_metrics()
- Add processing metrics (start/stop) at response boundaries
- Fix agent transcript handling for voice and text modalities:
  - Voice mode: push LLMTextFrame (append_to_context=False) and
    TTSTextFrame for deltas, skip duplicated final text
  - Text mode: push LLMTextFrame with proper response lifecycle,
    no TTSTextFrame (downstream TTS handles audio)
- Add output_medium parameter to AgentInputParams and OneShotInputParams
- Improve TTFB measurement using VAD speech end time
- Update example with user turn strategies and transcript events
- Add text-only output example (50a-ultravox-realtime-text.py)
2026-02-26 10:44:36 -05:00
Mark Backman
3ae173520e Code review feedback 2026-02-26 10:23:35 -05:00
Mark Backman
d69a337def Add text_aggregation_mode parameter to TTSService
Move the sentence vs token aggregation concern into text aggregators
so all text flows through them regardless of mode. This enables
pattern detection and tag handling to work in TOKEN mode.

- Add TextAggregationMode enum (SENTENCE, TOKEN) as the user-facing
  TTS setting, separate from the internal AggregationType
- Add TOKEN mode support to Simple, SkipTags, and PatternPair aggregators
- Add text_aggregation_mode parameter to TTSService and all TTS subclasses
- Deprecate aggregate_sentences in favor of text_aggregation_mode
- Merge TTSService._process_text_frame() into a single codepath
2026-02-26 08:55:41 -05:00
Paul Kompfner
8b6aa4b912 Unflatten LiveOptions back into a single live_options field on DeepgramSTTSettings and DeepgramSageMakerSTTSettings; add apply_update override with delta-merge semantics and from_mapping override for backward-compatible dict-style updates 2026-02-25 18:25:11 -05:00
Paul Kompfner
a4b6db6fb4 Flatten LiveOptions into individual fields on DeepgramSTTSettings and DeepgramSageMakerSTTSettings for backward-compatible dict-style updates via STTUpdateSettingsFrame; during the big service settings refactor, we accidentally got rid of the ability to update individual LiveOptions fields with a sparse update 2026-02-25 17:39:31 -05:00
kompfner
a8cb2a26d1 Merge pull request #3841 from pipecat-ai/pk/groq-tweaks
A few Groq-related tweaks:
2026-02-25 15:54:33 -05:00
Paul Kompfner
ff0f3dce32 A few Groq-related tweaks:
- Wire up passing speed setting to Groq, even though only a value of 1.0 is supported today
- Update the 55y example to switch voices instead of changing speed
- Add a 55zn example to exercise runtime updates of Groq STT
2026-02-25 15:10:48 -05:00
Paul Kompfner
bca42f7d68 Fix Hathora 55 series examples, and fix Hathora missing settings field warning 2026-02-25 14:48:40 -05:00
Mark Backman
44993fe9e3 Remove PlayHT TTS services 2026-02-25 14:12:39 -05:00
Mark Backman
3e6c59c736 Merge pull request #3809 from pipecat-ai/mb/krisp-viva-result
Add Krisp API key support and debug logging
2026-02-25 09:05:12 -05:00
Mark Backman
0ca8c850fb Add TurnMetricsData and e2e processing time for KrispVivaTurn
Introduce a generic TurnMetricsData class for turn detection metrics,
replacing the service-specific SmartTurnMetricsData (now deprecated).
Add end-to-end processing time measurement to KrispVivaTurn, tracking
the interval from VAD speech-to-silence transition to model threshold
crossing. Consume metrics in the strategy _handle_input_audio path
so they are pushed immediately when fresh.
2026-02-25 09:01:21 -05:00
Mark Backman
73ee4da7d4 Add Krisp API key support for new SDK licensing requirement
The Krisp VIVA SDK v1.8.0 requires a license key in globalInit(). Add
api_key parameter to KrispVivaSDKManager, KrispVivaTurn, and
KrispVivaFilter with fallback to KRISP_API_KEY env var. Maintain
backwards compatibility with older SDK versions by catching TypeError
and falling back to the old 3-arg signature.
2026-02-25 09:01:00 -05:00
Paul Kompfner
bcc2b4def4 Make clearer the distinction between "storage-mode" and "delta-mode" usage of *Settings objects
- Storage mode: for use in `self._settings`. All fields should be specified, i.e. should not be `NOT_GIVEN`.
- Delta mode: for use in `*UpdateSettingsFrame`.

In service of this, this commit:
- Adds a runtime check that all fields are specified in storage mode
- Updates all services to specify all fields in stored settings
- Updates all services to no longer check for `is_given` in stored settings (not necessary anymore)
- Updates relevant docstrings
- Renames `update` to `delta` in `*UpdateSettingsFrame`
- Updates community integrations guide
2026-02-24 14:01:28 -05:00
Mark Backman
65f563ad34 Add debug logging to KrispVivaTurn analyze_end_of_turn and update example
Move speech detection tracking outside the per-frame loop in append_audio
since is_speech applies to the whole buffer. Add debug log in
analyze_end_of_turn to show state and probability at decision time. Update
the Krisp VIVA example to use Cartesia TTS and turn analyzer strategy.
2026-02-23 21:35:35 -05:00
Paul Kompfner
ff174dd1c2 Fix STT/TTS Deepgram Sagemaker 55-series examples (examples updating settings at runtime) 2026-02-23 16:02:00 -05:00
Paul Kompfner
029f3dbefb Updating 55o ElevenLabsTTSService example to also exercise switching voices, which requires reconnect 2026-02-23 12:08:13 -05:00
kompfner
03cb0054f9 Merge branch 'main' into pk/service-settings-refactor 2026-02-23 11:46:03 -05:00
Aleix Conchillo Flaqué
abb20f34ba Update default Anthropic model to claude-sonnet-4-6
Update the default model in AnthropicLLMService and remove the
now-unnecessary explicit model from the function calling example.
2026-02-20 16:17:51 -08:00
Joshua Primas
0b4568843b Improved logging + error handling + pipecat bot name usage 2026-02-20 15:59:52 -08:00