pipecat

Author	SHA1	Message	Date
zack	42f91a9056	Apply ruff formatting fixes	2026-03-01 11:44:37 -05:00
zack	d1cbc81108	Fix 07o example to use new min_turn_silence parameter name in docs and comments	2026-03-01 11:36:46 -05:00
zack	07ae4b8d38	Update AssemblyAI examples to use u3-rt-pro and improve 55d example - Update 13d-assemblyai-transcription.py to explicitly use u3-rt-pro model - Update 55d-update-settings-assemblyai-stt.py to demonstrate keyterms updates instead of language updates - Add helpful logging to show before/after keyterms boosting effect - Use difficult names (Xiomara, Saoirse, Krzystof) to demonstrate boosting effectiveness	2026-03-01 11:27:31 -05:00
zack	21a409e447	Update prompt warning and rename min_end_of_turn_silence_when_confident to min_turn_silence - Add "beta feature" note to custom prompt warning - Rename min_end_of_turn_silence_when_confident parameter to min_turn_silence across all AssemblyAI code - Update documentation, examples, and test files to use new parameter name	2026-03-01 11:17:39 -05:00
zack	d7ce1eedd9	Add foundational examples for AssemblyAI u3-rt-pro - 07o-interruptible-assemblyai.py: Basic example using Pipecat VAD mode - 07o-interruptible-assemblyai-stt.py: Advanced example using STT-controlled turn detection with comprehensive documentation on u3-rt-pro features (turn detection tuning, prompt-based enhancement, speaker diarization)	2026-02-27 17:58:18 -05:00
kompfner	7fe458fe59	Merge pull request #3817 from pipecat-ai/pk/service-settings-fix-back-compat-for-nested-external-sdk-types Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings`…	2026-02-26 11:08:27 -05:00
Mark Backman	907ff58d41	Align Ultravox Realtime service with OpenAI/Gemini patterns - Add InterruptionFrame handling with stop_all_metrics() - Add processing metrics (start/stop) at response boundaries - Fix agent transcript handling for voice and text modalities: - Voice mode: push LLMTextFrame (append_to_context=False) and TTSTextFrame for deltas, skip duplicated final text - Text mode: push LLMTextFrame with proper response lifecycle, no TTSTextFrame (downstream TTS handles audio) - Add output_medium parameter to AgentInputParams and OneShotInputParams - Improve TTFB measurement using VAD speech end time - Update example with user turn strategies and transcript events - Add text-only output example (50a-ultravox-realtime-text.py)	2026-02-26 10:44:36 -05:00
Mark Backman	3ae173520e	Code review feedback	2026-02-26 10:23:35 -05:00
Mark Backman	d69a337def	Add text_aggregation_mode parameter to TTSService Move the sentence vs token aggregation concern into text aggregators so all text flows through them regardless of mode. This enables pattern detection and tag handling to work in TOKEN mode. - Add TextAggregationMode enum (SENTENCE, TOKEN) as the user-facing TTS setting, separate from the internal AggregationType - Add TOKEN mode support to Simple, SkipTags, and PatternPair aggregators - Add text_aggregation_mode parameter to TTSService and all TTS subclasses - Deprecate aggregate_sentences in favor of text_aggregation_mode - Merge TTSService._process_text_frame() into a single codepath	2026-02-26 08:55:41 -05:00
Paul Kompfner	8b6aa4b912	Unflatten `LiveOptions` back into a single `live_options` field on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings`; add `apply_update` override with delta-merge semantics and `from_mapping` override for backward-compatible dict-style updates	2026-02-25 18:25:11 -05:00
Paul Kompfner	a4b6db6fb4	Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings` for backward-compatible dict-style updates via `STTUpdateSettingsFrame`; during the big service settings refactor, we accidentally got rid of the ability to update individual `LiveOptions` fields with a sparse update	2026-02-25 17:39:31 -05:00
kompfner	a8cb2a26d1	Merge pull request #3841 from pipecat-ai/pk/groq-tweaks A few Groq-related tweaks:	2026-02-25 15:54:33 -05:00
Paul Kompfner	ff0f3dce32	A few Groq-related tweaks: - Wire up passing speed setting to Groq, even though only a value of 1.0 is supported today - Update the 55y example to switch voices instead of changing speed - Add a 55zn example to exercise runtime updates of Groq STT	2026-02-25 15:10:48 -05:00
Paul Kompfner	bca42f7d68	Fix Hathora 55 series examples, and fix Hathora missing settings field warning	2026-02-25 14:48:40 -05:00
Mark Backman	44993fe9e3	Remove PlayHT TTS services	2026-02-25 14:12:39 -05:00
Mark Backman	3e6c59c736	Merge pull request #3809 from pipecat-ai/mb/krisp-viva-result Add Krisp API key support and debug logging	2026-02-25 09:05:12 -05:00
Mark Backman	0ca8c850fb	Add TurnMetricsData and e2e processing time for KrispVivaTurn Introduce a generic TurnMetricsData class for turn detection metrics, replacing the service-specific SmartTurnMetricsData (now deprecated). Add end-to-end processing time measurement to KrispVivaTurn, tracking the interval from VAD speech-to-silence transition to model threshold crossing. Consume metrics in the strategy _handle_input_audio path so they are pushed immediately when fresh.	2026-02-25 09:01:21 -05:00
Mark Backman	73ee4da7d4	Add Krisp API key support for new SDK licensing requirement The Krisp VIVA SDK v1.8.0 requires a license key in globalInit(). Add api_key parameter to KrispVivaSDKManager, KrispVivaTurn, and KrispVivaFilter with fallback to KRISP_API_KEY env var. Maintain backwards compatibility with older SDK versions by catching TypeError and falling back to the old 3-arg signature.	2026-02-25 09:01:00 -05:00
Paul Kompfner	bcc2b4def4	Make clearer the distinction between "storage-mode" and "delta-mode" usage of `Settings` objects - Storage mode: for use in `self._settings`. All fields should be specified, i.e. should not be `NOT_GIVEN`. - Delta mode: for use in `UpdateSettingsFrame`. In service of this, this commit: - Adds a runtime check that all fields are specified in storage mode - Updates all services to specify all fields in stored settings - Updates all services to no longer check for `is_given` in stored settings (not necessary anymore) - Updates relevant docstrings - Renames `update` to `delta` in `*UpdateSettingsFrame` - Updates community integrations guide	2026-02-24 14:01:28 -05:00
Mark Backman	65f563ad34	Add debug logging to KrispVivaTurn analyze_end_of_turn and update example Move speech detection tracking outside the per-frame loop in append_audio since is_speech applies to the whole buffer. Add debug log in analyze_end_of_turn to show state and probability at decision time. Update the Krisp VIVA example to use Cartesia TTS and turn analyzer strategy.	2026-02-23 21:35:35 -05:00
Paul Kompfner	ff174dd1c2	Fix STT/TTS Deepgram Sagemaker 55-series examples (examples updating settings at runtime)	2026-02-23 16:02:00 -05:00
Paul Kompfner	029f3dbefb	Updating 55o ElevenLabsTTSService example to also exercise switching voices, which requires reconnect	2026-02-23 12:08:13 -05:00
kompfner	03cb0054f9	Merge branch 'main' into pk/service-settings-refactor	2026-02-23 11:46:03 -05:00
Aleix Conchillo Flaqué	abb20f34ba	Update default Anthropic model to claude-sonnet-4-6 Update the default model in AnthropicLLMService and remove the now-unnecessary explicit model from the function calling example.	2026-02-20 16:17:51 -08:00
Aleix Conchillo Flaqué	af4ef95dc6	Fix missing await on add_audio_frames_message in Google audio examples The method is async but was being called without await, silently discarding the coroutine.	2026-02-20 14:24:22 -08:00
Filipi da Silva Fuchter	c9615c8db6	Merge pull request #3779 from pipecat-ai/filipi/filter_observer Allowing to define the list of frame processors whose frames should be silently ignored by the RTVI observer.	2026-02-20 12:42:02 -05:00
Mark Backman	82ce3ea8de	Update 07c example to use DeepgramSageMakerTTSService	2026-02-20 08:10:41 -07:00
Mark Backman	273692421f	Add DeepgramSageMakerTTSService for Deepgram TTS on AWS SageMaker Adds a TTS service that connects to Deepgram models deployed on AWS SageMaker endpoints via HTTP/2 bidirectional streaming. Supports the Deepgram TTS protocol (Speak, Flush, Clear, Close) over the BiDi client, with interruption handling and per-turn TTFB metrics. Updates the example and env.example with separate STT/TTS endpoint names.	2026-02-20 08:08:00 -07:00
Paul Kompfner	fb27642190	Add `self._settings` to 6 remaining services - AWSNovaSonicLLMService: new `AWSNovaSonicLLMSettings` with `voice_id` and `endpointing_sensitivity`; remove `self._params` entirely, storing audio I/O config as plain instance variables - NeuphonicHttpTTSService: reuse `NeuphonicTTSSettings`; use inherited `language` field instead of bespoke `lang_code` - NvidiaTTSService: new `NvidiaTTSSettings` with `quality` - PiperTTSService / PiperHttpTTSService: new `PiperTTSSettings` / `PiperHttpTTSSettings` (no extra fields) - SpeechmaticsTTSService: new `SpeechmaticsTTSSettings` with `max_retries` Also remove redundant `lang_code` from `NeuphonicTTSSettings` (both WS and HTTP services now use the inherited `TTSSettings.language` field, with automatic enum conversion via the base class). HTTP services (Neuphonic HTTP, Piper HTTP, Speechmatics) don't override `_update_settings` since the base class applies changes to `self._settings` and subsequent requests read from it automatically.	2026-02-19 18:35:59 -05:00
Paul Kompfner	463ea3725b	Update Deepgram Flux with the new service settings pattern	2026-02-19 17:12:24 -05:00
Paul Kompfner	6c609031ee	Add more 55-series examples Also: - remove unnecessary pass-through `_update_settings` implementation in `FalSTTService` - warn that `AsyncAITTSService` doesn't currently support runtime settings updates - update how `GradiumTTSService._update_settings` checks for voice changes - remove a couple of unnecessary args (because they specified defaults) in other examples	2026-02-19 16:46:14 -05:00
filipi87	18630c9478	Adding changelog entry for RTVI observer ignored_sources feature.	2026-02-19 18:41:05 -03:00
filipi87	3a8d3cc841	Allowing to define the list of frame processors whose frames should be silently ignored by the RTVI observer.	2026-02-19 18:36:12 -03:00
Paul Kompfner	cc54ff4708	Add more 55-series examples	2026-02-19 14:55:21 -05:00
Paul Kompfner	a7edd8e441	Fix 55zp example	2026-02-18 17:15:22 -05:00
Paul Kompfner	2a07138abf	Fix Grok Realtime dynamic session properties updating, and update corresponding 55zo example	2026-02-18 17:12:36 -05:00
Paul Kompfner	ad942f6e4c	Update 55zn example (UIltravox dynamic settings updates) to exercise changing modality, which is a setting that supports dynamic updates	2026-02-18 16:33:05 -05:00
Paul Kompfner	97d34ef9e1	Update OpenAI Realtime to warn when you try to update settings that can't be updated dynamically. Update corresponding example to demonstrate updating output modality.	2026-02-18 16:16:06 -05:00
Paul Kompfner	c054780477	Fix 55zh example	2026-02-18 15:59:34 -05:00
Paul Kompfner	88a2dbdb82	Update 55zf example to update a setting that is supported by the default Camb TTS model	2026-02-18 15:48:50 -05:00
Paul Kompfner	d386a0efda	Update Sarvam TTS to apply all changes to settings, not just voic	2026-02-18 15:31:08 -05:00
Paul Kompfner	b718a23c17	Tweak 55zd example	2026-02-18 15:25:50 -05:00
Paul Kompfner	e38f7d9451	Fix 55zc example	2026-02-18 15:23:23 -05:00
Paul Kompfner	b00d454842	Fix Inworld TTS settings updating	2026-02-18 15:19:57 -05:00
Paul Kompfner	0fa51811ea	Fix 55z example	2026-02-18 15:11:04 -05:00
Paul Kompfner	323ee00b83	Fix 55w example	2026-02-18 14:51:48 -05:00
Paul Kompfner	0c73b77327	Update Lmnt TTS to support updating settings dynamically	2026-02-18 14:47:38 -05:00
Paul Kompfner	416e1cf877	Update Rime TTS services to store voice in the standard `settings.voice` field, as opposed to the nonstandard `speaker` field	2026-02-18 14:46:47 -05:00
Paul Kompfner	b4c5cb258b	Tweak 55r example to make the settings update more pronounced	2026-02-18 14:15:14 -05:00
Paul Kompfner	728a97ade3	Update Deepgram TTS to support updating settings dynamically	2026-02-18 14:11:51 -05:00

1 2 3 4 5 ...

1171 Commits