pipecat

Author	SHA1	Message	Date
zack	cd07937c5d	Fix missing imports: Add UserStartedSpeakingFrame and UserStoppedSpeakingFrame	2026-02-26 22:18:02 -05:00
zack	72934bd8ae	Add u3-rt-pro support and improvements to AssemblyAI STT service - Fix speaker diarization: Add field alias for speaker_label → speaker mapping in TurnMessage model - Add warning for non-optimal min_end_of_turn_silence_when_confident values (recommends 100ms for best latency) - Improve max_turn_silence override warning message clarity - Update custom prompt warning (remove 88% accuracy claim) - Add comprehensive logging for debugging: - Log final connection params after modifications - Log WebSocket URL and parsed parameters - Log speaker field in transcripts - Log text sent to LLM with speaker formatting - Support dynamic configuration updates via STTUpdateSettingsFrame: - keyterms_prompt (when AssemblyAI API supports it) - prompt - max_turn_silence - min_end_of_turn_silence_when_confident	2026-02-26 22:04:21 -05:00
Mark Backman	2a6a993869	Merge pull request #3850 from rupesh-svg/fix/genesys-remove-audio-chunk-logging Remove verbose audio chunk logging from GenesysAudioHookSerializer	2026-02-26 21:52:54 -05:00
Rupesh	bbaa79fef0	Add changelog for PR #3850	2026-02-26 14:00:34 -08:00
Rupesh	fff9db0d8f	Remove verbose audio chunk logging from GenesysAudioHookSerializer Fixes #3777	2026-02-26 13:51:05 -08:00
kompfner	7fe458fe59	Merge pull request #3817 from pipecat-ai/pk/service-settings-fix-back-compat-for-nested-external-sdk-types Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings`…	2026-02-26 11:08:27 -05:00
Paul Kompfner	faed775d90	Extract `_DeepgramSTTSettingsBase` with shared `_merge_live_options_delta` to deduplicate LiveOptions merge logic between `__init__` and `apply_update`, and between the Deepgram STT and SageMaker variants; make top-level model/language take precedence over conflicting live_options values in updates; remove unnecessary Language enum-to-string conversion (Language is a StrEnum)	2026-02-26 11:02:44 -05:00
Mark Backman	b63ca524f5	Merge pull request #3806 from pipecat-ai/mb/ultravox-updates Align Ultravox Realtime service with OpenAI/Gemini patterns	2026-02-26 10:49:21 -05:00
Mark Backman	907ff58d41	Align Ultravox Realtime service with OpenAI/Gemini patterns - Add InterruptionFrame handling with stop_all_metrics() - Add processing metrics (start/stop) at response boundaries - Fix agent transcript handling for voice and text modalities: - Voice mode: push LLMTextFrame (append_to_context=False) and TTSTextFrame for deltas, skip duplicated final text - Text mode: push LLMTextFrame with proper response lifecycle, no TTSTextFrame (downstream TTS handles audio) - Add output_medium parameter to AgentInputParams and OneShotInputParams - Improve TTFB measurement using VAD speech end time - Update example with user turn strategies and transcript events - Add text-only output example (50a-ultravox-realtime-text.py)	2026-02-26 10:44:36 -05:00
Mark Backman	97b93ebe57	Merge pull request #3696 from pipecat-ai/mb/streaming-tts-input Improve streaming TTS input support, add TextAggregationMetricsData	2026-02-26 10:26:53 -05:00
Mark Backman	3ae173520e	Code review feedback	2026-02-26 10:23:35 -05:00
Paul Kompfner	c184ac09b8	Inline `_build_live_options` into `_connect` in `DeepgramSTTService` and `DeepgramSageMakerSTTService` since it's trivial and only called from one place	2026-02-26 09:42:15 -05:00
Paul Kompfner	3c20eda8bf	Keep model/language in LiveOptions at construction time so apply_update's bidirectional sync is sufficient; simplify _build_live_options to only add sample_rate	2026-02-26 09:32:52 -05:00
Mark Backman	d69a337def	Add text_aggregation_mode parameter to TTSService Move the sentence vs token aggregation concern into text aggregators so all text flows through them regardless of mode. This enables pattern detection and tag handling to work in TOKEN mode. - Add TextAggregationMode enum (SENTENCE, TOKEN) as the user-facing TTS setting, separate from the internal AggregationType - Add TOKEN mode support to Simple, SkipTags, and PatternPair aggregators - Add text_aggregation_mode parameter to TTSService and all TTS subclasses - Deprecate aggregate_sentences in favor of text_aggregation_mode - Merge TTSService._process_text_frame() into a single codepath	2026-02-26 08:55:41 -05:00
Mark Backman	f7434cdde1	Add text aggregation time metric for TTS sentence aggregation Add TextAggregationMetricsData measuring the time from the first LLM token to the first complete sentence, representing the latency cost of sentence aggregation in the TTS pipeline.	2026-02-26 08:48:47 -05:00
Paul Kompfner	e21e8585f0	Add `deepgram` and `sagemaker` extras to CI test dependencies so Deepgram and Deepgram Sagemaker settings tests can run	2026-02-25 18:59:59 -05:00
Paul Kompfner	8b6aa4b912	Unflatten `LiveOptions` back into a single `live_options` field on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings`; add `apply_update` override with delta-merge semantics and `from_mapping` override for backward-compatible dict-style updates	2026-02-25 18:25:11 -05:00
Paul Kompfner	a4b6db6fb4	Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings` for backward-compatible dict-style updates via `STTUpdateSettingsFrame`; during the big service settings refactor, we accidentally got rid of the ability to update individual `LiveOptions` fields with a sparse update	2026-02-25 17:39:31 -05:00
Mark Backman	edc79d374a	Merge pull request #3836 from pipecat-ai/mb/small-webrtc-prebuilt-2.3.0 Update the pipecat-ai-small-webrtc-prebuilt to 2.3.0	2026-02-25 17:18:32 -05:00
Mark Backman	e521aef5df	Merge pull request #3842 from pipecat-ai/mb/claude-plugin-docs Add /update-docs skill to claude-plugin	2026-02-25 16:38:16 -05:00
kompfner	3cfff51205	Merge pull request #3827 from pipecat-ai/pk/gemini-tts-service-remove-model-ivar Remove unnecessary `_model` ivar from `GeminiTTSService`, using `_set…	2026-02-25 16:14:38 -05:00
Paul Kompfner	3d8e3a4043	Remove unnecessary `_model` ivar from ElevenLabs STT services, using `_settings.model` instead.	2026-02-25 16:07:33 -05:00
Paul Kompfner	7ee0400c4c	Remove unnecessary `_model` ivar from Hathora TTS and STT services, using `_settings.model` instead.	2026-02-25 16:07:26 -05:00
Paul Kompfner	781d191509	Remove unnecessary `_model` ivar from `GeminiTTSService`, using `_settings.model` instead	2026-02-25 15:59:38 -05:00
kompfner	a8cb2a26d1	Merge pull request #3841 from pipecat-ai/pk/groq-tweaks A few Groq-related tweaks:	2026-02-25 15:54:33 -05:00
kompfner	b1df1ba5d4	Merge pull request #3834 from pipecat-ai/pk/make-ai-service-exclusive-syncer-of-model-name-to-metrics Make it so that `AIService` is the exclusive "syncer" of model name t…	2026-02-25 15:53:59 -05:00
Mark Backman	eee2ef7e85	Add /update-docs skill to claude-plugin	2026-02-25 15:45:16 -05:00
Paul Kompfner	ff0f3dce32	A few Groq-related tweaks: - Wire up passing speed setting to Groq, even though only a value of 1.0 is supported today - Update the 55y example to switch voices instead of changing speed - Add a 55zn example to exercise runtime updates of Groq STT	2026-02-25 15:10:48 -05:00
Paul Kompfner	bca42f7d68	Fix Hathora 55 series examples, and fix Hathora missing settings field warning	2026-02-25 14:48:40 -05:00
Paul Kompfner	27940d83a2	Make it so that `AIService` is the exclusive "syncer" of model name to metrics. The only (rare) exception—where a service directly still needs to directly call `self._sync_model_name_to_metrics()`—is when the model name need to be "pulled" from another field (or nested field) in settings up to settings.model on a settings update. This only occurs in Deepgram services, where we use the voice as the model name. This change has the side-effect of bringing model name to metrics for a number of services that were accidentally omitting it before.	2026-02-25 14:48:24 -05:00
Mark Backman	937c691f2a	Merge pull request #3838 from pipecat-ai/mb/remove-playht Remove PlayHT TTS services	2026-02-25 14:34:15 -05:00
Mark Backman	6803d38d3f	Merge pull request #3833 from pipecat-ai/mb/add-performance-changelog-fragment Add Performance as a changelog fragment option	2026-02-25 14:33:52 -05:00
Mark Backman	44993fe9e3	Remove PlayHT TTS services	2026-02-25 14:12:39 -05:00
Mark Backman	0fe4c732b7	Merge pull request #3837 from alts/alts/append-trailing-space Add `append_trailing_space` to all Rime websocket services	2026-02-25 13:35:07 -05:00
Stephen Altamirano	ceead60ef2	Add `append_trailing_space` to all Rime websocket services This was added in `31daa889e8`, but only to `RimeTTSService`, not to `RimeNonJsonTTSService. Bringing these to parity means that users switching between the two, with the same inputs, have more consistent vocalization behaviors.	2026-02-25 10:02:38 -08:00
Mark Backman	e028194dbe	Update the pipecat-ai-small-webrtc-prebuilt to 2.3.0	2026-02-25 12:23:13 -05:00
Mark Backman	81f4672535	Add Performance as a changelog fragment option	2026-02-25 09:47:42 -05:00
Mark Backman	9273b158ea	Merge pull request #3825 from pipecat-ai/mb/llm-user-aggregator-interim-transcription Consume InterimTranscriptionFrame and TranslationFrame in LLMUserAggregator	2026-02-25 09:06:34 -05:00
Mark Backman	353a28842c	Merge pull request #3807 from pipecat-ai/mb/update-openai-realtime-1.5 Update OpenAI Realtime default model to gpt-realtime-1.5	2026-02-25 09:06:19 -05:00
Mark Backman	3e6c59c736	Merge pull request #3809 from pipecat-ai/mb/krisp-viva-result Add Krisp API key support and debug logging	2026-02-25 09:05:12 -05:00
Mark Backman	0ca8c850fb	Add TurnMetricsData and e2e processing time for KrispVivaTurn Introduce a generic TurnMetricsData class for turn detection metrics, replacing the service-specific SmartTurnMetricsData (now deprecated). Add end-to-end processing time measurement to KrispVivaTurn, tracking the interval from VAD speech-to-silence transition to model threshold crossing. Consume metrics in the strategy _handle_input_audio path so they are pushed immediately when fresh.	2026-02-25 09:01:21 -05:00
Mark Backman	73ee4da7d4	Add Krisp API key support for new SDK licensing requirement The Krisp VIVA SDK v1.8.0 requires a license key in globalInit(). Add api_key parameter to KrispVivaSDKManager, KrispVivaTurn, and KrispVivaFilter with fallback to KRISP_API_KEY env var. Maintain backwards compatibility with older SDK versions by catching TypeError and falling back to the old 3-arg signature.	2026-02-25 09:01:00 -05:00
Filipi da Silva Fuchter	2f60074da3	Merge pull request #3814 from pipecat-ai/filipi/fix_close_context Fixed an issue where the TTS providers did not close the context after the audio context finished processing all audio.	2026-02-25 08:21:04 -05:00
filipi87	751b1b8100	Adding the changelog entries for the tts fixes.	2026-02-25 10:18:25 -03:00
filipi87	d899f0af11	Refactored all AudioContextTTSService based providers to override the new callbacks instead of _handle_interruption(), making provider-specific cleanup cleaner and more explicit	2026-02-25 10:18:16 -03:00
filipi87	c09ae6ba6d	Added two new lifecycle callbacks to AudioContextTTSService: on_audio_context_interrupted() and on_audio_context_completed()	2026-02-25 10:17:54 -03:00
Mark Backman	a187a4b3b2	Merge pull request #3830 from pipecat-ai/aleix/restore-dev-skills	2026-02-25 06:33:16 -05:00
Aleix Conchillo Flaqué	68e19a730b	Restore dev skills and add marketplace for maintainer workflows Brings back the 6 development workflow skills (changelog, cleanup, code-review, docstring, pr-description, pr-submit) that were moved to pipecat-ai/skills, and adds a .claude-plugin/marketplace.json so other pipecat-ai repos can install them. Updates README contributing section with installation instructions.	2026-02-24 23:47:06 -08:00
Mark Backman	67cb7d575f	Merge pull request #3828 from pipecat-ai/mb/skip-empty-audio-filter-frames Skip empty audio frames after filter buffering	2026-02-24 23:27:22 -05:00
Mark Backman	a84930dc3e	Skip empty audio frames after filter buffering Audio filters like RNNoise, KrispViva, and AIC return empty bytes while buffering audio to accumulate their required frame size. These empty frames were flowing downstream, causing misleading "Empty audio frame received for STT service" warnings. Skip the frame in BaseInputTransport when audio is empty, preventing unnecessary processing in VAD and downstream processors. Fixes #3517	2026-02-24 23:21:52 -05:00

1 2 3 4 5 ...

7904 Commits