pipecat

Author	SHA1	Message	Date
zkleb-aai	5c2ca0ce64	Update changelog/3856.changed.md Co-authored-by: Mark Backman <m.backman@gmail.com>	2026-03-02 17:04:54 -05:00
zkleb-aai	6729f4366a	Update src/pipecat/services/assemblyai/stt.py Co-authored-by: Mark Backman <m.backman@gmail.com>	2026-03-02 17:04:42 -05:00
zkleb-aai	7648b62e6e	Update src/pipecat/services/assemblyai/stt.py Co-authored-by: Mark Backman <m.backman@gmail.com>	2026-03-02 17:04:17 -05:00
zack	cb7e612738	Remove test files and testing documentation from PR	2026-03-01 11:51:51 -05:00
zack	36b9c05730	Fix changelog entries to use proper markdown bullet format	2026-03-01 11:45:24 -05:00
zack	6968d83ccb	Add changelog entries for PR #3856	2026-03-01 11:44:51 -05:00
zack	42f91a9056	Apply ruff formatting fixes	2026-03-01 11:44:37 -05:00
zack	5de495cc98	Use logger.warning instead of warnings.warn for deprecation message - Makes deprecation warning visible in logs without needing Python warning flags - Users will see the warning during normal operation	2026-03-01 11:39:00 -05:00
zack	d1cbc81108	Fix 07o example to use new min_turn_silence parameter name in docs and comments	2026-03-01 11:36:46 -05:00
zack	66fca7e382	Add backward compatibility for min_end_of_turn_silence_when_confident parameter - Keep old parameter name for backward compatibility - Add deprecation warning when old parameter is used - Automatically migrate old parameter value to new min_turn_silence parameter - Exclude deprecated parameter from WebSocket URL to avoid sending it to API - New parameter takes precedence if both are set	2026-03-01 11:33:22 -05:00
zack	07ae4b8d38	Update AssemblyAI examples to use u3-rt-pro and improve 55d example - Update 13d-assemblyai-transcription.py to explicitly use u3-rt-pro model - Update 55d-update-settings-assemblyai-stt.py to demonstrate keyterms updates instead of language updates - Add helpful logging to show before/after keyterms boosting effect - Use difficult names (Xiomara, Saoirse, Krzystof) to demonstrate boosting effectiveness	2026-03-01 11:27:31 -05:00
zack	21a409e447	Update prompt warning and rename min_end_of_turn_silence_when_confident to min_turn_silence - Add "beta feature" note to custom prompt warning - Rename min_end_of_turn_silence_when_confident parameter to min_turn_silence across all AssemblyAI code - Update documentation, examples, and test files to use new parameter name	2026-03-01 11:17:39 -05:00
zack	d7ce1eedd9	Add foundational examples for AssemblyAI u3-rt-pro - 07o-interruptible-assemblyai.py: Basic example using Pipecat VAD mode - 07o-interruptible-assemblyai-stt.py: Advanced example using STT-controlled turn detection with comprehensive documentation on u3-rt-pro features (turn detection tuning, prompt-based enhancement, speaker diarization)	2026-02-27 17:58:18 -05:00
zack	ef00f27d53	Fix incorrect await on synchronous request_finalize() method The request_finalize() method in STTService is synchronous (sets a flag), but was being called with await in the VAD turn endpoint handling code. This caused "object NoneType can't be used in 'await' expression" errors. Also includes automatic formatting improvements from ruff.	2026-02-27 17:58:05 -05:00
zack	45532a9478	Remove info logs and unused import per PR feedback - Remove unused Mapping import - Remove info logs at initialization (connection params) - Remove info logs in _handle_transcription (transcript details, text sent to LLM) - Remove info logs in _build_ws_url (WebSocket URL and params) - Keep debug logs (less verbose, appropriate for development)	2026-02-27 16:15:49 -05:00
zack	6ba9f780b0	Remove unnecessary SpeechStarted fallback in STT mode u3-rt-pro guarantees SpeechStarted is always sent before transcripts, so the fallback UserStartedSpeakingFrame broadcast is never needed. This ensures clean pairing of UserStarted/StoppedSpeakingFrame: - Start: Always from _handle_speech_started - Stop: Always from _handle_transcription on final turn	2026-02-27 15:00:38 -05:00
zack	aa7e9a17d5	Fix finalization pattern: Use request/confirm in Pipecat mode, finalized flag in STT mode - Add request_finalize() before sending ForceEndpoint in Pipecat mode - Keep confirm_finalize() when receiving formatted finals in Pipecat mode - Remove confirm_finalize() from STT mode (use finalized=True instead) This follows Pipecat's two-step finalization pattern where request_finalize() is called when sending a finalize request to the STT service, and confirm_finalize() is called when receiving confirmation back.	2026-02-27 14:55:22 -05:00
zack	cd07937c5d	Fix missing imports: Add UserStartedSpeakingFrame and UserStoppedSpeakingFrame	2026-02-26 22:18:02 -05:00
zack	72934bd8ae	Add u3-rt-pro support and improvements to AssemblyAI STT service - Fix speaker diarization: Add field alias for speaker_label → speaker mapping in TurnMessage model - Add warning for non-optimal min_end_of_turn_silence_when_confident values (recommends 100ms for best latency) - Improve max_turn_silence override warning message clarity - Update custom prompt warning (remove 88% accuracy claim) - Add comprehensive logging for debugging: - Log final connection params after modifications - Log WebSocket URL and parsed parameters - Log speaker field in transcripts - Log text sent to LLM with speaker formatting - Support dynamic configuration updates via STTUpdateSettingsFrame: - keyterms_prompt (when AssemblyAI API supports it) - prompt - max_turn_silence - min_end_of_turn_silence_when_confident	2026-02-26 22:04:21 -05:00
Mark Backman	2a6a993869	Merge pull request #3850 from rupesh-svg/fix/genesys-remove-audio-chunk-logging Remove verbose audio chunk logging from GenesysAudioHookSerializer	2026-02-26 21:52:54 -05:00
Rupesh	bbaa79fef0	Add changelog for PR #3850	2026-02-26 14:00:34 -08:00
Rupesh	fff9db0d8f	Remove verbose audio chunk logging from GenesysAudioHookSerializer Fixes #3777	2026-02-26 13:51:05 -08:00
kompfner	7fe458fe59	Merge pull request #3817 from pipecat-ai/pk/service-settings-fix-back-compat-for-nested-external-sdk-types Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings`…	2026-02-26 11:08:27 -05:00
Paul Kompfner	faed775d90	Extract `_DeepgramSTTSettingsBase` with shared `_merge_live_options_delta` to deduplicate LiveOptions merge logic between `__init__` and `apply_update`, and between the Deepgram STT and SageMaker variants; make top-level model/language take precedence over conflicting live_options values in updates; remove unnecessary Language enum-to-string conversion (Language is a StrEnum)	2026-02-26 11:02:44 -05:00
Mark Backman	b63ca524f5	Merge pull request #3806 from pipecat-ai/mb/ultravox-updates Align Ultravox Realtime service with OpenAI/Gemini patterns	2026-02-26 10:49:21 -05:00
Mark Backman	907ff58d41	Align Ultravox Realtime service with OpenAI/Gemini patterns - Add InterruptionFrame handling with stop_all_metrics() - Add processing metrics (start/stop) at response boundaries - Fix agent transcript handling for voice and text modalities: - Voice mode: push LLMTextFrame (append_to_context=False) and TTSTextFrame for deltas, skip duplicated final text - Text mode: push LLMTextFrame with proper response lifecycle, no TTSTextFrame (downstream TTS handles audio) - Add output_medium parameter to AgentInputParams and OneShotInputParams - Improve TTFB measurement using VAD speech end time - Update example with user turn strategies and transcript events - Add text-only output example (50a-ultravox-realtime-text.py)	2026-02-26 10:44:36 -05:00
Mark Backman	97b93ebe57	Merge pull request #3696 from pipecat-ai/mb/streaming-tts-input Improve streaming TTS input support, add TextAggregationMetricsData	2026-02-26 10:26:53 -05:00
Mark Backman	3ae173520e	Code review feedback	2026-02-26 10:23:35 -05:00
Paul Kompfner	c184ac09b8	Inline `_build_live_options` into `_connect` in `DeepgramSTTService` and `DeepgramSageMakerSTTService` since it's trivial and only called from one place	2026-02-26 09:42:15 -05:00
Paul Kompfner	3c20eda8bf	Keep model/language in LiveOptions at construction time so apply_update's bidirectional sync is sufficient; simplify _build_live_options to only add sample_rate	2026-02-26 09:32:52 -05:00
Mark Backman	d69a337def	Add text_aggregation_mode parameter to TTSService Move the sentence vs token aggregation concern into text aggregators so all text flows through them regardless of mode. This enables pattern detection and tag handling to work in TOKEN mode. - Add TextAggregationMode enum (SENTENCE, TOKEN) as the user-facing TTS setting, separate from the internal AggregationType - Add TOKEN mode support to Simple, SkipTags, and PatternPair aggregators - Add text_aggregation_mode parameter to TTSService and all TTS subclasses - Deprecate aggregate_sentences in favor of text_aggregation_mode - Merge TTSService._process_text_frame() into a single codepath	2026-02-26 08:55:41 -05:00
Mark Backman	f7434cdde1	Add text aggregation time metric for TTS sentence aggregation Add TextAggregationMetricsData measuring the time from the first LLM token to the first complete sentence, representing the latency cost of sentence aggregation in the TTS pipeline.	2026-02-26 08:48:47 -05:00
Paul Kompfner	e21e8585f0	Add `deepgram` and `sagemaker` extras to CI test dependencies so Deepgram and Deepgram Sagemaker settings tests can run	2026-02-25 18:59:59 -05:00
Paul Kompfner	8b6aa4b912	Unflatten `LiveOptions` back into a single `live_options` field on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings`; add `apply_update` override with delta-merge semantics and `from_mapping` override for backward-compatible dict-style updates	2026-02-25 18:25:11 -05:00
Paul Kompfner	a4b6db6fb4	Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings` for backward-compatible dict-style updates via `STTUpdateSettingsFrame`; during the big service settings refactor, we accidentally got rid of the ability to update individual `LiveOptions` fields with a sparse update	2026-02-25 17:39:31 -05:00
Mark Backman	edc79d374a	Merge pull request #3836 from pipecat-ai/mb/small-webrtc-prebuilt-2.3.0 Update the pipecat-ai-small-webrtc-prebuilt to 2.3.0	2026-02-25 17:18:32 -05:00
Mark Backman	e521aef5df	Merge pull request #3842 from pipecat-ai/mb/claude-plugin-docs Add /update-docs skill to claude-plugin	2026-02-25 16:38:16 -05:00
kompfner	3cfff51205	Merge pull request #3827 from pipecat-ai/pk/gemini-tts-service-remove-model-ivar Remove unnecessary `_model` ivar from `GeminiTTSService`, using `_set…	2026-02-25 16:14:38 -05:00
Paul Kompfner	3d8e3a4043	Remove unnecessary `_model` ivar from ElevenLabs STT services, using `_settings.model` instead.	2026-02-25 16:07:33 -05:00
Paul Kompfner	7ee0400c4c	Remove unnecessary `_model` ivar from Hathora TTS and STT services, using `_settings.model` instead.	2026-02-25 16:07:26 -05:00
Paul Kompfner	781d191509	Remove unnecessary `_model` ivar from `GeminiTTSService`, using `_settings.model` instead	2026-02-25 15:59:38 -05:00
kompfner	a8cb2a26d1	Merge pull request #3841 from pipecat-ai/pk/groq-tweaks A few Groq-related tweaks:	2026-02-25 15:54:33 -05:00
kompfner	b1df1ba5d4	Merge pull request #3834 from pipecat-ai/pk/make-ai-service-exclusive-syncer-of-model-name-to-metrics Make it so that `AIService` is the exclusive "syncer" of model name t…	2026-02-25 15:53:59 -05:00
Mark Backman	eee2ef7e85	Add /update-docs skill to claude-plugin	2026-02-25 15:45:16 -05:00
Paul Kompfner	ff0f3dce32	A few Groq-related tweaks: - Wire up passing speed setting to Groq, even though only a value of 1.0 is supported today - Update the 55y example to switch voices instead of changing speed - Add a 55zn example to exercise runtime updates of Groq STT	2026-02-25 15:10:48 -05:00
Paul Kompfner	bca42f7d68	Fix Hathora 55 series examples, and fix Hathora missing settings field warning	2026-02-25 14:48:40 -05:00
Paul Kompfner	27940d83a2	Make it so that `AIService` is the exclusive "syncer" of model name to metrics. The only (rare) exception—where a service directly still needs to directly call `self._sync_model_name_to_metrics()`—is when the model name need to be "pulled" from another field (or nested field) in settings up to settings.model on a settings update. This only occurs in Deepgram services, where we use the voice as the model name. This change has the side-effect of bringing model name to metrics for a number of services that were accidentally omitting it before.	2026-02-25 14:48:24 -05:00
Mark Backman	937c691f2a	Merge pull request #3838 from pipecat-ai/mb/remove-playht Remove PlayHT TTS services	2026-02-25 14:34:15 -05:00
Mark Backman	6803d38d3f	Merge pull request #3833 from pipecat-ai/mb/add-performance-changelog-fragment Add Performance as a changelog fragment option	2026-02-25 14:33:52 -05:00
Mark Backman	44993fe9e3	Remove PlayHT TTS services	2026-02-25 14:12:39 -05:00

1 2 3 4 5 ...

7921 Commits