pipecat

Author	SHA1	Message	Date
kompfner	7fe458fe59	Merge pull request #3817 from pipecat-ai/pk/service-settings-fix-back-compat-for-nested-external-sdk-types Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings`…	2026-02-26 11:08:27 -05:00
Paul Kompfner	faed775d90	Extract `_DeepgramSTTSettingsBase` with shared `_merge_live_options_delta` to deduplicate LiveOptions merge logic between `__init__` and `apply_update`, and between the Deepgram STT and SageMaker variants; make top-level model/language take precedence over conflicting live_options values in updates; remove unnecessary Language enum-to-string conversion (Language is a StrEnum)	2026-02-26 11:02:44 -05:00
Mark Backman	d69a337def	Add text_aggregation_mode parameter to TTSService Move the sentence vs token aggregation concern into text aggregators so all text flows through them regardless of mode. This enables pattern detection and tag handling to work in TOKEN mode. - Add TextAggregationMode enum (SENTENCE, TOKEN) as the user-facing TTS setting, separate from the internal AggregationType - Add TOKEN mode support to Simple, SkipTags, and PatternPair aggregators - Add text_aggregation_mode parameter to TTSService and all TTS subclasses - Deprecate aggregate_sentences in favor of text_aggregation_mode - Merge TTSService._process_text_frame() into a single codepath	2026-02-26 08:55:41 -05:00
Paul Kompfner	8b6aa4b912	Unflatten `LiveOptions` back into a single `live_options` field on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings`; add `apply_update` override with delta-merge semantics and `from_mapping` override for backward-compatible dict-style updates	2026-02-25 18:25:11 -05:00
Mark Backman	69d916ca51	Consume InterimTranscriptionFrame and TranslationFrame in LLMUserAggregator These frames were falling through to the else branch and being pushed downstream, unlike TranscriptionFrame which is explicitly consumed. This aligns with how the assistant aggregator already filters them.	2026-02-24 20:51:41 -05:00
kompfner	03cb0054f9	Merge branch 'main' into pk/service-settings-refactor	2026-02-23 11:46:03 -05:00
Aleix Conchillo Flaqué	827032fefb	Unblock push_interruption_task_frame_and_wait after timeout When the InterruptionFrame does not complete within the timeout the caller was stuck in an infinite loop logging warnings. Now the event is set after the first timeout so the processor can continue. Also adds a keyword timeout parameter so callers can customize the wait duration.	2026-02-20 14:56:42 -08:00
Aleix Conchillo Flaqué	474b27305f	Merge pull request #3748 from pipecat-ai/mb/user-idle-configurable Make UserIdleController always-on with dynamic timeout updates	2026-02-19 11:44:51 -08:00
Aleix Conchillo Flaqué	20509e8f96	Merge pull request #3744 from pipecat-ai/mb/user-idle-timeout-frame Redesign UserIdleController to use BotStoppedSpeakingFrame	2026-02-19 11:34:42 -08:00
Paul Kompfner	94a651cee2	Remove dead `ServiceSettings.to_dict` method	2026-02-17 15:15:18 -05:00
Luke Payyapilli	247f0bbcd3	Fix async generator cleanup to prevent uvloop crash on Python 3.12+	2026-02-17 13:10:31 -05:00
Paul Kompfner	3b1ba57452	Change `apply_update` / `_update_settings` return type from `set[str]` to `dict[str, Any]`. The dict maps each changed field name to its pre-update value, enabling services to do granular diffing of complex settings objects. Existing call-site patterns (`"field" in changed`, `if changed`, iteration) work unchanged; set-difference sites use `changed.keys() - {...}`.	2026-02-17 11:49:15 -05:00
Mark Backman	dba4de77bf	Merge pull request #3684 from ai-coustics/goedev/aic-model-caching AIC model caching	2026-02-16 10:43:14 -05:00
Mark Backman	507765625f	Make UserIdleController always-on with dynamic timeout updates Always create UserIdleController (timeout=0 means disabled), removing all Optional guards. Add UserIdleTimeoutUpdateFrame to allow changing the idle timeout at runtime.	2026-02-14 09:54:30 -05:00
Mark Backman	012ef41ff4	Redesign UserIdleController to use BotStoppedSpeakingFrame Replace the continuous heartbeat-based timer (UserSpeakingFrame/BotSpeakingFrame + asyncio.Event loop) with a simple one-shot timer that starts when BotStoppedSpeakingFrame is received and cancels on UserStartedSpeakingFrame or BotStartedSpeakingFrame. This eliminates false idle triggers caused by gaps between the user finishing speaking and the bot starting to speak (LLM/TTS latency). Guard the timer start with two conditions to prevent false triggers: - User turn in progress: during interruptions, BotStoppedSpeaking arrives while the user is still speaking mid-turn. - Function calls in progress: FunctionCallsStarted arrives before BotStoppedSpeaking because the bot speaks concurrently with the function call starting, so the timer must wait for the result and subsequent bot response.	2026-02-14 08:55:56 -05:00
Paul Kompfner	8a4ab611be	Broad service settings refactor, with the primary aim of making service settings discoverable and strongly-typed. Service settings can be updated at runtime with `UpdateSettingsFrame`s. Does not (yet) touch `InputParams`, to avoid scope creep and touching something currently part of the public API. But there is a lot of overlap between `Settings` object fields and `InputParams` fields. Other than discoverability/typing, these are some other improvements brought by this refactor: - There is now a single code path (see `_update_settings_from_typed`) where services can respond to settings changes (by, say, reconnecting if needed), improving maintainability and guaranteeing one and only one reconnection no matter which settings changed - `set_language`/`set_model`/`set_voice`—which we're assuming are usable as public methods, though not recommended over `UpdateSettingsFrame`—all use the same code path as settings updates. They're also now all consistent in that, if a service needs to respond to a change (by, say, reconnecting if needed), any of these methods will kick off that process. Note that this is technically a behavior change. - Several services now properly react to changed settings by reconnecting: - `AWSTranscribeSTTService` - `AzureSTTService` - `SonioxSTTService` - `GladiaSTTService` - `SpeechmaticsSTTService` - `AssemblyAISTTService` - `CartesiaSTTService` - `FishAudioTTSService` (would previously only reconnect when `model` changed) - `GoogleSTTService` - `SpeechmaticsSTTService` (which previously only handled some* settings updates through a nonstandard public `update_params` method) - `GradiumSTTService` - `NvidiaSegmentedSTTService` (which previously only handled changes to language) - Bookkeeping across various services has been reduced, mostly by deduping ivars; the `self._settings` ivar is treated as the source of truth NOTE: I pretty much guarantee that there are services missed in this PR in terms of bringing to consistency with how updates are handled (like whether changes in certain fields trigger reconnects when they need to). We can squash remaining inconsistencies as we stumble onto them, service by service. The goal here is to get things mostly in order, and establish the infrastructure and patterns we'll need going forward.	2026-02-13 15:12:26 -05:00
Luke Payyapilli	3adb2f50a6	Fix LLMUserAggregator broadcasting mute events before StartFrame	2026-02-13 11:59:56 -05:00
Mark Backman	71a752c971	Add tests for TracingContext and TurnTraceObserver Cover pipeline-scoped tracing context lifecycle, span hierarchy, conversation/turn context management, and concurrent pipeline isolation.	2026-02-11 23:27:35 -05:00
Gökmen Görgen	2036757b84	add unit tests for `AICModelManager` and `AICFilter` error handling, model loading, and processor behavior	2026-02-11 15:22:37 +01:00
Aleix Conchillo Flaqué	93f4402198	Update stream close test to match new _closing helper	2026-02-10 18:19:57 -08:00
filipi87	4a00e6829f	Automated tests for the context summarizer.	2026-02-10 18:58:44 -03:00
filipi87	9d89afa7d4	Automated tests for the context summarization feature.	2026-02-10 18:58:33 -03:00
Mark Backman	981253c703	Rename RequestMetadataFrame to ServiceSwitcherRequestMetadataFrame with service targeting Add a `service` field so the frame targets a specific service, allowing ServiceSwitcher.push_frame to consume it only when the targeted service matches the active service. STTService and test mocks now push the frame downstream after handling instead of silently consuming it.	2026-02-09 16:48:34 -05:00
Aleix Conchillo Flaqué	944ac92593	Fix test_langchain to use explicit stop strategy The default stop strategy changed to TurnAnalyzerUserTurnStopStrategy, which requires actual audio analysis. Use SpeechTimeoutUserTurnStopStrategy explicitly since this test is not testing turn detection.	2026-02-09 12:00:41 -08:00
Aleix Conchillo Flaqué	2a572aedba	Simplify ServiceSwitcher with closure-based filters - Make ServiceSwitcherStrategy inherit from BaseObject with properties for services and active_service, and move initial service selection into the base class - Add on_service_switched event to ServiceSwitcherStrategy - handle_frame now returns the switched-to service (or None), allowing ServiceSwitcher to swallow ManuallySwitchServiceFrame on switch and request metadata from the new active service - Override push_frame to suppress RequestMetadataFrame and ServiceMetadataFrame from inactive services - Remove ServiceSwitcherFilter and ServiceSwitcherFilterFrame in favor of plain FunctionFilter instances with closures that check the strategy's active service directly - FunctionFilter: add FilterType alias - FunctionFilter: when direction is None, frames in both directions are filtered instead of just one - Add docstrings to ServiceSwitcher and its components	2026-02-09 14:12:33 -05:00
Mark Backman	34b068d657	Improve user turn stop timing by triggering timeout from VAD stop Refactor TranscriptionUserTurnStopStrategy and TurnAnalyzerUserTurnStopStrategy to use VADUserStoppedSpeakingFrame as the ground truth for when speech ended, rather than triggering timeouts from transcription frames.	2026-02-09 14:12:33 -05:00
Gökmen Görgen	67d39a97f7	AIC model caching.	2026-02-09 11:51:28 +01:00
Aleix Conchillo Flaqué	4945cfbd8f	Buffer internal frames during ParallelPipeline lifecycle synchronization Processors inside parallel sub-pipelines can push frames during StartFrame/EndFrame/CancelFrame processing. Previously these frames could escape the ParallelPipeline before all branches finished processing the lifecycle frame. Now they are buffered and flushed after synchronization completes.	2026-02-06 15:15:46 -08:00
Mark Backman	d7b1624d3c	Merge pull request #3663 from lukepayyapilli/fix/stream-close-sambanova-google fix: close stream on cancellation for SambaNova and Google OpenAI services	2026-02-06 14:02:31 -05:00
Aleix Conchillo Flaqué	d5105a78e6	STTMuteFilter should call frame.complete() when InterruptionFrame is blocked	2026-02-06 10:11:00 -08:00
Aleix Conchillo Flaqué	a352b2d7a0	Add tests for InterruptionFrame completion event Add tests for the event-based interruption completion: complete() sets the event, complete() is safe without an event, the event fires at the pipeline sink, and a warning is logged when the frame is blocked. Also remove the unconditional await after the timeout so the function returns instead of hanging when complete() is never called.	2026-02-06 09:57:24 -08:00
Luke Payyapilli	29c53b99a4	fix: close stream on cancellation for SambaNova and Google OpenAI services	2026-02-06 10:02:40 -05:00
Mark Backman	fa85f7bbc7	Merge pull request #3640 from lukepayyapilli/fix/openai-stream-close fix: close stream on cancellation to prevent socket leaks	2026-02-05 18:00:06 -05:00
Derek Haynes	f6c919354f	Add test for user bot latency	2026-02-05 14:29:45 -05:00
Luke Payyapilli	55a3b10e70	fix(openai): close stream on cancellation to prevent socket leaks	2026-02-04 09:59:10 -05:00
Mark Backman	2db3d94d06	Merge pull request #3628 from pipecat-ai/mb/broadcast-speech-control-params-frame Fix: Broadcast SpeechControlParamsFrame from VADController	2026-02-03 18:44:15 -05:00
Mark Backman	2a26b9f7a3	Fix: Broadcast SpeechControlParamsFrame from VADController	2026-02-03 18:40:39 -05:00
Mark Backman	84ca0b6d58	Merge pull request #3629 from pipecat-ai/fix/telephony-websocket-stopasynciteration Fix StopAsyncIteration in parse_telephony_websocket	2026-02-03 12:10:07 -05:00
Luke Payyapilli	8d3e10f054	Make EndFrame and StopFrame uninterruptible to prevent pipeline freeze	2026-02-03 09:12:59 -05:00
James Hush	90bead06ab	Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-03 16:42:13 +08:00
James Hush	b427d534ae	Add tests for parse_telephony_websocket StopAsyncIteration handling Tests cover: - No messages received (raises ValueError) - One message received (logs warning, continues) - Two messages received (normal operation) - All telephony providers (Twilio, Telnyx, Plivo, Exotel) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-03 16:33:36 +08:00
James Hush	763002f2bc	Fix sentence splitting for CJK and other non-Latin languages in TTS pipeline NLTK's sent_tokenize() only supports ~15 European languages and defaults to English. For Japanese, Chinese, Korean, Hindi, Arabic, and other non-Latin languages, NLTK fails to recognize sentence boundaries like 。？！ causing text to accumulate until flush instead of being emitted sentence-by-sentence. Add a fallback in match_endofsentence() that scans for unambiguous non-Latin sentence-ending punctuation when NLTK fails to split the text. Latin punctuation (. ! ? ; …) is excluded from the fallback since NLTK handles those correctly and they can be ambiguous (abbreviations, decimals, etc.). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 14:27:49 +08:00
Mark Backman	e779233918	Fix IVRNavigator to push AggregatedTextFrame when switching to conversation mode	2026-01-30 21:07:49 -05:00
Mark Backman	63a23246d5	Add UserTurnCompletionLLMServiceMixin (#3518 ) * Added UserTurnCompletionLLMServiceMixin class * Added 22-filter-incomplete-turns.py foundational example * Removed old 22 natural conversation foundational examples * Added test_user_turn_completion_mixin.py	2026-01-30 14:57:15 -05:00
Aleix Conchillo Flaqué	305ab44132	tests: add unittest.main() call	2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué	b486f35c70	audio: add new VADProcessor	2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué	ddfedaf478	audio(vad): add new VADController	2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué	7eabaaa0ef	FrameProcessors: do not deepcopy fields when broadcasting frames	2026-01-29 11:47:57 -08:00
Aleix Conchillo Flaqué	f3b72e9263	Merge pull request #3585 from pipecat-ai/aleix/improve-piper-tts-support improve Piper TTS support	2026-01-29 08:36:13 -08:00
Mark Backman	b77a50de73	Merge pull request #3529 from lukepayyapilli/fix/llm-timeout-without-retry feat: handle exceptions for BaseOpenAILLMService	2026-01-29 09:12:54 -05:00

1 2 3 4 5 ...

312 Commits