pipecat

Author	SHA1	Message	Date
Aleix Conchillo Flaqué	952dddca8b	Replace llm_completion_user_turn_stop_strategies() with FilterIncompleteUserTurnStrategies Wrap the detector chain with `deferred(...)` and append the LLM completion gate via a `UserTurnStrategies` specialization rather than a free-standing helper, mirroring the existing `ExternalUserTurnStrategies` pattern. The class lives next to other strategy containers in `pipecat.turns.user_turn_strategies`, so users discover it where they're already configuring `user_turn_strategies`. The deprecated `filter_incomplete_user_turns` flag now rewires through `FilterIncompleteUserTurnStrategies` under the hood, keeping the migration path identical to before. `deferred(...)` stays public as the explicit escape hatch for non-default compositions.	2026-05-07 17:47:39 -07:00
Aleix Conchillo Flaqué	480eca42f5	Split user-turn-stop into inference-triggered and finalized events Fixes a real bug: with `filter_incomplete_user_turns` enabled, the smart-turn detector's tentative stop was firing `on_user_turn_stopped` before the LLM had a chance to veto it. Observers, transcript appenders and UI indicators received an early — and sometimes duplicated — signal. Decomposes the single stop concern into two events: - `on_user_turn_inference_triggered` fires when a stop strategy has enough signal to start LLM inference. The aggregator pushes the context here, kicking off the LLM call. - `on_user_turn_stopped` fires only when the user turn is semantically final. Built-in strategies fire both events at the same call site, preserving today's behavior for the common case. Adds `LLMTurnCompletionUserTurnStopStrategy`, which gates finalization on a `UserTurnCompletedFrame` (a fieldless system frame emitted by any component judging turn completeness — currently the `UserTurnCompletionLLMServiceMixin` on `✓`). Adds `deferred(strategy)` / `DeferredUserTurnStopStrategy`, a thin wrapper that forwards an inner strategy's events except `on_user_turn_stopped`. Use this to install a stop strategy as an inference trigger only, leaving finalization to a peer (e.g. the LLM completion strategy). Adds `llm_completion_user_turn_stop_strategies()` for the common case: UserTurnStrategies( stop=llm_completion_user_turn_stop_strategies(), ) Deprecates `LLMUserAggregatorParams.filter_incomplete_user_turns`. The aggregator emits a `DeprecationWarning`, wraps existing stop strategies with `deferred(...)`, and appends `LLMTurnCompletionUserTurnStopStrategy` automatically.	2026-05-07 17:46:09 -07:00
Paul Kompfner	2616076bec	Add deterministic dev-error demo example ``examples/function-calling/function-calling-missing-handler.py`` demonstrates the missing-handler path by deliberately advertising a tool to the LLM without registering its handler — what happens when a developer forgets to call ``register_function``. Exercises the new ``logger.error`` severity end-to-end without needing to coax the LLM into hallucinating.	2026-05-05 13:08:00 -04:00
Paul Kompfner	e06e0c0282	Mitigate tool-call-related hallucination When tools change mid-conversation, LLMs can produce a few different flavors of tool-call-related hallucination: calling tools that have been removed, avoiding tools that have been re-added, or hallucinating output (made-up answers or tool-call-shaped non-tool-calls) when tools are unavailable. This change introduces an opt-in ``add_tool_change_messages`` flag on the LLM aggregators (preferred entry point: ``LLMContextAggregatorPair( ..., add_tool_change_messages=True)``) that appends a developer-role message to the context whenever ``LLMSetToolsFrame`` changes the set of advertised standard tools. Helps the LLM stay coherent across tool changes by spelling out exactly what just became available or unavailable. Both aggregators participate; whichever handles the frame first wins, and the other (if any) sees an empty diff against the shared context and stays silent — order-independent regardless of whether the frame flows downstream or upstream. Also tightens the existing missing-handler path (introduced in #4301): - Reworded the terminal tool result to a neutral "The function ``X`` is not currently available." (overridable via ``LLMService.MISSING_FUNCTION_CALL_MESSAGE_TEMPLATE``). Previously read "Error: function 'X' is not registered." - Logs at the call site now distinguish developer error (tool advertised but no handler registered → ``logger.error``) from hallucination (tool not advertised → ``logger.warning``). Includes a manual validation harness (``examples/features/features-add-tool-change-messages.py``) that exercises the new ``add_tool_change_messages`` mitigation by flipping tool availability on a turn counter so its effect can be observed end-to-end with the flag on vs. off.	2026-05-05 13:02:43 -04:00
Paul Kompfner	1b5c4cfa2a	feat: broaden tool_resources to app_resources Broaden `tool_resources` to `app_resources` for easy access not just in tool handlers but in other places like custom `FrameProcessor`s. Involves 3 changes: - A rename: `tool_resources` -> `app_resources` - A new property on `PipelineTask`: `app_resources` - A new property on `FrameProcessor`: `pipeline_task` Usage in tool handler: async def get_weather(params: FunctionCallParams): resources = cast(MyAppResources, params.app_resources) ... Usage in custom `FrameProcessor`: class MyProcessor(FrameProcessor): async def process_frame(self, frame, direction): await super().process_frame(frame, direction) if self.pipeline_task is not None: resources = cast(MyAppResources, self.pipeline_task.app_resources) ... The previous `tool_resources` aliases (on `PipelineTask`, `FunctionCallParams`, and `FrameProcessorSetup`) keep working but are deprecated as of 1.2.0 and emit `DeprecationWarning`s.	2026-04-30 16:16:17 -04:00
Mark Backman	58a038ddb2	Add Soniox real-time TTS service Introduce SonioxTTSService, a WebSocket TTS provider that streams text and receives audio over a persistent connection, multiplexing up to 5 concurrent streams per socket via Soniox's `stream_id`. Also updates the README service table and the Soniox voice example to use the new TTS end-to-end.	2026-04-27 16:04:02 -04:00
Paul Kompfner	124863175a	Add example demonstrating usage of `tool_resources`	2026-04-27 11:20:53 -04:00
kompfner	86effc4d10	Merge pull request #4015 from prettyprettyprettygood/feat/nova-sonic-session-continuation feat(nova-sonic): add proactive session continuation for conversation…	2026-04-27 09:36:48 -04:00
Gökmen Görgen	3bbfc42854	remove adaptive audio enhancement example and support for runtime enhancement level updates in `AICFilter`.	2026-04-25 10:05:47 +02:00
Gökmen Görgen	3b2127f912	rename environment variables and references from `AICOUSTICS` to `AIC`.	2026-04-25 09:51:23 +02:00
Gökmen Görgen	ea12b10742	rename `mcp-aic-adaptive.py` to `mcp-aicoustics-adaptive.py`.	2026-04-25 09:51:23 +02:00
Gökmen Görgen	a2fbed86cf	add adaptive audio enhancement example and support for runtime enhancement level updates in `AICFilter`.	2026-04-25 09:51:23 +02:00
Gökmen Görgen	f75f361629	bump `aic-sdk` to 2.2.0 and update `AICFilter` with `model_id` and `enhancement_level` changes.	2026-04-25 09:51:23 +02:00
Osman Ipek	f1b16a672a	feat(nova-sonic): add proactive session continuation for conversations >8min Nova Sonic sessions have an AWS-imposed ~8-minute time limit. This adds transparent session continuation that rotates sessions in the background before the limit is reached, preserving conversation context with no user-perceptible interruption. Implementation follows the AWS reference architecture: - Monitor loop detects when session age exceeds threshold - On assistant AUDIO contentStart: start buffering user audio, create next session (sessionStart + promptStart + system instruction) - Track SPECULATIVE/FINAL text counts as completion signal - On completion signal: send conversation history + audioInputStart + buffered audio to next session, then promote immediately - Close old session in background (non-blocking) - Dead session detection: recreate next session if idle >30s Key design decisions: - Session continuation enabled by default (fundamental for long conversations) - Conversation history tracked in real-time via _sc_conversation_history (independent of pipeline context aggregator which updates asynchronously) - Completion signal check in _handle_content_end_event (after history update) to ensure latest text is included in handoff - Rolling audio buffer (default 3s) captures user audio during transition - transition_threshold_seconds capped at 420s (7min) for safety margin - Unified event methods (_send_text_event, _send_client_event, etc.) accept optional stream/prompt_name params, eliminating duplicate SC methods Also adds: - SessionContinuationParams config (enabled, threshold, buffer, timeout)	2026-04-24 14:55:55 -07:00
Mark Backman	d8f5c0be71	Add XAITTSService for xAI streaming WebSocket TTS Adds XAITTSService in the existing xai/tts.py module, alongside the existing XAIHttpTTSService. Connects to xAI's streaming endpoint at wss://api.x.ai/v1/tts, streams text.delta chunks up and base64 audio.delta chunks down on the same connection so audio starts flowing before the full utterance is synthesized. Extends InterruptibleTTSService since xAI's protocol is strictly sequential per connection and exposes neither a cancel verb nor a context ID — the only way to stop an in-flight utterance is to tear down the WebSocket, which is exactly what InterruptibleTTSService does on interruption when the bot is speaking. Voice, language, codec, and sample_rate are passed as query-string params at connect time; runtime setting changes reconnect the socket. Defaults to raw PCM so emitted TTSAudioRawFrame objects need no decoding downstream. Splits the existing example into voice-xai.py (WebSocket) and voice-xai-http.py (batch HTTP) so each variant has its own entry point. Promotes the xai extra to depend on pipecat-ai[websockets-base] since the new service imports the websockets library.	2026-04-21 15:48:26 -04:00
Mark Backman	58a17c7b1b	Include examples in type checking Remove `examples/` from the `pyrightconfig.json` ignore list and fix the resulting type errors across all example files. Common fixes: - Required API keys: `os.getenv("X")` -> `os.environ["X"]` so the return type is `str` rather than `str \| None`, and misconfiguration fails fast. - Narrow `LLMContextMessage` union members with `isinstance(..., dict)` before dict-style access. - `assert isinstance(params.llm, ...)` before calling service-specific methods that aren't on the base `LLMService`. - Guard optional frame fields (e.g. `LLMSearchResponseFrame.search_result`) before use.	2026-04-21 15:43:31 -04:00
Mark Backman	b838bd906b	Add changelog for #4340	2026-04-21 13:45:34 -04:00
Mark Backman	c091232f2f	Add xAI streaming STT service New `XAISTTService` wraps xAI's real-time speech-to-text WebSocket (`wss://api.x.ai/v1/stt`). It extends `WebsocketSTTService`, authenticates with the `XAI_API_KEY` as a Bearer token on the WS handshake, and streams raw audio (PCM/mu-law/A-law) with configurable interim results, endpointing, language, multichannel, and diarization settings. - `src/pipecat/services/xai/stt.py`: new service, settings dataclass, and `language_to_xai_stt_language` helper. - `src/pipecat/services/stt_latency.py`: `XAI_TTFS_P99` default. - `pyproject.toml` / `uv.lock`: `xai` extra now pulls in `websockets-base`. - `README.md`: link to xAI STT in the services table. - `examples/voice/voice-xai.py`: swap DeepgramSTTService for XAISTTService so the xAI voice example is fully xAI. - `examples/transcription/transcription-xai.py`: new transcription-only example using the new service.	2026-04-21 13:45:34 -04:00
Paul Kompfner	81571beb1b	Use ExternalUserTurnStrategies, as expected, in a Deepgram Flux example	2026-04-21 10:51:59 -04:00
Mark Backman	42a6fc703c	Address review feedback - Fall back to Language.EN in _primary_detected_language when model is flux-general-en, preserving prior behavior on the default model. - Standardize example on DeepgramFluxSTTService.Settings and drop the now-redundant DeepgramFluxSTTSettings import. - Narrow the changed-behavior changelog to reflect that flux-general-en frames still carry Language.EN.	2026-04-17 15:38:14 -04:00
Mark Backman	6bb4e8295f	Add multilingual support for Deepgram Flux STT Enables the flux-general-multi model with one or more language_hints. Hints are sent as repeatable URL params at connect time and via a Configure control message when updated mid-stream (detect-then-lock). TranscriptionFrame.language now reflects the language Flux detected for each turn via the TurnInfo `languages` field.	2026-04-17 10:30:45 -04:00
Garegin Harutyunyan	4c19f5584c	VIVA SDK TT v3 support (#4252 ) * VIVA SDK TT v3 support * Format fix. * Renamed the API naming, removed '3' from the name. * Implementation of User turn start strategy using Krisp VIVA Interruption Prediction in scope of TT v3 support. * Typo fix in voice-krisp-viva example to use KrispVivaFilter class * style fix. * test run error fixes. * some test related changes. * Fixed tests * Stule fixes.	2026-04-17 07:53:41 -04:00
Aleix Conchillo Flaqué	b3bb6fdaa5	Modernize Python typing across the codebase Automated via ruff UP006, UP007, UP035, UP045 rules (target: py311): - Replace `typing.List`, `Dict`, `Tuple`, `Set`, `FrozenSet`, `Type` with their built-in equivalents (`list`, `dict`, `tuple`, etc.) - Replace `typing.Optional[X]` with `X \| None` - Replace `typing.Union[X, Y]` with `X \| Y` - Move `Mapping`, `Sequence`, `Callable`, `Awaitable`, `MutableMapping`, `MutableSequence`, `Iterator`, `AsyncIterator`, `AsyncGenerator` imports from `typing` to `collections.abc` - Remove now-unused `typing` imports - Add `from __future__ import annotations` to 5 files that use forward-reference strings in `X \| "Y"` annotations	2026-04-16 09:28:23 -07:00
Mark Backman	7291026695	Update Tavus transport example Show how to use on_connected event handler to obtain Daily room URL	2026-04-15 23:04:31 -04:00
Mark Backman	9ffcccdd84	Merge pull request #4253 from pipecat-ai/mb/mistral-stt Add Mistral Voxtral Realtime STT service	2026-04-15 09:00:27 -04:00
Filipi da Silva Fuchter	b1204cc430	Merge pull request #4241 from pipecat-ai/filipi/async_tools_cancellable Enable async tool cancellation feature.	2026-04-10 15:28:01 -03:00
filipi87	c542167065	Refactored on_function_calls_cancelled to use FunctionCallFromLLM.	2026-04-10 15:06:39 -03:00
filipi87	8cce25d2d2	Fixing openai examples.	2026-04-10 08:25:50 -03:00
filipi87	891f00cb5f	Using the on_function_calls_cancelled inside the examples.	2026-04-10 07:45:20 -03:00
filipi87	346c585290	Enabling the option to cancel the tools for all the async examples.	2026-04-10 07:31:51 -03:00
jp-lemon	c134110399	LemonSlice transport updates	2026-04-10 07:10:41 -03:00
filipi87	2dd1170229	Updating the Anthropic stream example to allow cancel the location tracking.	2026-04-09 17:26:51 -03:00
Cale Shapera	ec574edd53	Add Inworld Realtime Service (#4140 ) * Add Inworld Realtime LLM service Adds a WebSocket-based realtime service for Inworld's cascade STT/LLM/TTS API with semantic VAD, function calling, and streaming transcription support. New files: - src/pipecat/services/inworld/realtime/ (service, events) - src/pipecat/adapters/services/inworld_realtime_adapter.py - examples/foundational/19zb-inworld-realtime.py Also includes: - websockets dependency for inworld extra in pyproject.toml - Adapter and settings tests matching OpenAI/Grok realtime patterns - Fix for double-response when server-side VAD is enabled * Prefer init-provided system instruction in Inworld Realtime Adopt _resolve_system_instruction() from BaseLLMAdapter, matching the pattern applied to OpenAI Realtime, Grok Realtime, Gemini Live, and Nova Sonic in the pk/realtime-services-init-v-context-system-instructions-cleanup branch. * Update changelog entry with PR number * Fix changelog format to use bullet point * Polish PR: default model, example cleanup, changelog update - Change default model from gpt-4.1-nano to gpt-4.1-mini - Add function calling demo to example - Remove demo-testing artifact from system instruction - Mention Router support in changelog * Address PR review feedback for Inworld Realtime - Move example to examples/realtime/realtime-inworld.py - Change initial context role from "user" to "developer" - Remove explicit sample rates from example; sync them in _ensure_audio_config so Inworld gets the transport's actual rates - Add audio race condition guard in _handle_evt_audio_delta (matches OpenAI realtime pattern) - Convert remaining "system"/"developer" messages to "user" in adapter - Add clarifying comment for local-VAD vs server-VAD metrics paths * Simplify example, add provider tracking, remove local VAD path - Remove function calling from example, switch model to xai/grok-4-1-fast-non-reasoning - Add pipecat-realtime session key prefix and provider_data metadata for Inworld traffic attribution - Remove local VAD code path (Inworld only supports server-side VAD) - Use typed InputAudioBufferAppendEvent for audio sends * Default TTS model to inworld-tts-1.5-max * Remove dead shimmed tools code, set STT/VAD defaults - Remove non-functional AdapterType.SHIM custom tools code from adapter - Default STT model to assemblyai/u3-rt-pro - Default VAD eagerness to low	2026-04-09 13:04:17 -04:00
filipi87	edc197d050	Creating a new example for async stream using Google.	2026-04-09 09:50:00 -03:00
filipi87	7ece8e3c4a	Creating a new example for async stream using Anthropic.	2026-04-09 09:41:07 -03:00
filipi87	a544f885a3	Added new examples: function-calling-openai-async-stream.py and function-calling-openai-responses-async-stream.py	2026-04-09 09:04:06 -03:00
Mark Backman	68a3070ad4	Add Mistral Voxtral Realtime STT service	2026-04-07 15:26:56 -04:00
Mark Backman	0acfb4dd49	Merge pull request #4251 from pipecat-ai/mb/mistral-tts Add Mistral Voxtral streaming TTS service	2026-04-07 12:50:48 -04:00
Mark Backman	aa7a014518	Add mistral voice example	2026-04-07 12:32:06 -04:00
Filipi da Silva Fuchter	6eccd16543	Merge pull request #4217 from pipecat-ai/filipi/async_tools Supporting async function calls.	2026-04-07 09:35:03 -03:00
filipi87	d8dc6bc7d0	New example for async function calls using Google.	2026-04-07 09:31:22 -03:00
filipi87	d12a8529e2	New example for async function calls using OpenAI responses.	2026-04-07 09:28:01 -03:00
filipi87	aa061f7e2c	Renaming the openai and anthropic examples to async instead of delayed.	2026-04-07 09:23:45 -03:00
Filipi da Silva Fuchter	e863293198	Improving docstring description. Co-authored-by: kompfner <paul@daily.co>	2026-04-07 08:14:39 -04:00
Filipi da Silva Fuchter	a451c42dc7	Merge pull request #4247 from pipecat-ai/filipi/background_sound_example Fixing the background sound example.	2026-04-07 09:06:14 -03:00
filipi87	ceaa27ee6e	Fixing the background sound example.	2026-04-06 18:25:30 -03:00
Mark Backman	916af84974	Remove DeprecatedModuleProxy and service re-export shims Remove the deprecation proxy infrastructure that allowed old-style flat imports (e.g. `from pipecat.services.openai import OpenAILLMService`). Users must now import from specific submodules (`from pipecat.services.openai.llm import OpenAILLMService`), which is already the established pattern across all internal code and 179+ examples. - Strip 32 proxy `__init__.py` files to empty - Strip 3 non-proxy files with bare star imports (minimax, sambanova, sarvam) - Strip google/gemini_live `__init__.py` re-exports - Remove DeprecatedModuleProxy class and helpers from services/__init__.py - Remove ruff per-file ignore for services/__init__.py - Fix 2 examples using old-style imports	2026-04-03 13:43:02 -04:00
Mark Backman	c2358b273b	Use Parameters instead of Attributes in docstrings to fix duplicate object warnings Napoleon's Attributes section creates class-level attribute docs that duplicate the __init__ parameter docs when napoleon_include_init_with_doc is enabled. Using Parameters avoids the duplication.	2026-04-03 10:36:36 -04:00
Mark Backman	8adb38f87c	Remove unused imports across codebase	2026-04-02 22:21:16 -04:00
vipyne	1d7404ef21	Update MCP examples	2026-04-02 18:15:56 -05:00

1 2 3 4 5 ...

1890 Commits