pipecat

Author	SHA1	Message	Date
Mark Backman	1c94feaaff	Inject <ui_state> via the LLM's on_before_process_frame hook Move <ui_state> snapshot injection out of respond_with_llm into a cross-cutting on_before_process_frame handler on the UIWorker's LLM, so it appends the current snapshot to the context the request is built from, just before each inference. Injection is gated to the user-turn-initiating inference so a tool-calling turn never stacks duplicate <ui_state> blocks; respond_with_llm no longer injects manually. Also drop the bridged parameter from UIWorker: there is no viable way to bridge a UIWorker between workers — a shared, teed context would be polluted by the injection, and per-worker turn detection off teed frames isn't supported. Other workers keep their PipelineWorker bridging.	2026-05-21 23:20:40 -04:00
Mark Backman	f1f5a986e8	Add UIWorker UIWorker is an LLMContextWorker that observes and drives a client GUI over the RTVI UI channel: it stores accessibility snapshots, auto-injects <ui_state> at the start of each respond job, dispatches client events to @on_ui_event handlers, sends UI commands back to the client, and surfaces fan-out work as cancellable task cards via user_job_group(). The optional ReplyToolMixin exposes a bundled reply tool. The prompt_guide parameter auto-appends the UI wire-format guide to the LLM's system instruction (default UI_STATE_PROMPT_GUIDE; override with a string or disable with None), so the LLM can parse the injected <ui_state> / <ui_event> messages without the app concatenating the guide by hand.	2026-05-21 23:20:40 -04:00
Mark Backman	02667a7255	Add native RTVI⇄bus UI bridge to PipelineWorker When RTVI is enabled, PipelineWorker now republishes inbound ui-event / ui-snapshot / ui-cancel-task messages onto the bus as a broadcast BusUIEventMessage, and translates outbound BusUICommandMessage / BusUITask* carriers into the matching RTVI frames. This lets a UIWorker on the bus observe and drive the client UI with no decorator or manual wiring; when no UIWorker is present the events are simply unconsumed. The BusUI* carriers live in the bus layer so both pipeline and workers can reference them without an import cycle.	2026-05-21 23:03:37 -04:00
Mark Backman	ee3d1128ec	Add LLMService.append_system_instruction() Composes durable text onto a user-provided system instruction (alongside the turn-completion and async-tool-cancellation addons) so it is prepended on every inference and survives context-message resets. The user's base prompt is now snapshotted once and the effective instruction is always rebuilt from it, replacing the prior lazy capture/restore logic with a single invariant.	2026-05-21 23:03:37 -04:00
Aleix Conchillo Flaqué	e8ec7c585f	Rename PipelineRunner.add_worker() to variadic add_workers(*workers) Lets callers register multiple workers in a single call instead of awaiting add_worker() repeatedly. Updates all examples, docs, tests, and proxy worker docstrings to use the new API.	2026-05-21 19:46:53 -07:00
Aleix Conchillo Flaqué	d07ba562eb	Separate bus messages from pipeline frames BusMessage was a mixin tacked onto DataFrame / SystemFrame so the bus could reuse the frame priority machinery. That made every bus message also a Frame, which is misleading — bus messages travel on the bus, not through pipelines. If a worker actually needs to ship a frame, it wraps it in BusFrameMessage. BusMessage is now a plain dataclass base carrying source/target. BusDataMessage and BusSystemMessage are empty subclasses that exist only as priority markers. The bus router and the priority queue check ``isinstance(item, BusSystemMessage)`` directly instead of ``isinstance(item, SystemFrame)``. The serializer test that round-tripped DataFrame.name (a non-init field) is rewritten against a local _MessageWithNonInit(BusDataMessage) subclass so the serializer's init=False path stays covered.	2026-05-21 19:07:13 -07:00
Aleix Conchillo Flaqué	b03247f360	Rename BaseTask → BaseWorker and reserve "task" for asyncio Replaces every "task" identifier that referred to the BaseTask abstraction with "worker". Asyncio task plumbing (asyncio.Task, BaseTaskManager, TaskManager, create_task, cancel_task, etc.) stays untouched. Highlights: - Classes: BaseTask → BaseWorker, PipelineTask → PipelineWorker, LLMTask → LLMWorker, LLMContextTask → LLMContextWorker, TaskBus → WorkerBus, TaskRegistry → WorkerRegistry, TaskActivationArgs → WorkerActivationArgs, TaskReadyData → WorkerReadyData, TaskRegistryEntry → WorkerRegistryEntry, TaskObserver → WorkerObserver, all BusTaskMessage → BusWorkerMessage, BusAddTaskMessage.task field → worker, BusWorkerRegistryMessage.tasks field → workers. - Methods/decorators: activate_task → activate_worker, deactivate_task → deactivate_worker, add_task → add_worker, watch_task → watch_worker, @task_ready → @worker_ready, setup_pipeline_task hook → setup_pipeline_worker. - Params/fields: FrameProcessorSetup.pipeline_task and FunctionCallParams.pipeline_task → pipeline_worker. Parameter names like task_name → worker_name; spawn/run accept worker:. - Files: pipeline/base_task.py → base_worker.py, pipeline/task.py → worker.py (plus a re-export shim at pipeline/task.py), task_observer.py → worker_observer.py, task_ready_decorator.py → worker_ready_decorator.py, pipecat.tasks → pipecat.workers, llm_task.py → llm_worker.py, llm_context_task.py → llm_context_worker.py, examples/multi-task → examples/multi-worker. Back-compat: - PipelineTask kept as a deprecated subclass of PipelineWorker that warns on construction. - pipecat.pipeline.task re-exports PipelineWorker/PipelineTask/etc. so existing user imports keep working. - FrameProcessor.pipeline_task kept as a deprecated property that forwards to pipeline_worker. Local variables in examples that hold a worker (task = PipelineTask(...)) are renamed to worker = PipelineWorker(...). Asyncio-task locals (runner_task, etc.) are preserved.	2026-05-21 19:07:13 -07:00
Aleix Conchillo Flaqué	d8947c68a9	Rename BaseTask.send_message to send_bus_message Mirrors on_bus_message and makes it explicit that the call goes out on the task bus, not on a transport (transports have their own send_message for client/peer messaging).	2026-05-21 10:13:21 -07:00
Aleix Conchillo Flaqué	373894fc65	Fold BaseTask.handoff_to into activate_task(deactivate_self=...) BaseTask.handoff_to was just deactivate_self + activate_task. Remove it and add a deactivate_self flag on activate_task instead, so there's one entry point for activating another task. LLMTask now overrides activate_task (mirroring its end() override) to keep the messages / result_callback hooks that finish an in-progress tool call before the target is activated. All multi-task examples and unit tests switch to the new call.	2026-05-21 10:13:21 -07:00
Aleix Conchillo Flaqué	9ecb00d097	Skip pgmq/redis lazy-import tests when their extras are not installed ``test_pgmq_bus_lazy_import`` and ``test_redis_bus_lazy_import`` import ``pipecat.bus.network.pgmq`` / ``redis`` directly, which raises when the optional ``pgmq`` / ``redis`` packages are missing. Gate each test with ``@unittest.skipUnless`` on a top-level probe of the underlying package so they're skipped (not errored) in environments without the extras. ``test_unknown_attribute_raises`` is unaffected.	2026-05-21 10:13:21 -07:00
Aleix Conchillo Flaqué	79ae9740cc	Skip pgmq/redis bus tests when their extras are not installed The PGMQ and Redis bus modules raise an ``Exception`` at import time when the optional ``pgmq`` / ``redis`` packages are missing, which broke ``pytest`` collection in environments without those extras (e.g. CI that uses ``--no-extra gstreamer --no-extra local``). Wrap the imports in ``try/except`` and ``raise unittest.SkipTest`` so the whole test module is skipped cleanly instead of failing collection.	2026-05-21 10:13:21 -07:00
Aleix Conchillo Flaqué	402cf8dade	Port multi-task unit tests from pipecat-subagents Brings over 215 tests across 15 files covering the new multi-task framework: BaseTask / PipelineTask bus lifecycle, job RPC and job groups, the bus message hierarchy and serializers, TaskBus + AsyncQueueBus + RedisBus + PgmqBus (with direct and isolated backends), TaskRegistry, the BusBridgeProcessor, the WebSocket proxy tasks, the LLMTask deferral logic, and the PipelineRunner spawn-and-attach flow.	2026-05-21 10:13:21 -07:00
Aleix Conchillo Flaqué	ef806163b2	Tighten the pipeline_task contract for processors and tools `FrameProcessorSetup.pipeline_task` is now mandatory and `FrameProcessor.pipeline_task` raises if accessed before setup instead of returning `None`. `FunctionCallParams` gains a required `pipeline_task` field and `LLMService._run_function_call` populates it (plus reads `app_resources` directly off the pipeline task). Tests that build a processor or `FunctionCallParams` outside a real pipeline stub it with a `SimpleNamespace`.	2026-05-21 10:12:51 -07:00
Aleix Conchillo Flaqué	b5c757ab85	Make PipelineTask inherit BaseTask and support bridged pipelines `PipelineTask` now extends `BaseTask` so every pipeline task is also a bus participant. Adds optional `bus`, `bridged`, and `exclude_frames` parameters: when `bridged` is set, the user's pipeline is wrapped with `_BusEdgeProcessor` source/sink edges so frames are mirrored onto the bus. Bridges pipeline lifecycle events to `start()`/`stop()`, overrides `_handle_task_end` / `_handle_task_cancel` to drive the pipeline shutdown, subscribes to the bus in setup, and exposes the `bridged` property to the registry. Moves `PipelineTaskParams` here and updates the matching test import.	2026-05-21 10:12:51 -07:00
Mark Backman	105d6f27da	Merge pull request #4514 from pipecat-ai/mb/websocket-stt-service-exception-handling Align websocket STT connection failures	2026-05-20 15:15:35 -04:00
Mark Backman	9586db5b50	Preserve websocket reconnect failure retries	2026-05-20 14:45:29 -04:00
Mark Backman	709a0ce839	Merge pull request #4527 from pipecat-ai/mb/fix-elevenlabs-keepalive-1008 Fix ElevenLabs keepalive racing context-init (1008 disconnects)	2026-05-20 11:21:17 -04:00
Mark Backman	4a96ab7073	Merge pull request #4524 from pipecat-ai/mb/fix-runner-imports Improve runner optional transport handling	2026-05-20 11:16:16 -04:00
filipi87	81bb81c1d0	test: add automated tests for word tracking, frame sequencing, and Cartesia TTS Adds tests for AggregatedFrameSequencer, WordCompletionTracker, and word_timestamp_utils (including CJK language scenarios). Updates existing Cartesia TTS and TTS frame ordering tests to cover the new behaviours.	2026-05-20 10:03:26 -03:00
Mark Backman	a5e6886b80	Fix ElevenLabs keepalive racing context-init (1008 disconnects) The keepalive could fire for a new turn's context before that context's voice_settings context-init was sent, making the keepalive the context's first message (no voice_settings) and causing ElevenLabs to reject the later init with a 1008 policy violation. The keepalive now only targets a context once its context-init has been sent (tracked in _context_init_sent).	2026-05-20 08:59:01 -04:00
Mark Backman	d11a4ba0cd	Use shared telephony route availability checks	2026-05-20 08:57:48 -04:00
Mark Backman	b825dd779e	Clarify runner startup banner	2026-05-19 17:31:07 -04:00
Mark Backman	1487da53a9	Improve runner optional transport handling	2026-05-19 17:03:16 -04:00
Mark Backman	97b00042df	Align websocket STT connection failures	2026-05-18 12:35:01 -04:00
Antoni Silvestre	18368d047e	Linting and changes to adapt to v1.0	2026-05-18 14:40:56 +02:00
asilvestre	e3abb4b6d7	apply suggestions in PR	2026-05-18 14:40:56 +02:00
asilvestre	c61672194d	Vonage Video Connector Transport	2026-05-18 14:40:49 +02:00
Mark Backman	73278d3309	Use majority language for Soniox transcripts	2026-05-14 15:18:43 -04:00
Mark Backman	49bda11ae8	Merge pull request #4482 from pipecat-ai/mb/soniox-stt-token-language Propagate Soniox token language	2026-05-13 16:28:56 -04:00
Mark Backman	078af6969a	Merge pull request #4473 from timofey-TK/inworld-tts-v2 Add support for Inworld TTS v2 fields	2026-05-13 15:32:16 -04:00
Mark Backman	82f0896d6a	Propagate Soniox token language	2026-05-13 15:23:22 -04:00
Mark Backman	08680732f6	Merge pull request #4475 from pipecat-ai/mb/cartesia-korean-fix Fix Cartesia CJK timestamp spacing	2026-05-13 13:20:42 -04:00
Mark Backman	064b68aa01	Fix Cartesia CJK timestamp spacing	2026-05-13 13:13:40 -04:00
Mark Backman	5fef239b68	Merge pull request #4450 from pipecat-ai/mb/gpt-realtime-whisper Default OpenAI Realtime transcription to gpt-realtime-whisper	2026-05-13 09:48:33 -04:00
Timofey	39e7f9e354	Fix Inworld TTS v2 request fields	2026-05-13 11:17:31 +03:00
Mark Backman	644030584f	Centralize OpenAI audio constants	2026-05-12 17:48:53 -04:00
Mark Backman	abd28e2ac1	Update OpenAI realtime transcription default	2026-05-12 15:20:57 -04:00
Paul Kompfner	a52bdef32b	Add reasoning support to OpenAIRealtimeLLMService for gpt-realtime-2	2026-05-12 13:55:19 -04:00
Paul Kompfner	007fa3a3a8	Handle gpt-realtime-2 multi-output-item audio responses A single Realtime API response can now contain more than one audio item (observed with gpt-realtime-2), and the first item's audio.done can arrive after deltas from the second have started arriving. Deltas still arrive strictly in playback order across items, so we keep forwarding them as received — matching OpenAI's reference implementation. Adjusted OpenAIRealtimeLLMService so a multi-item response is treated as one continuous TTS turn: - _handle_evt_audio_delta: on item switch, advance the tracked item in place (reset total_size) without emitting another TTSStartedFrame. Truncation now always targets the latest item. - _handle_evt_audio_done: debug-trace only; no longer pushes TTSStoppedFrame. - _handle_evt_response_done: pushes a single TTSStoppedFrame per turn, bookending the audio with the Started pushed on the first delta. Added tests covering single-item, overlapping multi-item, non-overlapping multi-item, and interrupt-during-multi-item (last-item-wins truncation).	2026-05-12 10:34:50 -04:00
Paul Kompfner	72d0fb418a	fix: restore cancel_on_interruption=False support in AWS Nova Sonic and OpenAI Realtime Before the new async-tool mechanism landed, AWSNovaSonicLLMService and OpenAIRealtimeLLMService honored cancel_on_interruption=False by simply not cancelling in-flight function calls on interruption — the eventual result then flowed through the same channel as any synchronous tool result. The new mechanism (which appends started/intermediate/final messages to the LLM context as the underlying task progresses) broke that path: the realtime services didn't know how to interpret those messages, and the eventual result was never delivered to the provider. Restore the flag's behavior by teaching both services to detect async-tool messages in the context and route them appropriately: - started → skipped silently. The provider already issued the tool call and natively awaits a result; nothing to send for the started marker. - final → delivered via the formal tool-result channel. Same path as a synchronous tool result, just delayed. Streamed intermediate results (FunctionCallResultProperties(is_final= False)) are not supported on these realtime services. An intermediate result is logged as an error and surfaced via push_error, then dropped. Use a non-realtime LLM service if a tool needs to stream intermediate results. (Docstrings on register_function, register_direct_function, and FunctionCallResultProperties.is_final updated to call this out.) A new shared module pipecat.processors.aggregators.async_tool_messages is the single source of truth for the on-the-wire payload shape: the aggregator uses its build_*_message functions when injecting messages, and the realtime services use parse_message when scanning the context. Adds two example files exercising a network-delayed weather tool with each service. The plain realtime-aws-nova-sonic.py example is also reverted to a synchronous tool call now that the async variant lives in its own file. Similar fixes for other realtime services are forthcoming.	2026-05-08 09:33:06 -04:00
Aleix Conchillo Flaqué	b78cecf7b2	Rename UserTurnCompletedFrame to UserTurnInferenceCompletedFrame The old name overlapped semantically with `UserStoppedSpeakingFrame`: both could be read as "the user's turn is done." They're at different layers — `UserStoppedSpeakingFrame` is the acoustic stop signal, while this frame is the post-judgment "inference about the turn is now complete (turn is semantically final)" signal emitted by the LLM mixin (on ✓), an end-of-turn classifier, or a custom producer. The new name pairs naturally with the existing `on_user_turn_inference_triggered` event vocabulary and removes the ambiguity with `UserStoppedSpeakingFrame`.	2026-05-07 17:47:41 -07:00
Aleix Conchillo Flaqué	952dddca8b	Replace llm_completion_user_turn_stop_strategies() with FilterIncompleteUserTurnStrategies Wrap the detector chain with `deferred(...)` and append the LLM completion gate via a `UserTurnStrategies` specialization rather than a free-standing helper, mirroring the existing `ExternalUserTurnStrategies` pattern. The class lives next to other strategy containers in `pipecat.turns.user_turn_strategies`, so users discover it where they're already configuring `user_turn_strategies`. The deprecated `filter_incomplete_user_turns` flag now rewires through `FilterIncompleteUserTurnStrategies` under the hood, keeping the migration path identical to before. `deferred(...)` stays public as the explicit escape hatch for non-default compositions.	2026-05-07 17:47:39 -07:00
Aleix Conchillo Flaqué	e3e90d38aa	Preserve full user transcript across multiple inferences in one turn When a stop-strategy chain splits inference-triggered from finalization (e.g. `LLMTurnCompletionUserTurnStopStrategy` gating a deferred detector), more than one inference can fire inside a single user turn — each adds the new transcription segment to the context. Previously each inference overwrote `_pending_user_turn_aggregation`, so the eventual `on_user_turn_stopped` event surfaced only the segment from the last inference, dropping anything the user said before it. Concatenate each segment into `_full_user_turn_aggregation` instead of overwriting, and combine that running buffer with any post-final- inference segment when emitting the public event.	2026-05-07 17:46:15 -07:00
Aleix Conchillo Flaqué	d1c8162b0c	Route turn-completion markers through LLMMarkerFrame Add an `LLMMarkerFrame(DataFrame)` for sideband LLM markers that need to be persisted to context but should not flow through the standard text path (TTS, transcript). The frame carries an `append_to_context_immediately` flag so the assistant aggregator can either commit the marker as a stand-alone message (○ / ◐) or merge it with the upcoming aggregation as a prefix on the response (✓). `UserTurnCompletionLLMServiceMixin` now emits `LLMMarkerFrame` instead of pushing the marker as `LLMTextFrame(skip_tts=True)`, which fixes the case where an incomplete-turn marker (○ / ◐) was aggregated by the assistant aggregator but never committed to the context because the assistant turn lifecycle didn't run to completion (no spoken response, no `LLMFullResponseEndFrame`-driven `push_aggregation`). The frame is intentionally generic so other components — STT services with built-in turn signals, end-of-turn classifiers, custom annotations — can use the same mechanism to inject sideband signals into the assistant context.	2026-05-07 17:46:15 -07:00
Aleix Conchillo Flaqué	480eca42f5	Split user-turn-stop into inference-triggered and finalized events Fixes a real bug: with `filter_incomplete_user_turns` enabled, the smart-turn detector's tentative stop was firing `on_user_turn_stopped` before the LLM had a chance to veto it. Observers, transcript appenders and UI indicators received an early — and sometimes duplicated — signal. Decomposes the single stop concern into two events: - `on_user_turn_inference_triggered` fires when a stop strategy has enough signal to start LLM inference. The aggregator pushes the context here, kicking off the LLM call. - `on_user_turn_stopped` fires only when the user turn is semantically final. Built-in strategies fire both events at the same call site, preserving today's behavior for the common case. Adds `LLMTurnCompletionUserTurnStopStrategy`, which gates finalization on a `UserTurnCompletedFrame` (a fieldless system frame emitted by any component judging turn completeness — currently the `UserTurnCompletionLLMServiceMixin` on `✓`). Adds `deferred(strategy)` / `DeferredUserTurnStopStrategy`, a thin wrapper that forwards an inner strategy's events except `on_user_turn_stopped`. Use this to install a stop strategy as an inference trigger only, leaving finalization to a peer (e.g. the LLM completion strategy). Adds `llm_completion_user_turn_stop_strategies()` for the common case: UserTurnStrategies( stop=llm_completion_user_turn_stop_strategies(), ) Deprecates `LLMUserAggregatorParams.filter_incomplete_user_turns`. The aggregator emits a `DeprecationWarning`, wraps existing stop strategies with `deferred(...)`, and appends `LLMTurnCompletionUserTurnStopStrategy` automatically.	2026-05-07 17:46:09 -07:00
Mark Backman	1073510574	Merge pull request #4407 from pipecat-ai/mb/ui-agent-wire-format feat(rtvi): add UI Agent Protocol as first-class RTVI message types	2026-05-07 20:03:41 -04:00
Mark Backman	7a2cec2e45	Merge pull request #4426 from marcelodiaz558/feature/elevenlabs_stt_keyterms Add ElevenLabs STT keyterms support	2026-05-07 18:44:09 -04:00
Marcelo Díaz	edfcd6948b	Add ElevenLabs STT keyterms support	2026-05-07 21:00:26 +00:00
kompfner	991ee9e0e6	Merge pull request #4404 from pipecat-ai/pk/mitigate-calls-to-missing-tools Mitigate tool-call-related hallucination	2026-05-07 15:05:13 -04:00
filipi87	cf22dac171	Refactoring TTSService to preserve uninterruptible frames.	2026-05-06 16:26:45 -03:00

1 2 3 4 5 ...

549 Commits