Move <ui_state> snapshot injection out of respond_with_llm into a
cross-cutting on_before_process_frame handler on the UIWorker's LLM, so it
appends the current snapshot to the context the request is built from, just
before each inference. Injection is gated to the user-turn-initiating
inference so a tool-calling turn never stacks duplicate <ui_state> blocks;
respond_with_llm no longer injects manually.
Also drop the bridged parameter from UIWorker: there is no viable way to
bridge a UIWorker between workers — a shared, teed context would be polluted
by the injection, and per-worker turn detection off teed frames isn't
supported. Other workers keep their PipelineWorker bridging.
UIWorker is an LLMContextWorker that observes and drives a client GUI over the
RTVI UI channel: it stores accessibility snapshots, auto-injects <ui_state> at
the start of each respond job, dispatches client events to @on_ui_event
handlers, sends UI commands back to the client, and surfaces fan-out work as
cancellable task cards via user_job_group(). The optional ReplyToolMixin exposes
a bundled reply tool.
The prompt_guide parameter auto-appends the UI wire-format guide to the LLM's
system instruction (default UI_STATE_PROMPT_GUIDE; override with a string or
disable with None), so the LLM can parse the injected <ui_state> / <ui_event>
messages without the app concatenating the guide by hand.
When RTVI is enabled, PipelineWorker now republishes inbound ui-event /
ui-snapshot / ui-cancel-task messages onto the bus as a broadcast
BusUIEventMessage, and translates outbound BusUICommandMessage / BusUITask*
carriers into the matching RTVI frames. This lets a UIWorker on the bus observe
and drive the client UI with no decorator or manual wiring; when no UIWorker is
present the events are simply unconsumed.
The BusUI* carriers live in the bus layer so both pipeline and workers can
reference them without an import cycle.
Composes durable text onto a user-provided system instruction (alongside the
turn-completion and async-tool-cancellation addons) so it is prepended on every
inference and survives context-message resets. The user's base prompt is now
snapshotted once and the effective instruction is always rebuilt from it,
replacing the prior lazy capture/restore logic with a single invariant.
Lets callers register multiple workers in a single call instead of
awaiting add_worker() repeatedly. Updates all examples, docs, tests,
and proxy worker docstrings to use the new API.
BusMessage was a mixin tacked onto DataFrame / SystemFrame so the bus
could reuse the frame priority machinery. That made every bus message
also a Frame, which is misleading — bus messages travel on the bus, not
through pipelines. If a worker actually needs to ship a frame, it wraps
it in BusFrameMessage.
BusMessage is now a plain dataclass base carrying source/target.
BusDataMessage and BusSystemMessage are empty subclasses that exist
only as priority markers. The bus router and the priority queue check
``isinstance(item, BusSystemMessage)`` directly instead of
``isinstance(item, SystemFrame)``.
The serializer test that round-tripped DataFrame.name (a non-init
field) is rewritten against a local _MessageWithNonInit(BusDataMessage)
subclass so the serializer's init=False path stays covered.
Mirrors on_bus_message and makes it explicit that the call goes out on
the task bus, not on a transport (transports have their own
send_message for client/peer messaging).
BaseTask.handoff_to was just deactivate_self + activate_task. Remove
it and add a deactivate_self flag on activate_task instead, so there's
one entry point for activating another task.
LLMTask now overrides activate_task (mirroring its end() override) to
keep the messages / result_callback hooks that finish an in-progress
tool call before the target is activated. All multi-task examples and
unit tests switch to the new call.
``test_pgmq_bus_lazy_import`` and ``test_redis_bus_lazy_import``
import ``pipecat.bus.network.pgmq`` / ``redis`` directly, which raises
when the optional ``pgmq`` / ``redis`` packages are missing. Gate each
test with ``@unittest.skipUnless`` on a top-level probe of the
underlying package so they're skipped (not errored) in environments
without the extras. ``test_unknown_attribute_raises`` is unaffected.
The PGMQ and Redis bus modules raise an ``Exception`` at import time
when the optional ``pgmq`` / ``redis`` packages are missing, which broke
``pytest`` collection in environments without those extras (e.g. CI
that uses ``--no-extra gstreamer --no-extra local``). Wrap the imports
in ``try/except`` and ``raise unittest.SkipTest`` so the whole test
module is skipped cleanly instead of failing collection.
Brings over 215 tests across 15 files covering the new
multi-task framework: BaseTask / PipelineTask bus lifecycle,
job RPC and job groups, the bus message hierarchy and serializers,
TaskBus + AsyncQueueBus + RedisBus + PgmqBus (with direct and
isolated backends), TaskRegistry, the BusBridgeProcessor, the
WebSocket proxy tasks, the LLMTask deferral logic, and the
PipelineRunner spawn-and-attach flow.
`FrameProcessorSetup.pipeline_task` is now mandatory and
`FrameProcessor.pipeline_task` raises if accessed before setup
instead of returning `None`. `FunctionCallParams` gains a
required `pipeline_task` field and `LLMService._run_function_call`
populates it (plus reads `app_resources` directly off the
pipeline task). Tests that build a processor or
`FunctionCallParams` outside a real pipeline stub it with a
`SimpleNamespace`.
`PipelineTask` now extends `BaseTask` so every pipeline task is
also a bus participant. Adds optional `bus`, `bridged`, and
`exclude_frames` parameters: when `bridged` is set, the user's
pipeline is wrapped with `_BusEdgeProcessor` source/sink edges so
frames are mirrored onto the bus. Bridges pipeline lifecycle
events to `start()`/`stop()`, overrides `_handle_task_end` /
`_handle_task_cancel` to drive the pipeline shutdown, subscribes
to the bus in setup, and exposes the `bridged` property to the
registry. Moves `PipelineTaskParams` here and updates the
matching test import.
Adds tests for AggregatedFrameSequencer, WordCompletionTracker, and
word_timestamp_utils (including CJK language scenarios). Updates existing
Cartesia TTS and TTS frame ordering tests to cover the new behaviours.
The keepalive could fire for a new turn's context before that context's
voice_settings context-init was sent, making the keepalive the context's
first message (no voice_settings) and causing ElevenLabs to reject the
later init with a 1008 policy violation. The keepalive now only targets a
context once its context-init has been sent (tracked in _context_init_sent).
A single Realtime API response can now contain more than one audio item
(observed with gpt-realtime-2), and the first item's audio.done can
arrive after deltas from the second have started arriving. Deltas still
arrive strictly in playback order across items, so we keep forwarding
them as received — matching OpenAI's reference implementation.
Adjusted OpenAIRealtimeLLMService so a multi-item response is treated as
one continuous TTS turn:
- _handle_evt_audio_delta: on item switch, advance the tracked item in
place (reset total_size) without emitting another TTSStartedFrame.
Truncation now always targets the latest item.
- _handle_evt_audio_done: debug-trace only; no longer pushes
TTSStoppedFrame.
- _handle_evt_response_done: pushes a single TTSStoppedFrame per turn,
bookending the audio with the Started pushed on the first delta.
Added tests covering single-item, overlapping multi-item, non-overlapping
multi-item, and interrupt-during-multi-item (last-item-wins truncation).
Before the new async-tool mechanism landed, AWSNovaSonicLLMService and
OpenAIRealtimeLLMService honored cancel_on_interruption=False by simply
not cancelling in-flight function calls on interruption — the eventual
result then flowed through the same channel as any synchronous tool
result. The new mechanism (which appends started/intermediate/final
messages to the LLM context as the underlying task progresses) broke
that path: the realtime services didn't know how to interpret those
messages, and the eventual result was never delivered to the provider.
Restore the flag's behavior by teaching both services to detect
async-tool messages in the context and route them appropriately:
- started → skipped silently. The provider already issued the tool call
and natively awaits a result; nothing to send for the started marker.
- final → delivered via the formal tool-result channel. Same path as a
synchronous tool result, just delayed.
Streamed intermediate results (FunctionCallResultProperties(is_final=
False)) are not supported on these realtime services. An intermediate
result is logged as an error and surfaced via push_error, then dropped.
Use a non-realtime LLM service if a tool needs to stream intermediate
results. (Docstrings on register_function, register_direct_function, and
FunctionCallResultProperties.is_final updated to call this out.)
A new shared module pipecat.processors.aggregators.async_tool_messages
is the single source of truth for the on-the-wire payload shape: the
aggregator uses its build_*_message functions when injecting messages,
and the realtime services use parse_message when scanning the context.
Adds two example files exercising a network-delayed weather tool with
each service. The plain realtime-aws-nova-sonic.py example is also
reverted to a synchronous tool call now that the async variant lives in
its own file.
Similar fixes for other realtime services are forthcoming.
The old name overlapped semantically with `UserStoppedSpeakingFrame`:
both could be read as "the user's turn is done." They're at different
layers — `UserStoppedSpeakingFrame` is the acoustic stop signal,
while this frame is the post-judgment "inference about the turn is
now complete (turn is semantically final)" signal emitted by the LLM
mixin (on ✓), an end-of-turn classifier, or a custom producer.
The new name pairs naturally with the existing
`on_user_turn_inference_triggered` event vocabulary and removes the
ambiguity with `UserStoppedSpeakingFrame`.
Wrap the detector chain with `deferred(...)` and append the LLM
completion gate via a `UserTurnStrategies` specialization rather than
a free-standing helper, mirroring the existing
`ExternalUserTurnStrategies` pattern. The class lives next to other
strategy containers in `pipecat.turns.user_turn_strategies`, so users
discover it where they're already configuring `user_turn_strategies`.
The deprecated `filter_incomplete_user_turns` flag now rewires
through `FilterIncompleteUserTurnStrategies` under the hood, keeping
the migration path identical to before. `deferred(...)` stays public
as the explicit escape hatch for non-default compositions.
When a stop-strategy chain splits inference-triggered from
finalization (e.g. `LLMTurnCompletionUserTurnStopStrategy` gating a
deferred detector), more than one inference can fire inside a single
user turn — each adds the new transcription segment to the context.
Previously each inference overwrote `_pending_user_turn_aggregation`,
so the eventual `on_user_turn_stopped` event surfaced only the
segment from the last inference, dropping anything the user said
before it.
Concatenate each segment into `_full_user_turn_aggregation` instead
of overwriting, and combine that running buffer with any post-final-
inference segment when emitting the public event.
Add an `LLMMarkerFrame(DataFrame)` for sideband LLM markers that need
to be persisted to context but should not flow through the standard
text path (TTS, transcript). The frame carries an
`append_to_context_immediately` flag so the assistant aggregator can
either commit the marker as a stand-alone message (○ / ◐) or merge it
with the upcoming aggregation as a prefix on the response (✓).
`UserTurnCompletionLLMServiceMixin` now emits `LLMMarkerFrame` instead
of pushing the marker as `LLMTextFrame(skip_tts=True)`, which fixes
the case where an incomplete-turn marker (○ / ◐) was aggregated by
the assistant aggregator but never committed to the context because
the assistant turn lifecycle didn't run to completion (no spoken
response, no `LLMFullResponseEndFrame`-driven `push_aggregation`).
The frame is intentionally generic so other components — STT services
with built-in turn signals, end-of-turn classifiers, custom
annotations — can use the same mechanism to inject sideband signals
into the assistant context.
Fixes a real bug: with `filter_incomplete_user_turns` enabled, the
smart-turn detector's tentative stop was firing `on_user_turn_stopped`
before the LLM had a chance to veto it. Observers, transcript
appenders and UI indicators received an early — and sometimes
duplicated — signal.
Decomposes the single stop concern into two events:
- `on_user_turn_inference_triggered` fires when a stop strategy has
enough signal to start LLM inference. The aggregator pushes the
context here, kicking off the LLM call.
- `on_user_turn_stopped` fires only when the user turn is semantically
final. Built-in strategies fire both events at the same call site,
preserving today's behavior for the common case.
Adds `LLMTurnCompletionUserTurnStopStrategy`, which gates
finalization on a `UserTurnCompletedFrame` (a fieldless system frame
emitted by any component judging turn completeness — currently the
`UserTurnCompletionLLMServiceMixin` on `✓`).
Adds `deferred(strategy)` / `DeferredUserTurnStopStrategy`, a thin
wrapper that forwards an inner strategy's events except
`on_user_turn_stopped`. Use this to install a stop strategy as an
inference trigger only, leaving finalization to a peer (e.g. the LLM
completion strategy).
Adds `llm_completion_user_turn_stop_strategies()` for the common
case:
UserTurnStrategies(
stop=llm_completion_user_turn_stop_strategies(),
)
Deprecates `LLMUserAggregatorParams.filter_incomplete_user_turns`.
The aggregator emits a `DeprecationWarning`, wraps existing stop
strategies with `deferred(...)`, and appends
`LLMTurnCompletionUserTurnStopStrategy` automatically.