Drop redundant changelog entry for OpenAI Realtime example

The OpenAI Realtime story didn't add any service-level code — just a new example. The original 4480.added.md entry already describes the feature as "a realtime service like Gemini Live," which generalizes to OpenAI Realtime.
Add realtime-openai-local-vad example
2026-05-18 12:06:48 -04:00 · 2026-05-18 11:50:16 -04:00 · 2026-05-18 10:51:14 -04:00 · 2026-05-18 10:29:19 -04:00 · 2026-05-18 10:18:22 -04:00 · 2026-05-18 09:55:42 -04:00
615 changed files with 6118 additions and 49961 deletions
--- a/.claude/skills/squash-commits/SKILL.md
+++ b/.claude/skills/squash-commits/SKILL.md
@@ -1,91 +0,0 @@
---
-name: squash-commits
-description: Reorganize messy branch commits into a small set of logical, meaningful commits without changing any content. Drops merge-from-main commits. Safe: creates a backup branch first.
---
-
-Reorganize the commits on the current branch into a small number of logical commits. Do NOT change any file content — only the commit structure changes.
-
-## Instructions
-
-### 1. Safety check
-
-```bash
-git status --short
-```
-
-If there are uncommitted changes, stop and tell the user to commit or stash them first.
-
-### 2. Inspect the branch
-
-```bash
-git log main..HEAD --oneline
-git diff main..HEAD --name-only
-```
-
-List every file changed vs `main` and every commit on the branch (excluding merge commits from main).
-
-### 3. Create a backup branch
-
-```bash
-git branch backup/<current-branch-name>
-```
-
-Tell the user the backup exists so they can recover if needed.
-
-### 4. Soft-reset to main and unstage everything
-
-```bash
-git reset --soft main
-git restore --staged .
-```
-
-All branch changes are now in the working tree, unstaged. No content has changed.
-
-### 5. Plan the logical groups
-
-Read the changed files and the original commit messages to understand what the work covers. Group related files into logical commits. Typical groups:
-
- Core feature or fix (new source files + modified core files)
- Secondary features or fixes (each as its own commit if distinct)
- Refactoring or renames
- Tests
- Changelogs / docs
-
-Use the changelog files (if any) as a strong hint — each changelog entry often maps to one commit.
-
-Present the proposed grouping to the user and ask for confirmation before committing.
-
-### 6. Commit in logical groups
-
-For each group, stage only the relevant files and commit with a clear message following the project's conventions:
-
-```bash
-git add <file1> <file2> ...
-git commit -m "..."
-```
-
-Use conventional commit prefixes if the project uses them (`feat:`, `fix:`, `refactor:`, `test:`, `chore:`).
-
-### 7. Verify
-
-```bash
-git log main..HEAD --oneline
-git diff main..HEAD --name-only
-git status --short
-```
-
-Confirm:
- Commit count is small and each message is meaningful
- The set of changed files vs `main` is identical to before
- Working tree is clean
-
-### 8. Remind about force-push
-
-The branch history has been rewritten. Tell the user they will need to `git push --force-with-lease` when they are ready to update the remote. Do NOT push automatically.
-
-## Rules
-
- Never change file contents. If you find yourself editing a file, stop.
- Never skip the backup branch step.
- Never force-push without explicit user instruction.
- If any step fails or the result looks wrong, tell the user and suggest restoring from the backup: `git reset --hard backup/<branch-name>`.
--- a/.github/workflows/coverage.yaml
+++ b/.github/workflows/coverage.yaml
@@ -41,9 +41,7 @@ jobs:
            --extra google \
            --extra langchain \
            --extra livekit \
-            --extra pgmq \
            --extra piper \
-            --extra redis \
            --extra runner \
            --extra sagemaker \
            --extra tracing \
--- a/.github/workflows/tests.yaml
+++ b/.github/workflows/tests.yaml
@@ -45,9 +45,7 @@ jobs:
            --extra google \
            --extra langchain \
            --extra livekit \
-            --extra pgmq \
            --extra piper \
-            --extra redis \
            --extra runner \
            --extra sagemaker \
            --extra tracing \
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,515 +7,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 <!-- towncrier release notes start -->

-## [1.2.1] - 2026-05-15
-
-### Changed
-
- Changed the default WebSocket endpoints for `GradiumSTTService` and
-  `GradiumTTSService` to the region-neutral
-  `wss://api.gradium.ai/api/speech/asr` and
-  `wss://api.gradium.ai/api/speech/tts`. Gradium now automatically routes
-  traffic to the nearest endpoint. Override the url to pin to a specific
-  region.
-  (PR [#4500](https://github.com/pipecat-ai/pipecat/pull/4500))
-
-### Fixed
-
- Fixed bot hangs when `filter_incomplete_user_turns` was enabled and the LLM
-  responded by calling a tool. The user turn never finalized, so the assistant
-  aggregator gated the tool-result context push and the LLM continuation never
-  ran. Tool calls now finalize the turn the moment they start, before the
-  function dispatches.
-  (PR [#4501](https://github.com/pipecat-ai/pipecat/pull/4501))
-
-## [1.2.0] - 2026-05-14
-
-### Added
-
- Added a `session_id` field to `RunnerArguments` so bots can log or trace a
-  per-session identifier in local development the same way they can in Pipecat
-  Cloud. The development runner now mints a UUID at every construction site,
-  and paths that already returned a `sessionId` to the caller (Daily `/start`,
-  dial-in webhook) share that same UUID with the runner args instead of
-  generating two. The SmallWebRTC `/api/offer` endpoint also accepts an
-  optional `session_id` query parameter so the `/sessions/{session_id}/...`
-  proxy can thread it through.
-  (PR [#4385](https://github.com/pipecat-ai/pipecat/pull/4385))
-
- Added a `max_buffer_delay_ms` constructor argument to `CartesiaTTSService`
-  for controlling Cartesia's server-side text buffering. When unset, Pipecat
-  picks a sensible default based on `text_aggregation_mode`: `0` in `SENTENCE`
-  mode (custom buffering — avoids stacking client-side aggregation on top of
-  Cartesia's default 3000ms server buffer) and unset in `TOKEN` mode
-  (Cartesia's managed buffering applies). Pass an explicit value (0–5000ms) to
-  override.
-  (PR [#4390](https://github.com/pipecat-ai/pipecat/pull/4390))
-
- Added a `mip_opt_out` constructor argument to `DeepgramTTSService` and
-  `DeepgramHttpTTSService` so callers can opt out of the Deepgram Model
-  Improvement Program. When set, the value is forwarded to Deepgram as a query
-  parameter on the speak request. Defaults to `None`, which preserves the
-  existing behavior. See https://dpgr.am/deepgram-mip for pricing implications
-  before enabling.
-  (PR [#4400](https://github.com/pipecat-ai/pipecat/pull/4400))
-
- Added an opt-in `add_tool_change_messages` flag to the LLM aggregators (set
-  via `LLMContextAggregatorPair(..., add_tool_change_messages=True)`) that
-  appends a developer-role message to the context whenever `LLMSetToolsFrame`
-  changes the set of advertised standard tools. Helps the LLM stay coherent
-  across mid-conversation tool changes, mitigating several flavors of
-  tool-call-related hallucination: calling tools that have been removed,
-  avoiding tools that have been re-added, and hallucinating output (made-up
-  answers or tool-call-shaped non-tool-calls) when tools are unavailable.
-  (PR [#4404](https://github.com/pipecat-ai/pipecat/pull/4404))
-
- Added `deferred(strategy)` and `DeferredUserTurnStopStrategy` in
-  `pipecat.turns.user_stop`. Wraps a stop strategy so it fires only the
-  inference-triggered event and suppresses `on_user_turn_stopped`, leaving
-  finalization to another strategy in the chain such as
-  `LLMTurnCompletionUserTurnStopStrategy`.
-  (PR [#4405](https://github.com/pipecat-ai/pipecat/pull/4405))
-
- Added `ExternalUserTurnCompletionStopStrategy` in `pipecat.turns.user_stop` —
-  a generic stop strategy that finalizes the user turn whenever a
-  `UserTurnInferenceCompletedFrame` arrives, regardless of which component
-  produced it. `LLMTurnCompletionUserTurnStopStrategy` now extends this base;
-  future producers (Flux, custom end-of-turn classifiers, etc.) can use the
-  base directly or subclass it to add producer-specific setup.
-  (PR [#4405](https://github.com/pipecat-ai/pipecat/pull/4405))
-
- Added `on_user_turn_inference_triggered`, a new event on the user turn
-  controller, processor, aggregator and stop strategies that fires when a
-  strategy has enough signal to start LLM inference. By default it fires
-  together with `on_user_turn_stopped`; a gating strategy can fire only the
-  inference-triggered event and defer finalization to a peer.
-  (PR [#4405](https://github.com/pipecat-ai/pipecat/pull/4405))
-
- Added `FilterIncompleteUserTurnStrategies` in
-  `pipecat.turns.user_turn_strategies` — a `UserTurnStrategies` specialization
-  that wraps the detector chain with `deferred(...)` and appends
-  `LLMTurnCompletionUserTurnStopStrategy` as the finalizer. Common case:
-  `user_turn_strategies=FilterIncompleteUserTurnStrategies()`. Pass
-  `config=UserTurnCompletionConfig(...)` to customize timeouts and prompts.
-  (PR [#4405](https://github.com/pipecat-ai/pipecat/pull/4405))
-
- Added `LLMTurnCompletionUserTurnStopStrategy` in `pipecat.turns.user_stop`.
-  When installed, the strategy gates `on_user_turn_stopped` on a
-  `UserTurnInferenceCompletedFrame` (a new fieldless system frame emitted by
-  any component that can judge turn completeness — e.g. the
-  `UserTurnCompletionLLMServiceMixin` on `✓`). A `finalization_timeout`
-  provides a safety net if no completion frame ever arrives.
-  (PR [#4405](https://github.com/pipecat-ai/pipecat/pull/4405))
-
- Added first-class RTVI support for the UI Agent Protocol:
-    - Adds `ui-event`, `ui-snapshot`, and `ui-cancel-task` client-to-server
-  messages, plus `ui-command` and `ui-task` server-to-client messages, with
-  paired `*Data` / `*Message` pydantic models.
-    - Adds built-in command payload models for `Toast`, `Navigate`, `ScrollTo`,
-  `Highlight`, `Focus`, `Click`, `SetInputValue`, and `SelectText`; matching
-  default handlers live in `@pipecat-ai/client-react`.
-    - Adds `RTVIProcessor.on_ui_message` for inbound `ui-event`, `ui-snapshot`,
-  and `ui-cancel-task` messages.
-    - Adds five UI pipeline frames, mirroring the `client-message`
-  frame-and-event pattern: downstream code pushes `RTVIUICommandFrame` /
-  `RTVIUITaskFrame` for the observer to wrap into outbound `UICommandMessage` /
-  `UITaskMessage` envelopes, while the processor pushes inbound
-  `RTVIUIEventFrame`, `RTVIUISnapshotFrame`, and `RTVIUICancelTaskFrame`
-  alongside `on_ui_message`.
-    - Bumps the RTVI `PROTOCOL_VERSION` from `1.2.0` to `1.3.0`.
-  (PR [#4407](https://github.com/pipecat-ai/pipecat/pull/4407))
-
- AWS Transcribe STT, Polly TTS, Bedrock LLM, and the Bedrock AgentCore
-  processor now resolve credentials via the standard boto3 provider chain (EC2
-  instance profiles, EKS pod roles / IRSA, ECS task roles, SSO,
-  `~/.aws/credentials`) when explicit credentials and `AWS_*` environment
-  variables are absent. Services running with IAM roles no longer need to
-  export static credentials.
-  (PR [#4416](https://github.com/pipecat-ai/pipecat/pull/4416))
-
- Added `keyterms` support to ElevenLabs STT services so Scribe V2 callers can
-  bias transcription for both file-based and realtime transcription.
-  (PR [#4426](https://github.com/pipecat-ai/pipecat/pull/4426))
-
- Added `watchdog_min_timeout` parameter to `DeepgramFluxSTT` and
-  `DeepgramFluxSageMakerSTT` (default `0.5` seconds) to control the minimum
-  silence duration before the watchdog sends a silence packet to prevent
-  dangling turns. The actual threshold is `max(chunk_duration * 2,
-  watchdog_min_timeout)`, so it also adapts automatically to the audio chunk
-  size in use.
-  (PR [#4430](https://github.com/pipecat-ai/pipecat/pull/4430))
-
- Added `cancel_on_interruption=False` support for `GeminiLiveLLMService` on
-  models that support Gemini's NON_BLOCKING tool mechanism (currently Gemini
-  2.x); the conversation now continues while the tool runs. On models that
-  don't yet support NON_BLOCKING (Gemini 3.x), the service surfaces a one-time
-  warning explaining the limitation. (Note: an intermittent 1008 error can
-  occasionally fire on Gemini 2.5 during long-running tool calls; we
-  auto-reconnect.)
-  (PR [#4448](https://github.com/pipecat-ai/pipecat/pull/4448))
-
- Added `NvidiaSageMakerWebsocketSTTService` for streaming speech recognition
-  using NVIDIA Nemotron ASR via an AWS SageMaker bidirectional-stream endpoint.
-  Produces `InterimTranscriptionFrame` and `TranscriptionFrame` frames, is
-  VAD-aware, and automatically reconnects on error.
-  (PR [#4464](https://github.com/pipecat-ai/pipecat/pull/4464))
-
- Added NVIDIA Magpie TTS services via AWS SageMaker:
-  `NvidiaSageMakerHTTPTTSService` (single HTTP invocation, streams raw PCM
-  back) and `NvidiaSageMakerWebsocketTTSService` (persistent HTTP/2 bidi-stream
-  with full interruption support via `InterruptibleTTSService`).
-  (PR [#4464](https://github.com/pipecat-ai/pipecat/pull/4464))
-
- Added support for `reasoning` configuration on `OpenAIRealtimeLLMService`,
-  for use with reasoning-capable Realtime models such as `gpt-realtime-2`.
-  (PR [#4470](https://github.com/pipecat-ai/pipecat/pull/4470))
-
- Inworld TTS updates:
-    - Added `delivery_mode` setting (`STABLE`/`BALANCED`/`CREATIVE`) to
-  `InworldTTSService` and `InworldHttpTTSService`, enabling the
-  stability-vs-creativity tradeoff in `inworld-tts-2`.
-    - Added language support to `InworldTTSService` and
-  `InworldHttpTTSService`. The `language` setting is now forwarded to the API,
-  and a new `language_to_inworld_language()` helper normalizes Pipecat
-  `Language` enums to Inworld's BCP-47 locale tags.
-  (PR [#4473](https://github.com/pipecat-ai/pipecat/pull/4473))
-
-### Changed
-
- Updated the default `SonioxTTSService` model from `tts-rt-v1-preview` to the
-  generally available `tts-rt-v1`.
-  (PR [#4386](https://github.com/pipecat-ai/pipecat/pull/4386))
-
- Default `cartesia_version` for `CartesiaTTSService` bumped from `2025-04-16`
-  to `2026-03-01`, matching `CartesiaHttpTTSService` and unlocking the
-  `use_normalized_timestamps` and `max_buffer_delay_ms` fields.
-  (PR [#4390](https://github.com/pipecat-ai/pipecat/pull/4390))
-
- ⚠️ `CartesiaTTSService` now sends `use_normalized_timestamps: true` instead
-  of the deprecated `use_original_timestamps` field. Word timestamps now
-  reflect what was actually spoken (post text-normalization and
-  pronunciation-dictionary substitution), matching the convention Pipecat uses
-  for ElevenLabs. This is a behavior change for `sonic-3` users, who were
-  previously receiving timestamps tied to the input transcript.
-  (PR [#4390](https://github.com/pipecat-ai/pipecat/pull/4390))
-
- Broadened `tool_resources` to `app_resources` for easy access not just in
-  tool handlers but in other places like custom `FrameProcessor`s. Three
-  changes: a rename (`tool_resources` → `app_resources`), a new `app_resources`
-  property on `PipelineTask`, and a new `pipeline_task` property on
-  `FrameProcessor`. Tool handlers now read `params.app_resources`; custom
-  processors read `self.pipeline_task.app_resources`. The previous
-  `tool_resources` aliases (on `PipelineTask`, `FunctionCallParams`, and
-  `FrameProcessorSetup`) keep working but are deprecated as of 1.2.0 and emit
-  `DeprecationWarning`s.
-  (PR [#4395](https://github.com/pipecat-ai/pipecat/pull/4395))
-
- Lowered the per-message log in
-  `SmallWebRTCInputTransport._handle_app_message` from `debug` to `trace`. App
-  messages can be high-frequency and were noisy at debug level; set the loguru
-  level to `TRACE` to see them again.
-  (PR [#4397](https://github.com/pipecat-ai/pipecat/pull/4397))
-
- Changed the default model for `GrokRealtimeLLMService` to
-  `grok-voice-think-fast-1.0`, xAI's recommended Voice Agent model. The
-  previous default of `grok-voice-fast-1.0` has been deprecated by xAI and is
-  being removed.
-  (PR [#4401](https://github.com/pipecat-ai/pipecat/pull/4401))
-
- Changed the default Inworld TTS model from `inworld-tts-1.5-max` to
-  `inworld-tts-2` (Realtime TTS-2) across `InworldHttpTTSService`,
-  `InworldTTSService`, and the `InworldRealtimeLLMService` cascade. Existing
-  users can pin the prior model explicitly via the `model`/`tts_model`
-  argument; both `inworld-tts-1.5-max` and `inworld-tts-1.5-mini` remain valid
-  model IDs.
-  (PR [#4422](https://github.com/pipecat-ai/pipecat/pull/4422))
-
- Changed the default model for `GrokLLMService` from `grok-3` to
-  `grok-4.20-non-reasoning`. xAI is retiring `grok-3` on May 15, 2026.
-  (PR [#4429](https://github.com/pipecat-ai/pipecat/pull/4429))
-
- `DeepgramFluxSTT` watchdog silence threshold is now dynamic:
-  `max(chunk_duration * 2, watchdog_min_timeout)` instead of a fixed 500 ms.
-  This prevents false silence injections when large audio chunks are sent at
-  lower frequency.
-  (PR [#4430](https://github.com/pipecat-ai/pipecat/pull/4430))
-
- `ElevenLabsTTSService` now sends `close_context` to the server as soon as the
-  turn is complete (on `on_turn_context_completed`) rather than waiting until
-  all audio has finished playing back. The `isFinal` message from ElevenLabs is
-  now used to signal `TTSStoppedFrame` and clean up the audio context,
-  improving turn transition timing.
-  (PR [#4433](https://github.com/pipecat-ai/pipecat/pull/4433))
-
- Updated `InworldHttpTTSService` and `InworldTTSService` to use PCM audio
-  encoding by default, which returns audio bytes without headers.
-  (PR [#4446](https://github.com/pipecat-ai/pipecat/pull/4446))
-
- Moved `create_task`, `cancel_task`, the `task_manager` property, and
-  `setup(task_manager)` up from `FrameProcessor` to `BaseObject`. Custom
-  `BaseObject` subclasses (turn strategies, controllers, etc.) now inherit
-  these methods directly instead of reimplementing the task manager wiring.
-  Owners propagate the task manager to their child `BaseObject`s via `await
-  child.setup(task_manager)`.
-  (PR [#4449](https://github.com/pipecat-ai/pipecat/pull/4449))
-
- Changed the default OpenAI Realtime input audio transcription model from
-  `gpt-4o-transcribe` to `gpt-realtime-whisper` for both
-  `OpenAIRealtimeSTTService` and `OpenAIRealtimeLLMService`. The new model does
-  not accept the `prompt` parameter; if a prompt is supplied alongside
-  `gpt-realtime-whisper`, it is dropped automatically and a warning is logged.
-  To keep using prompt hints, explicitly pin `model="gpt-4o-transcribe"` (or
-  `"gpt-4o-mini-transcribe"`).
-  (PR [#4450](https://github.com/pipecat-ai/pipecat/pull/4450))
-
- Updated the default model for `CartesiaTTSService` and
-  `CartesiaHttpTTSService` from `sonic-3` to `sonic-3.5`.
-  (PR [#4462](https://github.com/pipecat-ai/pipecat/pull/4462))
-
- Changed the default model for `OpenAIRealtimeLLMService` from
-  `gpt-realtime-1.5` to `gpt-realtime-2`.
-  (PR [#4472](https://github.com/pipecat-ai/pipecat/pull/4472))
-
-### Deprecated
-
- Deprecated `LLMUserAggregatorParams.filter_incomplete_user_turns`. Use
-  `user_turn_strategies=FilterIncompleteUserTurnStrategies()` (or add
-  `LLMTurnCompletionUserTurnStopStrategy` to a custom
-  `user_turn_strategies.stop`) instead. Setting the legacy flag still works for
-  one release: the aggregator emits a `DeprecationWarning` and rewires the
-  strategies as if you had passed `FilterIncompleteUserTurnStrategies`
-  directly.
-  (PR [#4405](https://github.com/pipecat-ai/pipecat/pull/4405))
-
- Deprecated `ResampyResampler` in favor of `SOXRAudioResampler` (or the
-  `create_file_resampler()` / `create_stream_resampler()` factories).
-  Instantiating `ResampyResampler` now emits a `DeprecationWarning`. The class
-  will be removed in Pipecat 2.0 along with the default `resampy` and `numba`
-  dependencies.
-  (PR [#4428](https://github.com/pipecat-ai/pipecat/pull/4428))
-
-### Fixed
-
- Fixed `CartesiaTTSService` surfacing `flush_done` messages from Cartesia as
-  `ErrorFrame`s. The latest API emits a `flush_done` per transcript when
-  server-side buffering is disabled; Pipecat now consumes them silently since
-  each turn already has its own `context_id`.
-  (PR [#4390](https://github.com/pipecat-ai/pipecat/pull/4390))
-
- Fixed Cartesia tag helpers (`SPELL`, `EMOTION_TAG`, `PAUSE_TAG`,
-  `VOLUME_TAG`, `SPEED_TAG`) raising `TypeError` when called on an instance
-  (e.g. `tts.SPELL("hi")`). They're now `@staticmethod` and callable from both
-  the class and an instance.
-  (PR [#4390](https://github.com/pipecat-ai/pipecat/pull/4390))
-
- Fixed `CartesiaHttpTTSService` pushing two `ErrorFrame`s on a non-200
-  response — one with the API's error text and a second, less informative
-  "Unknown error" frame from the outer exception handler. It now pushes a
-  single frame that includes the HTTP status code and returns cleanly.
-  (PR [#4390](https://github.com/pipecat-ai/pipecat/pull/4390))
-
- Fixed an issue where `LocalSmartTurnAnalyzerV3` was imported unconditionally
-  for user turn stop strategies. It is now only imported when
-  `default_user_turn_stop_strategies()` is called. This improves startup time
-  and removes the `transformers` "PyTorch/TensorFlow/Flax not found" warning
-  when the default stop strategies are not used.
-  (PR [#4393](https://github.com/pipecat-ai/pipecat/pull/4393))
-
- Fixed `GrokRealtimeLLMService` ignoring the configured model. The model was
-  stored in `Settings` but never sent to xAI, so every session silently fell
-  back to xAI's server-side default. The model is now passed via the `?model=`
-  query parameter on the WebSocket URL as xAI's Voice Agent API requires.
-  (PR [#4401](https://github.com/pipecat-ai/pipecat/pull/4401))
-
- Fixed `on_user_turn_stopped` firing prematurely when
-  `filter_incomplete_user_turns` was enabled. The event now fires only after
-  the LLM confirms the user turn is complete (`✓`); previously the smart-turn
-  detector's tentative stop was bubbling up before the LLM had a chance to veto
-  it, causing observers, transcript appenders and UI indicators to receive an
-  early — and sometimes duplicated — signal.
-  (PR [#4405](https://github.com/pipecat-ai/pipecat/pull/4405))
-
- Fixed `TTSSpeakFrame(append_to_context=True)` greetings sometimes splitting
-  across two assistant messages in the LLM context and not surfacing in
-  `on_assistant_turn_stopped`. The `LLMAssistantPushAggregationFrame` emitted
-  at the end of a TTS context now carries a PTS just past the last word so it
-  can't overtake clock-queued `TTSTextFrame`s in the transport's output, and
-  `LLMAssistantAggregator` now triggers
-  `on_assistant_turn_started`/`on_assistant_turn_stopped` when it receives the
-  frame outside an LLM response cycle (restoring v0.0.104 behavior for greeting
-  transcripts).
-  (PR [#4414](https://github.com/pipecat-ai/pipecat/pull/4414))
-
- Fixed `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` producing merged
-  words (e.g. `bookLook`) when using Flash models. Flash often splits sentences
-  mid-stream into alignment chunks that begin with a real inter-word space, but
-  the previous fix unconditionally stripped that space from every chunk.
-  Leading spaces are now stripped only on the first alignment chunk of an
-  utterance, so subsequent chunks correctly flush partial words across
-  boundaries.
-  (PR [#4415](https://github.com/pipecat-ai/pipecat/pull/4415))
-
- Fixed AWS Polly TTS, Bedrock LLM, and the Bedrock AgentCore processor
-  erroring out when only one of `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY`
-  was set in the environment. The half-populated kwargs are no longer forwarded
-  to aioboto3; partial env-var configurations now fall through to the boto3
-  credential chain like fully-unset configurations do.
-  (PR [#4416](https://github.com/pipecat-ai/pipecat/pull/4416))
-
- Fixed `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` writing
-  romanized/normalized text to the LLM context. With non-Latin input (e.g.,
-  Chinese), the assistant transcript was getting populated with pinyin (`Ni Hao
-  !` instead of `你好！`), which then degraded subsequent LLM turns. The services
-  now consume `alignment` by default and only switch to `normalizedAlignment` /
-  `normalized_alignment` when `pronunciation_dictionary_locators` is configured
-  (where `alignment` has overlapping restarts that produce duplicated/garbled
-  words, per #4316). Both fields are read with preferred-with-fallback
-  semantics since each is nullable per the API schema.
-  (PR [#4424](https://github.com/pipecat-ai/pipecat/pull/4424))
-
- Fixed a deadlock in `TTSService` that could permanently stall pipeline
-  processing when all three conditions occurred together:
-  `pause_frame_processing=True`, an interruption arrived before any TTS audio
-  was played, and an `UninterruptibleFrame` (e.g. `TTSUpdateSettingsFrame`,
-  `FunctionCallResultFrame`) was in the processing queue at that moment. The
-  process task would block on `__process_event.wait()` indefinitely because
-  `BotStoppedSpeakingFrame` never arrives (no audio was played) and the
-  interruption handler did not resume processing. Affects services using
-  `pause_frame_processing=True` such as ElevenLabs, Rime, AsyncAI, Gradium, and
-  ResembleAI.
-  (PR [#4431](https://github.com/pipecat-ai/pipecat/pull/4431))
-
- Fixed interruptions being delayed when a slow non-uninterruptible frame was
-  processing and an uninterruptible frame was waiting in the queue. The bot
-  would stall until the slow frame finished instead of cancelling it
-  immediately on interruption.
-  (PR [#4434](https://github.com/pipecat-ai/pipecat/pull/4434))
-
- Fixed `TTSService` dropping uninterruptible frames (e.g.
-  `FunctionCallResultFrame`) from its internal serialization queue when an
-  interruption occurs. Previously, the queue was recreated on every
-  interruption, silently discarding any queued frames. The queue is now reset
-  instead of recreated, preserving uninterruptible frames so they are always
-  delivered downstream.
-  (PR [#4435](https://github.com/pipecat-ai/pipecat/pull/4435))
-
- Fixed a race condition in the Daily transport that caused `AttributeError:
-  'NoneType' object has no attribute 'send_app_message'` when tearing down a
-  pipeline. Both `DailyInputTransport` and `DailyOutputTransport` share the
-  same `DailyTransportClient` and both call `cleanup()`, which was releasing
-  the underlying `CallClient` on the first call — leaving the second caller
-  with a `None` client.
-  (PR [#4440](https://github.com/pipecat-ai/pipecat/pull/4440))
-
- Restored `cancel_on_interruption=False` support for `AWSNovaSonicLLMService`
-  and `OpenAIRealtimeLLMService`. These services previously honored the flag by
-  simply not cancelling in-flight function calls on interruption; the
-  introduction of the new async-tool mechanism (which threads
-  started/intermediate/final messages through the LLM context) broke that path
-  because the realtime services didn't know how to interpret those messages.
-  Note that new-style streamed intermediate results
-  (`FunctionCallResultProperties(is_final=False)`) are not supported on these
-  realtime services. Similar fixes for other impacted realtime services are
-  forthcoming.
-  (PR [#4441](https://github.com/pipecat-ai/pipecat/pull/4441))
-
- Fixed two misspelled Gemini TTS voice names in
-  `GeminiTTSService.AVAILABLE_VOICES`.
-  (PR [#4443](https://github.com/pipecat-ai/pipecat/pull/4443))
-
- Extended the `cancel_on_interruption=False` regression fix to
-  `GrokRealtimeLLMService`, `AzureRealtimeLLMService`, and
-  `UltravoxRealtimeLLMService`. Grok and Azure use the same approach as in
-  #4441 (each service detects async-tool messages in the LLM context and routes
-  the final result to its formal tool-result channel; Azure inherits
-  transitively from `OpenAIRealtimeLLMService`). Ultravox needed a different
-  approach because its API freezes the conversation between
-  `client_tool_invocation` and the matching `client_tool_result` — for
-  async-registered functions it now ships a placeholder `client_tool_result`
-  immediately when the function is invoked (to unfreeze the conversation), then
-  injects the real result as user-side text once the tool finishes. Streamed
-  intermediate results (`FunctionCallResultProperties(is_final=False)`) are
-  still not supported on any of these realtime services. `GeminiLiveLLMService`
-  and `InworldRealtimeLLMService` are excluded for now: Gemini Live's
-  async-tool path needs deeper investigation, and Inworld tool calling needs to
-  be sorted out first.
-  (PR [#4447](https://github.com/pipecat-ai/pipecat/pull/4447))
-
- Fixed `OpenAIRealtimeLLMService` handling of multi-output-item responses
-  (observed with `gpt-realtime-2`). A single response can now contain more than
-  one audio item, and the first item's `audio.done` may arrive after the second
-  item's deltas have started. Deltas still arrive strictly in playback order,
-  so we continue to forward them as received (matching OpenAI's reference
-  implementation). The fix removes spurious warnings, ensures truncation always
-  targets the latest audio item, and emits a single bracketing
-  `TTSStartedFrame`/`TTSStoppedFrame` pair per assistant turn (the Stopped is
-  now pushed on `response.done`).
-  (PR [#4465](https://github.com/pipecat-ai/pipecat/pull/4465))
-
- Fixed missing `output` attribute on LLM OpenTelemetry spans when the LLM call
-  is interrupted mid-stream.
-  (PR [#4467](https://github.com/pipecat-ai/pipecat/pull/4467))
-
- Fixed incorrect `metrics.ttfb` on STT OpenTelemetry spans, and parented them
-  to the current turn span.
-  (PR [#4467](https://github.com/pipecat-ai/pipecat/pull/4467))
-
- Fixed incorrect `metrics.ttfb` on TTS OpenTelemetry spans for streaming
-  services.
-  (PR [#4467](https://github.com/pipecat-ai/pipecat/pull/4467))
-
- Extended the `cancel_on_interruption=False` regression fix to
-  `InworldRealtimeLLMService`. Uses the same approach as in #4441 (the service
-  detects async-tool messages in the LLM context and routes the final result to
-  its formal tool-result channel). Note: as of this writing, Inworld Realtime
-  doesn't appear to handle the resulting delayed tool result reliably — the
-  routing is best-effort and the service surfaces a one-time warning when
-  async-tool messages are seen. Streamed intermediate results
-  (`FunctionCallResultProperties(is_final=False)`) are still not supported on
-  this realtime service. (Inworld was excluded from #4447 pending resolution of
-  an unrelated tool-calling issue, which turned out to be an account-level
-  matter.)
-  (PR [#4474](https://github.com/pipecat-ai/pipecat/pull/4474))
-
- Fixed Cartesia TTS Korean word timestamps to use normal spacing rules,
-  preserving word boundaries and per-word timestamp alignment during downstream
-  aggregation.
-  (PR [#4475](https://github.com/pipecat-ai/pipecat/pull/4475))
-
- Fixed Cartesia TTS Chinese and Japanese timestamp grouping to preserve
-  provider text spacing, avoiding artificial spaces when timestamp groups are
-  reassembled downstream.
-  (PR [#4475](https://github.com/pipecat-ai/pipecat/pull/4475))
-
- Fixed `SonioxSTTService` final transcription frames missing detected language
-  metadata when Soniox returns token-level language annotations.
-  (PR [#4482](https://github.com/pipecat-ai/pipecat/pull/4482))
-
- Fixed Soniox final transcription language detection to use the most common
-  recognized token language, avoiding mislabeling an utterance when the last
-  token is tagged with a different language.
-  (PR [#4495](https://github.com/pipecat-ai/pipecat/pull/4495))
-
- Fixed dropped audio in streaming TTS services whose wire protocol doesn't
-  echo `context_id` back on incoming audio (Sarvam, Smallest, Soniox, Inworld,
-  and others). Previously, audio that arrived between contexts or at the very
-  start of a turn was tagged with `context_id=None` and silently dropped with
-  an "unable to append audio to context: no context ID provided" debug log.
-  `TTSService.get_active_audio_context_id()` now falls back to the
-  synthesis-side `_turn_context_id` when the playback cursor isn't set yet.
-  (PR [#4497](https://github.com/pipecat-ai/pipecat/pull/4497))
-
-### Security
-
- Fixed a path traversal issue in the development runner's
-  `/files/{filename:path}` download endpoint. Previously, when the runner was
-  started with `--folder`, a request like `/files/..%2F..%2Fetc%2Fpasswd` could
-  escape the configured folder because `%2F`-encoded separators bypassed
-  Starlette's path normalisation. The endpoint now resolves the joined path and
-  rejects any filename that escapes the allowed base with a 403, and also
-  returns 404 (instead of an implicit `null` 200) when `--folder` is unset.
-  (PR [#4417](https://github.com/pipecat-ai/pipecat/pull/4417))
-
 ## [1.1.0] - 2026-04-27

 ### Added
--- a/README.md
+++ b/README.md
@@ -92,10 +92,10 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
 | Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
 | ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/api-reference/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/api-reference/server/services/stt/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/api-reference/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/api-reference/server/services/stt/gladia), [Google](https://docs.pipecat.ai/api-reference/server/services/stt/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/stt/mistral), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/stt/nvidia), [OpenAI (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/api-reference/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/api-reference/server/services/stt/whisper), [xAI](https://docs.pipecat.ai/api-reference/server/services/stt/xai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
-| LLMs                | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Inception](https://docs.pipecat.ai/api-reference/server/services/llm/inception), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+| LLMs                | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
 | Text-to-Speech      | [Async](https://docs.pipecat.ai/api-reference/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/api-reference/server/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/api-reference/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/api-reference/server/services/tts/fish), [Google](https://docs.pipecat.ai/api-reference/server/services/tts/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/api-reference/server/services/tts/groq), [Hume](https://docs.pipecat.ai/api-reference/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/api-reference/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/api-reference/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/api-reference/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/api-reference/server/services/tts/minimax), [Mistral](https://docs.pipecat.ai/api-reference/server/services/tts/mistral), [Neuphonic](https://docs.pipecat.ai/api-reference/server/services/tts/neuphonic), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/tts/nvidia), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/tts/openai), [Piper](https://docs.pipecat.ai/api-reference/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/api-reference/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/api-reference/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/api-reference/server/services/tts/smallest), [Soniox](https://docs.pipecat.ai/api-reference/server/services/tts/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/api-reference/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/api-reference/server/services/tts/xtts) |
 | Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/api-reference/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/api-reference/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/api-reference/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/api-reference/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/api-reference/server/services/s2s/ultravox),                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
-| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/api-reference/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/api-reference/server/services/transport/small-webrtc), [Vonage (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/vonage), [WebSocket Server](https://docs.pipecat.ai/api-reference/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/api-reference/server/services/transport/whatsapp), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/api-reference/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/api-reference/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/api-reference/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/api-reference/server/services/transport/whatsapp), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
 | Serializers         | [Exotel](https://docs.pipecat.ai/api-reference/server/services/serializers/exotel), [Genesys](https://docs.pipecat.ai/api-reference/server/services/serializers/genesys), [Plivo](https://docs.pipecat.ai/api-reference/server/services/serializers/plivo), [Twilio](https://docs.pipecat.ai/api-reference/server/services/serializers/twilio), [Telnyx](https://docs.pipecat.ai/api-reference/server/services/serializers/telnyx), [Vonage](https://docs.pipecat.ai/api-reference/server/services/serializers/vonage)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
 | Video               | [HeyGen](https://docs.pipecat.ai/api-reference/server/services/video/heygen), [LemonSlice](https://docs.pipecat.ai/api-reference/server/services/transport/lemonslice), [Tavus](https://docs.pipecat.ai/api-reference/server/services/video/tavus), [Simli](https://docs.pipecat.ai/api-reference/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
 | Memory              | [mem0](https://docs.pipecat.ai/api-reference/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
--- a/changelog/4052.added.md
+++ b/changelog/4052.added.md
@@ -1 +0,0 @@
- Added `VonageVideoConnectorTransport`, a new transport integration for real-time Vonage WebRTC sessions using the Vonage Video Connector library.
--- a/changelog/4306.fixed.md
+++ b/changelog/4306.fixed.md
@@ -1 +0,0 @@
- Fixed Azure TTS last word being missed by observers and RTVI UI. The completion signal was racing with word timestamp processing, causing the final word's `TTSTextFrame` to arrive after `TTSStoppedFrame`. Completion is now routed through the word boundary queue to ensure all words are processed before signaling stream end.
--- a/changelog/4380.fixed.2.md
+++ b/changelog/4380.fixed.2.md
@@ -1 +0,0 @@
- Fixed `BaseOutputTransport` reordering frames that share the same presentation timestamp. Frames with equal PTS values are now emitted in insertion order, preventing subtle audio/text sequencing bugs when multiple frames arrive at the same time.
--- a/changelog/4380.fixed.3.md
+++ b/changelog/4380.fixed.3.md
@@ -1 +0,0 @@
- Fixed Cartesia word timestamps leaking SSML tag text (e.g. `<spell>`, `<emotion>`, `<break>`) into word entries. Tags are now stripped before processing, so word-to-text attribution remains accurate when SSML markup is present in the TTS input.
--- a/changelog/4380.fixed.4.md
+++ b/changelog/4380.fixed.4.md
@@ -1 +0,0 @@
- Fixed `TTSTextFrame` entries losing their original text structure when word timestamps are enabled. Each `TTSTextFrame` now carries a `raw_text` field containing the corresponding span of the original LLM-produced text (including pattern delimiters such as `<card>4111 1111 1111 1111</card>`), so the assistant context receives properly-tagged content rather than the cleaned words returned by the TTS provider. Also handles words that straddle two sentence boundaries by splitting them and attributing each part to its correct source frame.
--- a/changelog/4380.fixed.md
+++ b/changelog/4380.fixed.md
@@ -1 +0,0 @@
- Fixed skipped TTS frames (e.g. code blocks filtered via `skip_aggregator_types`) being emitted to the assistant context immediately instead of waiting for preceding spoken frames to finish. They now hold their position in the frame sequence and are flushed only after all earlier spoken sentences are complete, keeping context ordering correct.
--- a/changelog/4385.added.md
+++ b/changelog/4385.added.md
@@ -0,0 +1 @@
+- Added a `session_id` field to `RunnerArguments` so bots can log or trace a per-session identifier in local development the same way they can in Pipecat Cloud. The development runner now mints a UUID at every construction site, and paths that already returned a `sessionId` to the caller (Daily `/start`, dial-in webhook) share that same UUID with the runner args instead of generating two. The SmallWebRTC `/api/offer` endpoint also accepts an optional `session_id` query parameter so the `/sessions/{session_id}/...` proxy can thread it through.
--- a/changelog/4386.changed.md
+++ b/changelog/4386.changed.md
@@ -0,0 +1 @@
+- Updated the default `SonioxTTSService` model from `tts-rt-v1-preview` to the generally available `tts-rt-v1`.
--- a/changelog/4390.added.md
+++ b/changelog/4390.added.md
@@ -0,0 +1 @@
+- Added a `max_buffer_delay_ms` constructor argument to `CartesiaTTSService` for controlling Cartesia's server-side text buffering. When unset, Pipecat picks a sensible default based on `text_aggregation_mode`: `0` in `SENTENCE` mode (custom buffering — avoids stacking client-side aggregation on top of Cartesia's default 3000ms server buffer) and unset in `TOKEN` mode (Cartesia's managed buffering applies). Pass an explicit value (0–5000ms) to override.
--- a/changelog/4390.changed.2.md
+++ b/changelog/4390.changed.2.md
@@ -0,0 +1 @@
+- Default `cartesia_version` for `CartesiaTTSService` bumped from `2025-04-16` to `2026-03-01`, matching `CartesiaHttpTTSService` and unlocking the `use_normalized_timestamps` and `max_buffer_delay_ms` fields.
--- a/changelog/4390.changed.md
+++ b/changelog/4390.changed.md
@@ -0,0 +1 @@
+- ⚠️ `CartesiaTTSService` now sends `use_normalized_timestamps: true` instead of the deprecated `use_original_timestamps` field. Word timestamps now reflect what was actually spoken (post text-normalization and pronunciation-dictionary substitution), matching the convention Pipecat uses for ElevenLabs. This is a behavior change for `sonic-3` users, who were previously receiving timestamps tied to the input transcript.
--- a/changelog/4390.fixed.2.md
+++ b/changelog/4390.fixed.2.md
@@ -0,0 +1 @@
+- Fixed `CartesiaHttpTTSService` pushing two `ErrorFrame`s on a non-200 response — one with the API's error text and a second, less informative "Unknown error" frame from the outer exception handler. It now pushes a single frame that includes the HTTP status code and returns cleanly.
--- a/changelog/4390.fixed.3.md
+++ b/changelog/4390.fixed.3.md
@@ -0,0 +1 @@
+- Fixed Cartesia tag helpers (`SPELL`, `EMOTION_TAG`, `PAUSE_TAG`, `VOLUME_TAG`, `SPEED_TAG`) raising `TypeError` when called on an instance (e.g. `tts.SPELL("hi")`). They're now `@staticmethod` and callable from both the class and an instance.
--- a/changelog/4390.fixed.md
+++ b/changelog/4390.fixed.md
@@ -0,0 +1 @@
+- Fixed `CartesiaTTSService` surfacing `flush_done` messages from Cartesia as `ErrorFrame`s. The latest API emits a `flush_done` per transcript when server-side buffering is disabled; Pipecat now consumes them silently since each turn already has its own `context_id`.
--- a/changelog/4393.fixed.md
+++ b/changelog/4393.fixed.md
@@ -0,0 +1 @@
+- Fixed an issue where `LocalSmartTurnAnalyzerV3` was imported unconditionally for user turn stop strategies. It is now only imported when `default_user_turn_stop_strategies()` is called. This improves startup time and removes the `transformers` "PyTorch/TensorFlow/Flax not found" warning when the default stop strategies are not used.
--- a/changelog/4395.changed.md
+++ b/changelog/4395.changed.md
@@ -0,0 +1 @@
+- Broadened `tool_resources` to `app_resources` for easy access not just in tool handlers but in other places like custom `FrameProcessor`s. Three changes: a rename (`tool_resources` → `app_resources`), a new `app_resources` property on `PipelineTask`, and a new `pipeline_task` property on `FrameProcessor`. Tool handlers now read `params.app_resources`; custom processors read `self.pipeline_task.app_resources`. The previous `tool_resources` aliases (on `PipelineTask`, `FunctionCallParams`, and `FrameProcessorSetup`) keep working but are deprecated as of 1.2.0 and emit `DeprecationWarning`s.
--- a/changelog/4397.changed.md
+++ b/changelog/4397.changed.md
@@ -0,0 +1 @@
+- Lowered the per-message log in `SmallWebRTCInputTransport._handle_app_message` from `debug` to `trace`. App messages can be high-frequency and were noisy at debug level; set the loguru level to `TRACE` to see them again.
--- a/changelog/4400.added.md
+++ b/changelog/4400.added.md
@@ -0,0 +1 @@
+- Added a `mip_opt_out` constructor argument to `DeepgramTTSService` and `DeepgramHttpTTSService` so callers can opt out of the Deepgram Model Improvement Program. When set, the value is forwarded to Deepgram as a query parameter on the speak request. Defaults to `None`, which preserves the existing behavior. See https://dpgr.am/deepgram-mip for pricing implications before enabling.
--- a/changelog/4401.changed.md
+++ b/changelog/4401.changed.md
@@ -0,0 +1 @@
+- Changed the default model for `GrokRealtimeLLMService` to `grok-voice-think-fast-1.0`, xAI's recommended Voice Agent model. The previous default of `grok-voice-fast-1.0` has been deprecated by xAI and is being removed.
--- a/changelog/4401.fixed.md
+++ b/changelog/4401.fixed.md
@@ -0,0 +1 @@
+- Fixed `GrokRealtimeLLMService` ignoring the configured model. The model was stored in `Settings` but never sent to xAI, so every session silently fell back to xAI's server-side default. The model is now passed via the `?model=` query parameter on the WebSocket URL as xAI's Voice Agent API requires.
--- a/changelog/4404.added.md
+++ b/changelog/4404.added.md
@@ -0,0 +1 @@
+- Added an opt-in `add_tool_change_messages` flag to the LLM aggregators (set via `LLMContextAggregatorPair(..., add_tool_change_messages=True)`) that appends a developer-role message to the context whenever `LLMSetToolsFrame` changes the set of advertised standard tools. Helps the LLM stay coherent across mid-conversation tool changes, mitigating several flavors of tool-call-related hallucination: calling tools that have been removed, avoiding tools that have been re-added, and hallucinating output (made-up answers or tool-call-shaped non-tool-calls) when tools are unavailable.
--- a/changelog/4405.added.2.md
+++ b/changelog/4405.added.2.md
@@ -0,0 +1 @@
+- Added `LLMTurnCompletionUserTurnStopStrategy` in `pipecat.turns.user_stop`. When installed, the strategy gates `on_user_turn_stopped` on a `UserTurnInferenceCompletedFrame` (a new fieldless system frame emitted by any component that can judge turn completeness — e.g. the `UserTurnCompletionLLMServiceMixin` on `✓`). A `finalization_timeout` provides a safety net if no completion frame ever arrives.
--- a/changelog/4405.added.3.md
+++ b/changelog/4405.added.3.md
@@ -0,0 +1 @@
+- Added `deferred(strategy)` and `DeferredUserTurnStopStrategy` in `pipecat.turns.user_stop`. Wraps a stop strategy so it fires only the inference-triggered event and suppresses `on_user_turn_stopped`, leaving finalization to another strategy in the chain such as `LLMTurnCompletionUserTurnStopStrategy`.
--- a/changelog/4405.added.4.md
+++ b/changelog/4405.added.4.md
@@ -0,0 +1 @@
+- Added `FilterIncompleteUserTurnStrategies` in `pipecat.turns.user_turn_strategies` — a `UserTurnStrategies` specialization that wraps the detector chain with `deferred(...)` and appends `LLMTurnCompletionUserTurnStopStrategy` as the finalizer. Common case: `user_turn_strategies=FilterIncompleteUserTurnStrategies()`. Pass `config=UserTurnCompletionConfig(...)` to customize timeouts and prompts.
--- a/changelog/4405.added.5.md
+++ b/changelog/4405.added.5.md
@@ -0,0 +1 @@
+- Added `ExternalUserTurnCompletionStopStrategy` in `pipecat.turns.user_stop` — a generic stop strategy that finalizes the user turn whenever a `UserTurnInferenceCompletedFrame` arrives, regardless of which component produced it. `LLMTurnCompletionUserTurnStopStrategy` now extends this base; future producers (Flux, custom end-of-turn classifiers, etc.) can use the base directly or subclass it to add producer-specific setup.
--- a/changelog/4405.added.md
+++ b/changelog/4405.added.md
@@ -0,0 +1 @@
+- Added `on_user_turn_inference_triggered`, a new event on the user turn controller, processor, aggregator and stop strategies that fires when a strategy has enough signal to start LLM inference. By default it fires together with `on_user_turn_stopped`; a gating strategy can fire only the inference-triggered event and defer finalization to a peer.
--- a/changelog/4405.deprecated.md
+++ b/changelog/4405.deprecated.md
@@ -0,0 +1 @@
+- Deprecated `LLMUserAggregatorParams.filter_incomplete_user_turns`. Use `user_turn_strategies=FilterIncompleteUserTurnStrategies()` (or add `LLMTurnCompletionUserTurnStopStrategy` to a custom `user_turn_strategies.stop`) instead. Setting the legacy flag still works for one release: the aggregator emits a `DeprecationWarning` and rewires the strategies as if you had passed `FilterIncompleteUserTurnStrategies` directly.
--- a/changelog/4405.fixed.md
+++ b/changelog/4405.fixed.md
@@ -0,0 +1 @@
+- Fixed `on_user_turn_stopped` firing prematurely when `filter_incomplete_user_turns` was enabled. The event now fires only after the LLM confirms the user turn is complete (`✓`); previously the smart-turn detector's tentative stop was bubbling up before the LLM had a chance to veto it, causing observers, transcript appenders and UI indicators to receive an early — and sometimes duplicated — signal.
--- a/changelog/4407.added.md
+++ b/changelog/4407.added.md
@@ -0,0 +1,6 @@
+- Added first-class RTVI support for the UI Agent Protocol:
+  - Adds `ui-event`, `ui-snapshot`, and `ui-cancel-task` client-to-server messages, plus `ui-command` and `ui-task` server-to-client messages, with paired `*Data` / `*Message` pydantic models.
+  - Adds built-in command payload models for `Toast`, `Navigate`, `ScrollTo`, `Highlight`, `Focus`, `Click`, `SetInputValue`, and `SelectText`; matching default handlers live in `@pipecat-ai/client-react`.
+  - Adds `RTVIProcessor.on_ui_message` for inbound `ui-event`, `ui-snapshot`, and `ui-cancel-task` messages.
+  - Adds five UI pipeline frames, mirroring the `client-message` frame-and-event pattern: downstream code pushes `RTVIUICommandFrame` / `RTVIUITaskFrame` for the observer to wrap into outbound `UICommandMessage` / `UITaskMessage` envelopes, while the processor pushes inbound `RTVIUIEventFrame`, `RTVIUISnapshotFrame`, and `RTVIUICancelTaskFrame` alongside `on_ui_message`.
+  - Bumps the RTVI `PROTOCOL_VERSION` from `1.2.0` to `1.3.0`.
--- a/changelog/4414.fixed.md
+++ b/changelog/4414.fixed.md
@@ -0,0 +1 @@
+- Fixed `TTSSpeakFrame(append_to_context=True)` greetings sometimes splitting across two assistant messages in the LLM context and not surfacing in `on_assistant_turn_stopped`. The `LLMAssistantPushAggregationFrame` emitted at the end of a TTS context now carries a PTS just past the last word so it can't overtake clock-queued `TTSTextFrame`s in the transport's output, and `LLMAssistantAggregator` now triggers `on_assistant_turn_started`/`on_assistant_turn_stopped` when it receives the frame outside an LLM response cycle (restoring v0.0.104 behavior for greeting transcripts).
--- a/changelog/4415.fixed.md
+++ b/changelog/4415.fixed.md
@@ -0,0 +1 @@
+- Fixed `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` producing merged words (e.g. `bookLook`) when using Flash models. Flash often splits sentences mid-stream into alignment chunks that begin with a real inter-word space, but the previous fix unconditionally stripped that space from every chunk. Leading spaces are now stripped only on the first alignment chunk of an utterance, so subsequent chunks correctly flush partial words across boundaries.
--- a/changelog/4416.added.md
+++ b/changelog/4416.added.md
@@ -0,0 +1 @@
+- AWS Transcribe STT, Polly TTS, Bedrock LLM, and the Bedrock AgentCore processor now resolve credentials via the standard boto3 provider chain (EC2 instance profiles, EKS pod roles / IRSA, ECS task roles, SSO, `~/.aws/credentials`) when explicit credentials and `AWS_*` environment variables are absent. Services running with IAM roles no longer need to export static credentials.
--- a/changelog/4416.fixed.md
+++ b/changelog/4416.fixed.md
@@ -0,0 +1 @@
+- Fixed AWS Polly TTS, Bedrock LLM, and the Bedrock AgentCore processor erroring out when only one of `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` was set in the environment. The half-populated kwargs are no longer forwarded to aioboto3; partial env-var configurations now fall through to the boto3 credential chain like fully-unset configurations do.
--- a/changelog/4417.security.md
+++ b/changelog/4417.security.md
@@ -0,0 +1 @@
+- Fixed a path traversal issue in the development runner's `/files/{filename:path}` download endpoint. Previously, when the runner was started with `--folder`, a request like `/files/..%2F..%2Fetc%2Fpasswd` could escape the configured folder because `%2F`-encoded separators bypassed Starlette's path normalisation. The endpoint now resolves the joined path and rejects any filename that escapes the allowed base with a 403, and also returns 404 (instead of an implicit `null` 200) when `--folder` is unset.
--- a/changelog/4422.changed.md
+++ b/changelog/4422.changed.md
@@ -0,0 +1 @@
+- Changed the default Inworld TTS model from `inworld-tts-1.5-max` to `inworld-tts-2` (Realtime TTS-2) across `InworldHttpTTSService`, `InworldTTSService`, and the `InworldRealtimeLLMService` cascade. Existing users can pin the prior model explicitly via the `model`/`tts_model` argument; both `inworld-tts-1.5-max` and `inworld-tts-1.5-mini` remain valid model IDs.
--- a/changelog/4423.added.md
+++ b/changelog/4423.added.md
@@ -1 +0,0 @@
- Added `InceptionLLMService` for Inception's Mercury 2 diffusion reasoning model, with support for `reasoning_effort` and `realtime` settings.
--- a/changelog/4424.fixed.md
+++ b/changelog/4424.fixed.md
@@ -0,0 +1 @@
+- Fixed `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` writing romanized/normalized text to the LLM context. With non-Latin input (e.g., Chinese), the assistant transcript was getting populated with pinyin (`Ni Hao !` instead of `你好！`), which then degraded subsequent LLM turns. The services now consume `alignment` by default and only switch to `normalizedAlignment` / `normalized_alignment` when `pronunciation_dictionary_locators` is configured (where `alignment` has overlapping restarts that produce duplicated/garbled words, per #4316). Both fields are read with preferred-with-fallback semantics since each is nullable per the API schema.
--- a/changelog/4426.added.md
+++ b/changelog/4426.added.md
@@ -0,0 +1 @@
+- Added `keyterms` support to ElevenLabs STT services so Scribe V2 callers can bias transcription for both file-based and realtime transcription.
--- a/changelog/4428.deprecated.md
+++ b/changelog/4428.deprecated.md
@@ -0,0 +1 @@
+- Deprecated `ResampyResampler` in favor of `SOXRAudioResampler` (or the `create_file_resampler()` / `create_stream_resampler()` factories). Instantiating `ResampyResampler` now emits a `DeprecationWarning`. The class will be removed in Pipecat 2.0 along with the default `resampy` and `numba` dependencies.
--- a/changelog/4429.changed.md
+++ b/changelog/4429.changed.md
@@ -0,0 +1 @@
+- Changed the default model for `GrokLLMService` from `grok-3` to `grok-4.20-non-reasoning`. xAI is retiring `grok-3` on May 15, 2026.
--- a/changelog/4430.added.md
+++ b/changelog/4430.added.md
@@ -0,0 +1 @@
+- Added `watchdog_min_timeout` parameter to `DeepgramFluxSTT` and `DeepgramFluxSageMakerSTT` (default `0.5` seconds) to control the minimum silence duration before the watchdog sends a silence packet to prevent dangling turns. The actual threshold is `max(chunk_duration * 2, watchdog_min_timeout)`, so it also adapts automatically to the audio chunk size in use.
--- a/changelog/4430.changed.md
+++ b/changelog/4430.changed.md
@@ -0,0 +1 @@
+- `DeepgramFluxSTT` watchdog silence threshold is now dynamic: `max(chunk_duration * 2, watchdog_min_timeout)` instead of a fixed 500 ms. This prevents false silence injections when large audio chunks are sent at lower frequency.
--- a/changelog/4431.fixed.md
+++ b/changelog/4431.fixed.md
@@ -0,0 +1 @@
+- Fixed a deadlock in `TTSService` that could permanently stall pipeline processing when all three conditions occurred together: `pause_frame_processing=True`, an interruption arrived before any TTS audio was played, and an `UninterruptibleFrame` (e.g. `TTSUpdateSettingsFrame`, `FunctionCallResultFrame`) was in the processing queue at that moment. The process task would block on `__process_event.wait()` indefinitely because `BotStoppedSpeakingFrame` never arrives (no audio was played) and the interruption handler did not resume processing. Affects services using `pause_frame_processing=True` such as ElevenLabs, Rime, AsyncAI, Gradium, and ResembleAI.
--- a/changelog/4433.changed.md
+++ b/changelog/4433.changed.md
@@ -0,0 +1 @@
+- `ElevenLabsTTSService` now sends `close_context` to the server as soon as the turn is complete (on `on_turn_context_completed`) rather than waiting until all audio has finished playing back. The `isFinal` message from ElevenLabs is now used to signal `TTSStoppedFrame` and clean up the audio context, improving turn transition timing.
--- a/changelog/4434.fixed.md
+++ b/changelog/4434.fixed.md
@@ -0,0 +1 @@
+- Fixed interruptions being delayed when a slow non-uninterruptible frame was processing and an uninterruptible frame was waiting in the queue. The bot would stall until the slow frame finished instead of cancelling it immediately on interruption.
--- a/changelog/4435.fixed.md
+++ b/changelog/4435.fixed.md
@@ -0,0 +1 @@
+- Fixed `TTSService` dropping uninterruptible frames (e.g. `FunctionCallResultFrame`) from its internal serialization queue when an interruption occurs. Previously, the queue was recreated on every interruption, silently discarding any queued frames. The queue is now reset instead of recreated, preserving uninterruptible frames so they are always delivered downstream.
--- a/changelog/4440.fixed.md
+++ b/changelog/4440.fixed.md
@@ -0,0 +1 @@
+- Fixed a race condition in the Daily transport that caused `AttributeError: 'NoneType' object has no attribute 'send_app_message'` when tearing down a pipeline. Both `DailyInputTransport` and `DailyOutputTransport` share the same `DailyTransportClient` and both call `cleanup()`, which was releasing the underlying `CallClient` on the first call — leaving the second caller with a `None` client.
--- a/changelog/4441.fixed.md
+++ b/changelog/4441.fixed.md
@@ -0,0 +1 @@
+- Restored `cancel_on_interruption=False` support for `AWSNovaSonicLLMService` and `OpenAIRealtimeLLMService`. These services previously honored the flag by simply not cancelling in-flight function calls on interruption; the introduction of the new async-tool mechanism (which threads started/intermediate/final messages through the LLM context) broke that path because the realtime services didn't know how to interpret those messages. Note that new-style streamed intermediate results (`FunctionCallResultProperties(is_final=False)`) are not supported on these realtime services. Similar fixes for other impacted realtime services are forthcoming.
--- a/changelog/4442.added.2.md
+++ b/changelog/4442.added.2.md
@@ -1 +0,0 @@
- Added `GET /status` endpoint to the development runner that reports which transports the running instance accepts (all by default, or the single transport passed via `-t`).
--- a/changelog/4442.added.md
+++ b/changelog/4442.added.md
@@ -1 +0,0 @@
- Added plain WebSocket transport support to the development runner. Bots can now accept connections from non-telephony WebSocket clients (e.g., browser apps using protobuf framing) via the `/ws-client` endpoint alongside other transports.
--- a/changelog/4442.changed.md
+++ b/changelog/4442.changed.md
@@ -1 +0,0 @@
- ⚠️ The development runner now supports all transports (WebRTC, Daily, telephony, plain WebSocket) simultaneously from a single server. The `/start` endpoint accepts a `"transport"` field to select the transport per-request; omitting `-t` at startup enables all transports instead of defaulting to WebRTC. The Daily browser-redirect route moved from `GET /` to `GET /daily`.
--- a/changelog/4443.fixed.md
+++ b/changelog/4443.fixed.md
@@ -0,0 +1 @@
+- Fixed two misspelled Gemini TTS voice names in `GeminiTTSService.AVAILABLE_VOICES`.
--- a/changelog/4446.change.md
+++ b/changelog/4446.change.md
@@ -0,0 +1 @@
+- Updated `InworldHttpTTSService` and `InworldTTSService` to use PCM audio encoding by default, which returns audio bytes without headers.
--- a/changelog/4447.fixed.md
+++ b/changelog/4447.fixed.md
@@ -0,0 +1 @@
+- Extended the `cancel_on_interruption=False` regression fix to `GrokRealtimeLLMService`, `AzureRealtimeLLMService`, and `UltravoxRealtimeLLMService`. Grok and Azure use the same approach as in #4441 (each service detects async-tool messages in the LLM context and routes the final result to its formal tool-result channel; Azure inherits transitively from `OpenAIRealtimeLLMService`). Ultravox needed a different approach because its API freezes the conversation between `client_tool_invocation` and the matching `client_tool_result` — for async-registered functions it now ships a placeholder `client_tool_result` immediately when the function is invoked (to unfreeze the conversation), then injects the real result as user-side text once the tool finishes. Streamed intermediate results (`FunctionCallResultProperties(is_final=False)`) are still not supported on any of these realtime services. `GeminiLiveLLMService` and `InworldRealtimeLLMService` are excluded for now: Gemini Live's async-tool path needs deeper investigation, and Inworld appears to have a pre-existing problem with even simple tool calling on its Realtime API.
--- a/changelog/4448.added.md
+++ b/changelog/4448.added.md
@@ -0,0 +1 @@
+- Added `cancel_on_interruption=False` support for `GeminiLiveLLMService` on models that support Gemini's NON_BLOCKING tool mechanism (currently Gemini 2.x); the conversation now continues while the tool runs. On models that don't yet support NON_BLOCKING (Gemini 3.x), the service surfaces a one-time warning explaining the limitation. (Note: an intermittent 1008 error can occasionally fire on Gemini 2.5 during long-running tool calls; we auto-reconnect.)
--- a/changelog/4449.changed.md
+++ b/changelog/4449.changed.md
@@ -0,0 +1 @@
+- Moved `create_task`, `cancel_task`, the `task_manager` property, and `setup(task_manager)` up from `FrameProcessor` to `BaseObject`. Custom `BaseObject` subclasses (turn strategies, controllers, etc.) now inherit these methods directly instead of reimplementing the task manager wiring. Owners propagate the task manager to their child `BaseObject`s via `await child.setup(task_manager)`.
--- a/changelog/4450.changed.md
+++ b/changelog/4450.changed.md
@@ -0,0 +1 @@
+- Changed the default OpenAI Realtime input audio transcription model from `gpt-4o-transcribe` to `gpt-realtime-whisper` for both `OpenAIRealtimeSTTService` and `OpenAIRealtimeLLMService`. The new model does not accept the `prompt` parameter; if a prompt is supplied alongside `gpt-realtime-whisper`, it is dropped automatically and a warning is logged. To keep using prompt hints, explicitly pin `model="gpt-4o-transcribe"` (or `"gpt-4o-mini-transcribe"`).
--- a/changelog/4462.changed.md
+++ b/changelog/4462.changed.md
@@ -0,0 +1 @@
+- Updated the default model for `CartesiaTTSService` and `CartesiaHttpTTSService` from `sonic-3` to `sonic-3.5`.
--- a/changelog/4464.added.2.md
+++ b/changelog/4464.added.2.md
@@ -0,0 +1 @@
+- Added NVIDIA Magpie TTS services via AWS SageMaker: `NvidiaSageMakerHTTPTTSService` (single HTTP invocation, streams raw PCM back) and `NvidiaSageMakerWebsocketTTSService` (persistent HTTP/2 bidi-stream with full interruption support via `InterruptibleTTSService`).
--- a/changelog/4464.added.md
+++ b/changelog/4464.added.md
@@ -0,0 +1 @@
+- Added `NvidiaSageMakerWebsocketSTTService` for streaming speech recognition using NVIDIA Nemotron ASR via an AWS SageMaker bidirectional-stream endpoint. Produces `InterimTranscriptionFrame` and `TranscriptionFrame` frames, is VAD-aware, and automatically reconnects on error.
--- a/changelog/4465.fixed.md
+++ b/changelog/4465.fixed.md
@@ -0,0 +1 @@
+- Fixed `OpenAIRealtimeLLMService` handling of multi-output-item responses (observed with `gpt-realtime-2`). A single response can now contain more than one audio item, and the first item's `audio.done` may arrive after the second item's deltas have started. Deltas still arrive strictly in playback order, so we continue to forward them as received (matching OpenAI's reference implementation). The fix removes spurious warnings, ensures truncation always targets the latest audio item, and emits a single bracketing `TTSStartedFrame`/`TTSStoppedFrame` pair per assistant turn (the Stopped is now pushed on `response.done`).
--- a/changelog/4470.added.md
+++ b/changelog/4470.added.md
@@ -0,0 +1 @@
+- Added support for `reasoning` configuration on `OpenAIRealtimeLLMService`, for use with reasoning-capable Realtime models such as `gpt-realtime-2`.
--- a/changelog/4472.changed.md
+++ b/changelog/4472.changed.md
@@ -0,0 +1 @@
+- Changed the default model for `OpenAIRealtimeLLMService` from `gpt-realtime-1.5` to `gpt-realtime-2`.
--- a/changelog/4480.added.md
+++ b/changelog/4480.added.md
@@ -0,0 +1 @@
+- Added `wait_for_transcript_to_end_user_turn` on `LLMUserAggregatorParams` for pipelines where local turn detection drives a realtime service like Gemini Live. Set it to False to avoid unnecessary latency from transcript delay — the realtime service consumes user audio directly, so we don't need user transcripts in context before it can respond. The option makes it so that (1) turn strategies do not consider user transcripts, letting the user turn end sooner, and (2) user transcripts are then handled by the aggregator: a simple timer gives it time to gather those transcripts after the user turn ends, and once gathered, the aggregator emits a new `on_user_turn_message_finalized` event with the new user context message. The new event also fires in the default mode (coinciding with `on_user_turn_stopped`), so consumers that want the populated user transcript can subscribe to it uniformly. See `examples/realtime/realtime-gemini-live-local-vad.py` for the full pattern.
--- a/changelog/4493.added.md
+++ b/changelog/4493.added.md
@@ -1 +0,0 @@
- Added `pipecat.workers`, a worker-based agent framework folded in from the standalone `pipecat-subagents` package. Workers inherit from `BaseWorker`, share a `WorkerBus`, register in a `WorkerRegistry`, and exchange typed work via `@job` handlers. `LLMWorker` and `LLMContextWorker` provide ready-made LLM-driven workers. `PipelineRunner.spawn(worker)` registers fire-and-forget workers alongside the main pipeline worker.
--- a/changelog/4493.changed.2.md
+++ b/changelog/4493.changed.2.md
@@ -1 +0,0 @@
- ⚠️ `FrameProcessorSetup.pipeline_worker` and `FunctionCallParams.pipeline_worker` are now mandatory fields, and `FrameProcessor.pipeline_worker` raises if read before `setup()` instead of returning `None`. Real-world code (frame processors set up by `PipelineWorker`, tool handlers invoked by `LLMService`) is unaffected; only callers that construct these dataclasses by hand (typically tests) now have to supply a `pipeline_worker` reference.
--- a/changelog/4493.changed.md
+++ b/changelog/4493.changed.md
@@ -1 +0,0 @@
- `PipelineWorker` now inherits from `BaseWorker`, so every pipeline worker is also a bus participant. It accepts a new optional `bridged=()` parameter that auto-wraps the pipeline with bus edge processors, letting the worker exchange frames with other bridged workers over the shared `WorkerBus`. The bus is supplied by `PipelineRunner` via `worker.attach(registry=..., bus=...)` instead of through the constructor.
--- a/changelog/4507.fixed.md
+++ b/changelog/4507.fixed.md
@@ -1 +0,0 @@
- Fixed `ElevenLabsSTTService` crashing when `language` was passed as `None`. When `language` is not set, the service now lets ElevenLabs auto-detect the audio language.
--- a/changelog/4514.fixed.md
+++ b/changelog/4514.fixed.md
@@ -1 +0,0 @@
- Fixed websocket STT connection setup failures so services clear stale websocket state and emit non-fatal error frames, allowing `ServiceSwitcher` failover to keep agents running.
--- a/changelog/4521.added.md
+++ b/changelog/4521.added.md
@@ -1 +0,0 @@
- Added `max_endpoint_delay_ms` to `SonioxSTTService.Settings`, controlling the maximum delay (500-3000 ms) before endpoint detection finalizes a turn.
--- a/changelog/4521.changed.md
+++ b/changelog/4521.changed.md
@@ -1 +0,0 @@
- `SonioxSTTService` now applies settings updates (e.g. via `STTUpdateSettingsFrame`) using a graceful reconnect instead of a hard disconnect/reconnect, preserving the service's reconnect retry behavior.
--- a/changelog/4521.removed.md
+++ b/changelog/4521.removed.md
@@ -1 +0,0 @@
- Removed the unsupported Georgian (`Language.KA`) language mapping from `SonioxSTTService`.
--- a/changelog/4522.changed.md
+++ b/changelog/4522.changed.md
@@ -1 +0,0 @@
- Updated the default p99 TTFS latency values for Smallest AI, Mistral, and XAI STT so turn stop timing uses measured values instead of the conservative fallback.
--- a/changelog/4524.changed.md
+++ b/changelog/4524.changed.md
@@ -1 +0,0 @@
- Updated the development runner startup banner to show the prebuilt client URL once and list enabled or disabled transports with install hints.
--- a/changelog/4524.fixed.md
+++ b/changelog/4524.fixed.md
@@ -1 +0,0 @@
- Fixed the development runner so missing optional transport dependencies disable only their related routes instead of failing startup in all-transport mode.
--- a/changelog/4527.fixed.md
+++ b/changelog/4527.fixed.md
@@ -1 +0,0 @@
- Fixed a race in `ElevenLabsTTSService` where the periodic keepalive could be sent for a new turn's context before that context's `voice_settings` initialization message, causing ElevenLabs to close the WebSocket with a 1008 policy violation (`voice_settings field must be provided in the first message ...`). The keepalive now only targets a context once its context-init has been sent.
--- a/changelog/4531.changed.md
+++ b/changelog/4531.changed.md
@@ -1 +0,0 @@
- Bumped `pipecat-ai-prebuilt` to 1.0.1 in the `runner` extra, updating the prebuilt client UI served by the development runner.
--- a/changelog/xxxx.added.2.md
+++ b/changelog/xxxx.added.2.md
@@ -1 +0,0 @@
- Added `LLMService.append_system_instruction(...)`, which composes durable text onto a user-provided system instruction (alongside the turn-completion and async-tool-cancellation instructions) so it is prepended on every inference and survives context-message resets.
--- a/changelog/xxxx.added.md
+++ b/changelog/xxxx.added.md
@@ -1,3 +0,0 @@
- Added `pipecat.workers.ui.UIWorker`, an `LLMContextWorker` that observes and drives a client GUI over the RTVI UI channel: it stores live accessibility snapshots, auto-injects `<ui_state>` into the LLM context before every inference (via the LLM's `on_before_process_frame` hook), dispatches client events to `@on_ui_event` handlers, and sends UI commands (`scroll_to`, `highlight`, `select_text`, `click`, `set_input_value`) back to the client. The optional `ReplyToolMixin` exposes a bundled `reply` tool, and `user_job_group(...)` surfaces fan-out work to the client as cancellable task cards. A native RTVI⇄bus UI bridge is built into `PipelineWorker` (active whenever RTVI is enabled), so no decorator or manual wiring is needed: inbound UI messages are broadcast on the bus as `BusUIEventMessage`, and outbound `BusUICommandMessage` / `BusUITask*` carriers are translated into RTVI frames for the client.
-
- `UIWorker` auto-injects the UI wire-format guide (`UI_STATE_PROMPT_GUIDE`) into its LLM's system instruction by default, via a `prompt_guide` parameter — pass your own string to override the guide, or `None` to disable. Apps no longer need to concatenate `UI_STATE_PROMPT_GUIDE` into the LLM's `system_instruction` by hand.
--- a/env.example
+++ b/env.example
@@ -91,9 +91,6 @@ HEYGEN_LIVE_AVATAR_API_KEY=...
 HUME_API_KEY=...
 HUME_VOICE_ID=...

-# Inception
-INCEPTION_API_KEY=...
-
 # Inworld
 INWORLD_API_KEY=...

@@ -214,11 +211,6 @@ TWILIO_AUTH_TOKEN=...
 # Ultravox Realtime
 ULTRAVOX_API_KEY=...

-# Vonage
-VONAGE_APPLICATION_ID=...
-VONAGE_SESSION_ID=...
-VONAGE_TOKEN=...
-
 # WhatsApp
 WHATSAPP_TOKEN=...
 WHATSAPP_WEBHOOK_VERIFICATION_TOKEN=...
--- a/examples/audio/audio-bot-background-sound.py
+++ b/examples/audio/audio-bot-background-sound.py
@@ -16,7 +16,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame, MixerEnableFrame, MixerUpdateSettingsFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -105,7 +105,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -120,27 +120,27 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        logger.info(f"Listening for background sound for a bit...")
        await asyncio.sleep(5.0)
        logger.info(f"Reducing volume...")
-        await worker.queue_frame(MixerUpdateSettingsFrame({"volume": 0.5}))
+        await task.queue_frame(MixerUpdateSettingsFrame({"volume": 0.5}))
        await asyncio.sleep(5.0)
        logger.info(f"Disabling background sound for a bit...")
-        await worker.queue_frame(MixerEnableFrame(False))
+        await task.queue_frame(MixerEnableFrame(False))
        await asyncio.sleep(5.0)
        logger.info(f"Re-enabling background sound and starting bot...")
-        await worker.queue_frame(MixerEnableFrame(True))
+        await task.queue_frame(MixerEnableFrame(True))
        # Kick off the conversation.
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/audio/audio-recording.py
+++ b/examples/audio/audio-recording.py
@@ -54,7 +54,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -146,7 +146,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -161,12 +161,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        # Start recording audio
        await audiobuffer.start_recording()
        # Start conversation - empty prompt to let LLM follow system instructions
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    # Handler for merged audio
    @audiobuffer.event_handler("on_audio_data")
@@ -191,7 +191,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        await save_audio_file(bot_audio, bot_filename, sample_rate, 1)

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/audio/audio-sound-effects.py
+++ b/examples/audio/audio-sound-effects.py
@@ -20,7 +20,7 @@ from pipecat.frames.frames import (
 )
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineWorker
+from pipecat.pipeline.task import PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -144,7 +144,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
    )
@@ -153,17 +153,17 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
        # Kick off the conversation.
-        await worker.queue_frame(TTSSpeakFrame("Hi, I'm listening!"))
+        await task.queue_frame(TTSSpeakFrame("Hi, I'm listening!"))
        await transport.send_audio(sounds["ding1.wav"])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/context-summarization/context-summarization-dedicated-llm.py
+++ b/examples/context-summarization/context-summarization-dedicated-llm.py
@@ -26,7 +26,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_context_summarizer import SummaryAppliedEvent
 from pipecat.processors.aggregators.llm_response_universal import (
@@ -198,7 +198,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -214,16 +214,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info("Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/context-summarization/context-summarization-google.py
+++ b/examples/context-summarization/context-summarization-google.py
@@ -24,7 +24,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_context_summarizer import SummaryAppliedEvent
 from pipecat.processors.aggregators.llm_response_universal import (
@@ -159,7 +159,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -175,16 +175,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info("Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/context-summarization/context-summarization-manual-openai.py
+++ b/examples/context-summarization/context-summarization-manual-openai.py
@@ -26,7 +26,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame, LLMSummarizeContextFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -133,7 +133,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -149,16 +149,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info("Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/context-summarization/context-summarization-openai.py
+++ b/examples/context-summarization/context-summarization-openai.py
@@ -24,7 +24,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_context_summarizer import SummaryAppliedEvent
 from pipecat.processors.aggregators.llm_response_universal import (
@@ -159,7 +159,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -175,16 +175,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info("Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/features/features-add-tool-change-messages.py
+++ b/examples/features/features-add-tool-change-messages.py
@@ -56,7 +56,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame, LLMSetToolsFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import NOT_GIVEN, LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -163,7 +163,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(enable_metrics=True, enable_usage_metrics=True),
        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
@@ -185,13 +185,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                "=== Phase 1: weather tool REMOVED. Keep asking about the weather "
                "to exercise hallucination scenarios. ==="
            )
-            await worker.queue_frame(LLMSetToolsFrame(tools=NOT_GIVEN))
+            await task.queue_frame(LLMSetToolsFrame(tools=NOT_GIVEN))
        elif user_turn_count == READD_AT_TURN - 1:
            logger.info(
                "=== Phase 2: weather tool RE-ADDED. Ask for the weather again — "
                "does the LLM call it, or keep refusing? (THIS IS THE TEST.) ==="
            )
-            await worker.queue_frame(LLMSetToolsFrame(tools=weather_tools))
+            await task.queue_frame(LLMSetToolsFrame(tools=weather_tools))

    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
@@ -209,15 +209,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                ),
            }
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info("Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/features/features-app-resources.py
+++ b/examples/features/features-app-resources.py
@@ -4,27 +4,27 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-"""Example demonstrating ``PipelineWorker(app_resources=...)``.
+"""Example demonstrating ``PipelineTask(app_resources=...)``.

 ``app_resources`` is an application-defined bag of anything your
 application code may want to share across a session: database handles,
 HTTP clients, feature flags, per-user state, observability clients,
 in-memory caches — whatever fits your app. Pipecat passes it through
-untouched and exposes it as ``worker.app_resources``, so any code with a
-handle on the worker can read or mutate it.
+untouched and exposes it as ``task.app_resources``, so any code with a
+handle on the task can read or mutate it.

 Two of the convenience aliases exercised below:

 - Tool handlers read it from ``FunctionCallParams.app_resources``.
 - Custom ``FrameProcessor`` subclasses read it from
-  ``self.pipeline_worker.app_resources``.
+  ``self.pipeline_task.app_resources``.

 This example uses two small loggers as stand-ins for that "shared thing":
 ``ToolCallLogger`` (written from tool handlers) and
 ``TranscriptionLogger`` (written from a custom ``FrameProcessor`` that
 sits in the pipeline). A real app might just as easily pass a Postgres
 pool, a Redis client, a Stripe SDK instance, or any combination thereof.
-The mechanics shown here — construct once, hand to the worker, read it
+The mechanics shown here — construct once, hand to the task, read it
 from each site, inspect it after the session — are the same regardless
 of what you put in.

@@ -50,7 +50,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import Frame, LLMRunFrame, TranscriptionFrame, TTSSpeakFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -131,7 +131,7 @@ class AppResources:
    get autocomplete and refactor safety:

    - In tools: ``cast(AppResources, params.app_resources)``.
-    - In custom processors: ``cast(AppResources, self.pipeline_worker.app_resources)``.
+    - In custom processors: ``cast(AppResources, self.pipeline_task.app_resources)``.
    """

    tool_call_logger: ToolCallLogger
@@ -155,8 +155,8 @@ class TranscriptionLoggingProcessor(FrameProcessor):

    Demonstrates the second read site for ``app_resources``: any custom
    ``FrameProcessor`` can reach the same bag every tool handler sees by
-    going through ``self.pipeline_worker.app_resources``. ``pipeline_worker``
-    is ``None`` until the worker sets the processor up, so we guard against
+    going through ``self.pipeline_task.app_resources``. ``pipeline_task``
+    is ``None`` until the task sets the processor up, so we guard against
    that case.
    """

@@ -164,8 +164,8 @@ class TranscriptionLoggingProcessor(FrameProcessor):
        """Forward all frames; log final user transcriptions on the way through."""
        await super().process_frame(frame, direction)

-        if isinstance(frame, TranscriptionFrame) and self.pipeline_worker is not None:
-            resources = cast(AppResources, self.pipeline_worker.app_resources)
+        if isinstance(frame, TranscriptionFrame) and self.pipeline_task is not None:
+            resources = cast(AppResources, self.pipeline_task.app_resources)
            resources.transcription_logger.log_transcription(frame.text)

        await self.push_frame(frame, direction)
@@ -282,7 +282,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        transcription_logger=transcription_logger,
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -299,16 +299,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)

    # The session has ended; read whatever state the handlers built up.
    logger.info(f"Tool calls logged during session:\n{tool_call_logger.dump()}")
--- a/examples/features/features-before-and-after-events.py
+++ b/examples/features/features-before-and-after-events.py
@@ -14,7 +14,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import DataFrame, LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -97,7 +97,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -124,18 +124,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
        # Custom frames are pushed in order so they can be used for synchronization purposes.
-        await worker.queue_frames(
-            [CustomBeforeProcessFrame(), LLMRunFrame(), CustomAfterPushFrame()]
-        )
+        await task.queue_frames([CustomBeforeProcessFrame(), LLMRunFrame(), CustomAfterPushFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/features/features-concurrent-llm-evaluation.py
+++ b/examples/features/features-concurrent-llm-evaluation.py
@@ -15,7 +15,7 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.parallel_pipeline import ParallelPipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -130,7 +130,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -149,16 +149,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        groq_context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/features/features-concurrent-llm-rtvi-ignored-sources.py
+++ b/examples/features/features-concurrent-llm-rtvi-ignored-sources.py
@@ -21,7 +21,7 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.parallel_pipeline import ParallelPipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -141,7 +141,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -160,16 +160,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        evaluator_context.add_message(
            {"role": "developer", "content": "Ready to evaluate user messages."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info("Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/features/features-custom-frame-processor.py
+++ b/examples/features/features-custom-frame-processor.py
@@ -17,7 +17,7 @@ from pipecat.frames.frames import (
 )
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -128,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -144,16 +144,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/features/features-gpu-container-local-bot.py
+++ b/examples/features/features-gpu-container-local-bot.py
@@ -14,7 +14,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -95,7 +95,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -112,7 +112,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        context.add_message(
            {"role": "developer", "content": "Please introduce yourself to the user."}
        )
-        await worker.queue_frames([LLMRunFrame()])
+        await task.queue_frames([LLMRunFrame()])

    # Handle "latency-ping" messages. The client will send app messages that look like
    # this:
@@ -128,13 +128,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                logger.debug(f"Received latency ping app message: {message}")
                ts = message["latency-ping"]["ts"]
                # Send immediately
-                await worker.queue_frame(
+                await task.queue_frame(
                    DailyOutputTransportMessageUrgentFrame(
                        message={"latency-pong-msg-handler": {"ts": ts}}, participant_id=sender
                    )
                )
                # And push to the pipeline for the Daily transport.output to send
-                await worker.queue_frame(
+                await task.queue_frame(
                    DailyOutputTransportMessageFrame(
                        message={"latency-pong-pipeline-delivery": {"ts": ts}},
                        participant_id=sender,
@@ -146,11 +146,11 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/features/features-live-translation.py
+++ b/examples/features/features-live-translation.py
@@ -14,7 +14,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import TTSSpeakFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.worker import PipelineParams, PipelineWorker
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
@@ -99,7 +99,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

-    worker = PipelineWorker(
+    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
@@ -111,7 +111,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
-        await worker.queue_frames(
+        await task.queue_frames(
            [
                TTSSpeakFrame(
                    text="Hello, welcome to live translation. Everything you say will be automatically translated to Spanish. Let's begin!",
@@ -123,11 +123,11 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
-        await worker.cancel()
+        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-    await runner.run(worker)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Paul Kompfner	3fee91ddec	Drop redundant changelog entry for OpenAI Realtime example The OpenAI Realtime story didn't add any service-level code — just a new example. The original 4480.added.md entry already describes the feature as "a realtime service like Gemini Live," which generalizes to OpenAI Realtime.	2026-05-18 12:06:48 -04:00
Paul Kompfner	638294c1cc	Add realtime-openai-local-vad example Mirrors the Gemini Live local-VAD example for OpenAI Realtime, showing that `wait_for_transcript_to_end_user_turn=False` composes cleanly with `turn_detection=False`. The OpenAI Realtime service already wires `UserStoppedSpeakingFrame` to `input_audio_buffer.commit` + `response.create` when `turn_detection=False`, so the example is the only new code needed.	2026-05-18 11:50:16 -04:00
Paul Kompfner	ea96b7aec7	Rename transcript-gather to post-turn transcript wait Switch the vocabulary for the timer-driven phase that runs when `wait_for_transcript_to_end_user_turn=False`. "Transcript gather" was too vague to be self-documenting; "post-turn transcript wait" names when it happens (after the user turn ends) and what it's for (waiting for late-arriving transcripts). Renames the internal property to `_wait_for_post_turn_transcripts` and the supporting state/method names to match (`_post_turn_transcript_wait_task`, `_complete_post_turn_transcript_wait`, etc.). Updates docstrings, comments, log messages, the example inline doc, and the test prose to use the new vocabulary consistently.	2026-05-18 10:51:14 -04:00
Paul Kompfner	666c619113	Size transcript-gather timer to STT-reported P99 TTFS The aggregator's transcript-gather timer (used when `wait_for_transcript_to_end_user_turn=False`) was hardcoded to `DEFAULT_TTFS_P99`. Capture `STTMetadataFrame.ttfs_p99_latency` as it flows through the user aggregator and prefer that value, just like the stop strategies already do. Falls back to `DEFAULT_TTFS_P99` when no STT service has reported a value.	2026-05-18 10:29:19 -04:00
Paul Kompfner	797d09a1d5	Align vocabulary around `wait_for_transcript_to_end_user_turn=False` Reframe comments, docstrings, identifiers, changelog, and example around a single explanation of the option: (1) turn strategies do not consider user transcripts, letting the user turn end sooner, and (2) the aggregator gathers user transcripts on its own after the turn ends via a simple timer, then emits `on_user_turn_message_finalized` with the new user context message. The mechanism is generic, so internal aggregator vocabulary stays generic ("transcript-gather", "after the user turn ends"); the public-facing param docstring is the one place that explains the "local turn detection drives a realtime service" use case. The stop strategies' `wait_for_transcript` flag is pointed at as something that's "usually flipped indirectly" by the aggregator param rather than something to pair with it. Renames internal state to match: `_expect_delayed_transcripts` → `_aggregator_gathers_transcripts`, `_pending_finalization_` → `_transcript_gather_`, `_finalize_delayed_user_message` → `_finalize_user_message`, etc.	2026-05-18 10:18:22 -04:00
Paul Kompfner	ee1538d18e	test: cover fallback path and align with vocabulary refactor Adds two tests for the strategy's transcripts-without-VAD fallback path — one in default mode (both events fire with the aggregated content) and one in delayed-transcript mode (only ``on_user_turn_message_finalized`` fires; no end-of-turn event is emitted since no turn ever started in the controller). Updates existing tests for the vocabulary refactor: assertions now expect ``content=None`` (not ``""``) for the end-of-turn event in delayed-transcript mode; comments and docstrings use the standardized terms (end of turn, user message finalization, pending-finalization timer, plural "transcripts").	2026-05-18 09:55:42 -04:00
Paul Kompfner	8330c3487d	Refactor delayed-transcript machinery; standardize vocabulary Splits ``_maybe_emit_user_turn_stopped`` into three focused methods — ``_flush_user_message_to_context`` (push aggregation, return content + timestamp), ``_finalize_user_turn`` (default-mode flow, emits both events), and ``_finalize_delayed_user_message`` (delayed-mode flow, emits only ``on_user_turn_message_finalized``). Fixes a side-issue where ``on_user_turn_stopped`` could fire from non-end-of-turn paths in delayed-transcript mode; that event now has a single origin (the end-of-turn handler). Standardizes vocabulary across docstrings and comments: - "Default mode" / "Delayed-transcript mode" (with ``_expect_delayed_transcripts == False/True``) - "End of turn" (not "audible stop" or "audible end of turn") - "User message finalization" (the moment user-text is flushed to context + ``on_user_turn_message_finalized`` fires) - "Pending finalization" (the in-between state in delayed mode) - Transcripts (plural — the aggregator combines multiple per turn) The timer that triggers user message finalization is no longer described as a "backstop" — it's the sole trigger for finalization in delayed-transcript mode, not a fallback. Renamed accordingly: ``_pending_finalization_task``, ``_pending_finalization_handler``, ``_run_pending_finalization``, ``_discard_pending_finalization``. Adds a separate message class for the two events: ``UserTurnStoppedMessage.content`` is now ``str \| None`` (``None`` at end-of-turn in delayed-transcript mode), and a new ``UserMessageFinalizedMessage`` carries the always-populated ``content`` for the finalization event.	2026-05-18 09:55:11 -04:00
Paul Kompfner	4479a3a6af	docs: tighten wait_for_transcript_to_end_user_turn docstring + test docstring Reframes the strategy mutations as part of configuring the flag (not an "also" aside), and the ordering invariant in the test docstring as flush-timing (not arrival-timing).	2026-05-15 15:16:39 -04:00
Paul Kompfner	8631518388	test: cover wait_for_transcript_to_end_user_turn=False aggregator behavior Adds five tests for the delayed-transcript flow on `LLMUserAggregator`: - basic flow: `on_user_turn_stopped` fires fast with empty content; `on_user_turn_message_finalized` fires later with the populated transcript; user message lands in context. - backstop with no transcript: backstop timer still finalizes the turn; message_finalized fires with empty content; no user message added to context. - next-turn precondition violation: a new VAD start fires while the previous turn is still pending; the previous turn is force-flushed before the new turn begins. - context-order with assistant response: paired aggregators with a late user transcript arriving before the assistant content streams; verifies the user message lands in context before the assistant message (the conversational-order invariant the design relies on). - strategy mutation: explicit start/stop strategies are mutated by the bundle — `TranscriptionUserTurnStartStrategy` is dropped from start, `wait_for_transcript=False` is flipped on the stop strategy that had it explicitly set to True. Tests patch `DEFAULT_TTFS_P99` to keep the backstop fast.	2026-05-15 14:08:50 -04:00
Paul Kompfner	47e2f7a037	realtime + local turn detection: drop the user-transcript wait Add the configuration surface to drive a realtime service like Gemini Live from local turn detection without paying user-transcript latency. Cascaded pipelines wait for a transcript before ending the user's turn because the downstream LLM needs the user's words recorded in context — but that wait is pure latency in pipelines using local turn detection to drive a realtime service, which consumes user audio directly. Set `wait_for_transcript_to_end_user_turn=False` on `LLMUserAggregatorParams` to turn this on. With that single flag the aggregator: - drops `TranscriptionUserTurnStartStrategy` from the start strategies (so late-arriving realtime transcripts don't trigger new turns), - sets `wait_for_transcript=False` on any stop strategy that supports it (so the turn ends on the audible end of the turn, without waiting for a transcript), - fires `on_user_turn_stopped` on the audible end of the turn with empty `content` (since the transcript hasn't arrived), and - defers the context flush until the transcript arrives or a backstop timer fires. A new `on_user_turn_message_finalized` event fires when the user's message has been written to context. In the default mode it coincides with `on_user_turn_stopped`; in the delayed-transcript mode it fires later. Consumers that want the populated transcript should subscribe to `on_user_turn_message_finalized` — it's the event that always carries the user message, regardless of mode. Strategy mutations are logged: loudly when the user passed their own strategies (we're overwriting parts of their config), quietly otherwise. The strategy-level `wait_for_transcript` parameter on `TurnAnalyzerUserTurnStopStrategy` and `SpeechTimeoutUserTurnStopStrategy` remains exposed for advanced cases. The example `realtime-gemini-live-local-vad.py` demonstrates the full pattern.	2026-05-15 13:49:16 -04:00
Paul Kompfner	6d21507e95	user turn stop strategies: don't always wait for transcripts Until now, both TurnAnalyzerUserTurnStopStrategy and SpeechTimeoutUserTurnStopStrategy waited for at least one transcript before ending the user turn. That's the right behavior for cascaded pipelines, where the downstream LLM can't respond until the user's words are recorded in its context — but it's pure latency in pipelines using local turn detection to drive a realtime service like Gemini Live. Add a `require_transcript: bool \| None = None` parameter to both strategies. When None (default), it infers from whether an STTMetadataFrame has been seen — a proxy for "does the downstream LLM need the transcript in context?". Explicit True/False overrides the heuristic. When a transcript isn't required, the strategies also skip the STT-waiting timeout in the VAD-stopped handler, so the user turn ends as soon as the analyzer (or speech timer) concludes the turn is complete.	2026-05-13 15:45:51 -04:00
				`@@ -1 +0,0 @@`
				- Added `VonageVideoConnectorTransport`, a new transport integration for real-time Vonage WebRTC sessions using the Vonage Video Connector library.
				`@@ -1 +0,0 @@`
				- Fixed Azure TTS last word being missed by observers and RTVI UI. The completion signal was racing with word timestamp processing, causing the final word's `TTSTextFrame` to arrive after `TTSStoppedFrame`. Completion is now routed through the word boundary queue to ensure all words are processed before signaling stream end.
				`@@ -1 +0,0 @@`
				- Fixed `BaseOutputTransport` reordering frames that share the same presentation timestamp. Frames with equal PTS values are now emitted in insertion order, preventing subtle audio/text sequencing bugs when multiple frames arrive at the same time.
				`@@ -1 +0,0 @@`
				- Fixed Cartesia word timestamps leaking SSML tag text (e.g. `<spell>`, `<emotion>`, `<break>`) into word entries. Tags are now stripped before processing, so word-to-text attribution remains accurate when SSML markup is present in the TTS input.
				`@@ -1 +0,0 @@`
				- Fixed `TTSTextFrame` entries losing their original text structure when word timestamps are enabled. Each `TTSTextFrame` now carries a `raw_text` field containing the corresponding span of the original LLM-produced text (including pattern delimiters such as `<card>4111 1111 1111 1111</card>`), so the assistant context receives properly-tagged content rather than the cleaned words returned by the TTS provider. Also handles words that straddle two sentence boundaries by splitting them and attributing each part to its correct source frame.
				`@@ -1 +0,0 @@`
				- Fixed skipped TTS frames (e.g. code blocks filtered via `skip_aggregator_types`) being emitted to the assistant context immediately instead of waiting for preceding spoken frames to finish. They now hold their position in the frame sequence and are flushed only after all earlier spoken sentences are complete, keeping context ordering correct.
				`@@ -0,0 +1 @@`
				- Added a `session_id` field to `RunnerArguments` so bots can log or trace a per-session identifier in local development the same way they can in Pipecat Cloud. The development runner now mints a UUID at every construction site, and paths that already returned a `sessionId` to the caller (Daily `/start`, dial-in webhook) share that same UUID with the runner args instead of generating two. The SmallWebRTC `/api/offer` endpoint also accepts an optional `session_id` query parameter so the `/sessions/{session_id}/...` proxy can thread it through.
				`@@ -0,0 +1 @@`
				- Updated the default `SonioxTTSService` model from `tts-rt-v1-preview` to the generally available `tts-rt-v1`.
				`@@ -0,0 +1 @@`
				- Added a `max_buffer_delay_ms` constructor argument to `CartesiaTTSService` for controlling Cartesia's server-side text buffering. When unset, Pipecat picks a sensible default based on `text_aggregation_mode`: `0` in `SENTENCE` mode (custom buffering — avoids stacking client-side aggregation on top of Cartesia's default 3000ms server buffer) and unset in `TOKEN` mode (Cartesia's managed buffering applies). Pass an explicit value (0–5000ms) to override.