Merge pull request #2953 from pipecat-ai/aleix/pipecat-0.0.92

update CHANGELOG for 0.0.92. 🎃 "The Haunted Edition" 👻
2025-10-31 09:47:25 -07:00 · 2025-10-31 09:17:03 -07:00 · 2025-10-31 12:12:14 -04:00 · 2025-10-31 12:08:53 -04:00 · 2025-10-31 12:04:42 -04:00 · 2025-10-31 12:01:03 -04:00
132 changed files with 6426 additions and 3021 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,19 +5,413 @@ All notable changes to **Pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

-## [Unreleased]
+## [0.0.92] - 2025-10-31 🎃 "The Haunted Edition" 👻

 ### Added

+- Added a new `DeepgramHttpTTSService`, which delivers a meaningful reduction
+  in latency when compared to the `DeepgramTTSService`.
+
+- Add support for `speaking_rate` input parameter in `GoogleHttpTTSService`.
+
+- Added `enable_speaker_diarization` and `enable_language_identification` to
+  `SonioxSTTService`.
+
+- Added `SpeechmaticsTTSService`, which uses Speechmatic's TTS API. Updated
+  examples 07a\* to use the new TTS service.
+
+- Added support for including images or audio to LLM context messages using
+  `LLMContext.create_image_message()` or `LLMContext.create_image_url_message()`
+  (not all LLMs support URLs) and `LLMContext.create_audio_message()`. For
+  example, when creating `LLMMessagesAppendFrame`:
+
+  ```python
+  message = LLMContext.create_image_message(image=..., size= ...)
+  await self.push_frame(LLMMessagesAppendFrame(messages=[message], run_llm=True))
+  ```
+
+- New event handlers for the `DeepgramFluxSTTService`: `on_start_of_turn`,
+  `on_turn_resumed`, `on_end_of_turn`, `on_eager_end_of_turn`, `on_update`.
+
+- Added `generation_config` parameter support to `CartesiaTTSService` and
+  `CartesiaHttpTTSService` for Cartesia Sonic-3 models. Includes a new
+  `GenerationConfig` class with `volume` (0.5-2.0), `speed` (0.6-1.5),
+  and `emotion` (60+ options) parameters for fine-grained speech generation
+  control.
+
+- Expanded support for univeral `LLMContext` to `OpenAIRealtimeLLMService`.
+  As a reminder, the context-setup pattern when using `LLMContext` is:
+
+  ```python
+  context = LLMContext(messages, tools)
+  context_aggregator = LLMContextAggregatorPair(context)
+  ```
+
+  (Note that even though `OpenAIRealtimeLLMService` now supports the universal
+  `LLMContext`, it is not meant to be swapped out for another LLM service at
+  runtime with `LLMSwitcher`.)
+
+  Note: `TranscriptionFrame`s and `InterimTranscriptionFrame`s now go upstream
+  from `OpenAIRealtimeLLMService`, so if you're using `TranscriptProcessor`,
+  say, you'll want to adjust accordingly:
+
+  ```python
+  pipeline = Pipeline(
+    [
+      transport.input(),
+      context_aggregator.user(),
+
+      # BEFORE
+      llm,
+      transcript.user(),
+
+      # AFTER
+      transcript.user(),
+      llm,
+
+      transport.output(),
+      transcript.assistant(),
+      context_aggregator.assistant(),
+    ]
+  )
+  ```
+
+  Also worth noting: whether or not you use the new context-setup pattern with
+  `OpenAIRealtimeLLMService`, some types have changed under the hood:
+
+  ```python
+  ## BEFORE:
+
+  # Context aggregator type
+  context_aggregator: OpenAIContextAggregatorPair
+
+  # Context frame type
+  frame: OpenAILLMContextFrame
+
+  # Context type
+  context: OpenAIRealtimeLLMContext
+  # or
+  context: OpenAILLMContext
+
+  ## AFTER:
+
+  # Context aggregator type
+  context_aggregator: LLMContextAggregatorPair
+
+  # Context frame type
+  frame: LLMContextFrame
+
+  # Context type
+  context: LLMContext
+  ```
+
+  Also note that `RealtimeMessagesUpdateFrame` and
+  `RealtimeFunctionCallResultFrame` have been deprecated, since they're no
+  longer used by `OpenAIRealtimeLLMService`. OpenAI Realtime now works more
+  like other LLM services in Pipecat, relying on updates to its context, pushed
+  by context aggregators, to update its internal state. Listen for
+  `LLMContextFrame`s for context updates.
+
+  Finally, `LLMTextFrame`s are no longer pushed from `OpenAIRealtimeLLMService`
+  when it's configured with `output_modalities=['audio']`. If you need
+  to process its output, listen for `TTSTextFrame`s instead.
+
+- Expanded support for universal `LLMContext` to `GeminiLiveLLMService`.
+  As a reminder, the context-setup pattern when using `LLMContext` is:
+
+  ```python
+  context = LLMContext(messages, tools)
+  context_aggregator = LLMContextAggregatorPair(context)
+  ```
+
+  (Note that even though `GeminiLiveLLMService` now supports the universal
+  `LLMContext`, it is not meant to be swapped out for another LLM service at
+  runtime with `LLMSwitcher`.)
+
+  Worth noting: whether or not you use the new context-setup pattern with
+  `GeminiLiveLLMService`, some types have changed under the hood:
+
+  ```python
+  ## BEFORE:
+
+  # Context aggregator type
+  context_aggregator: GeminiLiveContextAggregatorPair
+
+  # Context frame type
+  frame: OpenAILLMContextFrame
+
+  # Context type
+  context: GeminiLiveLLMContext
+  # or
+  context: OpenAILLMContext
+
+  ## AFTER:
+
+  # Context aggregator type
+  context_aggregator: LLMContextAggregatorPair
+
+  # Context frame type
+  frame: LLMContextFrame
+
+  # Context type
+  context: LLMContext
+  ```
+
+  Also note that `LLMTextFrame`s are no longer pushed from `GeminiLiveLLMService`
+  when it's configured with `modalities=GeminiModalities.AUDIO`. If you need
+  to process its output, listen for `TTSTextFrame`s instead.
+
+### Changed
+
+- The development runner's `/start` endpoint now supports passing
+  `dailyRoomProperties` and `dailyMeetingTokenProperties` in the request body
+  when `createDailyRoom` is true. Properties are validated against the
+  `DailyRoomProperties` and `DailyMeetingTokenProperties` types respectively
+  and passed to Daily's room and token creation APIs.
+
+- `UserImageRawFrame` new fields `append_to_context` and `text`. The
+  `append_to_context` field indicates if this image and text should be added to
+  the LLM context (by the LLM assistant aggregator). The `text` field, if set,
+  might also guide the LLM or the vision service on how to analyze the image.
+
+- `UserImageRequestFrame` new fiels `append_to_context` and `text`. Both fields
+  will be used to set the same fields on the captured `UserImageRawFrame`.
+
+- `UserImageRequestFrame` don't require function call name and ID anymore.
+
+- Updated `MoondreamService` to process `UserImageRawFrame`.
+
+- `VisionService` expects `UserImageRawFrame` in order to analyze images.
+
+- `DailyTransport` triggers `on_error` event if transcription can't be started
+  or stopped.
+
+- `DailyTransport` updates: `start_dialout()` now returns two values:
+  `session_id` and `error`. `start_recording()` now returns two values:
+  `stream_id` and `error`.
+
+- Updated `daily-python` to 0.21.0.
+
+- `SimliVideoService` now accepts `api_key` and `face_id` parameters directly,
+  with optional `params` for `max_session_length` and `max_idle_time`
+  configuration, aligning with other Pipecat service patterns.
+
+- Updated the default model to `sonic-3` for `CartesiaTTSService` and
+  `CartesiaHttpTTSService`.
+
+- `FunctionFilter` now has a `filter_system_frames` arg, which controls whether
+  or not SystemFrames are filtered.
+
+- Upgraded `aws_sdk_bedrock_runtime` to v0.1.1 to resolve potential CPU issues
+  when running `AWSNovaSonicLLMService`.
+
+### Deprecated
+
+- The `expect_stripped_words` parameter of `LLMAssistantAggregatorParams` is
+  ignored when used with the newer `LLMAssistantAggregator`, which now handles
+  word spacing automatically.
+
+- `LLMService.request_image_frame()` is deprecated, push a
+  `UserImageRequestFrame` instead.
+
+- `UserResponseAggregator` is deprecated and will be removed in a future version.
+
+- The `send_transcription_frames` argument to `OpenAIRealtimeLLMService` is
+  deprecated. Transcription frames are now always sent. They go upstream, to be
+  handled by the user context aggregator. See "Added" section for details.
+
+- Types in `pipecat.services.openai.realtime.context` and
+  `pipecat.services.openai.realtime.frames` are deprecated, as they're no
+  longer used by `OpenAIRealtimeLLMService`. See "Added" section for details.
+
+- `SimliVideoService` `simli_config` parameter is deprecated. Use `api_key` and
+  `face_id` parameters instead.
+
+### Removed
+
+- Removed `enable_non_final_tokens` and `max_non_final_tokens_duration_ms` from
+  `SonioxSTTService`.
+
+- Removed the `aiohttp_session` arg from `SarvamTTSService` as it's no longer
+  used.
+
+### Fixed
+
+- Fixed a `PipelineTask` issue that was causing an idle timeout for frames that
+  were being generated but not reaching the end of the pipeline. Since the exact
+  point when frames are discarded is unknown, we now monitor pipeline frames
+  using an observer. If the observer detects frames are being generated, it will
+  prevent the pipeline from being considered idle.
+
+- Fixed an issue in `HumeTTSService` that was only using Octave 2, which does
+  not support the `description` field. Now, if a description is provided, it
+  switches to Octave 1.
+
+- Fixed an issue where `DailyTransport` would timeout prematurely on join and on
+  leave.
+
+- Fixed an issue in the runner where starting a DailyTransport room via
+  `/start` didn't support using the `DAILY_SAMPLE_ROOM_URL` env var.
+
+- Fixed an issue in `ServiceSwitcher` where the `STTService`s would result in
+  all STT services producing `TranscriptionFrame`s.
+
+### Other
+
+- Updated all vision 12-series foundational examples to load images from a file.
+
+- Added 14-series video examples for different services. These new examples
+  request an image from the user camera through a function call.
+
+## [0.0.91] - 2025-10-21
+
+### Added
+
+- It is now possible to start a bot from the `/start` endpoint when using the
+  runner Daily's transport. This follows the Pipecat Cloud format with
+  `createDailyRoom` and `body` fields in the POST request body.
+
+- Added an ellipsis character (`…`) to the end of sentence detection in the
+  string utils.
+
+- Expanded support for universal `LLMContext` to `AWSNovaSonicLLMService`.
+  As a reminder, the context-setup pattern when using `LLMContext` is:
+
+  ```python
+  context = LLMContext(messages, tools)
+  context_aggregator = LLMContextAggregatorPair(context)
+  ```
+
+  (Note that even though `AWSNovaSonicLLMService` now supports the universal
+  `LLMContext`, it is not meant to be swapped out for another LLM service at
+  runtime with `LLMSwitcher`.)
+
+  Worth noting: whether or not you use the new context-setup pattern with
+  `AWSNovaSonicLLMService`, some types have changed under the hood:
+
+  ```python
+  ## BEFORE:
+
+  # Context aggregator type
+  context_aggregator: AWSNovaSonicContextAggregatorPair
+
+  # Context frame type
+  frame: OpenAILLMContextFrame
+
+  # Context type
+  context: AWSNovaSonicLLMContext
+  # or
+  context: OpenAILLMContext
+
+  ## AFTER:
+
+  # Context aggregator type
+  context_aggregator: LLMContextAggregatorPair
+
+  # Context frame type
+  frame: LLMContextFrame
+
+  # Context type
+  context: LLMContext
+  ```
+
+- Added support for `bulbul:v3` model in `SarvamTTSService` and
+  `SarvamHttpTTSService`.
+
+- Added `keyterms_prompt` parameter to `AssemblyAIConnectionParams`.
+
+- Added `speech_model` parameter to `AssemblyAIConnectionParams` to access the
+  multilingual model.
+
+- Added support for trickle ICE to the `SmallWebRTCTransport`.
+
+- Added support for updating `OpenAITTSService` settings (`instructions` and
+  `speed`) at runtime via `TTSUpdateSettingsFrame`.
+
+- Added `--whatsapp` flag to runner to better surface WhatsApp transport logs.
+
+- Added `on_connected` and `on_disconnected` events to TTS and STT
+  websocket-based services.
+
+- Added an `aggregate_sentences` arg in `ElevenLabsHttpTTSService`, where the
+  default value is True.
+
+- Added a `room_properties` arg to the Daily runner's `configure()` method,
+  allowing `DailyRoomProperties` to be provided.
+
 - The runner `--folder` argument now supports downloading files from
  subdirectories.

+### Changed
+
+- `RunnerArguments` now include the `body` field, so there's no need to add it
+  to subclasses. Also, all `RunnerArguments` fields are now keyword-only.
+
+- `CartesiaSTTService` now inherits from `WebsocketSTTService`.
+
+- Package upgrades:
+
+  - `daily-python` upgraded to 0.20.0.
+  - `openai` upgraded to support up to 2.x.x.
+  - `openpipe` upgraded to support up to 5.x.x.
+
+- `SpeechmaticsSTTService` updated dependencies for `speechmatics-rt>=0.5.0`.
+
+### Deprecated
+
+- The `send_transcription_frames` argument to `AWSNovaSonicLLMService` is
+  deprecated. Transcription frames are now always sent. They go upstream, to be
+  handled by the user context aggregator. See "Added" section for details.
+
+- Types in `pipecat.services.aws.nova_sonic.context` are deprecated, as they're
+  no longer used by `AWSNovaSonicLLMService`. See "Added" section for
+  details.
+
 ### Fixed

+- Fixed an issue where the `RTVIProcessor` was sending duplicate
+  `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` messages.
+
+- Fixed an issue in `AWSBedrockLLMService` where both `temperature` and `top_p`
+  were always sent together, causing conflicts with models like Claude Sonnet 4.5
+  that don't allow both parameters simultaneously. The service now only includes
+  inference parameters that are explicitly set, and `InputParams` defaults have
+  been changed to `None` to rely on AWS Bedrock's built-in model defaults.
+
+- Fixed an issue in `RivaSegmentedSTTService` where a runtime error occurred due
+  to a mismatch in the `_handle_transcription` method's signature.
+
+- Fixed multiple pipeline task cancellation issues. `asyncio.CancelledError` is
+  now handled properly in `PipelineTask` making it possible to cancel an asyncio
+  task that it's executing a `PipelineRunner` cleanly. Also,
+  `PipelineTask.cancel()` does not block anymore waiting for the `CancelFrame`
+  to reach the end of the pipeline (going back to the behavior in < 0.0.83).
+
+- Fixed an issue in `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` where
+  the Flash models would split words, resulting in a space being inserted
+  between words.
+
+- Fixed an issue where audio filters' `stop()` would not be called when using
+  `CancelFrame`.
+
+- Fixed an issue in `ElevenLabsHttpTTSService`, where
+  `apply_text_normalization` was incorrectly set as a query parameter. It's now
+  being added as a request parameter.
+
 - Fixed an issue where `RimeHttpTTSService` and `PiperTTSService` could generate
  incorrectly 16-bit aligned audio frames, potentially leading to internal
  errors or static audio.

+- Fixed an issue in `SpeechmaticsSTTService` where `AdditionalVocabEntry` items
+  needed to have `sounds_like` for the session to start.
+
+### Other
+
+- Added foundational example `47-sentry-metrics.py`, demonstrating how to use the
+  `SentryMetrics` processor.
+
+- Added foundational example `14x-function-calling-openpipe.py`.
+
 ## [0.0.90] - 2025-10-10

 ### Added
@@ -1009,6 +1403,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Added

+- Added `SonioxSTTService` using Soniox's STT websocket API.
+
 - Added `enable_emulated_vad_interruptions` to `LLMUserAggregatorParams`.
  When user speech is emulated (e.g. when a transcription is received but
  VAD doesn't detect speech), this parameter controls whether the emulated
--- a/README.md
+++ b/README.md
@@ -44,6 +44,10 @@ Looking to build structured conversations? Check out [Pipecat Flows](https://git

 Want to build beautiful and engaging experiences? Checkout the [Voice UI Kit](https://github.com/pipecat-ai/voice-ui-kit), a collection of components, hooks and templates for building voice AI applications quickly.

+### 🛠️ Create and deploy projects
+
+Create a new project in under a minute with the [Pipecat CLI](https://github.com/pipecat-ai/pipecat-cli). Then use the CLI to monitor and deploy your agent to production.
+
 ### 🔍 Debugging

 Looking for help debugging your pipeline and processors? Check out [Whisker](https://github.com/pipecat-ai/whisker), a real-time Pipecat debugger.
@@ -63,24 +67,24 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
    <a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
    <br/>
    <a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>&nbsp;
-    <a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
+    <a href="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/12-describe-video.py"><img src="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/assets/moondream.png" width="400" /></a>
 </p>

 ## 🧩 Available services

-| Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
-| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                    |
-| LLMs                | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together)                                                                                          |
-| Text-to-Speech      | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
-| Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
-| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
-| Serializers         | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
-| Video               | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
-| Memory              | [mem0](https://docs.pipecat.ai/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
-| Vision & Image      | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
-| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
-| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+| Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                                                                                                                                                        |
+| LLMs                | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together)                                                                                                                                                                                                                              |
+| Text-to-Speech      | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
+| Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+| Serializers         | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+| Video               | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+| Memory              | [mem0](https://docs.pipecat.ai/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+| Vision & Image      | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |

 📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)

--- a/env.example
+++ b/env.example
@@ -4,6 +4,9 @@ AICOUSTICS_LICENSE_KEY=...
 # Anthropic
 ANTHROPIC_API_KEY=...

+# Assembly AI
+ASSEMBLYAI_API_KEY=...
+
 # Async
 ASYNCAI_API_KEY=...
 ASYNCAI_VOICE_ID=...
@@ -21,12 +24,19 @@ AZURE_CHATGPT_API_KEY=...
 AZURE_CHATGPT_ENDPOINT=https://...
 AZURE_CHATGPT_MODEL=...

+AZURE_REALTIME_API_KEY=...
+AZURE_REALTIME_BASE_URL=...
+
 AZURE_DALLE_API_KEY=...
 AZURE_DALLE_ENDPOINT=https://...
 AZURE_DALLE_MODEL=...

 # Cartesia
 CARTESIA_API_KEY=...
+CARTESIA_VOICE_ID=...
+
+# Cerebras
+CEREBRAS_API_KEY=...

 # Daily
 DAILY_API_KEY=...
@@ -35,57 +45,48 @@ DAILY_SAMPLE_ROOM_URL=https://...
 # Deepgram
 DEEPGRAM_API_KEY=...

+# DeepSeek
+DEEPSEEK_API_KEY=...
+
 # ElevenLabs
 ELEVENLABS_API_KEY=...
 ELEVENLABS_VOICE_ID=...

-# Neuphonic
-NEUPHONIC_API_KEY=...
-
 # Fal
 FAL_KEY=...

 # Fireworks
 FIREWORKS_API_KEY=...

+# Fish Audio
+FISH_API_KEY=...
+
 # Gladia
 GLADIA_API_KEY=...
 GLADIA_REGION=...

 # Google
 GOOGLE_API_KEY=...
-GOOGLE_CLOUD_PROJECT_ID=...
-GOOGLE_TEST_CREDENTIALS=...
 GOOGLE_VERTEX_TEST_CREDENTIALS=...
+GOOGLE_CLOUD_PROJECT_ID=...
+GOOGLE_CLOUD_LOCATION=...
+GOOGLE_TEST_CREDENTIALS=...
+
+# Grok
+GROK_API_KEY=...
+
+# Groq
+GROQ_API_KEY=...
+
+# Heygen
+HEYGEN_API_KEY=...

 # Hume
 HUME_API_KEY=...
+HUME_VOICE_ID=...

-# LMNT
-LMNT_API_KEY=...
-LMNT_VOICE_ID=...
-
-# Perplexity
-PERPLEXITY_API_KEY=...
-
-# PlayHT
-PLAYHT_USER_ID=...
-PLAYHT_API_KEY=...
-
-# OpenAI
-OPENAI_API_KEY=...
-
-# OpenPipe
-OPENPIPE_API_KEY=...
-
-# Tavus
-TAVUS_API_KEY=...
-TAVUS_REPLICA_ID=...
-TAVUS_PERSONA_ID=...
-
-# Simli
-SIMLI_API_KEY=...
-SIMLI_FACE_ID=...
+# Inworld
+INWORLD_API_KEY=...

 # Krisp
 KRISP_MODEL_PATH=...
@@ -93,77 +94,100 @@ KRISP_MODEL_PATH=...
 # Krisp Viva
 KRISP_VIVA_MODEL_PATH=...

-# DeepSeek
-DEEPSEEK_API_KEY=...
+# LiveKit
+LIVEKIT_API_KEY=...
+LIVEKIT_API_SECRET=...

-# Groq
-GROQ_API_KEY=...
-
-# Grok
-GROK_API_KEY=...
-
-# Inworld
-INWORLD_API_KEY=...
-
-# Together.ai
-TOGETHER_API_KEY=...
-
-# Cerebras
-CEREBRAS_API_KEY=...
-
-# Fish Audio
-FISH_API_KEY=...
-
-# Assembly AI
-ASSEMBLYAI_API_KEY=...
-
-# OpenRouter
-OPENROUTER_API_KEY=...
-
-# Piper
-PIPER_BASE_URL=...
-
-# Smart turn
-LOCAL_SMART_TURN_MODEL_PATH=...
-FAL_SMART_TURN_API_KEY=...
-
-# Twilio
-TWILIO_ACCOUNT_SID=...
-TWILIO_AUTH_TOKEN=...
+# LMNT
+LMNT_API_KEY=...
+LMNT_VOICE_ID=...

 # MiniMax
 MINIMAX_API_KEY=...
 MINIMAX_GROUP_ID=...

-# Sarvam AI
-SARVAM_API_KEY=...
-
-# Soniox
-SONIOX_API_KEY=
-
-# Speechmatics
-SPEECHMATICS_API_KEY=...
-
-# SambaNova
-SAMBANOVA_API_KEY=...
-
-# Sentry
-SENTRY_DSN=...
-
-# Heygen
-HEYGEN_API_KEY=...
-
 # Mistral
 MISTRAL_API_KEY=...

+# Neuphonic
+NEUPHONIC_API_KEY=...
+
 # NVIDIA
 NVIDIA_API_KEY=...

+# OpenAI
+OPENAI_API_KEY=...
+
+# OpenPipe
+OPENPIPE_API_KEY=...
+
+# OpenRouter
+OPENROUTER_API_KEY=...
+
+# Perplexity
+PERPLEXITY_API_KEY=...
+
+# Picovoice Koala
+KOALA_ACCESS_KEY=...
+
+# Piper
+PIPER_BASE_URL=...
+
+# PlayHT
+PLAYHT_USER_ID=...
+PLAYHT_API_KEY=...
+
+# Plivo
+PLIVO_AUTH_ID=...
+PLIVO_AUTH_TOKEN=...
+
 # Qwen
 QWEN_API_KEY=...

+# Rime
+RIME_API_KEY=...
+RIME_VOICE_ID=...
+
+# SambaNova
+SAMBANOVA_API_KEY=...
+
+# Sarvam AI
+SARVAM_API_KEY=...
+
+# Sentry
+SENTRY_DSN=...
+
+# Simli
+SIMLI_API_KEY=...
+SIMLI_FACE_ID=...
+
+# Smart turn
+LOCAL_SMART_TURN_MODEL_PATH=...
+FAL_SMART_TURN_API_KEY=...
+
+# Soniox
+SONIOX_API_KEY=...
+
+# Speechmatics
+SPEECHMATICS_API_KEY=...
+
+# Tavus
+TAVUS_API_KEY=...
+TAVUS_REPLICA_ID=...
+
+# Telnyx
+TELNYX_API_KEY=...
+TELNYX_ACCOUNT_SID=...
+
+# Together.ai
+TOGETHER_API_KEY=...
+
+# Twilio
+TWILIO_ACCOUNT_SID=...
+TWILIO_AUTH_TOKEN=...
+
 # WhatsApp
-WHATSAPP_TOKEN=
-WHATSAPP_WEBHOOK_VERIFICATION_TOKEN=
-WHATSAPP_PHONE_NUMBER_ID=
-WHATSAPP_APP_SECRET=
+WHATSAPP_TOKEN=...
+WHATSAPP_WEBHOOK_VERIFICATION_TOKEN=...
+WHATSAPP_PHONE_NUMBER_ID=...
+WHATSAPP_APP_SECRET=...
--- a/examples/foundational/07-interruptible.py
+++ b/examples/foundational/07-interruptible.py
@@ -21,8 +21,8 @@ from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.stt import CartesiaSTTService
 from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -58,7 +58,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+    stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))

    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
--- a/examples/foundational/07a-interruptible-speechmatics-vad.py
+++ b/examples/foundational/07a-interruptible-speechmatics-vad.py
@@ -6,6 +6,7 @@

 import os

+import aiohttp
 from dotenv import load_dotenv
 from loguru import logger

@@ -20,10 +21,10 @@ from pipecat.processors.aggregators.llm_response import (
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
 from pipecat.services.openai.base_llm import BaseOpenAILLMService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
+from pipecat.services.speechmatics.tts import SpeechmaticsTTSService
 from pipecat.transcriptions.language import Language
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -51,121 +52,127 @@ transport_params = {


 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    """Speechmatics STT Service Example
+    """Speechmatics STT and TTS Service Example

-    This example demonstrates using Speechmatics Speech-to-Text service with speaker diarization and intelligent speaker management. Key features:
+    This example demonstrates using Speechmatics Speech-to-Text and Text-to-Speech services
+    with speaker diarization and intelligent speaker management. Key features:

-    1. Speaker Diarization
+    1. Speaker Diarization (STT)
       - Automatically identifies and distinguishes between different speakers
       - First speaker is identified as 'S1', others get subsequent IDs
       - Uses `enable_diarization` parameter to manage speaker detection

-    2. Smart Speaker Control
+    2. Smart Speaker Control (STT)
       - `focus_speakers` parameter lets you target specific speakers (e.g. ["S1"])
       - Other speakers will be wrapped in PASSIVE tags
       - Only processes speech from focused speakers
       - Words from all speakers are wrapped with XML tags for clear speaker identification
       - Other speakers' speech only sent when focused speaker is active

-    3. Voice Activity Detection
+    3. Voice Activity Detection (STT)
       - Built-in VAD using `enable_vad` parameter
       - Remove `vad_analyzer` from `transport` config to use module's VAD
       - Emits speaker started/stopped events

-    4. Configuration Options
+    4. Text-to-Speech (TTS)
+       - Low latency streaming audio synthesis
+       - Multiple voice options available including `sarah`, `theo`, and `megan`
+
+    5. Configuration Options
       - `operating_point` parameter defaults to `ENHANCED` for optimal accuracy
       - Configurable `end_of_utterance_silence_trigger` (default 0.5s)
       - Customizable speaker formatting
       - Additional diarization settings available

-    For detailed information about operating points and configuration:
-    https://docs.speechmatics.com/rt-api-ref
+    For detailed information:
+    - STT: https://docs.speechmatics.com/rt-api-ref
+    - TTS: https://docs.speechmatics.com/text-to-speech/quickstart
    """

    logger.info(f"Starting bot")
-
-    stt = SpeechmaticsSTTService(
-        api_key=os.getenv("SPEECHMATICS_API_KEY"),
-        params=SpeechmaticsSTTService.InputParams(
-            language=Language.EN,
-            enable_vad=True,
-            enable_diarization=True,
-            focus_speakers=["S1"],
-            end_of_utterance_silence_trigger=0.5,
-            speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
-            speaker_passive_format="<PASSIVE><{speaker_id}>{text}</{speaker_id}></PASSIVE>",
-        ),
-    )
-
-    tts = ElevenLabsTTSService(
-        api_key=os.getenv("ELEVENLABS_API_KEY"),
-        voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
-        model="eleven_turbo_v2_5",
-    )
-
-    llm = OpenAILLMService(
-        api_key=os.getenv("OPENAI_API_KEY"),
-        params=BaseOpenAILLMService.InputParams(temperature=0.75),
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": (
-                "You are a helpful British assistant called Alfred. "
-                "Your goal is to demonstrate your capabilities in a succinct way. "
-                "Your output will be converted to audio so don't include special characters in your answers. "
-                "Always include punctuation in your responses. "
-                "Give very short replies - do not give longer replies unless strictly necessary. "
-                "Respond to what the user said in a concise, funny, creative and helpful way. "
-                "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies. "
-                "Do not respond to speakers within `<PASSIVE/>` tags unless explicitly asked to. "
+    async with aiohttp.ClientSession() as session:
+        stt = SpeechmaticsSTTService(
+            api_key=os.getenv("SPEECHMATICS_API_KEY"),
+            params=SpeechmaticsSTTService.InputParams(
+                language=Language.EN,
+                enable_vad=True,
+                enable_diarization=True,
+                focus_speakers=["S1"],
+                end_of_utterance_silence_trigger=0.5,
+                speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
+                speaker_passive_format="<PASSIVE><{speaker_id}>{text}</{speaker_id}></PASSIVE>",
            ),
-        },
-    ]
+        )

-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(
-        context,
-        user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
-    )
+        tts = SpeechmaticsTTSService(
+            api_key=os.getenv("SPEECHMATICS_API_KEY"),
+            voice_id="sarah",
+            aiohttp_session=session,
+        )

-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            params=BaseOpenAILLMService.InputParams(temperature=0.75),
+        )
+
+        messages = [
+            {
+                "role": "system",
+                "content": (
+                    "You are a helpful British assistant called Sarah. "
+                    "Your goal is to demonstrate your capabilities in a succinct way. "
+                    "Your output will be converted to audio so don't include special characters in your answers. "
+                    "Always include punctuation in your responses. "
+                    "Give very short replies - do not give longer replies unless strictly necessary. "
+                    "Respond to what the user said in a concise, funny, creative and helpful way. "
+                    "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies. "
+                    "Do not respond to speakers within `<PASSIVE/>` tags unless explicitly asked to. "
+                ),
+            },
        ]
-    )

-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
+        context = LLMContext(messages)
+        context_aggregator = LLMContextAggregatorPair(
+            context,
+            user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
+        )

-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Say a short hello to the user."})
-        await task.queue_frames([LLMRunFrame()])
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,
+                context_aggregator.user(),  # User responses
+                llm,  # LLM
+                tts,  # TTS
+                transport.output(),  # Transport bot output
+                context_aggregator.assistant(),  # Assistant spoken responses
+            ]
+        )

-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                enable_metrics=True,
+                enable_usage_metrics=True,
+            ),
+            idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+        )

-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            logger.info(f"Client connected")
+            # Kick off the conversation.
+            messages.append({"role": "system", "content": "Say a short hello to the user."})
+            await task.queue_frames([LLMRunFrame()])

-    await runner.run(task)
+        @transport.event_handler("on_client_disconnected")
+        async def on_client_disconnected(transport, client):
+            logger.info(f"Client disconnected")
+            await task.cancel()
+
+        runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+        await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/foundational/07a-interruptible-speechmatics.py
+++ b/examples/foundational/07a-interruptible-speechmatics.py
@@ -6,6 +6,7 @@

 import os

+import aiohttp
 from dotenv import load_dotenv
 from loguru import logger

@@ -24,10 +25,10 @@ from pipecat.processors.aggregators.llm_response import (
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
 from pipecat.services.openai.base_llm import BaseOpenAILLMService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
+from pipecat.services.speechmatics.tts import SpeechmaticsTTSService
 from pipecat.transcriptions.language import Language
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -61,100 +62,106 @@ transport_params = {


 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    """Run example using Speechmatics STT.
+    """Run example using Speechmatics STT and TTS.

-    This example will use diarization within our STT service and output the words spoken by
-    each individual speaker and wrap them with XML tags for the LLM to process. Note the
-    instructions in the system context for the LLM. This greatly improves the conversation
-    experience by allowing the LLM to understand who is speaking in a multi-party call.
+    This example demonstrates a complete Speechmatics integration with both Speech-to-Text
+    and Text-to-Speech services:

-    By default, this example will use our ENHANCED operating point, which is optimized for
-    high accuracy. You can change this by setting the `operating_point` parameter to a different
-    value.
+    STT Features:
+    - Diarization to identify and distinguish between different speakers
+    - Words spoken by each speaker are wrapped with XML tags for LLM processing
+    - System context instructions help the LLM understand multi-party conversations
+    - ENHANCED operating point by default for optimal accuracy

-    For more information on operating points, see the Speechmatics documentation:
-    https://docs.speechmatics.com/rt-api-ref
+    TTS Features:
+    - Low latency streaming audio synthesis
+    - Multiple voice options available including `sarah`, `theo`, and `megan`
+
+    For more information:
+    - STT: https://docs.speechmatics.com/rt-api-ref
+    - TTS: https://docs.speechmatics.com/text-to-speech/quickstart
    """
    logger.info(f"Starting bot")

-    stt = SpeechmaticsSTTService(
-        api_key=os.getenv("SPEECHMATICS_API_KEY"),
-        params=SpeechmaticsSTTService.InputParams(
-            language=Language.EN,
-            enable_diarization=True,
-            end_of_utterance_silence_trigger=0.5,
-            speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
-        ),
-    )
-
-    tts = ElevenLabsTTSService(
-        api_key=os.getenv("ELEVENLABS_API_KEY"),
-        voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
-        model="eleven_turbo_v2_5",
-    )
-
-    llm = OpenAILLMService(
-        api_key=os.getenv("OPENAI_API_KEY"),
-        params=BaseOpenAILLMService.InputParams(temperature=0.75),
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": (
-                "You are a helpful British assistant called Alfred. "
-                "Your goal is to demonstrate your capabilities in a succinct way. "
-                "Your output will be converted to audio so don't include special characters in your answers. "
-                "Always include punctuation in your responses. "
-                "Give very short replies - do not give longer replies unless strictly necessary. "
-                "Respond to what the user said in a concise, funny, creative and helpful way. "
-                "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies."
+    async with aiohttp.ClientSession() as session:
+        stt = SpeechmaticsSTTService(
+            api_key=os.getenv("SPEECHMATICS_API_KEY"),
+            params=SpeechmaticsSTTService.InputParams(
+                language=Language.EN,
+                enable_diarization=True,
+                end_of_utterance_silence_trigger=0.5,
+                speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
            ),
-        },
-    ]
+        )

-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(
-        context,
-        user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
-    )
+        tts = SpeechmaticsTTSService(
+            api_key=os.getenv("SPEECHMATICS_API_KEY"),
+            voice_id="sarah",
+            aiohttp_session=session,
+        )

-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            params=BaseOpenAILLMService.InputParams(temperature=0.75),
+        )
+
+        messages = [
+            {
+                "role": "system",
+                "content": (
+                    "You are a helpful British assistant called Sarah. "
+                    "Your goal is to demonstrate your capabilities in a succinct way. "
+                    "Your output will be converted to audio so don't include special characters in your answers. "
+                    "Always include punctuation in your responses. "
+                    "Give very short replies - do not give longer replies unless strictly necessary. "
+                    "Respond to what the user said in a concise, funny, creative and helpful way. "
+                    "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies."
+                ),
+            },
        ]
-    )

-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
+        context = LLMContext(messages)
+        context_aggregator = LLMContextAggregatorPair(
+            context,
+            user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
+        )

-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Say a short hello to the user."})
-        await task.queue_frames([LLMRunFrame()])
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,  # STT
+                context_aggregator.user(),  # User responses
+                llm,  # LLM
+                tts,  # TTS
+                transport.output(),  # Transport bot output
+                context_aggregator.assistant(),  # Assistant spoken responses
+            ]
+        )

-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                enable_metrics=True,
+                enable_usage_metrics=True,
+            ),
+            idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+        )

-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            logger.info(f"Client connected")
+            # Kick off the conversation.
+            messages.append({"role": "system", "content": "Say a short hello to the user."})
+            await task.queue_frames([LLMRunFrame()])

-    await runner.run(task)
+        @transport.event_handler("on_client_disconnected")
+        async def on_client_disconnected(transport, client):
+            logger.info(f"Client disconnected")
+            await task.cancel()
+
+        runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+        await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/foundational/07c-interruptible-deepgram-flux.py
+++ b/examples/foundational/07c-interruptible-deepgram-flux.py
@@ -101,6 +101,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        logger.info(f"Client disconnected")
        await task.cancel()

+    @stt.event_handler("on_update")
+    async def on_deepgram_flux_update(stt, transcript):
+        logger.debug(f"On deeggram flux update: {transcript}")
+
    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

    await runner.run(task)
--- a/examples/foundational/07c-interruptible-deepgram-http.py
+++ b/examples/foundational/07c-interruptible-deepgram-http.py
@@ -0,0 +1,132 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+import aiohttp
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.deepgram.tts import DeepgramHttpTTSService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    async with aiohttp.ClientSession() as session:
+        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+        tts = DeepgramHttpTTSService(
+            api_key=os.getenv("DEEPGRAM_API_KEY"),
+            voice="aura-2-andromeda-en",
+            aiohttp_session=session,
+        )
+
+        llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        context = LLMContext(messages)
+        context_aggregator = LLMContextAggregatorPair(context)
+
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,  # STT
+                context_aggregator.user(),  # User responses
+                llm,  # LLM
+                tts,  # TTS
+                transport.output(),  # Transport bot output
+                context_aggregator.assistant(),  # Assistant spoken responses
+            ]
+        )
+
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                enable_metrics=True,
+                enable_usage_metrics=True,
+            ),
+            idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+        )
+
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            logger.info(f"Client connected")
+            # Kick off the conversation.
+            messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMRunFrame()])
+
+        @transport.event_handler("on_client_disconnected")
+        async def on_client_disconnected(transport, client):
+            logger.info(f"Client disconnected")
+            await task.cancel()
+
+        runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+        await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/07m-interruptible-aws.py
+++ b/examples/foundational/07m-interruptible-aws.py
@@ -67,8 +67,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    llm = AWSBedrockLLMService(
        aws_region="us-west-2",
-        model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
-        params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
+        model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
+        params=AWSBedrockLLMService.InputParams(temperature=0.8),
    )

    messages = [
--- a/examples/foundational/08-bots-arguing.py
+++ b/examples/foundational/08-bots-arguing.py
@@ -1,147 +0,0 @@
-import asyncio
-import logging
-import os
-from typing import Tuple
-
-import aiohttp
-from dotenv import load_dotenv
-
-from pipecat.frames.frames import AudioFrame, EndFrame, ImageFrame, LLMContextFrame, TextFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.processors.aggregators import SentenceAggregator
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.daily import configure
-from pipecat.services.azure import AzureLLMService, AzureTTSService
-from pipecat.services.elevenlabs import ElevenLabsTTSService
-from pipecat.services.fal import FalImageGenService
-from pipecat.transports.daily.transport import DailyTransport
-
-load_dotenv(override=True)
-
-logging.basicConfig(format=f"%(levelno)s %(asctime)s %(message)s")
-logger = logging.getLogger("pipecat")
-logger.setLevel(logging.DEBUG)
-
-
-async def main():
-    async with aiohttp.ClientSession() as session:
-        (room_url, _) = await configure(session)
-
-        transport = DailyTransport(
-            room_url,
-            None,
-            "Respond bot",
-            duration_minutes=10,
-            mic_enabled=True,
-            mic_sample_rate=16000,
-            camera_enabled=True,
-            camera_width=1024,
-            camera_height=1024,
-        )
-
-        llm = AzureLLMService(
-            api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
-            endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
-            model=os.getenv("AZURE_CHATGPT_MODEL"),
-        )
-        tts1 = AzureTTSService(
-            api_key=os.getenv("AZURE_SPEECH_API_KEY"),
-            region=os.getenv("AZURE_SPEECH_REGION"),
-        )
-        tts2 = ElevenLabsTTSService(
-            api_key=os.getenv("ELEVENLABS_API_KEY"),
-            voice_id="jBpfuIE2acCO8z3wKNLl",
-        )
-        dalle = FalImageGenService(
-            params=FalImageGenService.InputParams(image_size="1024x1024"),
-            aiohttp_session=session,
-            key=os.getenv("FAL_KEY"),
-        )
-
-        bot1_messages = [
-            {
-                "role": "system",
-                "content": "You are a stern librarian. You strongly believe that a hot dog is a sandwich. Start by stating this fact in a few sentences, then be prepared to debate this with the user. You shouldn't ever compromise on the fundamental truth that a hot dog is a sandwich. Your responses should only be a few sentences long.",
-            },
-        ]
-        bot2_messages = [
-            {
-                "role": "system",
-                "content": "You are a silly cat, and you strongly believe that a hot dog is not a sandwich. Debate this with the user, only responding with a few sentences. Don't ever accept that a hot dog is a sandwich.",
-            },
-        ]
-
-        async def get_text_and_audio(messages) -> Tuple[str, bytearray]:
-            """This function streams text from the LLM and uses the TTS service to convert
-            that text to speech as it's received.
-            """
-            source_queue = asyncio.Queue()
-            sink_queue = asyncio.Queue()
-            sentence_aggregator = SentenceAggregator()
-            pipeline = Pipeline([llm, sentence_aggregator, tts1], source_queue, sink_queue)
-
-            await source_queue.put(LLMContextFrame(LLMContext(messages)))
-            await source_queue.put(EndFrame())
-            await pipeline.run_pipeline()
-
-            message = ""
-            all_audio = bytearray()
-            while sink_queue.qsize():
-                frame = sink_queue.get_nowait()
-                if isinstance(frame, TextFrame):
-                    message += frame.text
-                elif isinstance(frame, AudioFrame):
-                    all_audio.extend(frame.audio)
-
-            return (message, all_audio)
-
-        async def get_bot1_statement():
-            message, audio = await get_text_and_audio(bot1_messages)
-
-            bot1_messages.append({"role": "assistant", "content": message})
-            bot2_messages.append({"role": "user", "content": message})
-
-            return audio
-
-        async def get_bot2_statement():
-            message, audio = await get_text_and_audio(bot2_messages)
-
-            bot2_messages.append({"role": "assistant", "content": message})
-            bot1_messages.append({"role": "user", "content": message})
-
-            return audio
-
-        async def argue():
-            for i in range(100):
-                print(f"In iteration {i}")
-
-                bot1_description = "A woman conservatively dressed as a librarian in a library surrounded by books, cartoon, serious, highly detailed"
-
-                (audio1, image_data1) = await asyncio.gather(
-                    get_bot1_statement(), dalle.run_image_gen(bot1_description)
-                )
-                await transport.send_queue.put(
-                    [
-                        ImageFrame(image_data1[1], image_data1[2]),
-                        AudioFrame(audio1),
-                    ]
-                )
-
-                bot2_description = "A cat dressed in a hot dog costume, cartoon, bright colors, funny, highly detailed"
-
-                (audio2, image_data2) = await asyncio.gather(
-                    get_bot2_statement(), dalle.run_image_gen(bot2_description)
-                )
-                await transport.send_queue.put(
-                    [
-                        ImageFrame(image_data2[1], image_data2[2]),
-                        AudioFrame(audio2),
-                    ]
-                )
-
-        await asyncio.gather(transport.run(), argue())
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
--- a/examples/foundational/08-custom-frame-processor.py
+++ b/examples/foundational/08-custom-frame-processor.py
@@ -4,8 +4,9 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+import io
 import os
-from typing import Optional
+import re

 from dotenv import load_dotenv
 from loguru import logger
@@ -16,24 +17,17 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import (
    Frame,
-    LLMContextFrame,
-    TextFrame,
-    TTSSpeakFrame,
-    UserImageRawFrame,
-    UserImageRequestFrame,
+    LLMRunFrame,
+    MetricsFrame,
 )
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.user_response import UserResponseAggregator
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import (
-    create_transport,
-    get_transport_client_id,
-    maybe_capture_participant_camera,
-)
+from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.openai.llm import OpenAILLMService
@@ -43,46 +37,41 @@ from pipecat.transports.daily.transport import DailyParams
 load_dotenv(override=True)


-class UserImageRequester(FrameProcessor):
-    """Converts incoming text into requests for user images."""
+def format_metrics(metrics, indent=0):
+    lines = []
+    tab = "\t" * indent

-    def __init__(self, participant_id: Optional[str] = None):
-        super().__init__()
-        self._participant_id = participant_id
+    for metric in metrics:
+        lines.append(tab + type(metric).__name__)
+        for field, value in vars(metric).items():
+            if hasattr(value, "__dict__") and not isinstance(
+                value, (str, int, float, bool, type(None))
+            ):
+                lines.append(f"{tab}\t{field}={type(value).__name__}")
+                for k, v in vars(value).items():
+                    lines.append(f"{tab}\t\t{k}={repr(v)}")
+            else:
+                lines.append(f"{tab}\t{field}={repr(value)}")

-    def set_participant_id(self, participant_id: str):
-        self._participant_id = participant_id
+    return "\n".join(lines)
+
+
+class MetricsFrameLogger(FrameProcessor):
+    """MetricsFrameLogger formats and logs all MetericsFrames"""
+
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)

    async def process_frame(self, frame: Frame, direction: FrameDirection):
        await super().process_frame(frame, direction)

-        if self._participant_id and isinstance(frame, TextFrame):
-            await self.push_frame(
-                UserImageRequestFrame(self._participant_id, context=frame.text),
-                FrameDirection.UPSTREAM,
-            )
-        else:
+        if isinstance(frame, MetricsFrame):
+            logger.info(f"{frame.name}\n    {format_metrics(frame.data)}")
            await self.push_frame(frame, direction)

-
-class UserImageProcessor(FrameProcessor):
-    """Converts incoming user images into context frames."""
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if isinstance(frame, UserImageRawFrame):
-            if frame.request and frame.request.context:
-                context = LLMContext()
-                context.add_image_frame_message(
-                    image=frame.image,
-                    text=frame.request.context,
-                    size=frame.size,
-                    format=frame.format,
-                )
-                frame = LLMContextFrame(context)
-                await self.push_frame(frame)
+        # ALWAYS push all frames
        else:
+            # SUPER IMPORTANT: always push every frame!
            await self.push_frame(frame, direction)


@@ -93,14 +82,13 @@ transport_params = {
    "daily": lambda: DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
    "webrtc": lambda: TransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        video_in_enabled=True,
+        video_out_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
@@ -110,33 +98,37 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    user_response = UserResponseAggregator()
-
-    # Initialize the image requester without setting the participant ID yet
-    image_requester = UserImageRequester()
-
-    image_processor = UserImageProcessor()
-
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

-    # OpenAI GPT-4o for vision analysis
-    openai = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

+    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    metrics_frame_processor = MetricsFrameLogger()
+
    pipeline = Pipeline(
        [
            transport.input(),
            stt,
-            user_response,
-            image_requester,
-            image_processor,
-            openai,
+            context_aggregator.user(),
+            llm,
            tts,
            transport.output(),
+            context_aggregator.assistant(),
+            metrics_frame_processor,  # pretty print metrics frames
        ]
    )

@@ -152,15 +144,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected: {client}")
-
-        await maybe_capture_participant_camera(transport, client)
-
-        # Set the participant ID in the image requester
-        client_id = get_transport_client_id(transport, client)
-        image_requester.set_participant_id(client_id)
-
-        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
+        # Kick off the conversation.
+        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12-describe-image-openai.py
+++ b/examples/foundational/12-describe-image-openai.py
@@ -0,0 +1,141 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+from PIL import Image
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+
+        if not runner_args.body:
+            script_dir = os.path.dirname(__file__)
+            runner_args.body = {
+                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
+                "question": "Describe this image",
+            }
+
+        image_path = runner_args.body["image_path"]
+        question = runner_args.body["question"]
+
+        # Kick off the conversation.
+        image = Image.open(image_path)
+        message = LLMContext.create_image_message(
+            image=image.tobytes(),
+            format="RGB",
+            size=image.size,
+            text=question,
+        )
+        messages.append(message)
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/12-describe-video.py
+++ b/examples/foundational/12-describe-video.py
@@ -1,180 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import os
-from typing import Optional
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import (
-    Frame,
-    LLMContextFrame,
-    TextFrame,
-    TTSSpeakFrame,
-    UserImageRawFrame,
-    UserImageRequestFrame,
-)
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import (
-    create_transport,
-    get_transport_client_id,
-    maybe_capture_participant_camera,
-)
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.moondream.vision import MoondreamService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-
-load_dotenv(override=True)
-
-
-class UserImageRequester(FrameProcessor):
-    """Converts incoming text into requests for user images."""
-
-    def __init__(self, participant_id: Optional[str] = None):
-        super().__init__()
-        self._participant_id = participant_id
-
-    def set_participant_id(self, participant_id: str):
-        self._participant_id = participant_id
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if self._participant_id and isinstance(frame, TextFrame):
-            await self.push_frame(
-                UserImageRequestFrame(self._participant_id, context=frame.text),
-                FrameDirection.UPSTREAM,
-            )
-        else:
-            await self.push_frame(frame, direction)
-
-
-class UserImageProcessor(FrameProcessor):
-    """Converts incoming user images into context frames."""
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if isinstance(frame, UserImageRawFrame):
-            if frame.request and frame.request.context:
-                context = LLMContext()
-                context.add_image_frame_message(
-                    image=frame.image,
-                    text=frame.request.context,
-                    size=frame.size,
-                    format=frame.format,
-                )
-                frame = LLMContextFrame(context)
-                await self.push_frame(frame)
-        else:
-            await self.push_frame(frame, direction)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_in_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_in_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    user_response = UserResponseAggregator()
-
-    # Initialize the image requester without setting the participant ID yet
-    image_requester = UserImageRequester()
-
-    image_processor = UserImageProcessor()
-
-    # If you run into weird description, try with use_cpu=True
-    moondream = MoondreamService()
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
-    pipeline = Pipeline(
-        [
-            transport.input(),
-            stt,
-            user_response,
-            image_requester,
-            image_processor,
-            moondream,
-            tts,
-            transport.output(),
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected: {client}")
-
-        await maybe_capture_participant_camera(transport, client)
-
-        # Set the participant ID in the image requester
-        client_id = get_transport_client_id(transport, client)
-        image_requester.set_participant_id(client_id)
-
-        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/12a-describe-image-anthropic.py
+++ b/examples/foundational/12a-describe-image-anthropic.py
@@ -4,36 +4,25 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+
 import os
-from typing import Optional

 from dotenv import load_dotenv
 from loguru import logger
+from PIL import Image

 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import (
-    Frame,
-    LLMContextFrame,
-    TextFrame,
-    TTSSpeakFrame,
-    UserImageRawFrame,
-    UserImageRequestFrame,
-)
+from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import (
-    create_transport,
-    get_transport_client_id,
-    maybe_capture_participant_camera,
-)
+from pipecat.runner.utils import create_transport
 from pipecat.services.anthropic.llm import AnthropicLLMService
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
@@ -43,49 +32,6 @@ from pipecat.transports.daily.transport import DailyParams
 load_dotenv(override=True)


-class UserImageRequester(FrameProcessor):
-    """Converts incoming text into requests for user images."""
-
-    def __init__(self, participant_id: Optional[str] = None):
-        super().__init__()
-        self._participant_id = participant_id
-
-    def set_participant_id(self, participant_id: str):
-        self._participant_id = participant_id
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if self._participant_id and isinstance(frame, TextFrame):
-            await self.push_frame(
-                UserImageRequestFrame(self._participant_id, context=frame.text),
-                FrameDirection.UPSTREAM,
-            )
-        else:
-            await self.push_frame(frame, direction)
-
-
-class UserImageProcessor(FrameProcessor):
-    """Converts incoming user images into context frames."""
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if isinstance(frame, UserImageRawFrame):
-            if frame.request and frame.request.context:
-                context = LLMContext()
-                context.add_image_frame_message(
-                    image=frame.image,
-                    text=frame.request.context,
-                    size=frame.size,
-                    format=frame.format,
-                )
-                frame = LLMContextFrame(context)
-                await self.push_frame(frame)
-        else:
-            await self.push_frame(frame, direction)
-
-
 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
 # selected.
@@ -93,14 +39,12 @@ transport_params = {
    "daily": lambda: DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
    "webrtc": lambda: TransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
@@ -110,33 +54,34 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    user_response = UserResponseAggregator()
-
-    # Initialize the image requester without setting the participant ID yet
-    image_requester = UserImageRequester()
-
-    image_processor = UserImageProcessor()
-
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

-    # Anthropic for vision analysis
-    anthropic = AnthropicLLMService(api_key=os.getenv("ANTHROPIC_API_KEY"))
-
    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

+    llm = AnthropicLLMService(api_key=os.getenv("ANTHROPIC_API_KEY"))
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
    pipeline = Pipeline(
        [
-            transport.input(),
-            stt,
-            user_response,
-            image_requester,
-            image_processor,
-            anthropic,
-            tts,
-            transport.output(),
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
        ]
    )

@@ -151,16 +96,28 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
-        logger.info(f"Client connected: {client}")
+        logger.info(f"Client connected")

-        await maybe_capture_participant_camera(transport, client)
+        if not runner_args.body:
+            script_dir = os.path.dirname(__file__)
+            runner_args.body = {
+                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
+                "question": "Describe this image",
+            }

-        # Set the participant ID in the image requester
-        client_id = get_transport_client_id(transport, client)
-        image_requester.set_participant_id(client_id)
+        image_path = runner_args.body["image_path"]
+        question = runner_args.body["question"]

-        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
+        # Kick off the conversation.
+        image = Image.open(image_path)
+        message = LLMContext.create_image_message(
+            image=image.tobytes(),
+            format="RGB",
+            size=image.size,
+            text=question,
+        )
+        messages.append(message)
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12b-describe-image-aws.py
+++ b/examples/foundational/12b-describe-image-aws.py
@@ -0,0 +1,148 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+from PIL import Image
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.aws.llm import AWSBedrockLLMService
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    llm = AWSBedrockLLMService(
+        aws_region="us-west-2",
+        model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
+        # Note: usually, prefer providing latency="optimized" param.
+        # Here we can't because AWS Bedrock doesn't support it for Claude 3.7,
+        # which we need for image input.
+        params=AWSBedrockLLMService.InputParams(temperature=0.8),
+    )
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+
+        if not runner_args.body:
+            script_dir = os.path.dirname(__file__)
+            runner_args.body = {
+                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
+                "question": "Describe this image",
+            }
+
+        image_path = runner_args.body["image_path"]
+        question = runner_args.body["question"]
+
+        # Kick off the conversation.
+        image = Image.open(image_path)
+        message = LLMContext.create_image_message(
+            image=image.tobytes(),
+            format="RGB",
+            size=image.size,
+            text=question,
+        )
+        messages.append(message)
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/12c-describe-image-gemini-flash.py
+++ b/examples/foundational/12c-describe-image-gemini-flash.py
@@ -0,0 +1,141 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+from PIL import Image
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.google.llm import GoogleLLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    llm = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+
+        if not runner_args.body:
+            script_dir = os.path.dirname(__file__)
+            runner_args.body = {
+                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
+                "question": "Describe this image",
+            }
+
+        image_path = runner_args.body["image_path"]
+        question = runner_args.body["question"]
+
+        # Kick off the conversation.
+        image = Image.open(image_path)
+        message = LLMContext.create_image_message(
+            image=image.tobytes(),
+            format="RGB",
+            size=image.size,
+            text=question,
+        )
+        messages.append(message)
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/12d-describe-image-moondream.py
+++ b/examples/foundational/12d-describe-image-moondream.py
@@ -0,0 +1,122 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+from PIL import Image
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import UserImageRawFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.moondream.vision import MoondreamService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    vision = MoondreamService()
+
+    pipeline = Pipeline(
+        [
+            vision,  # Vision
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+
+        if not runner_args.body:
+            script_dir = os.path.dirname(__file__)
+            runner_args.body = {
+                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
+                "question": "Describe this image",
+            }
+
+        image_path = runner_args.body["image_path"]
+        question = runner_args.body["question"]
+
+        # Describe the image.
+        image = Image.open(image_path)
+        await task.queue_frames(
+            [
+                UserImageRawFrame(
+                    image=image.tobytes(),
+                    format="RGB",
+                    size=image.size,
+                    text=question,
+                )
+            ]
+        )
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/13f-cartesia-transcription.py
+++ b/examples/foundational/13f-cartesia-transcription.py
@@ -48,10 +48,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = CartesiaSTTService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        base_url=os.getenv("CARTESIA_BASE_URL"),
-    )
+    stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))

    tl = TranscriptionLogger()

--- a/examples/foundational/14d-function-calling-anthropic-video.py
+++ b/examples/foundational/14d-function-calling-anthropic-video.py
@@ -4,8 +4,6 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-
-import asyncio
 import os

 from dotenv import load_dotenv
@@ -17,12 +15,13 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
+from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -39,34 +38,30 @@ from pipecat.transports.daily.transport import DailyParams
 load_dotenv(override=True)


-# Global variable to store the client ID
-client_id = ""
+async def fetch_user_image(params: FunctionCallParams):
+    """Fetch the user image and push it to the LLM.

-
-async def get_weather(params: FunctionCallParams):
-    location = params.arguments["location"]
-    await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.")
-
-
-async def get_image(params: FunctionCallParams):
+    When called, this function pushes a UserImageRequestFrame upstream to the
+    transport. As a result, the transport will request the user image and push a
+    UserImageRawFrame downstream which will be added to the context by the LLM
+    assistant aggregator.
+    """
+    user_id = params.arguments["user_id"]
    question = params.arguments["question"]
-    logger.debug(f"Requesting image with user_id={client_id}, question={question}")
+    logger.debug(f"Requesting image with user_id={user_id}, question={question}")

-    # Request the image frame
-    await params.llm.request_image_frame(
-        user_id=client_id,
-        function_name=params.function_name,
-        tool_call_id=params.tool_call_id,
-        text_content=question,
+    # Request a user image frame and indicate that it should be added to the
+    # context.
+    await params.llm.push_frame(
+        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
+        FrameDirection.UPSTREAM,
    )

-    # Wait a short time for the frame to be processed
-    await asyncio.sleep(0.5)
+    await params.result_callback(None)

-    # Return a result to complete the function call
-    await params.result_callback(
-        f"I've captured an image from your camera and I'm analyzing what you asked about: {question}"
-    )
+    # Instead of None, it's possible to also provide a tool call answer to
+    # tell the LLM that we are grabbing the image to analyze.
+    # await params.result_callback({"result": "Image is being captured."})


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -100,70 +95,32 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    llm = AnthropicLLMService(
-        api_key=os.getenv("ANTHROPIC_API_KEY"),
-        model="claude-3-7-sonnet-latest",
-        params=AnthropicLLMService.InputParams(enable_prompt_caching=True),
-    )
-    llm.register_function("get_weather", get_weather)
-    llm.register_function("get_image", get_image)
+    # Anthropic for vision analysis
+    llm = AnthropicLLMService(api_key=os.getenv("ANTHROPIC_API_KEY"))
+    llm.register_function("fetch_user_image", fetch_user_image)

-    weather_function = FunctionSchema(
-        name="get_weather",
-        description="Get the current weather",
+    fetch_image_function = FunctionSchema(
+        name="fetch_user_image",
+        description="Called when the user requests a description of their camera feed",
        properties={
-            "location": {
+            "user_id": {
                "type": "string",
-                "description": "The city and state, e.g. San Francisco, CA",
+                "description": "The ID of the user to grab the image from",
            },
-        },
-        required=["location"],
-    )
-    get_image_function = FunctionSchema(
-        name="get_image",
-        description="Get an image from the video stream.",
-        properties={
            "question": {
                "type": "string",
-                "description": "The question that the user is asking about the image.",
-            }
+                "description": "The question that the user is asking about the image",
+            },
        },
-        required=["question"],
+        required=["user_id", "question"],
    )
-    tools = ToolsSchema(standard_tools=[weather_function, get_image_function])
-
-    system_prompt = """\
-You are a helpful assistant who converses with a user and answers questions. Respond concisely to general questions.
-
-Your response will be turned into speech so use only simple words and punctuation.
-
-You have access to two tools: get_weather and get_image.
-
-You can respond to questions about the weather using the get_weather tool.
-
-You can answer questions about the user's video stream using the get_image tool. Some examples of phrases that \
-indicate you should use the get_image tool are:
- What do you see?
- What's in the video?
- Can you describe the video?
- Tell me about what you see.
- Tell me something interesting about what you see.
- What's happening in the video?
-
-If you need to use a tool, simply use the tool. Do not tell the user the tool you are using. Be brief and concise.
-    """
+    tools = ToolsSchema(standard_tools=[fetch_image_function])

    messages = [
        {
            "role": "system",
-            "content": [
-                {
-                    "type": "text",
-                    "text": system_prompt,
-                }
-            ],
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
        },
-        {"role": "user", "content": "Start the conversation by introducing yourself."},
    ]

    context = LLMContext(messages, tools)
@@ -173,11 +130,11 @@ If you need to use a tool, simply use the tool. Do not tell the user the tool yo
        [
            transport.input(),  # Transport user input
            stt,  # STT
-            context_aggregator.user(),  # User speech to text
+            context_aggregator.user(),  # User responses
            llm,  # LLM
            tts,  # TTS
            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses and tool context
+            context_aggregator.assistant(),  # Assistant spoken responses
        ]
    )

@@ -196,10 +153,16 @@ If you need to use a tool, simply use the tool. Do not tell the user the tool yo

        await maybe_capture_participant_camera(transport, client)

-        global client_id
+        # Set the participant ID in the image requester
        client_id = get_transport_client_id(transport, client)

        # Kick off the conversation.
+        messages.append(
+            {
+                "role": "system",
+                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
+            }
+        )
        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
--- a/examples/foundational/14d-function-calling-aws-video.py
+++ b/examples/foundational/14d-function-calling-aws-video.py
@@ -5,29 +5,23 @@
 #

 import os
-from typing import Optional

 from dotenv import load_dotenv
 from loguru import logger

+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import (
-    Frame,
-    LLMContextFrame,
-    TextFrame,
-    TTSSpeakFrame,
-    UserImageRawFrame,
-    UserImageRequestFrame,
-)
+from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -37,54 +31,37 @@ from pipecat.runner.utils import (
 from pipecat.services.aws.llm import AWSBedrockLLMService
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams

 load_dotenv(override=True)


-class UserImageRequester(FrameProcessor):
-    """Converts incoming text into requests for user images."""
+async def fetch_user_image(params: FunctionCallParams):
+    """Fetch the user image and push it to the LLM.

-    def __init__(self, participant_id: Optional[str] = None):
-        super().__init__()
-        self._participant_id = participant_id
+    When called, this function pushes a UserImageRequestFrame upstream to the
+    transport. As a result, the transport will request the user image and push a
+    UserImageRawFrame downstream which will be added to the context by the LLM
+    assistant aggregator.
+    """
+    user_id = params.arguments["user_id"]
+    question = params.arguments["question"]
+    logger.debug(f"Requesting image with user_id={user_id}, question={question}")

-    def set_participant_id(self, participant_id: str):
-        self._participant_id = participant_id
+    # Request a user image frame and indicate that it should be added to the
+    # context.
+    await params.llm.push_frame(
+        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
+        FrameDirection.UPSTREAM,
+    )

-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
+    await params.result_callback(None)

-        if self._participant_id and isinstance(frame, TextFrame):
-            await self.push_frame(
-                UserImageRequestFrame(self._participant_id, context=frame.text),
-                FrameDirection.UPSTREAM,
-            )
-        else:
-            await self.push_frame(frame, direction)
-
-
-class UserImageProcessor(FrameProcessor):
-    """Converts incoming user images into context frames."""
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if isinstance(frame, UserImageRawFrame):
-            if frame.request and frame.request.context:
-                # Note: AWS Bedrock does not yet support the universal LLMContext
-                context = LLMContext()
-                context.add_image_frame_message(
-                    image=frame.image,
-                    text=frame.request.context,
-                    size=frame.size,
-                    format=frame.format,
-                )
-                frame = LLMContextFrame(context)
-                await self.push_frame(frame)
-        else:
-            await self.push_frame(frame, direction)
+    # Instead of None, it's possible to also provide a tool call answer to
+    # tell the LLM that we are grabbing the image to analyze.
+    # await params.result_callback({"result": "Image is being captured."})


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -111,17 +88,15 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    user_response = UserResponseAggregator()
-
-    # Initialize the image requester without setting the participant ID yet
-    image_requester = UserImageRequester()
-
-    image_processor = UserImageProcessor()
-
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
    # AWS for vision analysis
-    aws = AWSBedrockLLMService(
+    llm = AWSBedrockLLMService(
        aws_region="us-west-2",
        model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
        # Note: usually, prefer providing latency="optimized" param.
@@ -129,22 +104,44 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        # which we need for image input.
        params=AWSBedrockLLMService.InputParams(temperature=0.8),
    )
+    llm.register_function("fetch_user_image", fetch_user_image)

-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    fetch_image_function = FunctionSchema(
+        name="fetch_user_image",
+        description="Called when the user requests a description of their camera feed",
+        properties={
+            "user_id": {
+                "type": "string",
+                "description": "The ID of the user to grab the image from",
+            },
+            "question": {
+                "type": "string",
+                "description": "The question that the user is asking about the image",
+            },
+        },
+        required=["user_id", "question"],
    )
+    tools = ToolsSchema(standard_tools=[fetch_image_function])
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+        },
+    ]
+
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
-            transport.input(),
-            stt,
-            user_response,
-            image_requester,
-            image_processor,
-            aws,
-            tts,
-            transport.output(),
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
        ]
    )

@@ -165,10 +162,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Set the participant ID in the image requester
        client_id = get_transport_client_id(transport, client)
-        image_requester.set_participant_id(client_id)

-        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
+        # Kick off the conversation.
+        messages.append(
+            {
+                "role": "system",
+                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
+            }
+        )
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/14d-function-calling-gemini-flash-video.py
+++ b/examples/foundational/14d-function-calling-gemini-flash-video.py
@@ -5,29 +5,23 @@
 #

 import os
-from typing import Optional

 from dotenv import load_dotenv
 from loguru import logger

+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import (
-    Frame,
-    LLMContextFrame,
-    TextFrame,
-    TTSSpeakFrame,
-    UserImageRawFrame,
-    UserImageRequestFrame,
-)
+from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -37,53 +31,37 @@ from pipecat.runner.utils import (
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.google.llm import GoogleLLMService
+from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams

 load_dotenv(override=True)


-class UserImageRequester(FrameProcessor):
-    """Converts incoming text into requests for user images."""
+async def fetch_user_image(params: FunctionCallParams):
+    """Fetch the user image and push it to the LLM.

-    def __init__(self, participant_id: Optional[str] = None):
-        super().__init__()
-        self._participant_id = participant_id
+    When called, this function pushes a UserImageRequestFrame upstream to the
+    transport. As a result, the transport will request the user image and push a
+    UserImageRawFrame downstream which will be added to the context by the LLM
+    assistant aggregator.
+    """
+    user_id = params.arguments["user_id"]
+    question = params.arguments["question"]
+    logger.debug(f"Requesting image with user_id={user_id}, question={question}")

-    def set_participant_id(self, participant_id: str):
-        self._participant_id = participant_id
+    # Request a user image frame and indicate that it should be added to the
+    # context.
+    await params.llm.push_frame(
+        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
+        FrameDirection.UPSTREAM,
+    )

-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
+    await params.result_callback(None)

-        if self._participant_id and isinstance(frame, TextFrame):
-            await self.push_frame(
-                UserImageRequestFrame(self._participant_id, context=frame.text),
-                FrameDirection.UPSTREAM,
-            )
-        else:
-            await self.push_frame(frame, direction)
-
-
-class UserImageProcessor(FrameProcessor):
-    """Converts incoming user images into context frames."""
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if isinstance(frame, UserImageRawFrame):
-            if frame.request and frame.request.context:
-                context = LLMContext()
-                context.add_image_frame_message(
-                    image=frame.image,
-                    text=frame.request.context,
-                    size=frame.size,
-                    format=frame.format,
-                )
-                frame = LLMContextFrame(context)
-                await self.push_frame(frame)
-        else:
-            await self.push_frame(frame, direction)
+    # Instead of None, it's possible to also provide a tool call answer to
+    # tell the LLM that we are grabbing the image to analyze.
+    # await params.result_callback({"result": "Image is being captured."})


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -110,33 +88,53 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    user_response = UserResponseAggregator()
-
-    # Initialize the image requester without setting the participant ID yet
-    image_requester = UserImageRequester()
-
-    image_processor = UserImageProcessor()
-
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

-    # Google Gemini model for vision analysis
-    google = GoogleLLMService(model="gemini-2.0-flash-001", api_key=os.getenv("GOOGLE_API_KEY"))
-
    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

+    # Google Gemini model for vision analysis
+    llm = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
+    llm.register_function("fetch_user_image", fetch_user_image)
+
+    fetch_image_function = FunctionSchema(
+        name="fetch_user_image",
+        description="Called when the user requests a description of their camera feed",
+        properties={
+            "user_id": {
+                "type": "string",
+                "description": "The ID of the user to grab the image from",
+            },
+            "question": {
+                "type": "string",
+                "description": "The question that the user is asking about the image",
+            },
+        },
+        required=["user_id", "question"],
+    )
+    tools = ToolsSchema(standard_tools=[fetch_image_function])
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+        },
+    ]
+
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)
+
    pipeline = Pipeline(
        [
-            transport.input(),
-            stt,
-            user_response,
-            image_requester,
-            image_processor,
-            google,
-            tts,
-            transport.output(),
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
        ]
    )

@@ -157,10 +155,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Set the participant ID in the image requester
        client_id = get_transport_client_id(transport, client)
-        image_requester.set_participant_id(client_id)

-        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
+        # Kick off the conversation.
+        messages.append(
+            {
+                "role": "system",
+                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
+            }
+        )
+        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/14d-function-calling-moondream-video.py
+++ b/examples/foundational/14d-function-calling-moondream-video.py
@@ -0,0 +1,190 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
+from pipecat.pipeline.parallel_pipeline import ParallelPipeline
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.services.moondream.vision import MoondreamService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+async def fetch_user_image(params: FunctionCallParams):
+    """Fetch the user image.
+
+    When called, this function pushes a UserImageRequestFrame upstream to the
+    transport. As a result, the transport will request the user image and push a
+    UserImageRawFrame downstream.
+    """
+    user_id = params.arguments["user_id"]
+    question = params.arguments["question"]
+    logger.debug(f"Requesting image with user_id={user_id}, question={question}")
+
+    # Request a user image frame. In this case, we don't want the requested
+    # image to be added to the context because we will process it with
+    # Moondream.
+    await params.llm.push_frame(
+        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=False),
+        FrameDirection.UPSTREAM,
+    )
+
+    await params.result_callback(None)
+
+    # Instead of None, it's possible to also provide a tool call answer to
+    # tell the LLM that we are grabbing the image to analyze.
+    # await params.result_callback({"result": "Image is being captured."})
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+    llm.register_function("fetch_user_image", fetch_user_image)
+
+    fetch_image_function = FunctionSchema(
+        name="fetch_user_image",
+        description="Called when the user requests a description of their camera feed",
+        properties={
+            "user_id": {
+                "type": "string",
+                "description": "The ID of the user to grab the image from",
+            },
+            "question": {
+                "type": "string",
+                "description": "The question that the user is asking about the image",
+            },
+        },
+        required=["user_id", "question"],
+    )
+    tools = ToolsSchema(standard_tools=[fetch_image_function])
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+        },
+    ]
+
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    # If you run into weird description, try with use_cpu=True
+    moondream = MoondreamService()
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            ParallelPipeline(
+                [llm],  # LLM
+                [moondream],
+            ),
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected: {client}")
+
+        await maybe_capture_participant_camera(transport, client)
+
+        # Set the participant ID in the image requester
+        client_id = get_transport_client_id(transport, client)
+
+        # Kick off the conversation.
+        messages.append(
+            {
+                "role": "system",
+                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
+            }
+        )
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/14d-function-calling-openai-video.py
+++ b/examples/foundational/14d-function-calling-openai-video.py
@@ -0,0 +1,186 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+async def fetch_user_image(params: FunctionCallParams):
+    """Fetch the user image and push it to the LLM.
+
+    When called, this function pushes a UserImageRequestFrame upstream to the
+    transport. As a result, the transport will request the user image and push a
+    UserImageRawFrame downstream which will be added to the context by the LLM
+    assistant aggregator.
+    """
+    user_id = params.arguments["user_id"]
+    question = params.arguments["question"]
+    logger.debug(f"Requesting image with user_id={user_id}, question={question}")
+
+    # Request a user image frame and indicate that it should be added to the
+    # context.
+    await params.llm.push_frame(
+        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
+        FrameDirection.UPSTREAM,
+    )
+
+    await params.result_callback(None)
+
+    # Instead of None, it's possible to also provide a tool call answer to
+    # tell the LLM that we are grabbing the image to analyze.
+    # await params.result_callback({"result": "Image is being captured."})
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+    llm.register_function("fetch_user_image", fetch_user_image)
+
+    fetch_image_function = FunctionSchema(
+        name="fetch_user_image",
+        description="Called when the user requests a description of their camera feed",
+        properties={
+            "user_id": {
+                "type": "string",
+                "description": "The ID of the user to grab the image from",
+            },
+            "question": {
+                "type": "string",
+                "description": "The question that the user is asking about the image",
+            },
+        },
+        required=["user_id", "question"],
+    )
+    tools = ToolsSchema(standard_tools=[fetch_image_function])
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+        },
+    ]
+
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+
+        await maybe_capture_participant_camera(transport, client)
+
+        client_id = get_transport_client_id(transport, client)
+
+        # Kick off the conversation.
+        messages.append(
+            {
+                "role": "system",
+                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
+            }
+        )
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/14r-function-calling-aws.py
+++ b/examples/foundational/14r-function-calling-aws.py
@@ -79,8 +79,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    llm = AWSBedrockLLMService(
        aws_region="us-west-2",
-        model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
-        params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
+        model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
+        params=AWSBedrockLLMService.InputParams(temperature=0.8),
    )

    # You can also register a function_name of None to get all functions
--- a/examples/foundational/14x-function-calling-openpipe.py
+++ b/examples/foundational/14x-function-calling-openpipe.py
@@ -4,9 +4,8 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-
-import asyncio
 import os
+import time

 from dotenv import load_dotenv
 from loguru import logger
@@ -17,56 +16,31 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
+from pipecat.frames.frames import LLMRunFrame, TTSSpeakFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import (
-    create_transport,
-    get_transport_client_id,
-    maybe_capture_participant_camera,
-)
+from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.services.openpipe.llm import OpenPipeLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams

 load_dotenv(override=True)


-# Global variable to store the client ID
-client_id = ""
+async def fetch_weather_from_api(params: FunctionCallParams):
+    await params.result_callback({"conditions": "nice", "temperature": "75"})


-async def get_weather(params: FunctionCallParams):
-    location = params.arguments["location"]
-    await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.")
-
-
-async def get_image(params: FunctionCallParams):
-    question = params.arguments["question"]
-    logger.debug(f"Requesting image with user_id={client_id}, question={question}")
-
-    # Request the image frame
-    await params.llm.request_image_frame(
-        user_id=client_id,
-        function_name=params.function_name,
-        tool_call_id=params.tool_call_id,
-        text_content=question,
-    )
-
-    # Wait a short time for the frame to be processed
-    await asyncio.sleep(0.5)
-
-    # Return a result to complete the function call
-    await params.result_callback(
-        f"I've captured an image from your camera and I'm analyzing what you asked about: {question}"
-    )
+async def fetch_restaurant_recommendation(params: FunctionCallParams):
+    await params.result_callback({"name": "The Golden Dragon"})


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -76,14 +50,18 @@ transport_params = {
    "daily": lambda: DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
    "webrtc": lambda: TransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
@@ -100,12 +78,24 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-    llm.register_function("get_weather", get_weather)
-    llm.register_function("get_image", get_image)
+    timestamp = int(time.time())
+    llm = OpenPipeLLMService(
+        api_key=os.getenv("OPENAI_API_KEY"),
+        openpipe_api_key=os.getenv("OPENPIPE_API_KEY"),
+        tags={"conversation_id": f"pipecat-{timestamp}"},
+    )
+
+    # You can also register a function_name of None to get all functions
+    # sent to the same callback with an additional function_name parameter.
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+
+    @llm.event_handler("on_function_calls_started")
+    async def on_function_calls_started(service, function_calls):
+        await tts.queue_frame(TTSSpeakFrame("Let me check on that."))

    weather_function = FunctionSchema(
-        name="get_weather",
+        name="get_current_weather",
        description="Get the current weather",
        properties={
            "location": {
@@ -118,41 +108,26 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                "description": "The temperature unit to use. Infer this from the user's location.",
            },
        },
+        required=["location", "format"],
+    )
+    restaurant_function = FunctionSchema(
+        name="get_restaurant_recommendation",
+        description="Get a restaurant recommendation",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+        },
        required=["location"],
    )
-    get_image_function = FunctionSchema(
-        name="get_image",
-        description="Get an image from the video stream.",
-        properties={
-            "question": {
-                "type": "string",
-                "description": "The question that the user is asking about the image.",
-            }
-        },
-        required=["question"],
-    )
-    tools = ToolsSchema(standard_tools=[weather_function, get_image_function])
+    tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])

-    system_prompt = """\
-You are a helpful assistant who converses with a user and answers questions. Respond concisely to general questions.
-
-Your response will be turned into speech so use only simple words and punctuation.
-
-You have access to two tools: get_weather and get_image.
-
-You can respond to questions about the weather using the get_weather tool.
-
-You can answer questions about the user's video stream using the get_image tool. Some examples of phrases that \
-indicate you should use the get_image tool are:
- What do you see?
- What's in the video?
- Can you describe the video?
- Tell me about what you see.
- Tell me something interesting about what you see.
- What's happening in the video?
-"""
    messages = [
-        {"role": "system", "content": system_prompt},
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
    ]

    context = LLMContext(messages, tools)
@@ -182,12 +157,6 @@ indicate you should use the get_image tool are:
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
-
-        await maybe_capture_participant_camera(transport, client)
-
-        global client_id
-        client_id = get_transport_client_id(transport, client)
-
        # Kick off the conversation.
        await task.queue_frames([LLMRunFrame()])

--- a/examples/foundational/18-openai-realtime-usage.py
+++ b/examples/foundational/18-openai-realtime-usage.py
@@ -1,156 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-"""Example: Print OpenAI Realtime API Token Usage Statistics
-
-This example demonstrates how to access and print token usage statistics
-from the OpenAI Realtime API, including detailed breakdowns of input/output
-tokens, cached tokens, and audio/text token usage.
-"""
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects don't get instantiated until the desired
-# transport gets selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    """Main function demonstrating usage statistics tracking."""
-    logger.info(f"Starting bot")
-
-    # Initialize the OpenAI Realtime service
-    llm = OpenAIRealtimeLLMService(
-        api_key=os.getenv("OPENAI_API_KEY") or "",
-        model="gpt-4o-realtime-preview-2024-12-17",
-    )
-
-    # To access usage statistics, we wrap the internal response handler
-    # This is the cleanest way to intercept usage data from the realtime API
-    original_handler = llm._handle_evt_response_done
-
-    async def custom_response_done_handler(evt):
-        """Custom handler that prints usage stats before calling original handler."""
-        # Print usage statistics if available
-        if evt.response.usage:
-            usage = evt.response.usage
-
-            logger.info("\n" + "=" * 50)
-            logger.info("📊 TOKEN USAGE STATISTICS")
-            logger.info("=" * 50)
-            logger.info(f"Total tokens: {usage.total_tokens}")
-            logger.info(f"Input tokens: {usage.input_tokens}")
-            logger.info(f"Output tokens: {usage.output_tokens}")
-
-            # Input token details
-            if usage.input_token_details:
-                logger.info(f"\n📥 Input token breakdown:")
-                logger.info(f"  • Cached tokens: {usage.input_token_details.cached_tokens}")
-                logger.info(f"  • Text tokens: {usage.input_token_details.text_tokens}")
-                logger.info(f"  • Audio tokens: {usage.input_token_details.audio_tokens}")
-
-                # Cached token details if available
-                if usage.input_token_details.cached_tokens_details:
-                    logger.info(
-                        f"  • Cached text tokens: {usage.input_token_details.cached_tokens_details.text_tokens}"
-                    )
-                    logger.info(
-                        f"  • Cached audio tokens: {usage.input_token_details.cached_tokens_details.audio_tokens}"
-                    )
-
-            # Output token details
-            if usage.output_token_details:
-                logger.info(f"\n📤 Output token breakdown:")
-                logger.info(f"  • Text tokens: {usage.output_token_details.text_tokens}")
-                logger.info(f"  • Audio tokens: {usage.output_token_details.audio_tokens}")
-
-            logger.info("=" * 50 + "\n")
-
-        # Call the original handler to maintain normal functionality
-        await original_handler(evt)
-
-    # Replace the handler with our custom one
-    llm._handle_evt_response_done = custom_response_done_handler
-
-    # Create pipeline
-    pipeline = Pipeline(
-        [
-            transport.input(),
-            llm,
-            transport.output(),
-        ]
-    )
-
-    # Create task
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            allow_interruptions=True,
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info("Client connected")
-        logger.info("🎤 Speak into your microphone to interact with the assistant")
-        logger.info("📊 Usage statistics will be printed after each response")
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info("Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/19-openai-realtime.py
+++ b/examples/foundational/19-openai-realtime.py
@@ -5,6 +5,7 @@
 #


+import asyncio
 import os
 from datetime import datetime

@@ -14,12 +15,14 @@ from loguru import logger
 from pipecat.adapters.schemas.function_schema import FunctionSchema
 from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
+from pipecat.frames.frames import LLMRunFrame, LLMSetToolsFrame, TranscriptionMessage
 from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
@@ -52,6 +55,18 @@ async def fetch_weather_from_api(params: FunctionCallParams):
    )


+async def get_news(params: FunctionCallParams):
+    await params.result_callback(
+        {
+            "news": [
+                "Massive UFO currently hovering above New York City",
+                "Stock markets reach all-time highs",
+                "Living dinosaur species discovered in the Amazon rainforest",
+            ],
+        }
+    )
+
+
 async def fetch_restaurant_recommendation(params: FunctionCallParams):
    await params.result_callback({"name": "The Golden Dragon"})

@@ -73,6 +88,13 @@ weather_function = FunctionSchema(
    required=["location", "format"],
 )

+get_news_function = FunctionSchema(
+    name="get_news",
+    description="Get the current news.",
+    properties={},
+    required=[],
+)
+
 restaurant_function = FunctionSchema(
    name="get_restaurant_recommendation",
    description="Get a restaurant recommendation",
@@ -140,10 +162,6 @@ even if you're asked about them.
 You are participating in a voice conversation. Keep your responses concise, short, and to the point
 unless specifically asked to elaborate on a topic.

-You have access to the following tools:
- get_current_weather: Get the current weather for a given location.
- get_restaurant_recommendation: Get a restaurant recommendation for a given location.
-
 Remember, your responses should be short. Just one or two sentences, usually. Respond in English.""",
    )

@@ -157,25 +175,26 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
    # llm.register_function(None, fetch_weather_from_api)
    llm.register_function("get_current_weather", fetch_weather_from_api)
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+    llm.register_function("get_news", get_news)

    transcript = TranscriptProcessor()

    # Create a standard OpenAI LLM context object using the normal messages format. The
    # OpenAIRealtimeLLMService will convert this internally to messages that the
    # openai WebSocket API can understand.
-    context = OpenAILLMContext(
+    context = LLMContext(
        [{"role": "user", "content": "Say hello!"}],
        tools,
    )

-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
            context_aggregator.user(),
+            transcript.user(),  # LLM pushes TranscriptionFrames upstream
            llm,  # LLM
-            transcript.user(),  # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
            transport.output(),  # Transport bot output
            transcript.assistant(),  # After the transcript output, to time with the audio output
            context_aggregator.assistant(),
@@ -198,6 +217,13 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
        # Kick off the conversation.
        await task.queue_frames([LLMRunFrame()])

+        # Add a new tool at runtime after a delay.
+        await asyncio.sleep(15)
+        new_tools = ToolsSchema(
+            standard_tools=[weather_function, restaurant_function, get_news_function]
+        )
+        await task.queue_frames([LLMSetToolsFrame(tools=new_tools)])
+
    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
--- a/examples/foundational/19a-azure-realtime.py
+++ b/examples/foundational/19a-azure-realtime.py
@@ -18,7 +18,9 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.azure.realtime.llm import AzureRealtimeLLMService
@@ -155,10 +157,10 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
    llm.register_function("get_current_weather", fetch_weather_from_api)
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)

-    # Create a standard OpenAI LLM context object using the normal messages format. The
+    # Create a standard LLM context object using the normal messages format. The
    # OpenAIRealtimeBetaLLMService will convert this internally to messages that the
    # openai WebSocket API can understand.
-    context = OpenAILLMContext(
+    context = LLMContext(
        [{"role": "user", "content": "Say hello!"}],
        # [{"role": "user", "content": [{"type": "text", "text": "Say hello!"}]}],
        #     [
@@ -173,7 +175,7 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
        tools,
    )

-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/19b-openai-realtime-text.py
+++ b/examples/foundational/19b-openai-realtime-text.py
@@ -18,7 +18,8 @@ from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
@@ -169,20 +170,20 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
    # Create a standard OpenAI LLM context object using the normal messages format. The
    # OpenAIRealtimeLLMService will convert this internally to messages that the
    # openai WebSocket API can understand.
-    context = OpenAILLMContext(
+    context = LLMContext(
        [{"role": "user", "content": "Say hello!"}],
        tools,
    )

-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
            context_aggregator.user(),
+            transcript.user(),  # LLM pushes TranscriptionFrames upstream
            llm,  # LLM
            tts,  # TTS
-            transcript.user(),  # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
            transport.output(),  # Transport bot output
            transcript.assistant(),  # After the transcript output, to time with the audio output
            context_aggregator.assistant(),
--- a/examples/foundational/20b-persistent-context-openai-realtime.py
+++ b/examples/foundational/20b-persistent-context-openai-realtime.py
@@ -13,14 +13,15 @@ from datetime import datetime
 from dotenv import load_dotenv
 from loguru import logger

+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-)
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.deepgram.stt import DeepgramSTTService
@@ -69,11 +70,11 @@ async def save_conversation(params: FunctionCallParams):
    timestamp = datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
    filename = f"{BASE_FILENAME}{timestamp}.json"
    logger.debug(
-        f"writing conversation to {filename}\n{json.dumps(params.context.messages, indent=4)}"
+        f"writing conversation to {filename}\n{json.dumps(params.context.get_messages(), indent=4)}"
    )
    try:
        with open(filename, "w") as file:
-            messages = params.context.get_messages_for_persistent_storage()
+            messages = params.context.get_messages()
            # remove the last message, which is the instruction we just gave to save the conversation
            messages.pop()
            json.dump(messages, file, indent=2)
@@ -90,6 +91,10 @@ async def load_conversation(params: FunctionCallParams):
            with open(filename, "r") as file:
                params.context.set_messages(json.load(file))
                await params.llm.reset_conversation()
+                # NOTE: we manually create a response here rather than relying
+                # on the function callback to trigger one since we've reset the
+                # conversation so the remote service doesn't know about the
+                # in-progress tool call.
                await params.llm._create_response()
        except Exception as e:
            await params.result_callback({"success": False, "error": str(e)})
@@ -97,14 +102,12 @@ async def load_conversation(params: FunctionCallParams):
    asyncio.create_task(_reset())


-tools = [
-    {
-        "type": "function",
-        "name": "get_current_weather",
-        "description": "Get the current weather",
-        "parameters": {
-            "type": "object",
-            "properties": {
+tools = ToolsSchema(
+    standard_tools=[
+        FunctionSchema(
+            name="get_current_weather",
+            description="Get the current weather",
+            properties={
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
@@ -115,45 +118,33 @@ tools = [
                    "description": "The temperature unit to use. Infer this from the users location.",
                },
            },
-            "required": ["location", "format"],
-        },
-    },
-    {
-        "type": "function",
-        "name": "save_conversation",
-        "description": "Save the current conversatione. Use this function to persist the current conversation to external storage.",
-        "parameters": {
-            "type": "object",
-            "properties": {},
-            "required": [],
-        },
-    },
-    {
-        "type": "function",
-        "name": "get_saved_conversation_filenames",
-        "description": "Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
-        "parameters": {
-            "type": "object",
-            "properties": {},
-            "required": [],
-        },
-    },
-    {
-        "type": "function",
-        "name": "load_conversation",
-        "description": "Load a conversation history. Use this function to load a conversation history into the current session.",
-        "parameters": {
-            "type": "object",
-            "properties": {
+            required=["location", "format"],
+        ),
+        FunctionSchema(
+            name="save_conversation",
+            description="Save the current conversatione. Use this function to persist the current conversation to external storage.",
+            properties={},
+            required=[],
+        ),
+        FunctionSchema(
+            name="get_saved_conversation_filenames",
+            description="Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
+            properties={},
+            required=[],
+        ),
+        FunctionSchema(
+            name="load_conversation",
+            description="Load a conversation history. Use this function to load a conversation history into the current session.",
+            properties={
                "filename": {
                    "type": "string",
                    "description": "The filename of the conversation history to load.",
                }
            },
-            "required": ["filename"],
-        },
-    },
-]
+            required=["filename"],
+        ),
+    ]
+)


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -224,8 +215,8 @@ Remember, your responses should be short. Just one or two sentences, usually."""
    llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
    llm.register_function("load_conversation", load_conversation)

-    context = OpenAILLMContext([], tools)
-    context_aggregator = llm.create_context_aggregator(context)
+    context = LLMContext([{"role": "user", "content": "Say hello!"}], tools)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/20c-persistent-context-anthropic.py
+++ b/examples/foundational/20c-persistent-context-anthropic.py
@@ -72,7 +72,6 @@ async def save_conversation(params: FunctionCallParams):
    )
    try:
        with open(filename, "w") as file:
-            # todo: extract 'system' into the first message in the list
            messages = params.context.get_messages()
            # remove the last message, which is the instruction we just gave to save the conversation
            messages.pop()
--- a/examples/foundational/20d-persistent-context-gemini.py
+++ b/examples/foundational/20d-persistent-context-gemini.py
@@ -90,7 +90,6 @@ async def save_conversation(params: FunctionCallParams):
    )
    try:
        with open(filename, "w") as file:
-            # todo: extract 'system' into the first message in the list
            messages = params.context.get_messages()
            # remove the last message (the instruction to save the context)
            messages.pop()
--- a/examples/foundational/20e-persistent-context-aws-nova-sonic.py
+++ b/examples/foundational/20e-persistent-context-aws-nova-sonic.py
@@ -20,6 +20,8 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
@@ -75,7 +77,7 @@ async def save_conversation(params: FunctionCallParams):
    filename = f"{BASE_FILENAME}{timestamp}.json"
    try:
        with open(filename, "w") as file:
-            messages = params.context.get_messages_for_persistent_storage()
+            messages = params.context.get_messages()
            # remove the last few messages. in reverse order, they are:
            # - the in progress save tool call
            # - the invocation of the save tool call
@@ -223,13 +225,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
    llm.register_function("load_conversation", load_conversation)

-    context = OpenAILLMContext(
+    context = LLMContext(
        messages=[
            {"role": "system", "content": f"{system_instruction}"},
        ],
        tools=tools,
    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26a-gemini-live-transcription.py
+++ b/examples/foundational/26a-gemini-live-transcription.py
@@ -16,7 +16,9 @@ from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
@@ -72,7 +74,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        # inference_on_context_initialization=False,
    )

-    context = OpenAILLMContext(
+    context = LLMContext(
        [
            {
                "role": "user",
@@ -90,7 +92,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            #     },
        ],
    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    transcript = TranscriptProcessor()

--- a/examples/foundational/26b-gemini-live-function-calling.py
+++ b/examples/foundational/26b-gemini-live-function-calling.py
@@ -19,7 +19,9 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
@@ -139,10 +141,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_current_weather", fetch_weather_from_api)
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)

-    context = OpenAILLMContext(
+    context = LLMContext(
        [{"role": "user", "content": "Say hello."}],
    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26c-gemini-live-video.py
+++ b/examples/foundational/26c-gemini-live-video.py
@@ -17,7 +17,9 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -65,7 +67,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        # inference_on_context_initialization=False,
    )

-    context = OpenAILLMContext(
+    context = LLMContext(
        [
            {
                "role": "user",
@@ -73,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            },
        ],
    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26d-gemini-live-text.py
+++ b/examples/foundational/26d-gemini-live-text.py
@@ -16,7 +16,8 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
@@ -109,8 +110,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    # Set up conversation context and management
    # The context_aggregator will automatically collect conversation context
-    context = OpenAILLMContext(messages)
-    context_aggregator = llm.create_context_aggregator(context)
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26e-gemini-live-google-search.py
+++ b/examples/foundational/26e-gemini-live-google-search.py
@@ -16,7 +16,9 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
@@ -90,7 +92,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        tools=tools,
    )

-    context = OpenAILLMContext(
+    context = LLMContext(
        [
            {
                "role": "user",
@@ -98,7 +100,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            }
        ],
    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26f-gemini-live-files-api.py
+++ b/examples/foundational/26f-gemini-live-files-api.py
@@ -16,7 +16,9 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
@@ -129,7 +131,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        mime_type = "text/plain"

        # Create context with file reference
-        context = OpenAILLMContext(
+        context = LLMContext(
            [
                {
                    "role": "user",
@@ -152,7 +154,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    except Exception as e:
        logger.error(f"Error uploading file: {e}")
        # Continue with a basic context if file upload fails
-        context = OpenAILLMContext(
+        context = LLMContext(
            [
                {
                    "role": "user",
@@ -162,7 +164,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        )

    # Create context aggregator
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    # Build the pipeline
    pipeline = Pipeline(
--- a/examples/foundational/26g-gemini-live-groundingMetadata.py
+++ b/examples/foundational/26g-gemini-live-groundingMetadata.py
@@ -10,7 +10,9 @@ from pipecat.frames.frames import Frame, LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
@@ -124,8 +126,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    ]

    # Set up conversation context and management
-    context = OpenAILLMContext(messages)
-    context_aggregator = llm.create_context_aggregator(context)
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26h-gemini-live-vertex-function-calling.py
+++ b/examples/foundational/26h-gemini-live-vertex-function-calling.py
@@ -9,21 +9,21 @@ import os
 from datetime import datetime

 from dotenv import load_dotenv
-from google.genai.types import HttpOptions
 from loguru import logger

 from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.services.google.gemini_live.llm_vertex import GeminiLiveVertexLLMService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
@@ -139,10 +139,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_current_weather", fetch_weather_from_api)
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)

-    context = OpenAILLMContext(
-        [{"role": "user", "content": "Say hello."}],
-    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context = LLMContext([{"role": "user", "content": "Say hello."}])
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26i-gemini-live-graceful-end.py
+++ b/examples/foundational/26i-gemini-live-graceful-end.py
@@ -18,7 +18,9 @@ from pipecat.frames.frames import EndTaskFrame, LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
@@ -62,7 +64,7 @@ You have three tools available to you:

 After you've responded to the user three times, do two things, in order:
 1. Politely let them know that that's all the time you have today and say goodbye.
-2. Call the end_conversation tool to gracefully end the conversation.
+2. *WITHOUT WAITING FOR THE USER TO RESPOND*, call the end_conversation tool to gracefully end the conversation.
 """


@@ -152,10 +154,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
    llm.register_function("end_conversation", end_conversation)

-    context = OpenAILLMContext(
+    context = LLMContext(
        [{"role": "user", "content": "Say hello."}],
    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/27-simli-layer.py
+++ b/examples/foundational/27-simli-layer.py
@@ -9,7 +9,6 @@ import os

 from dotenv import load_dotenv
 from loguru import logger
-from simli import SimliConfig

 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
@@ -66,11 +65,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="a167e0f3-df7e-4d52-a9c3-f949145efdab",
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",
    )

    simli_ai = SimliVideoService(
-        SimliConfig(os.getenv("SIMLI_API_KEY"), os.getenv("SIMLI_FACE_ID")),
+        api_key=os.getenv("SIMLI_API_KEY"),
+        face_id="cace3ef7-a4c4-425d-a8cf-a5358eb0c427",
    )

    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o-mini")
--- a/examples/foundational/40-aws-nova-sonic.py
+++ b/examples/foundational/40-aws-nova-sonic.py
@@ -18,7 +18,8 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.aws.nova_sonic.llm import AWSNovaSonicLLMService
@@ -119,9 +120,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_current_weather", fetch_weather_from_api)

    # Set up context and context management.
-    # AWSNovaSonicService will adapt OpenAI LLM context objects with standard message format to
-    # what's expected by Nova Sonic.
-    context = OpenAILLMContext(
+    context = LLMContext(
        messages=[
            {"role": "system", "content": f"{system_instruction}"},
            {
@@ -131,7 +130,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ],
        tools=tools,
    )
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = LLMContextAggregatorPair(context)

    # Build the pipeline
    pipeline = Pipeline(
--- a/examples/foundational/46-video-processing.py
+++ b/examples/foundational/46-video-processing.py
@@ -15,7 +15,9 @@ from pipecat.frames.frames import Frame, InputImageRawFrame, LLMRunFrame, Output
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.processors.frameworks.rtvi import RTVIObserver, RTVIProcessor
 from pipecat.runner.types import RunnerArguments
@@ -108,8 +110,8 @@ async def run_bot(pipecat_transport):
        }
    ]

-    context = OpenAILLMContext(messages)
-    context_aggregator = llm.create_context_aggregator(context)
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)

    # RTVI events for Pipecat client UI
    rtvi = RTVIProcessor()
--- a/examples/foundational/47-sentry-metrics.py
+++ b/examples/foundational/47-sentry-metrics.py
@@ -0,0 +1,142 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import os
+
+import sentry_sdk
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.metrics.sentry import SentryMetrics
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    # Initialize Sentry
+    sentry_sdk.init(
+        dsn=os.getenv("SENTRY_DSN"),
+        traces_sample_rate=1.0,
+    )
+
+    stt = DeepgramSTTService(
+        api_key=os.getenv("DEEPGRAM_API_KEY"),
+        metrics=SentryMetrics(),
+    )
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+        metrics=SentryMetrics(),
+    )
+
+    llm = OpenAILLMService(
+        api_key=os.getenv("OPENAI_API_KEY"),
+        metrics=SentryMetrics(),
+    )
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/48-service-switcher.py
+++ b/examples/foundational/48-service-switcher.py
@@ -0,0 +1,153 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame, ManuallySwitchServiceFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.service_switcher import ServiceSwitcher, ServiceSwitcherStrategyManual
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.stt import CartesiaSTTService
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.deepgram.tts import DeepgramTTSService
+from pipecat.services.google.llm import GoogleLLMService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt_cartesia = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
+    stt_deepgram = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+    stt_switcher = ServiceSwitcher(
+        services=[stt_cartesia, stt_deepgram], strategy_type=ServiceSwitcherStrategyManual
+    )
+
+    tts_cartesia = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",
+    )
+    tts_deepgram = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+    tts_switcher = ServiceSwitcher(
+        services=[tts_cartesia, tts_deepgram], strategy_type=ServiceSwitcherStrategyManual
+    )
+
+    llm_openai = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+    llm_google = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
+    llm_switcher = ServiceSwitcher(
+        services=[llm_openai, llm_google], strategy_type=ServiceSwitcherStrategyManual
+    )
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt_switcher,
+            context_aggregator.user(),  # User responses
+            llm_switcher,  # LLM
+            tts_switcher,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+        await task.queue_frames([LLMRunFrame()])
+        await asyncio.sleep(15)
+        print(f"Switching to {stt_deepgram}")
+        await task.queue_frames([ManuallySwitchServiceFrame(service=stt_deepgram)])
+        await asyncio.sleep(15)
+        print(f"Switching to {llm_google}")
+        await task.queue_frames([ManuallySwitchServiceFrame(service=llm_google)])
+        await asyncio.sleep(15)
+        print(f"Switching to {tts_deepgram}")
+        await task.queue_frames([ManuallySwitchServiceFrame(service=tts_deepgram)])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/assets/cat.jpg
+++ b/examples/foundational/assets/cat.jpg
--- a/examples/foundational/assets/moondream.png
+++ b/examples/foundational/assets/moondream.png
--- a/examples/quickstart/README.md
+++ b/examples/quickstart/README.md
@@ -73,13 +73,13 @@ Transform your local bot into a production-ready service. Pipecat Cloud handles

 1. [Sign up for Pipecat Cloud](https://pipecat.daily.co/sign-up).

-2. Install the Pipecat Cloud CLI:
+2. Install the Pipecat CLI:

   ```bash
-   uv add pipecatcloud
+   uv tool install pipecat-ai-cli
   ```

-> 💡 Tip: You can run the `pipecatcloud` CLI using the `pcc` alias.
+> 💡 Tip: You can run the `pipecat` CLI using the `pc` alias.

 3. Set up Docker for building your bot image:

@@ -113,12 +113,22 @@ secret_set = "quickstart-secrets"

 > 💡 Tip: [Set up `image_credentials`](https://docs.pipecat.ai/deployment/pipecat-cloud/fundamentals/secrets#image-pull-secrets) in your TOML file for authenticated image pulls

+### Log in to Pipecat Cloud
+
+To start using the CLI, authenticate to Pipecat Cloud:
+
+```bash
+pipecat cloud auth login
+```
+
+You'll be presented with a link that you can click to authenticate your client.
+
 ### Configure secrets

 Upload your API keys to Pipecat Cloud's secure storage:

 ```bash
-uv run pcc secrets set quickstart-secrets --file .env
+pipecat cloud secrets set quickstart-secrets --file .env
 ```

 This creates a secret set called `quickstart-secrets` (matching your TOML file) and uploads all your API keys from `.env`.
@@ -128,13 +138,13 @@ This creates a secret set called `quickstart-secrets` (matching your TOML file)
 Build your Docker image and push to Docker Hub:

 ```bash
-uv run pcc docker build-push
+pipecat cloud docker build-push
 ```

 Deploy to Pipecat Cloud:

 ```bash
-uv run pcc deploy
+pipecat cloud deploy
 ```

 ### Connect to your agent
--- a/examples/quickstart/pcc-deploy.toml
+++ b/examples/quickstart/pcc-deploy.toml
@@ -1,6 +1,11 @@
 agent_name = "quickstart"
 image = "your_username/quickstart:0.1"
 secret_set = "quickstart-secrets"
+agent_profile = "agent-1x"
+
+# RECOMMENDED: Set an image pull secret:
+# https://docs.pipecat.ai/deployment/pipecat-cloud/fundamentals/secrets#image-pull-secrets
+# image_credentials = "your_image_pull_secret"

 [scaling]
 	min_agents = 1
--- a/examples/quickstart/pyproject.toml
+++ b/examples/quickstart/pyproject.toml
@@ -4,13 +4,14 @@ version = "0.1.0"
 description = "Quickstart example for building voice AI bots with Pipecat"
 requires-python = ">=3.10"
 dependencies = [
-    "pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]>=0.0.86",
-    "pipecatcloud>=0.2.4"
+    "pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]",
+    "pipecat-ai-cli"
 ]

 [dependency-groups]
 dev = [
-    "ruff~=0.12.1",
+    "pyright>=1.1.404,<2",
+    "ruff>=0.12.11,<1",
 ]

 [tool.ruff]
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -34,7 +34,7 @@ dependencies = [
    "pyloudnorm~=0.1.1",
    "resampy~=0.4.3",
    "soxr~=0.5.0",
-    "openai>=1.74.0,<=1.99.1",
+    "openai>=1.74.0,<3",
    # Pinning numba to resolve package dependencies
    "numba==0.61.2",
    "wait_for2>=0.4.1; python_version<'3.12'",
@@ -50,12 +50,12 @@ anthropic = [ "anthropic~=0.49.0" ]
 assemblyai = [ "pipecat-ai[websockets-base]" ]
 asyncai = [ "pipecat-ai[websockets-base]" ]
 aws = [ "aioboto3~=15.0.0", "pipecat-ai[websockets-base]" ]
-aws-nova-sonic = [ "aws_sdk_bedrock_runtime~=0.1.0; python_version>='3.12'" ]
+aws-nova-sonic = [ "aws_sdk_bedrock_runtime~=0.1.1; python_version>='3.12'" ]
 azure = [ "azure-cognitiveservices-speech~=1.42.0"]
 cartesia = [ "cartesia~=2.0.3", "pipecat-ai[websockets-base]" ]
 cerebras = []
 deepseek = []
-daily = [ "daily-python~=0.19.9" ]
+daily = [ "daily-python~=0.21.0" ]
 deepgram = [ "deepgram-sdk~=4.7.0" ]
 elevenlabs = [ "pipecat-ai[websockets-base]" ]
 fal = [ "fal-client~=0.5.9" ]
@@ -84,7 +84,7 @@ nim = []
 neuphonic = [ "pipecat-ai[websockets-base]" ]
 noisereduce = [ "noisereduce~=3.0.3" ]
 openai = [ "pipecat-ai[websockets-base]" ]
-openpipe = [ "openpipe~=4.50.0" ]
+openpipe = [ "openpipe>=4.50.0,<6" ]
 openrouter = []
 perplexity = []
 playht = [ "pipecat-ai[websockets-base]" ]
@@ -102,7 +102,7 @@ silero = [ "onnxruntime>=1.20.1,<2" ]
 simli = [ "simli-ai~=0.1.10"]
 soniox = [ "pipecat-ai[websockets-base]" ]
 soundfile = [ "soundfile~=0.13.0" ]
-speechmatics = [ "speechmatics-rt>=0.4.0" ]
+speechmatics = [ "speechmatics-rt>=0.5.0" ]
 strands = [ "strands-agents>=1.9.1,<2" ]
 tavus=[]
 together = []
--- a/scripts/evals/eval.py
+++ b/scripts/evals/eval.py
@@ -10,9 +10,10 @@ import os
 import re
 import time
 import wave
+from dataclasses import dataclass
 from datetime import datetime
 from pathlib import Path
-from typing import List, Optional, Tuple
+from typing import Any, List, Optional, Tuple

 import aiofiles
 from deepgram import LiveOptions
@@ -53,6 +54,14 @@ EVAL_TIMEOUT_SECS = 120
 EvalPrompt = str | Tuple[str, ImageFile]


+@dataclass
+class EvalConfig:
+    prompt: EvalPrompt
+    eval: str
+    eval_speaks_first: bool = False
+    runner_args_body: Optional[Any] = None
+
+
 class EvalRunner:
    def __init__(
        self,
@@ -93,9 +102,7 @@ class EvalRunner:
    async def run_eval(
        self,
        example_file: str,
-        prompt: EvalPrompt,
-        eval: str,
-        user_speaks_first: bool = False,
+        eval_config: EvalConfig,
    ):
        if not re.match(self._pattern, example_file):
            return
@@ -112,10 +119,8 @@ class EvalRunner:

        try:
            tasks = [
-                asyncio.create_task(run_example_pipeline(script_path)),
-                asyncio.create_task(
-                    run_eval_pipeline(self, example_file, prompt, eval, user_speaks_first)
-                ),
+                asyncio.create_task(run_example_pipeline(script_path, eval_config)),
+                asyncio.create_task(run_eval_pipeline(self, example_file, eval_config)),
            ]
            _, pending = await asyncio.wait(tasks, timeout=EVAL_TIMEOUT_SECS)
            if pending:
@@ -177,7 +182,7 @@ class EvalRunner:
        return os.path.join(self._recordings_dir, f"{base_name}.wav")


-async def run_example_pipeline(script_path: Path):
+async def run_example_pipeline(script_path: Path, eval_config: EvalConfig):
    room_url = os.getenv("DAILY_SAMPLE_ROOM_URL")

    module = load_module_from_path(script_path)
@@ -196,6 +201,7 @@ async def run_example_pipeline(script_path: Path):

    runner_args = RunnerArguments()
    runner_args.pipeline_idle_timeout_secs = PIPELINE_IDLE_TIMEOUT_SECS
+    runner_args.body = eval_config.runner_args_body

    await module.run_bot(transport, runner_args)

@@ -203,9 +209,7 @@ async def run_example_pipeline(script_path: Path):
 async def run_eval_pipeline(
    eval_runner: EvalRunner,
    example_file: str,
-    prompt: EvalPrompt,
-    eval: str,
-    user_speaks_first: bool = False,
+    eval_config: EvalConfig,
 ):
    logger.info(f"Starting eval bot")

@@ -262,17 +266,16 @@ async def run_eval_pipeline(
    # Load example prompt depending on image.
    example_prompt = ""
    example_image: Optional[ImageFile] = None
-    if isinstance(prompt, str):
-        example_prompt = prompt
-    elif isinstance(prompt, tuple):
-        example_prompt, example_image = prompt
+    if isinstance(eval_config.prompt, str):
+        example_prompt = eval_config.prompt
+    elif isinstance(eval_config.prompt, tuple):
+        example_prompt, example_image = eval_config.prompt

-    eval_prompt = f"The answer is correct if it matches: {eval}."
    common_system_prompt = (
        "The user might say things other than the answer and that's allowed. "
-        f"You should only call the eval function with your assessment when the user actually answers the question. {eval_prompt}"
+        f"You should only call the eval function when the user: {eval_config.eval}"
    )
-    if user_speaks_first:
+    if eval_config.eval_speaks_first:
        system_prompt = f"You are an LLM eval, be extremly brief. You will start the conversation by saying: '{example_prompt}'. {common_system_prompt}"
    else:
        system_prompt = f"You are an LLM eval, be extremly brief. Your goal is to first ask one question: {example_prompt}. {common_system_prompt}"
@@ -330,9 +333,9 @@ async def run_eval_pipeline(

        # Default behavior is for the bot to speak first
        # If the eval bot speaks first, we append the prompt to the messages
-        if user_speaks_first:
+        if eval_config.eval_speaks_first:
            messages.append(
-                {"role": "user", "content": f"Start by saying this exactly: '{prompt}'"}
+                {"role": "user", "content": f"Start by saying this exactly: '{eval_config.prompt}'"}
            )
            await task.queue_frames([LLMRunFrame()])

--- a/scripts/evals/run-release-evals.py
+++ b/scripts/evals/run-release-evals.py
@@ -11,7 +11,7 @@ from datetime import datetime, timezone
 from pathlib import Path

 from dotenv import load_dotenv
-from eval import EvalRunner
+from eval import EvalConfig, EvalRunner
 from loguru import logger
 from PIL import Image
 from utils import check_env_variables
@@ -24,188 +24,184 @@ ASSETS_DIR = SCRIPT_DIR / "assets"

 FOUNDATIONAL_DIR = SCRIPT_DIR.parent.parent / "examples" / "foundational"

-# Speaking order constants
-USER_SPEAKS_FIRST = True
-BOT_SPEAKS_FIRST = False
-
-# Math
-PROMPT_SIMPLE_MATH = "A simple math addition."
-EVAL_SIMPLE_MATH = "Correct math addition."
-
-# Weather
-PROMPT_WEATHER = "What's the weather in San Francisco?"
-EVAL_WEATHER = (
-    "Something specific about the current weather in San Francisco, including the degrees."
+EVAL_SIMPLE_MATH = EvalConfig(
+    prompt="A simple math addition.",
+    eval="The user answers the math addition correctly.",
 )

-# Online search
-PROMPT_ONLINE_SEARCH = "What's the date right now in London?"
-EVAL_ONLINE_SEARCH = f"Today is {datetime.now(timezone.utc).strftime('%B %d, %Y')}."
+EVAL_WEATHER = EvalConfig(
+    prompt="What's the weather in San Francisco?",
+    eval="The user says something specific about the current weather in San Francisco, including the degrees.",
+)

-# Switch language
-PROMPT_SWITCH_LANGUAGE = "Say something in Spanish."
-EVAL_SWITCH_LANGUAGE = "The user is now talking in Spanish."
+EVAL_ONLINE_SEARCH = EvalConfig(
+    prompt="What's the date right now in London?",
+    eval=f"The user says today is {datetime.now(timezone.utc).strftime('%B %d, %Y')} in London.",
+)

-# Vision
-PROMPT_VISION = ("What do you see?", Image.open(ASSETS_DIR / "cat.jpg"))
-EVAL_VISION = "A cat description."
+EVAL_SWITCH_LANGUAGE = EvalConfig(
+    prompt="Say something in Spanish.",
+    eval="The user talks in Spanish.",
+)
+
+EVAL_VISION_CAMERA = EvalConfig(
+    prompt=("Briefly describe what you see.", Image.open(ASSETS_DIR / "cat.jpg")),
+    eval="The user provides a cat description.",
+)
+
+
+def EVAL_VISION_IMAGE(*, eval_speaks_first: bool = False):
+    return EvalConfig(
+        prompt="Briefly describe this image.",
+        eval="The user provides a cat description.",
+        eval_speaks_first=eval_speaks_first,
+        runner_args_body={
+            "image_path": ASSETS_DIR / "cat.jpg",
+            "question": "Briefly describe this image.",
+        },
+    )
+
+
+EVAL_VOICEMAIL = EvalConfig(
+    prompt="Please leave a message.",
+    eval="The user leaves a voicemail message.",
+    eval_speaks_first=True,
+)
+
+EVAL_CONVERSATION = EvalConfig(
+    prompt="Hello, this is Mark.",
+    eval="The user replies with a greeting.",
+    eval_speaks_first=True,
+)

-# Voicemail
-PROMPT_VOICEMAIL = "Please leave a message after the beep."
-EVAL_VOICEMAIL = "Assess the conversation and determine if it is a voicemail."
-PROMPT_CONVERSATION = "Hello, this is Mark."
-EVAL_CONVERSATION = "A start of a conversation, not a voicemail."

 TESTS_07 = [
    # 07 series
-    ("07-interruptible.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07-interruptible-cartesia-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07a-interruptible-speechmatics.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07aa-interruptible-soniox.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07ab-interruptible-inworld-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07ac-interruptible-asyncai.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07ac-interruptible-asyncai-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07b-interruptible-langchain.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07c-interruptible-deepgram.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07c-interruptible-deepgram-flux.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07d-interruptible-elevenlabs.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    (
-        "07d-interruptible-elevenlabs-http.py",
-        PROMPT_SIMPLE_MATH,
-        EVAL_SIMPLE_MATH,
-        BOT_SPEAKS_FIRST,
-    ),
-    ("07f-interruptible-azure.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07g-interruptible-openai.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07h-interruptible-openpipe.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07j-interruptible-gladia.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07k-interruptible-lmnt.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07l-interruptible-groq.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07m-interruptible-aws.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07m-interruptible-aws-strands.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("07n-interruptible-gemini.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07n-interruptible-google.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07o-interruptible-assemblyai.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07q-interruptible-rime.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07q-interruptible-rime-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07r-interruptible-riva-nim.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    (
-        "07s-interruptible-google-audio-in.py",
-        PROMPT_SIMPLE_MATH,
-        EVAL_SIMPLE_MATH,
-        BOT_SPEAKS_FIRST,
-    ),
-    ("07t-interruptible-fish.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07v-interruptible-neuphonic.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07v-interruptible-neuphonic-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07w-interruptible-fal.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07y-interruptible-minimax.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07z-interruptible-sarvam.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    ("07ae-interruptible-hume.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07-interruptible.py", EVAL_SIMPLE_MATH),
+    ("07-interruptible-cartesia-http.py", EVAL_SIMPLE_MATH),
+    ("07a-interruptible-speechmatics.py", EVAL_SIMPLE_MATH),
+    ("07aa-interruptible-soniox.py", EVAL_SIMPLE_MATH),
+    ("07ab-interruptible-inworld-http.py", EVAL_SIMPLE_MATH),
+    ("07ac-interruptible-asyncai.py", EVAL_SIMPLE_MATH),
+    ("07ac-interruptible-asyncai-http.py", EVAL_SIMPLE_MATH),
+    ("07b-interruptible-langchain.py", EVAL_SIMPLE_MATH),
+    ("07c-interruptible-deepgram.py", EVAL_SIMPLE_MATH),
+    ("07c-interruptible-deepgram-flux.py", EVAL_SIMPLE_MATH),
+    ("07c-interruptible-deepgram-http.py", EVAL_SIMPLE_MATH),
+    ("07d-interruptible-elevenlabs.py", EVAL_SIMPLE_MATH),
+    ("07d-interruptible-elevenlabs-http.py", EVAL_SIMPLE_MATH),
+    ("07f-interruptible-azure.py", EVAL_SIMPLE_MATH),
+    ("07g-interruptible-openai.py", EVAL_SIMPLE_MATH),
+    ("07h-interruptible-openpipe.py", EVAL_SIMPLE_MATH),
+    ("07j-interruptible-gladia.py", EVAL_SIMPLE_MATH),
+    ("07k-interruptible-lmnt.py", EVAL_SIMPLE_MATH),
+    ("07l-interruptible-groq.py", EVAL_SIMPLE_MATH),
+    ("07m-interruptible-aws.py", EVAL_SIMPLE_MATH),
+    ("07m-interruptible-aws-strands.py", EVAL_WEATHER),
+    ("07n-interruptible-gemini.py", EVAL_SIMPLE_MATH),
+    ("07n-interruptible-google.py", EVAL_SIMPLE_MATH),
+    ("07o-interruptible-assemblyai.py", EVAL_SIMPLE_MATH),
+    ("07q-interruptible-rime.py", EVAL_SIMPLE_MATH),
+    ("07q-interruptible-rime-http.py", EVAL_SIMPLE_MATH),
+    ("07r-interruptible-riva-nim.py", EVAL_SIMPLE_MATH),
+    ("07s-interruptible-google-audio-in.py", EVAL_SIMPLE_MATH),
+    ("07t-interruptible-fish.py", EVAL_SIMPLE_MATH),
+    ("07v-interruptible-neuphonic.py", EVAL_SIMPLE_MATH),
+    ("07v-interruptible-neuphonic-http.py", EVAL_SIMPLE_MATH),
+    ("07w-interruptible-fal.py", EVAL_SIMPLE_MATH),
+    ("07y-interruptible-minimax.py", EVAL_SIMPLE_MATH),
+    ("07z-interruptible-sarvam.py", EVAL_SIMPLE_MATH),
+    ("07ae-interruptible-hume.py", EVAL_SIMPLE_MATH),
    # Needs a local XTTS docker instance running.
-    # ("07i-interruptible-xtts.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    # ("07i-interruptible-xtts.py", EVAL_SIMPLE_MATH),
    # Needs a Krisp license.
-    # ("07p-interruptible-krisp.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    # ("07p-interruptible-krisp.py", EVAL_SIMPLE_MATH),
    # Needs GPU resources.
-    # ("07u-interruptible-ultravox.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    # ("07u-interruptible-ultravox.py", EVAL_SIMPLE_MATH),
 ]

 TESTS_12 = [
-    ("12-describe-video.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
-    ("12a-describe-video-gemini-flash.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
-    ("12b-describe-video-gpt-4o.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
-    ("12c-describe-video-anthropic.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
+    ("12-describe-image-openai.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
+    ("12a-describe-image-anthropic.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
+    ("12b-describe-image-aws.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
+    ("12c-describe-image-gemini-flash.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
+    ("12d-describe-image-moondream.py", EVAL_VISION_IMAGE()),
 ]

 TESTS_14 = [
-    ("14-function-calling.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14a-function-calling-anthropic.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14b-function-calling-anthropic-video.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14d-function-calling-video.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14e-function-calling-google.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14f-function-calling-groq.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14g-function-calling-grok.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14h-function-calling-azure.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14i-function-calling-fireworks.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14j-function-calling-nim.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14k-function-calling-cerebras.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14m-function-calling-openrouter.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14n-function-calling-perplexity.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14p-function-calling-gemini-vertex-ai.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14q-function-calling-qwen.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14r-function-calling-aws.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14v-function-calling-openai.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("14w-function-calling-mistral.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14-function-calling.py", EVAL_WEATHER),
+    ("14a-function-calling-anthropic.py", EVAL_WEATHER),
+    ("14e-function-calling-google.py", EVAL_WEATHER),
+    ("14f-function-calling-groq.py", EVAL_WEATHER),
+    ("14g-function-calling-grok.py", EVAL_WEATHER),
+    ("14h-function-calling-azure.py", EVAL_WEATHER),
+    ("14i-function-calling-fireworks.py", EVAL_WEATHER),
+    ("14j-function-calling-nim.py", EVAL_WEATHER),
+    ("14k-function-calling-cerebras.py", EVAL_WEATHER),
+    ("14m-function-calling-openrouter.py", EVAL_WEATHER),
+    ("14n-function-calling-perplexity.py", EVAL_WEATHER),
+    ("14p-function-calling-gemini-vertex-ai.py", EVAL_WEATHER),
+    ("14q-function-calling-qwen.py", EVAL_WEATHER),
+    ("14r-function-calling-aws.py", EVAL_WEATHER),
+    ("14v-function-calling-openai.py", EVAL_WEATHER),
+    ("14w-function-calling-mistral.py", EVAL_WEATHER),
+    ("14x-function-calling-openpipe.py", EVAL_WEATHER),
+    # Video
+    ("14d-function-calling-anthropic-video.py", EVAL_VISION_CAMERA),
+    ("14d-function-calling-aws-video.py", EVAL_VISION_CAMERA),
+    ("14d-function-calling-gemini-flash-video.py", EVAL_VISION_CAMERA),
+    ("14d-function-calling-moondream-video.py", EVAL_VISION_CAMERA),
+    ("14d-function-calling-openai-video.py", EVAL_VISION_CAMERA),
    # Currently not working.
-    # ("14c-function-calling-together.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    # ("14l-function-calling-deepseek.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    # ("14o-function-calling-gemini-openai-format.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    # ("14c-function-calling-together.py", EVAL_WEATHER),
+    # ("14l-function-calling-deepseek.py", EVAL_WEATHER),
+    # ("14o-function-calling-gemini-openai-format.py", EVAL_WEATHER),
 ]

 TESTS_15 = [
-    ("15a-switch-languages.py", PROMPT_SWITCH_LANGUAGE, EVAL_SWITCH_LANGUAGE, BOT_SPEAKS_FIRST),
+    ("15a-switch-languages.py", EVAL_SWITCH_LANGUAGE),
 ]

 TESTS_19 = [
-    ("19-openai-realtime.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("19-openai-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("19-openai-realtime.py", EVAL_WEATHER),
+    ("19-openai-realtime-beta.py", EVAL_WEATHER),
    # OpenAI Realtime not released on Azure yet
-    # ("19a-azure-realtime.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("19a-azure-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("19b-openai-realtime-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
-    ("19b-openai-realtime-beta-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    # ("19a-azure-realtime.py", EVAL_WEATHER),
+    ("19a-azure-realtime-beta.py", EVAL_WEATHER),
+    ("19b-openai-realtime-text.py", EVAL_WEATHER),
+    ("19b-openai-realtime-beta-text.py", EVAL_WEATHER),
 ]

 TESTS_21 = [
-    ("21a-tavus-video-service.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("21a-tavus-video-service.py", EVAL_SIMPLE_MATH),
 ]

 TESTS_26 = [
-    ("26-gemini-multimodal-live.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    (
-        "26a-gemini-live-transcription.py",
-        PROMPT_SIMPLE_MATH,
-        EVAL_SIMPLE_MATH,
-        BOT_SPEAKS_FIRST,
-    ),
-    (
-        "26b-gemini-live-function-calling.py",
-        PROMPT_WEATHER,
-        EVAL_WEATHER,
-        BOT_SPEAKS_FIRST,
-    ),
-    ("26c-gemini-live-video.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    (
-        "26e-gemini-multimodal-google-search.py",
-        PROMPT_ONLINE_SEARCH,
-        EVAL_ONLINE_SEARCH,
-        BOT_SPEAKS_FIRST,
-    ),
+    ("26-gemini-live.py", EVAL_SIMPLE_MATH),
+    ("26a-gemini-live-transcription.py", EVAL_SIMPLE_MATH),
+    ("26b-gemini-live-function-calling.py", EVAL_WEATHER),
+    ("26c-gemini-live-video.py", EVAL_SIMPLE_MATH),
+    ("26e-gemini-live-google-search.py", EVAL_ONLINE_SEARCH),
+    ("26h-gemini-live-vertex-function-calling.py", EVAL_WEATHER),
    # Currently not working.
-    # ("26d-gemini-live-text.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
-    (
-        "26h-gemini-live-vertex-function-calling.py",
-        PROMPT_WEATHER,
-        EVAL_WEATHER,
-        BOT_SPEAKS_FIRST,
-    ),
+    # ("26d-gemini-live-text.py", EVAL_SIMPLE_MATH),
 ]

 TESTS_27 = [
-    ("27-simli-layer.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("27-simli-layer.py", EVAL_SIMPLE_MATH),
 ]

 TESTS_40 = [
-    ("40-aws-nova-sonic.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("40-aws-nova-sonic.py", EVAL_SIMPLE_MATH),
 ]

 TESTS_43 = [
-    ("43a-heygen-video-service.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("43a-heygen-video-service.py", EVAL_SIMPLE_MATH),
 ]

 TESTS_44 = [
-    ("44-voicemail-detection.py", PROMPT_VOICEMAIL, EVAL_VOICEMAIL, USER_SPEAKS_FIRST),
-    ("44-voicemail-detection.py", PROMPT_CONVERSATION, EVAL_CONVERSATION, USER_SPEAKS_FIRST),
+    ("44-voicemail-detection.py", EVAL_VOICEMAIL),
+    ("44-voicemail-detection.py", EVAL_CONVERSATION),
 ]

 TESTS = [
@@ -243,9 +239,9 @@ async def main(args: argparse.Namespace):

    # Parse test config: (test, prompt, eval, user_speaks_first)
    for test_config in TESTS:
-        test, prompt, eval, user_speaks_first = test_config
+        test, eval_config = test_config

-        await runner.run_eval(test, prompt, eval, user_speaks_first)
+        await runner.run_eval(test, eval_config)

    runner.print_results()

--- a/src/pipecat/adapters/schemas/tools_schema.py
+++ b/src/pipecat/adapters/schemas/tools_schema.py
@@ -22,9 +22,12 @@ class AdapterType(Enum):

    Parameters:
        GEMINI: Google Gemini adapter - currently the only service supporting custom tools.
+        SHIM: Backward compatibility shim for creating ToolsSchemas from lists of tools in
+              any format, used by LLMContext.from_openai_context.
    """

    GEMINI = "gemini"  # that is the only service where we are able to add custom tools for now
+    SHIM = "shim"  # for use as backward compatibility shim for creating ToolsSchemas from list of tools in any format


 class ToolsSchema:
--- a/src/pipecat/adapters/services/anthropic_adapter.py
+++ b/src/pipecat/adapters/services/anthropic_adapter.py
@@ -110,7 +110,7 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
        system = NOT_GIVEN
        messages = []

-        # first, map messages using self._from_universal_context_message(m)
+        # First, map messages using self._from_universal_context_message(m)
        try:
            messages = [self._from_universal_context_message(m) for m in universal_context_messages]
        except Exception as e:
@@ -245,13 +245,25 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
                    item["text"] = "(empty)"
                # handle image_url -> image conversion
                if item["type"] == "image_url":
-                    item["type"] = "image"
-                    item["source"] = {
-                        "type": "base64",
-                        "media_type": "image/jpeg",
-                        "data": item["image_url"]["url"].split(",")[1],
-                    }
-                    del item["image_url"]
+                    if item["image_url"]["url"].startswith("data:"):
+                        item["type"] = "image"
+                        item["source"] = {
+                            "type": "base64",
+                            "media_type": "image/jpeg",
+                            "data": item["image_url"]["url"].split(",")[1],
+                        }
+                        del item["image_url"]
+                    elif item["image_url"]["url"].startswith("http"):
+                        item["type"] = "image"
+                        item["source"] = {
+                            "type": "url",
+                            "url": item["image_url"]["url"],
+                        }
+                        del item["image_url"]
+                    else:
+                        url = item["image_url"]["url"]
+                        logger.warning(f"Unsupported 'image_url': {url}")
+
            # In the case where there's a single image in the list (like what
            # would result from a UserImageRawFrame), ensure that the image
            # comes before text, as recommended by Anthropic docs
--- a/src/pipecat/adapters/services/aws_nova_sonic_adapter.py
+++ b/src/pipecat/adapters/services/aws_nova_sonic_adapter.py
@@ -6,13 +6,47 @@

 """AWS Nova Sonic LLM adapter for Pipecat."""

+import copy
 import json
-from typing import Any, Dict, List, TypedDict
+from dataclasses import dataclass
+from enum import Enum
+from typing import Any, Dict, List, Optional, TypedDict
+
+from loguru import logger

 from pipecat.adapters.base_llm_adapter import BaseLLMAdapter
 from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
+from pipecat.processors.aggregators.llm_context import LLMContext, LLMContextMessage
+
+
+class Role(Enum):
+    """Roles supported in AWS Nova Sonic conversations.
+
+    Parameters:
+        SYSTEM: System-level messages (not used in conversation history).
+        USER: Messages sent by the user.
+        ASSISTANT: Messages sent by the assistant.
+        TOOL: Messages sent by tools (not used in conversation history).
+    """
+
+    SYSTEM = "SYSTEM"
+    USER = "USER"
+    ASSISTANT = "ASSISTANT"
+    TOOL = "TOOL"
+
+
+@dataclass
+class AWSNovaSonicConversationHistoryMessage:
+    """A single message in AWS Nova Sonic conversation history.
+
+    Parameters:
+        role: The role of the message sender (USER or ASSISTANT only).
+        text: The text content of the message.
+    """
+
+    role: Role  # only USER and ASSISTANT
+    text: str


 class AWSNovaSonicLLMInvocationParams(TypedDict):
@@ -21,7 +55,9 @@ class AWSNovaSonicLLMInvocationParams(TypedDict):
    This is a placeholder until support for universal LLMContext machinery is added for AWS Nova Sonic.
    """

-    pass
+    system_instruction: Optional[str]
+    messages: List[AWSNovaSonicConversationHistoryMessage]
+    tools: List[Dict[str, Any]]


 class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
@@ -34,7 +70,7 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
    @property
    def id_for_llm_specific_messages(self) -> str:
        """Get the identifier used in LLMSpecificMessage instances for AWS Nova Sonic."""
-        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")
+        return "aws-nova-sonic"

    def get_llm_invocation_params(self, context: LLMContext) -> AWSNovaSonicLLMInvocationParams:
        """Get AWS Nova Sonic-specific LLM invocation parameters from a universal LLM context.
@@ -47,7 +83,13 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
        Returns:
            Dictionary of parameters for invoking AWS Nova Sonic's LLM API.
        """
-        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")
+        messages = self._from_universal_context_messages(self.get_messages(context))
+        return {
+            "system_instruction": messages.system_instruction,
+            "messages": messages.messages,
+            # NOTE: LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
+            "tools": self.from_standard_tools(context.tools) or [],
+        }

    def get_messages_for_logging(self, context) -> List[Dict[str, Any]]:
        """Get messages from a universal LLM context in a format ready for logging about AWS Nova Sonic.
@@ -62,7 +104,75 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
        Returns:
            List of messages in a format ready for logging about AWS Nova Sonic.
        """
-        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")
+        return self._from_universal_context_messages(self.get_messages(context)).messages
+
+    @dataclass
+    class ConvertedMessages:
+        """Container for Google-formatted messages converted from universal context."""
+
+        messages: List[AWSNovaSonicConversationHistoryMessage]
+        system_instruction: Optional[str] = None
+
+    def _from_universal_context_messages(
+        self, universal_context_messages: List[LLMContextMessage]
+    ) -> ConvertedMessages:
+        system_instruction = None
+        messages = []
+
+        # Bail if there are no messages
+        if not universal_context_messages:
+            return self.ConvertedMessages()
+
+        universal_context_messages = copy.deepcopy(universal_context_messages)
+
+        # If we have a "system" message as our first message, let's pull that out into "instruction"
+        if universal_context_messages[0].get("role") == "system":
+            system = universal_context_messages.pop(0)
+            content = system.get("content")
+            if isinstance(content, str):
+                system_instruction = content
+            elif isinstance(content, list):
+                system_instruction = content[0].get("text")
+            if system_instruction:
+                self._system_instruction = system_instruction
+
+        # Process remaining messages to fill out conversation history.
+        # Nova Sonic supports "user" and "assistant" messages in history.
+        for universal_context_message in universal_context_messages:
+            message = self._from_universal_context_message(universal_context_message)
+            if message:
+                messages.append(message)
+
+        return self.ConvertedMessages(messages=messages, system_instruction=system_instruction)
+
+    def _from_universal_context_message(self, message) -> AWSNovaSonicConversationHistoryMessage:
+        """Convert standard message format to Nova Sonic format.
+
+        Args:
+            message: Standard message dictionary to convert.
+
+        Returns:
+            Nova Sonic conversation history message, or None if not convertible.
+        """
+        role = message.get("role")
+        if message.get("role") == "user" or message.get("role") == "assistant":
+            content = message.get("content")
+            if isinstance(message.get("content"), list):
+                content = ""
+                for c in message.get("content"):
+                    if c.get("type") == "text":
+                        content += " " + c.get("text")
+                    else:
+                        logger.error(
+                            f"Unhandled content type in context message: {c.get('type')} - {message}"
+                        )
+            # There won't be content if this is an assistant tool call entry.
+            # We're ignoring those since they can't be loaded into AWS Nova Sonic conversation
+            # history
+            if content:
+                return AWSNovaSonicConversationHistoryMessage(role=Role[role.upper()], text=content)
+        # NOTE: we're ignoring messages with role "tool" since they can't be loaded into AWS Nova
+        # Sonic conversation history

    @staticmethod
    def _to_aws_nova_sonic_function_format(function: FunctionSchema) -> Dict[str, Any]:
@@ -100,4 +210,18 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
            List of dictionaries in AWS Nova Sonic function format.
        """
        functions_schema = tools_schema.standard_tools
-        return [self._to_aws_nova_sonic_function_format(func) for func in functions_schema]
+        standard_tools = [
+            self._to_aws_nova_sonic_function_format(func) for func in functions_schema
+        ]
+
+        # For backward compatibility, AWS Nova Sonic can still be used with
+        # tools in dict format, even though it always uses `LLMContext` under
+        # the hood (via `LLMContext.from_openai_context()`).
+        # To support this behavior, we use "shimmed" custom tools here.
+        # (We maintain this backward compatibility because users aren't
+        # *knowingly* opting into the new `LLMContext`.)
+        shimmed_tools = []
+        if tools_schema.custom_tools:
+            shimmed_tools = tools_schema.custom_tools.get(AdapterType.SHIM, [])
+
+        return standard_tools + shimmed_tools
--- a/src/pipecat/adapters/services/bedrock_adapter.py
+++ b/src/pipecat/adapters/services/bedrock_adapter.py
@@ -107,7 +107,7 @@ class AWSBedrockLLMAdapter(BaseLLMAdapter[AWSBedrockLLMInvocationParams]):
        system = None
        messages = []

-        # first, map messages using self._from_universal_context_message(m)
+        # First, map messages using self._from_universal_context_message(m)
        try:
            messages = [self._from_universal_context_message(m) for m in universal_context_messages]
        except Exception as e:
@@ -256,15 +256,22 @@ class AWSBedrockLLMAdapter(BaseLLMAdapter[AWSBedrockLLMInvocationParams]):
                    new_content.append({"text": text_content})
                # handle image_url -> image conversion
                if item["type"] == "image_url":
-                    new_item = {
-                        "image": {
-                            "format": "jpeg",
-                            "source": {
-                                "bytes": base64.b64decode(item["image_url"]["url"].split(",")[1])
-                            },
+                    if item["image_url"]["url"].startswith("data:"):
+                        new_item = {
+                            "image": {
+                                "format": "jpeg",
+                                "source": {
+                                    "bytes": base64.b64decode(
+                                        item["image_url"]["url"].split(",")[1]
+                                    )
+                                },
+                            }
                        }
-                    }
-                    new_content.append(new_item)
+                        new_content.append(new_item)
+                    else:
+                        url = item["image_url"]["url"]
+                        logger.warning(f"Unsupported 'image_url': {url}")
+
            # In the case where there's a single image in the list (like what
            # would result from a UserImageRawFrame), ensure that the image
            # comes before text
--- a/src/pipecat/adapters/services/gemini_adapter.py
+++ b/src/pipecat/adapters/services/gemini_adapter.py
@@ -8,8 +8,8 @@

 import base64
 import json
-from dataclasses import dataclass
-from typing import Any, Dict, List, Optional, TypedDict
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional, Tuple, TypedDict

 from loguru import logger
 from openai import NotGiven
@@ -24,13 +24,7 @@ from pipecat.processors.aggregators.llm_context import (
 )

 try:
-    from google.genai.types import (
-        Blob,
-        Content,
-        FunctionCall,
-        FunctionResponse,
-        Part,
-    )
+    from google.genai.types import Blob, Content, FileData, FunctionCall, FunctionResponse, Part
 except ModuleNotFoundError as e:
    logger.error(f"Exception: {e}")
    logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
@@ -133,6 +127,28 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
        messages: List[Content]
        system_instruction: Optional[str] = None

+    @dataclass
+    class MessageConversionResult:
+        """Result of converting a single universal context message to Google format.
+
+        Either content (a Google Content object) or a system instruction string
+        is guaranteed to be set.
+
+        Also returns a tool call ID to name mapping for any tool calls
+        discovered in the message.
+        """
+
+        content: Optional[Content] = None
+        system_instruction: Optional[str] = None
+        tool_call_id_to_name_mapping: Dict[str, str] = field(default_factory=dict)
+
+    @dataclass
+    class MessageConversionParams:
+        """Parameters for converting a single universal context message to Google format."""
+
+        already_have_system_instruction: bool
+        tool_call_id_to_name_mapping: Dict[str, str]
+
    def _from_universal_context_messages(
        self, universal_context_messages: List[LLMContextMessage]
    ) -> ConvertedMessages:
@@ -156,24 +172,26 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
        """
        system_instruction = None
        messages = []
+        tool_call_id_to_name_mapping = {}

        # Process each message, preserving Google-formatted messages and converting others
        for message in universal_context_messages:
-            if isinstance(message, LLMSpecificMessage):
-                # Assume that LLMSpecificMessage wraps a message in Google format
-                messages.append(message.message)
-                continue
-
-            # Convert standard format to Google format
-            converted = self._from_standard_message(
-                message, already_have_system_instruction=bool(system_instruction)
+            result = self._from_universal_context_message(
+                message,
+                params=self.MessageConversionParams(
+                    already_have_system_instruction=bool(system_instruction),
+                    tool_call_id_to_name_mapping=tool_call_id_to_name_mapping,
+                ),
            )
-            if isinstance(converted, Content):
-                # Regular (non-system) message
-                messages.append(converted)
-            else:
-                # System instruction
-                system_instruction = converted
+            # Each result is either a Content or a system instruction
+            if result.content:
+                messages.append(result.content)
+            elif result.system_instruction:
+                system_instruction = result.system_instruction
+
+            # Merge tool call ID to name mapping
+            if result.tool_call_id_to_name_mapping:
+                tool_call_id_to_name_mapping.update(result.tool_call_id_to_name_mapping)

        # Check if we only have function-related messages (no regular text)
        has_regular_messages = any(
@@ -193,9 +211,16 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):

        return self.ConvertedMessages(messages=messages, system_instruction=system_instruction)

+    def _from_universal_context_message(
+        self, message: LLMContextMessage, *, params: MessageConversionParams
+    ) -> MessageConversionResult:
+        if isinstance(message, LLMSpecificMessage):
+            return self.MessageConversionResult(content=message.message)
+        return self._from_standard_message(message, params=params)
+
    def _from_standard_message(
-        self, message: LLMStandardMessage, already_have_system_instruction: bool
-    ) -> Content | str:
+        self, message: LLMStandardMessage, *, params: MessageConversionParams
+    ) -> MessageConversionResult:
        """Convert standard universal context message to Google Content object.

        Handles conversion of text, images, and function calls to Google's
@@ -205,10 +230,11 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
        Args:
            message: Message in standard universal context format.
            already_have_system_instruction: Whether we already have a system instruction
+            params: Parameters for conversion.

        Returns:
-            Content object with role and parts, or a plain string for system
-            messages.
+            MessageConversionResult containing either a Content object or a
+            system instruction string.

        Examples:
            Standard text message::
@@ -242,38 +268,49 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            Converts to Google Content with::

                Content(
-                    role="model",
+                    role="user",
                    parts=[Part(function_call=FunctionCall(name="search", args={"query": "test"}))]
                )
        """
        role = message["role"]
        content = message.get("content", [])
+
        if role == "system":
-            if already_have_system_instruction:
+            if params.already_have_system_instruction:
                role = "user"  # Convert system message to user role if we already have a system instruction
            else:
-                # System instructions are returned as plain text
+                system_instruction: str = None
                if isinstance(content, str):
-                    return content
+                    system_instruction = content
                elif isinstance(content, list):
                    # If content is a list, we assume it's a list of text parts, per the standard
-                    return " ".join(part["text"] for part in content if part.get("type") == "text")
+                    system_instruction = " ".join(
+                        part["text"] for part in content if part.get("type") == "text"
+                    )
+                if system_instruction:
+                    return self.MessageConversionResult(system_instruction=system_instruction)
        elif role == "assistant":
            role = "model"

        parts = []
+        tool_call_id_to_name_mapping = {}
+
        if message.get("tool_calls"):
            for tc in message["tool_calls"]:
+                id = tc["id"]
+                name = tc["function"]["name"]
+                tool_call_id_to_name_mapping[id] = name
                parts.append(
                    Part(
                        function_call=FunctionCall(
-                            name=tc["function"]["name"],
+                            id=id,
+                            name=name,
                            args=json.loads(tc["function"]["arguments"]),
                        )
                    )
                )
        elif role == "tool":
-            role = "model"
+            role = "user"
            try:
                response = json.loads(message["content"])
                if isinstance(response, dict):
@@ -284,10 +321,18 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
                # Response might not be JSON-deserializable.
                # This occurs with a UserImageFrame, for example, where we get a plain "COMPLETED" string.
                response_dict = {"value": message["content"]}
+
+            # Get function name from mapping using tool_call_id, or fallback
+            tool_call_id = message.get("tool_call_id")
+            function_name = "tool_call_result"  # Default fallback
+            if tool_call_id and tool_call_id in params.tool_call_id_to_name_mapping:
+                function_name = params.tool_call_id_to_name_mapping[tool_call_id]
+
            parts.append(
                Part(
                    function_response=FunctionResponse(
-                        name="tool_call_result",  # seems to work to hard-code the same name every time
+                        id=tool_call_id,
+                        name=function_name,
                        response=response_dict,
                    )
                )
@@ -298,7 +343,7 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            for c in content:
                if c["type"] == "text":
                    parts.append(Part(text=c["text"]))
-                elif c["type"] == "image_url":
+                elif c["type"] == "image_url" and c["image_url"]["url"].startswith("data:"):
                    parts.append(
                        Part(
                            inline_data=Blob(
@@ -307,9 +352,25 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
                            )
                        )
                    )
+                elif c["type"] == "image_url":
+                    url = c["image_url"]["url"]
+                    logger.warning(f"Unsupported 'image_url': {url}")
                elif c["type"] == "input_audio":
                    input_audio = c["input_audio"]
                    audio_bytes = base64.b64decode(input_audio["data"])
                    parts.append(Part(inline_data=Blob(mime_type="audio/wav", data=audio_bytes)))
+                elif c["type"] == "file_data":
+                    file_data = c["file_data"]
+                    parts.append(
+                        Part(
+                            file_data=FileData(
+                                mime_type=file_data.get("mime_type"),
+                                file_uri=file_data.get("file_uri"),
+                            )
+                        )
+                    )

-        return Content(role=role, parts=parts)
+        return self.MessageConversionResult(
+            content=Content(role=role, parts=parts),
+            tool_call_id_to_name_mapping=tool_call_id_to_name_mapping,
+        )
--- a/src/pipecat/adapters/services/open_ai_realtime_adapter.py
+++ b/src/pipecat/adapters/services/open_ai_realtime_adapter.py
@@ -6,12 +6,18 @@

 """OpenAI Realtime LLM adapter for Pipecat."""

-from typing import Any, Dict, List, TypedDict
+import copy
+import json
+from dataclasses import dataclass
+from typing import Any, Dict, List, Optional, TypedDict
+
+from loguru import logger

 from pipecat.adapters.base_llm_adapter import BaseLLMAdapter
 from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
+from pipecat.processors.aggregators.llm_context import LLMContext, LLMContextMessage
+from pipecat.services.openai.realtime import events


 class OpenAIRealtimeLLMInvocationParams(TypedDict):
@@ -20,7 +26,9 @@ class OpenAIRealtimeLLMInvocationParams(TypedDict):
    This is a placeholder until support for universal LLMContext machinery is added for OpenAI Realtime.
    """

-    pass
+    system_instruction: Optional[str]
+    messages: List[events.ConversationItem]
+    tools: List[Dict[str, Any]]


 class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
@@ -33,7 +41,7 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
    @property
    def id_for_llm_specific_messages(self) -> str:
        """Get the identifier used in LLMSpecificMessage instances for OpenAI Realtime."""
-        raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")
+        return "openai-realtime"

    def get_llm_invocation_params(self, context: LLMContext) -> OpenAIRealtimeLLMInvocationParams:
        """Get OpenAI Realtime-specific LLM invocation parameters from a universal LLM context.
@@ -46,7 +54,13 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
        Returns:
            Dictionary of parameters for invoking OpenAI Realtime's API.
        """
-        raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")
+        messages = self._from_universal_context_messages(self.get_messages(context))
+        return {
+            "system_instruction": messages.system_instruction,
+            "messages": messages.messages,
+            # NOTE: LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
+            "tools": self.from_standard_tools(context.tools) or [],
+        }

    def get_messages_for_logging(self, context) -> List[Dict[str, Any]]:
        """Get messages from a universal LLM context in a format ready for logging about OpenAI Realtime.
@@ -61,7 +75,124 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
        Returns:
            List of messages in a format ready for logging about OpenAI Realtime.
        """
-        raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")
+        # NOTE: this is the same as in OpenAIAdapter, as that's what it was
+        # prior to a refactor. Worth noting that for OpenAI Realtime
+        # specifically, not everything handled here is necessarily supported
+        # (or supported yet).
+        msgs = []
+        for message in self.get_messages(context):
+            msg = copy.deepcopy(message)
+            if "content" in msg:
+                if isinstance(msg["content"], list):
+                    for item in msg["content"]:
+                        if item["type"] == "image_url":
+                            if item["image_url"]["url"].startswith("data:image/"):
+                                item["image_url"]["url"] = "data:image/..."
+                        if item["type"] == "input_audio":
+                            item["input_audio"]["data"] = "..."
+            if "mime_type" in msg and msg["mime_type"].startswith("image/"):
+                msg["data"] = "..."
+            msgs.append(msg)
+        return msgs
+
+    @dataclass
+    class ConvertedMessages:
+        """Container for OpenAI-formatted messages converted from universal context."""
+
+        messages: List[events.ConversationItem]
+        system_instruction: Optional[str] = None
+
+    def _from_universal_context_messages(
+        self, universal_context_messages: List[LLMContextMessage]
+    ) -> ConvertedMessages:
+        # We can't load a long conversation history into the openai realtime api yet. (The API/model
+        # forgets that it can do audio, if you do a series of `conversation.item.create` calls.) So
+        # our general strategy until this is fixed is just to put everything into a first "user"
+        # message as a single input.
+
+        if not universal_context_messages:
+            return self.ConvertedMessages(messages=[])
+
+        messages = copy.deepcopy(universal_context_messages)
+        system_instruction = None
+
+        # If we have a "system" message as our first message, let's pull that out into session
+        # "instructions"
+        if messages[0].get("role") == "system":
+            system = messages.pop(0)
+            content = system.get("content")
+            if isinstance(content, str):
+                system_instruction = content
+            elif isinstance(content, list):
+                system_instruction = content[0].get("text")
+            if not messages:
+                return self.ConvertedMessages(messages=[], system_instruction=system_instruction)
+
+        # If we have just a single "user" item, we can just send it normally
+        if len(messages) == 1 and messages[0].get("role") == "user":
+            return self.ConvertedMessages(
+                messages=[self._from_universal_context_message(messages[0])],
+                system_instruction=system_instruction,
+            )
+
+        # Otherwise, let's pack everything into a single "user" message with a bit of
+        # explanation for the LLM
+        intro_text = """
+        This is a previously saved conversation. Please treat this conversation history as a
+        starting point for the current conversation."""
+
+        trailing_text = """
+        This is the end of the previously saved conversation. Please continue the conversation
+        from here. If the last message is a user instruction or question, act on that instruction
+        or answer the question. If the last message is an assistant response, simple say that you
+        are ready to continue the conversation."""
+
+        return self.ConvertedMessages(
+            messages=[
+                {
+                    "role": "user",
+                    "type": "message",
+                    "content": [
+                        {
+                            "type": "input_text",
+                            "text": "\n\n".join(
+                                [intro_text, json.dumps(messages, indent=2), trailing_text]
+                            ),
+                        }
+                    ],
+                }
+            ],
+            system_instruction=system_instruction,
+        )
+
+    def _from_universal_context_message(
+        self, message: LLMContextMessage
+    ) -> events.ConversationItem:
+        if message.get("role") == "user":
+            content = message.get("content")
+            if isinstance(message.get("content"), list):
+                content = ""
+                for c in message.get("content"):
+                    if c.get("type") == "text":
+                        content += " " + c.get("text")
+                    else:
+                        logger.error(
+                            f"Unhandled content type in context message: {c.get('type')} - {message}"
+                        )
+            return events.ConversationItem(
+                role="user",
+                type="message",
+                content=[events.ItemContent(type="input_text", text=content)],
+            )
+        if message.get("role") == "assistant" and message.get("tool_calls"):
+            tc = message.get("tool_calls")[0]
+            return events.ConversationItem(
+                type="function_call",
+                call_id=tc["id"],
+                name=tc["function"]["name"],
+                arguments=tc["function"]["arguments"],
+            )
+        logger.error(f"Unhandled message type in _from_universal_context_message: {message}")

    @staticmethod
    def _to_openai_realtime_function_format(function: FunctionSchema) -> Dict[str, Any]:
@@ -94,4 +225,18 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
            List of function definitions in OpenAI Realtime format.
        """
        functions_schema = tools_schema.standard_tools
-        return [self._to_openai_realtime_function_format(func) for func in functions_schema]
+        standard_tools = [
+            self._to_openai_realtime_function_format(func) for func in functions_schema
+        ]
+
+        # For backward compatibility, OpenAI Realtime can still be used with
+        # tools in dict format, even though it always uses `LLMContext` under
+        # the hood (via `LLMContext.from_openai_context()`).
+        # To support this behavior, we use "shimmed" custom tools here.
+        # (We maintain this backward compatibility because users aren't
+        # *knowingly* opting into the new `LLMContext`.)
+        shimmed_tools = []
+        if tools_schema.custom_tools:
+            shimmed_tools = tools_schema.custom_tools.get(AdapterType.SHIM, [])
+
+        return standard_tools + shimmed_tools
--- a/src/pipecat/frames/frames.py
+++ b/src/pipecat/frames/frames.py
@@ -1201,26 +1201,23 @@ class TransportMessageUrgentFrame(OutputTransportMessageUrgentFrame):
 class UserImageRequestFrame(SystemFrame):
    """Frame requesting an image from a specific user.

-    A frame to request an image from the given user. The frame might be
-    generated by a function call in which case the corresponding fields will be
-    properly set.
+    A frame to request an image from the given user. The request might come with
+    a text that can be later used to describe the requested image.

    Parameters:
        user_id: Identifier of the user to request image from.
-        context: Optional context for the image request.
-        function_name: Name of function that generated this request (if any).
-        tool_call_id: Tool call ID if generated by function call.
+        text: An optional text associated to the image request.
+        append_to_context: Whether the requested image should be appended to the LLM context.
        video_source: Specific video source to capture from.
    """

    user_id: str
-    context: Optional[Any] = None
-    function_name: Optional[str] = None
-    tool_call_id: Optional[str] = None
+    text: Optional[str] = None
+    append_to_context: Optional[bool] = None
    video_source: Optional[str] = None

    def __str__(self):
-        return f"{self.name}(user: {self.user_id}, video_source: {self.video_source}, function: {self.function_name}, request: {self.tool_call_id})"
+        return f"{self.name}(user: {self.user_id}, text: {self.text}, append_to_context: {self.append_to_context}, {self.video_source})"


@dataclass
@@ -1294,15 +1291,17 @@ class UserImageRawFrame(InputImageRawFrame):

    Parameters:
        user_id: Identifier of the user who provided this image.
-        request: The original image request frame if this is a response.
+        text: An optional text associated to this image.
+        append_to_context: Whether the requested image should be appended to the LLM context.
    """

    user_id: str = ""
-    request: Optional[UserImageRequestFrame] = None
+    text: Optional[str] = None
+    append_to_context: Optional[bool] = None

    def __str__(self):
        pts = format_pts(self.pts)
-        return f"{self.name}(pts: {pts}, user: {self.user_id}, source: {self.transport_source}, size: {self.size}, format: {self.format}, request: {self.request})"
+        return f"{self.name}(pts: {pts}, user: {self.user_id}, source: {self.transport_source}, size: {self.size}, format: {self.format}, text: {self.text}, append_to_context: {self.append_to_context})"


@dataclass
--- a/src/pipecat/pipeline/llm_switcher.py
+++ b/src/pipecat/pipeline/llm_switcher.py
@@ -14,20 +14,41 @@ from pipecat.services.llm_service import LLMService


 class LLMSwitcher(ServiceSwitcher[StrategyType]):
-    """A pipeline that switches between different LLMs at runtime."""
+    """A pipeline that switches between different LLMs at runtime.
+
+    Example::
+
+        llm_switcher = LLMSwitcher(
+            llms=[openai_llm, anthropic_llm],
+            strategy_type=ServiceSwitcherStrategyManual
+        )
+    """

    def __init__(self, llms: List[LLMService], strategy_type: Type[StrategyType]):
-        """Initialize the service switcher with a list of LLMs and a switching strategy."""
+        """Initialize the service switcher with a list of LLMs and a switching strategy.
+
+        Args:
+            llms: List of LLM services to switch between.
+            strategy_type: The strategy class to use for switching between LLMs.
+        """
        super().__init__(llms, strategy_type)

    @property
    def llms(self) -> List[LLMService]:
-        """Get the list of LLMs managed by this switcher."""
+        """Get the list of LLMs managed by this switcher.
+
+        Returns:
+            List of LLM services managed by this switcher.
+        """
        return self.services

    @property
    def active_llm(self) -> Optional[LLMService]:
-        """Get the currently active LLM, if any."""
+        """Get the currently active LLM.
+
+        Returns:
+            The currently active LLM service, or None if no LLM is active.
+        """
        return self.strategy.active_service

    async def run_inference(self, context: LLMContext) -> Optional[str]:
--- a/src/pipecat/pipeline/runner.py
+++ b/src/pipecat/pipeline/runner.py
@@ -70,11 +70,15 @@ class PipelineRunner(BaseObject):
        """
        logger.debug(f"Runner {self} started running {task}")
        self._tasks[task.name] = task
-        params = PipelineTaskParams(loop=self._loop)
+
+        # PipelineTask handles asyncio.CancelledError to shutdown the pipeline
+        # properly and re-raises it in case there's more cleanup to do.
        try:
+            params = PipelineTaskParams(loop=self._loop)
            await task.run(params)
        except asyncio.CancelledError:
-            await self._cancel()
+            pass
+
        del self._tasks[task.name]

        # Cleanup base object.
--- a/src/pipecat/pipeline/service_switcher.py
+++ b/src/pipecat/pipeline/service_switcher.py
@@ -21,10 +21,22 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor


 class ServiceSwitcherStrategy:
-    """Base class for service switching strategies."""
+    """Base class for service switching strategies.
+
+    Note:
+        Strategy classes are instantiated internally by ServiceSwitcher.
+        Developers should pass the strategy class (not an instance) to ServiceSwitcher.
+    """

    def __init__(self, services: List[FrameProcessor]):
-        """Initialize the service switcher strategy with a list of services."""
+        """Initialize the service switcher strategy with a list of services.
+
+        Note:
+            This is called internally by ServiceSwitcher. Do not instantiate directly.
+
+        Args:
+            services: List of frame processors to switch between.
+        """
        self.services = services
        self.active_service: Optional[FrameProcessor] = None

@@ -46,10 +58,24 @@ class ServiceSwitcherStrategyManual(ServiceSwitcherStrategy):

    This strategy allows the user to manually select which service is active.
    The initial active service is the first one in the list.
+
+    Example::
+
+        stt_switcher = ServiceSwitcher(
+            services=[stt_1, stt_2],
+            strategy_type=ServiceSwitcherStrategyManual
+        )
    """

    def __init__(self, services: List[FrameProcessor]):
-        """Initialize the manual service switcher strategy with a list of services."""
+        """Initialize the manual service switcher strategy with a list of services.
+
+        Note:
+            This is called internally by ServiceSwitcher. Do not instantiate directly.
+
+        Args:
+            services: List of frame processors to switch between.
+        """
        super().__init__(services)
        self.active_service = services[0] if services else None

@@ -85,7 +111,12 @@ class ServiceSwitcher(ParallelPipeline, Generic[StrategyType]):
    """A pipeline that switches between different services at runtime."""

    def __init__(self, services: List[FrameProcessor], strategy_type: Type[StrategyType]):
-        """Initialize the service switcher with a list of services and a switching strategy."""
+        """Initialize the service switcher with a list of services and a switching strategy.
+
+        Args:
+            services: List of frame processors to switch between.
+            strategy_type: The strategy class to use for switching between services.
+        """
        strategy = strategy_type(services)
        super().__init__(*self._make_pipeline_definitions(services, strategy))
        self.services = services
@@ -100,14 +131,20 @@ class ServiceSwitcher(ParallelPipeline, Generic[StrategyType]):
            active_service: FrameProcessor,
            direction: FrameDirection,
        ):
-            """Initialize the service switcher filter with a strategy and direction."""
+            """Initialize the service switcher filter with a strategy and direction.
+
+            Args:
+                wrapped_service: The service that this filter wraps.
+                active_service: The currently active service.
+                direction: The direction of frame flow to filter.
+            """
+            self._wrapped_service = wrapped_service
+            self._active_service = active_service

            async def filter(_: Frame) -> bool:
                return self._wrapped_service == self._active_service

-            super().__init__(filter, direction)
-            self._wrapped_service = wrapped_service
-            self._active_service = active_service
+            super().__init__(filter, direction, filter_system_frames=True)

        async def process_frame(self, frame, direction):
            """Process a frame through the filter, handling special internal filter-updating frames."""
--- a/src/pipecat/pipeline/task.py
+++ b/src/pipecat/pipeline/task.py
@@ -12,7 +12,6 @@ including heartbeats, idle detection, and observer integration.
 """

 import asyncio
-import time
 from typing import Any, AsyncIterable, Dict, Iterable, List, Optional, Tuple, Type

 from loguru import logger
@@ -39,7 +38,7 @@ from pipecat.frames.frames import (
    UserSpeakingFrame,
 )
 from pipecat.metrics.metrics import ProcessingMetricsData, TTFBMetricsData
-from pipecat.observers.base_observer import BaseObserver
+from pipecat.observers.base_observer import BaseObserver, FramePushed
 from pipecat.observers.turn_tracking_observer import TurnTrackingObserver
 from pipecat.pipeline.base_task import BasePipelineTask, PipelineTaskParams
 from pipecat.pipeline.pipeline import Pipeline, PipelineSink, PipelineSource
@@ -57,6 +56,43 @@ IDLE_TIMEOUT_SECS = 300
 CANCEL_TIMEOUT_SECS = 20.0


+class IdleFrameObserver(BaseObserver):
+    """Idle timeout observer.
+
+    This observer waits for specific frames being generated in the pipeline. If
+    the frames are generated the given asyncio event is set. If the event is not
+    set it means the pipeline is probably idle.
+
+    """
+
+    def __init__(self, *, idle_event: asyncio.Event, idle_timeout_frames: Tuple[Type[Frame], ...]):
+        """Initialize the observer.
+
+        Args:
+            idle_event: The event to set if the idle timeout frames are being pushed.
+            idle_timeout_frames: A tuple with the frames that should set the event when received
+        """
+        super().__init__()
+        self._idle_event = idle_event
+        self._idle_timeout_frames = idle_timeout_frames
+        self._processed_frames = set()
+
+    async def on_push_frame(self, data: FramePushed):
+        """Callback executed when a frame is pushed in the pipeline.
+
+        Args:
+            data: The frame push event data.
+        """
+        # Skip already processed frames
+        if data.frame.id in self._processed_frames:
+            return
+
+        self._processed_frames.add(data.frame.id)
+
+        if isinstance(data.frame, StartFrame) or isinstance(data.frame, self._idle_timeout_frames):
+            self._idle_event.set()
+
+
 class PipelineParams(BaseModel):
    """Configuration parameters for pipeline execution.

@@ -215,7 +251,6 @@ class PipelineTask(BasePipelineTask):
        self._conversation_id = conversation_id
        self._enable_tracing = enable_tracing and is_tracing_available()
        self._enable_turn_tracking = enable_turn_tracking
-        self._idle_timeout_frames = idle_timeout_frames
        self._idle_timeout_secs = idle_timeout_secs
        if self._params.observers:
            import warnings
@@ -250,16 +285,24 @@ class PipelineTask(BasePipelineTask):
        # This queue is the queue used to push frames to the pipeline.
        self._push_queue = asyncio.Queue()
        self._process_push_task: Optional[asyncio.Task] = None
+
        # This is the heartbeat queue. When a heartbeat frame is received in the
        # down queue we add it to the heartbeat queue for processing.
        self._heartbeat_queue = asyncio.Queue()
        self._heartbeat_push_task: Optional[asyncio.Task] = None
        self._heartbeat_monitor_task: Optional[asyncio.Task] = None
-        # This is the idle queue. When frames are received downstream they are
-        # put in the queue. If no frame is received the pipeline is considered
-        # idle.
-        self._idle_queue = asyncio.Queue()
+
+        # This is the idle event. When selected frames are pushed from any
+        # processor we consider the pipeline is not idle. We use an observer
+        # which will be listening any part of the pipeline.
+        self._idle_event = asyncio.Event()
        self._idle_monitor_task: Optional[asyncio.Task] = None
+        if self._idle_timeout_secs:
+            idle_frame_observer = IdleFrameObserver(
+                idle_event=self._idle_event,
+                idle_timeout_frames=idle_timeout_frames,
+            )
+            observers.append(idle_frame_observer)

        # This event is used to indicate the StartFrame has been received at the
        # end of the pipeline.
@@ -269,6 +312,9 @@ class PipelineTask(BasePipelineTask):
        # StopFrame) has been received at the end of the pipeline.
        self._pipeline_end_event = asyncio.Event()

+        # This event is set when the pipeline truly finishes.
+        self._pipeline_finished_event = asyncio.Event()
+
        # This is the final pipeline. It is composed of a source processor,
        # followed by the user pipeline, and ending with a sink processor. The
        # source allows us to receive and react to upstream frames, and the sink
@@ -401,11 +447,7 @@ class PipelineTask(BasePipelineTask):
        await self.queue_frame(EndFrame())

    async def cancel(self):
-        """Immediately stop the running pipeline.
-
-        Cancels all running tasks and stops frame processing without
-        waiting for completion.
-        """
+        """Request the running pipeline to cancel."""
        if not self._finished:
            await self._cancel()

@@ -417,51 +459,38 @@ class PipelineTask(BasePipelineTask):
        """
        if self.has_finished():
            return
-        cleanup_pipeline = True
+
+        # Setup processors.
+        await self._setup(params)
+
+        # Create all main tasks and wait for the main push task. This is the
+        # task that pushes frames to the very beginning of our pipeline (i.e. to
+        # our controlled source processor).
+        await self._create_tasks()
+
        try:
-            # Setup processors.
-            await self._setup(params)
-
-            # Create all main tasks and wait of the main push task. This is the
-            # task that pushes frames to the very beginning of our pipeline (our
-            # controlled source processor).
-            push_task = await self._create_tasks()
-            await push_task
-
-            # We have already cleaned up the pipeline inside the task.
-            cleanup_pipeline = False
-
-            # Pipeline has finished nicely.
-            self._finished = True
+            # Wait for pipeline to finish.
+            await self._wait_for_pipeline_finished()
        except asyncio.CancelledError:
-            # Raise exception back to the pipeline runner so it can cancel this
-            # task properly.
+            logger.debug(f"Pipeline task {self} got cancelled from outside...")
+            # We have been cancelled from outside, let's just cancel everything.
+            await self._cancel()
+            # Wait again for pipeline to finish. This time we have really
+            # cancelled, so it should really finish.
+            await self._wait_for_pipeline_finished()
+            # Re-raise in case there's more cleanup to do.
            raise
        finally:
            # We can reach this point for different reasons:
            #
-            # 1. The task has finished properly (e.g. `EndFrame`).
-            # 2. By calling `PipelineTask.cancel()`.
-            # 3. By asyncio task cancellation.
-            #
-            # Case (1) will execute the code below without issues because
-            # `self._finished` is true.
-            #
-            # Case (2) will execute the code below without issues because
-            # `self._cancelled` is true.
-            #
-            # Case (3) will raise the exception above (because we are cancelling
-            # the asyncio task). This will be then captured by the
-            # `PipelineRunner` which will call `PipelineTask.cancel()` and
-            # therefore becoming case (2).
-            if self._finished or self._cancelled:
-                logger.debug(f"Pipeline task {self} is finishing cleanup...")
-                await self._cancel_tasks()
-                await self._cleanup(cleanup_pipeline)
-                if self._check_dangling_tasks:
-                    self._print_dangling_tasks()
-                self._finished = True
-                logger.debug(f"Pipeline task {self} has finished")
+            # 1. The pipeline task has finished (try case).
+            # 2. By an asyncio task cancellation (except case).
+            logger.debug(f"Pipeline task {self} is finishing...")
+            await self._cancel_tasks()
+            if self._check_dangling_tasks:
+                self._print_dangling_tasks()
+            self._finished = True
+            logger.debug(f"Pipeline task {self} has finished")

    async def queue_frame(self, frame: Frame):
        """Queue a single frame to be pushed down the pipeline.
@@ -489,19 +518,7 @@ class PipelineTask(BasePipelineTask):
        if not self._cancelled:
            logger.debug(f"Cancelling pipeline task {self}")
            self._cancelled = True
-            cancel_frame = CancelFrame()
-            # Make sure everything is cleaned up downstream. This is sent
-            # out-of-band from the main streaming task which is what we want since
-            # we want to cancel right away.
-            await self._pipeline.queue_frame(cancel_frame)
-            # Wait for CancelFrame to make it through the pipeline.
-            await self._wait_for_pipeline_end(cancel_frame)
-            # Only cancel the push task, we don't want to be able to process any
-            # other frame after cancel. Everything else will be cancelled in
-            # run().
-            if self._process_push_task:
-                await self._task_manager.cancel_task(self._process_push_task)
-                self._process_push_task = None
+            await self.queue_frame(CancelFrame())

    async def _create_tasks(self):
        """Create and start all pipeline processing tasks."""
@@ -556,7 +573,7 @@ class PipelineTask(BasePipelineTask):

    async def _maybe_cancel_idle_task(self):
        """Cancel idle monitoring task if it is running."""
-        if self._idle_timeout_secs and self._idle_monitor_task:
+        if self._idle_monitor_task:
            await self._task_manager.cancel_task(self._idle_monitor_task)
            self._idle_monitor_task = None

@@ -603,6 +620,17 @@ class PipelineTask(BasePipelineTask):

        self._pipeline_end_event.clear()

+        # We are really done.
+        self._pipeline_finished_event.set()
+
+    async def _wait_for_pipeline_finished(self):
+        await self._pipeline_finished_event.wait()
+        self._pipeline_finished_event.clear()
+        # Make sure we wait for the main task to complete.
+        if self._process_push_task:
+            await self._process_push_task
+            self._process_push_task = None
+
    async def _setup(self, params: PipelineTaskParams):
        """Set up the pipeline task and all processors."""
        mgr_params = TaskManagerParams(loop=params.loop)
@@ -721,10 +749,6 @@ class PipelineTask(BasePipelineTask):
        processors have handled the EndFrame and therefore we can exit the task
        cleanly.
        """
-        # Queue received frame to the idle queue so we can monitor idle
-        # pipelines.
-        await self._idle_queue.put(frame)
-
        if isinstance(frame, self._reached_downstream_types):
            await self._call_event_handler("on_frame_reached_downstream", frame)

@@ -787,33 +811,10 @@ class PipelineTask(BasePipelineTask):
        Note: Heartbeats are excluded from idle detection.
        """
        running = True
-        last_frame_time = 0
-
        while running:
            try:
-                frame = await asyncio.wait_for(
-                    self._idle_queue.get(), timeout=self._idle_timeout_secs
-                )
-
-                if isinstance(frame, StartFrame) or isinstance(frame, self._idle_timeout_frames):
-                    # If we find a StartFrame or one of the frames that prevents a
-                    # time out we update the time.
-                    last_frame_time = time.time()
-                else:
-                    # If we find any other frame we check if the pipeline is
-                    # idle by checking the last time we received one of the
-                    # valid frames.
-                    diff_time = time.time() - last_frame_time
-                    if diff_time >= self._idle_timeout_secs:
-                        running = await self._idle_timeout_detected()
-                        # Reset `last_frame_time` so we don't trigger another
-                        # immediate idle timeout if we are not cancelling. For
-                        # example, we might want to force the bot to say goodbye
-                        # and then clean nicely with an `EndFrame`.
-                        last_frame_time = time.time()
-
-                self._idle_queue.task_done()
-
+                await asyncio.wait_for(self._idle_event.wait(), timeout=self._idle_timeout_secs)
+                self._idle_event.clear()
            except asyncio.TimeoutError:
                running = await self._idle_timeout_detected()

@@ -825,7 +826,7 @@ class PipelineTask(BasePipelineTask):
        """
        # If we are cancelling, just exit the task.
        if self._cancelled:
-            return True
+            return False

        logger.warning("Idle timeout detected.")
        await self._call_event_handler("on_idle_timeout")
--- a/src/pipecat/pipeline/task_observer.py
+++ b/src/pipecat/pipeline/task_observer.py
@@ -129,7 +129,7 @@ class TaskObserver(BaseObserver):
        for proxy in self._proxies:
            await proxy.cleanup()

-    async def on_process_frame(self, data: FramePushed):
+    async def on_process_frame(self, data: FrameProcessed):
        """Queue frame data for all managed observers.

        Args:
@@ -189,7 +189,7 @@ class TaskObserver(BaseObserver):
            if isinstance(data, FramePushed):
                if on_push_frame_deprecated:
                    await observer.on_push_frame(
-                        data.src, data.dst, data.frame, data.direction, data.timestamp
+                        data.source, data.destination, data.frame, data.direction, data.timestamp
                    )
                else:
                    await observer.on_push_frame(data)
--- a/src/pipecat/processors/aggregators/llm_context.py
+++ b/src/pipecat/processors/aggregators/llm_context.py
@@ -16,8 +16,9 @@ service-specific adapter.

 import base64
 import io
+import wave
 from dataclasses import dataclass
-from typing import Any, List, Optional, TypeAlias, Union
+from typing import TYPE_CHECKING, Any, List, Optional, TypeAlias, Union

 from loguru import logger
 from openai._types import NOT_GIVEN as OPEN_AI_NOT_GIVEN
@@ -28,9 +29,12 @@ from openai.types.chat import (
 )
 from PIL import Image

-from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
 from pipecat.frames.frames import AudioRawFrame

+if TYPE_CHECKING:
+    from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+
 # "Re-export" types from OpenAI that we're using as universal context types.
 # NOTE: if universal message types need to someday diverge from OpenAI's, we
 # should consider managing our own definitions. But we should do so carefully,
@@ -65,6 +69,34 @@ class LLMContext:
    and content formatting.
    """

+    @staticmethod
+    def from_openai_context(openai_context: "OpenAILLMContext") -> "LLMContext":
+        """Create a universal LLM context from an OpenAI-specific context.
+
+        NOTE: this should only be used internally, for facilitating migration
+        from OpenAILLMContext to LLMContext. New user code should use
+        LLMContext directly.
+
+        Args:
+            openai_context: The OpenAI LLM context to convert.
+
+        Returns:
+            New LLMContext instance with converted messages and settings.
+        """
+        # Convert tools to ToolsSchema if needed.
+        # If the tools are already a ToolsSchema, this is a no-op.
+        # Otherwise, we wrap them in a shim ToolsSchema.
+        converted_tools = openai_context.tools
+        if isinstance(converted_tools, list):
+            converted_tools = ToolsSchema(
+                standard_tools=[], custom_tools={AdapterType.SHIM: converted_tools}
+            )
+        return LLMContext(
+            messages=openai_context.get_messages(),
+            tools=converted_tools,
+            tool_choice=openai_context.tool_choice,
+        )
+
    def __init__(
        self,
        messages: Optional[List[LLMContextMessage]] = None,
@@ -82,6 +114,129 @@ class LLMContext:
        self._tools: ToolsSchema | NotGiven = LLMContext._normalize_and_validate_tools(tools)
        self._tool_choice: LLMContextToolChoice | NotGiven = tool_choice

+    @staticmethod
+    def create_image_url_message(
+        *,
+        role: str = "user",
+        url: str,
+        text: Optional[str] = None,
+    ) -> LLMContextMessage:
+        """Create a context message containing an image URL.
+
+        Args:
+            role: The role of this message (defaults to "user").
+            url: The URL of the image.
+            text: Optional text to include with the image.
+        """
+        content = []
+        if text:
+            content.append({"type": "text", "text": text})
+
+        content.append({"type": "image_url", "image_url": {"url": url}})
+
+        return {"role": role, "content": content}
+
+    @staticmethod
+    def create_image_message(
+        *,
+        role: str = "user",
+        format: str,
+        size: tuple[int, int],
+        image: bytes,
+        text: Optional[str] = None,
+    ) -> LLMContextMessage:
+        """Create a context message containing an image.
+
+        Args:
+            role: The role of this message (defaults to "user").
+            format: Image format (e.g., 'RGB', 'RGBA').
+            size: Image dimensions as (width, height) tuple.
+            image: Raw image bytes.
+            text: Optional text to include with the image.
+        """
+        buffer = io.BytesIO()
+        Image.frombytes(format, size, image).save(buffer, format="JPEG")
+        encoded_image = base64.b64encode(buffer.getvalue()).decode("utf-8")
+        url = f"data:image/jpeg;base64,{encoded_image}"
+
+        return LLMContext.create_image_url_message(role=role, url=url, text=text)
+
+    @staticmethod
+    def create_audio_message(
+        *, role: str = "user", audio_frames: list[AudioRawFrame], text: str = "Audio follows"
+    ) -> LLMContextMessage:
+        """Create a context message containing audio.
+
+        Args:
+            role: The role of this message (defaults to "user").
+            audio_frames: List of audio frame objects to include.
+            text: Optional text to include with the audio.
+        """
+        sample_rate = audio_frames[0].sample_rate
+        num_channels = audio_frames[0].num_channels
+
+        content = []
+        content.append({"type": "text", "text": text})
+        data = b"".join(frame.audio for frame in audio_frames)
+
+        with io.BytesIO() as buffer:
+            with wave.open(buffer, "wb") as wf:
+                wf.setsampwidth(2)
+                wf.setnchannels(num_channels)
+                wf.setframerate(sample_rate)
+                wf.writeframes(data)
+
+        encoded_audio = base64.b64encode(buffer.getvalue()).decode("utf-8")
+
+        content.append(
+            {
+                "type": "input_audio",
+                "input_audio": {"data": encoded_audio, "format": "wav"},
+            }
+        )
+
+        return {"role": role, "content": content}
+
+    @property
+    def messages(self) -> List[LLMContextMessage]:
+        """Get the current messages list.
+
+        NOTE: This is equivalent to calling `get_messages()` with no filter. If
+        you want to filter out LLM-specific messages that don't pertain to your
+        LLM, use `get_messages()` directly.
+
+        Returns:
+            List of conversation messages.
+        """
+        return self.get_messages()
+
+    def get_messages_for_persistent_storage(self) -> List[LLMContextMessage]:
+        """Get messages suitable for persistent storage.
+
+        NOTE: the only reason this method exists is because we're "silently"
+        switching from OpenAILLMContext to LLMContext under the hood in some
+        services and don't want to trip up users who may have been relying on
+        this method, which is part of the public API of OpenAILLMContext but
+        doesn't need to be for LLMContext.
+
+        .. deprecated::
+            Use `get_messages()` instead.
+
+        Returns:
+            List of conversation messages.
+        """
+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "get_messages_for_persistent_storage() is deprecated, use get_messages() instead.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
+        return self.get_messages()
+
    def get_messages(self, llm_specific_filter: Optional[str] = None) -> List[LLMContextMessage]:
        """Get the current messages list.

@@ -89,7 +244,8 @@ class LLMContext:
            llm_specific_filter: Optional filter to return LLM-specific
                messages for the given LLM, in addition to the standard
                messages. If messages end up being filtered, an error will be
-                logged.
+                logged; this is intended to catch accidental use of
+                incompatible LLM-specific messages.

        Returns:
            List of conversation messages.
@@ -166,7 +322,7 @@ class LLMContext:
        self._tool_choice = tool_choice

    def add_image_frame_message(
-        self, *, format: str, size: tuple[int, int], image: bytes, text: str = None
+        self, *, format: str, size: tuple[int, int], image: bytes, text: Optional[str] = None
    ):
        """Add a message containing an image frame.

@@ -176,17 +332,8 @@ class LLMContext:
            image: Raw image bytes.
            text: Optional text to include with the image.
        """
-        buffer = io.BytesIO()
-        Image.frombytes(format, size, image).save(buffer, format="JPEG")
-        encoded_image = base64.b64encode(buffer.getvalue()).decode("utf-8")
-
-        content = []
-        if text:
-            content.append({"type": "text", "text": text})
-        content.append(
-            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
-        )
-        self.add_message({"role": "user", "content": content})
+        message = LLMContext.create_image_message(format=format, size=size, image=image, text=text)
+        self.add_message(message)

    def add_audio_frames_message(
        self, *, audio_frames: list[AudioRawFrame], text: str = "Audio follows"
@@ -197,66 +344,8 @@ class LLMContext:
            audio_frames: List of audio frame objects to include.
            text: Optional text to include with the audio.
        """
-        if not audio_frames:
-            return
-
-        sample_rate = audio_frames[0].sample_rate
-        num_channels = audio_frames[0].num_channels
-
-        content = []
-        content.append({"type": "text", "text": text})
-        data = b"".join(frame.audio for frame in audio_frames)
-        data = bytes(
-            self._create_wav_header(
-                sample_rate,
-                num_channels,
-                16,
-                len(data),
-            )
-            + data
-        )
-        encoded_audio = base64.b64encode(data).decode("utf-8")
-        content.append(
-            {
-                "type": "input_audio",
-                "input_audio": {"data": encoded_audio, "format": "wav"},
-            }
-        )
-        self.add_message({"role": "user", "content": content})
-
-    def _create_wav_header(self, sample_rate, num_channels, bits_per_sample, data_size):
-        """Create a WAV file header for audio data.
-
-        Args:
-            sample_rate: Audio sample rate in Hz.
-            num_channels: Number of audio channels.
-            bits_per_sample: Bits per audio sample.
-            data_size: Size of audio data in bytes.
-
-        Returns:
-            WAV header as a bytearray.
-        """
-        # RIFF chunk descriptor
-        header = bytearray()
-        header.extend(b"RIFF")  # ChunkID
-        header.extend((data_size + 36).to_bytes(4, "little"))  # ChunkSize: total size - 8
-        header.extend(b"WAVE")  # Format
-        # "fmt " sub-chunk
-        header.extend(b"fmt ")  # Subchunk1ID
-        header.extend((16).to_bytes(4, "little"))  # Subchunk1Size (16 for PCM)
-        header.extend((1).to_bytes(2, "little"))  # AudioFormat (1 for PCM)
-        header.extend(num_channels.to_bytes(2, "little"))  # NumChannels
-        header.extend(sample_rate.to_bytes(4, "little"))  # SampleRate
-        # Calculate byte rate and block align
-        byte_rate = sample_rate * num_channels * (bits_per_sample // 8)
-        block_align = num_channels * (bits_per_sample // 8)
-        header.extend(byte_rate.to_bytes(4, "little"))  # ByteRate
-        header.extend(block_align.to_bytes(2, "little"))  # BlockAlign
-        header.extend(bits_per_sample.to_bytes(2, "little"))  # BitsPerSample
-        # "data" sub-chunk
-        header.extend(b"data")  # Subchunk2ID
-        header.extend(data_size.to_bytes(4, "little"))  # Subchunk2Size
-        return header
+        message = LLMContext.create_audio_message(audio_frames=audio_frames, text=text)
+        self.add_message(message)

    @staticmethod
    def _normalize_and_validate_tools(tools: ToolsSchema | NotGiven) -> ToolsSchema | NotGiven:
--- a/src/pipecat/processors/aggregators/llm_response.py
+++ b/src/pipecat/processors/aggregators/llm_response.py
@@ -89,7 +89,9 @@ class LLMAssistantAggregatorParams:

    Parameters:
        expect_stripped_words: Whether to expect and handle stripped words
-            in text frames by adding spaces between tokens.
+            in text frames by adding spaces between tokens. This parameter is
+            ignored when used with the newer LLMAssistantAggregator, which
+            handles word spacing automatically.
    """

    expect_stripped_words: bool = True
--- a/src/pipecat/processors/aggregators/llm_response_universal.py
+++ b/src/pipecat/processors/aggregators/llm_response_universal.py
@@ -13,6 +13,7 @@ LLM processing, and text-to-speech components in conversational AI pipelines.

 import asyncio
 import json
+import warnings
 from abc import abstractmethod
 from typing import Any, Dict, List, Literal, Optional, Set

@@ -65,6 +66,7 @@ from pipecat.processors.aggregators.llm_response import (
    LLMUserAggregatorParams,
 )
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.utils.string import concatenate_aggregated_text
 from pipecat.utils.time import time_now_iso8601


@@ -88,7 +90,7 @@ class LLMContextAggregator(FrameProcessor):
        self._context = context
        self._role = role

-        self._aggregation: str = ""
+        self._aggregation: List[str] = []

    @property
    def messages(self) -> List[LLMContextMessage]:
@@ -168,13 +170,21 @@ class LLMContextAggregator(FrameProcessor):

    async def reset(self):
        """Reset the aggregation state."""
-        self._aggregation = ""
+        self._aggregation = []

    @abstractmethod
    async def push_aggregation(self):
        """Push the current aggregation downstream."""
        pass

+    def aggregation_string(self) -> str:
+        """Get the current aggregation as a string.
+
+        Returns:
+            The concatenated aggregation string.
+        """
+        return concatenate_aggregated_text(self._aggregation)
+

 class LLMUserAggregator(LLMContextAggregator):
    """User LLM aggregator that processes speech-to-text transcriptions.
@@ -212,8 +222,6 @@ class LLMUserAggregator(LLMContextAggregator):
        self._turn_params: Optional[SmartTurnParams] = None

        if "aggregation_timeout" in kwargs:
-            import warnings
-
            with warnings.catch_warnings():
                warnings.simplefilter("always")
                warnings.warn(
@@ -290,6 +298,12 @@ class LLMUserAggregator(LLMContextAggregator):
            await self._handle_llm_messages_update(frame)
        elif isinstance(frame, LLMSetToolsFrame):
            self.set_tools(frame.tools)
+            # Push the LLMSetToolsFrame as well, since speech-to-speech LLM
+            # services (like OpenAI Realtime) may need to know about tool
+            # changes; unlike text-based LLM services they won't just "pick up
+            # the change" on the next LLM run, as the LLM is continuously
+            # running.
+            await self.push_frame(frame, direction)
        elif isinstance(frame, LLMSetToolChoiceFrame):
            self.set_tool_choice(frame.tool_choice)
        elif isinstance(frame, SpeechControlParamsFrame):
@@ -301,7 +315,7 @@ class LLMUserAggregator(LLMContextAggregator):

    async def _process_aggregation(self):
        """Process the current aggregation and push it downstream."""
-        aggregation = self._aggregation
+        aggregation = self.aggregation_string()
        await self.reset()
        self._context.add_message({"role": self.role, "content": aggregation})
        frame = LLMContextFrame(self._context)
@@ -349,7 +363,7 @@ class LLMUserAggregator(LLMContextAggregator):
        """

        async def should_interrupt(strategy: BaseInterruptionStrategy):
-            await strategy.append_text(self._aggregation)
+            await strategy.append_text(self.aggregation_string())
            return await strategy.should_interrupt()

        return any([await should_interrupt(s) for s in self._interruption_strategies])
@@ -419,7 +433,7 @@ class LLMUserAggregator(LLMContextAggregator):
        if not text.strip():
            return

-        self._aggregation += f" {text}" if self._aggregation else text
+        self._aggregation.append(text)
        # We just got a final result, so let's reset interim results.
        self._seen_interim_results = False
        # Reset aggregation timer.
@@ -544,23 +558,31 @@ class LLMAssistantAggregator(LLMContextAggregator):
        Args:
            context: The OpenAI LLM context for conversation storage.
            params: Configuration parameters for aggregation behavior.
-            **kwargs: Additional arguments. Supports deprecated 'expect_stripped_words'.
+            **kwargs: Additional arguments.
        """
        super().__init__(context=context, role="assistant", **kwargs)
        self._params = params or LLMAssistantAggregatorParams()

        if "expect_stripped_words" in kwargs:
-            import warnings
-
            with warnings.catch_warnings():
                warnings.simplefilter("always")
                warnings.warn(
-                    "Parameter 'expect_stripped_words' is deprecated, use 'params' instead.",
+                    "Parameter 'expect_stripped_words' is deprecated. "
+                    "LLMAssistantAggregator now handles word spacing automatically.",
                    DeprecationWarning,
                )

            self._params.expect_stripped_words = kwargs["expect_stripped_words"]

+        if params and not params.expect_stripped_words:
+            with warnings.catch_warnings():
+                warnings.simplefilter("always")
+                warnings.warn(
+                    "params.expect_stripped_words is deprecated. "
+                    "LLMAssistantAggregator now handles word spacing automatically.",
+                    DeprecationWarning,
+                )
+
        self._started = 0
        self._function_calls_in_progress: Dict[str, Optional[FunctionCallInProgressFrame]] = {}
        self._context_updated_tasks: Set[asyncio.Task] = set()
@@ -610,7 +632,7 @@ class LLMAssistantAggregator(LLMContextAggregator):
            await self._handle_function_call_result(frame)
        elif isinstance(frame, FunctionCallCancelFrame):
            await self._handle_function_call_cancel(frame)
-        elif isinstance(frame, UserImageRawFrame) and frame.request and frame.request.tool_call_id:
+        elif isinstance(frame, UserImageRawFrame):
            await self._handle_user_image_frame(frame)
        elif isinstance(frame, BotStoppedSpeakingFrame):
            await self.push_aggregation()
@@ -623,7 +645,7 @@ class LLMAssistantAggregator(LLMContextAggregator):
        if not self._aggregation:
            return

-        aggregation = self._aggregation.strip()
+        aggregation = self.aggregation_string()
        await self.reset()

        if aggregation:
@@ -761,27 +783,16 @@ class LLMAssistantAggregator(LLMContextAggregator):
                message["content"] = result

    async def _handle_user_image_frame(self, frame: UserImageRawFrame):
-        logger.debug(
-            f"{self} UserImageRawFrame: [{frame.request.function_name}:{frame.request.tool_call_id}]"
-        )
-
-        if frame.request.tool_call_id not in self._function_calls_in_progress:
-            logger.warning(
-                f"UserImageRawFrame tool_call_id [{frame.request.tool_call_id}] is not running"
-            )
+        if not frame.append_to_context:
            return

-        del self._function_calls_in_progress[frame.request.tool_call_id]
+        logger.debug(f"{self} Appending UserImageRawFrame to LLM context (size: {frame.size})")

-        # Update context with the image frame
-        self._update_function_call_result(
-            frame.request.function_name, frame.request.tool_call_id, "COMPLETED"
-        )
        self._context.add_image_frame_message(
            format=frame.format,
            size=frame.size,
            image=frame.image,
-            text=frame.request.context,
+            text=frame.text,
        )

        await self.push_aggregation()
@@ -798,10 +809,11 @@ class LLMAssistantAggregator(LLMContextAggregator):
        if not self._started:
            return

-        if self._params.expect_stripped_words:
-            self._aggregation += f" {frame.text}" if self._aggregation else frame.text
-        else:
-            self._aggregation += frame.text
+        # Make sure we really have text (spaces count, too!)
+        if len(frame.text) == 0:
+            return
+
+        self._aggregation.append(frame.text)

    def _context_updated_task_finished(self, task: asyncio.Task):
        self._context_updated_tasks.discard(task)
--- a/src/pipecat/processors/aggregators/user_response.py
+++ b/src/pipecat/processors/aggregators/user_response.py
@@ -27,11 +27,24 @@ class UserResponseAggregator(LLMUserAggregator):
    def __init__(self, **kwargs):
        """Initialize the user response aggregator.

+        .. deprecated:: 0.0.92
+            `UserResponseAggregator` is deprecated and will be removed in a future version.
+
        Args:
            **kwargs: Additional arguments passed to parent LLMUserAggregator.
        """
        super().__init__(context=LLMContext(), **kwargs)

+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "`UserResponseAggregator` is deprecated and will be removed in a future version.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
    async def push_aggregation(self):
        """Push the aggregated user response as a TextFrame.

--- a/src/pipecat/processors/filters/function_filter.py
+++ b/src/pipecat/processors/filters/function_filter.py
@@ -12,7 +12,7 @@ allowing for flexible frame filtering logic in processing pipelines.

 from typing import Awaitable, Callable

-from pipecat.frames.frames import EndFrame, Frame, SystemFrame
+from pipecat.frames.frames import CancelFrame, EndFrame, Frame, StartFrame, SystemFrame
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor


@@ -28,6 +28,7 @@ class FunctionFilter(FrameProcessor):
        self,
        filter: Callable[[Frame], Awaitable[bool]],
        direction: FrameDirection = FrameDirection.DOWNSTREAM,
+        filter_system_frames: bool = False,
    ):
        """Initialize the function filter.

@@ -36,22 +37,32 @@ class FunctionFilter(FrameProcessor):
                frame should pass through, False otherwise.
            direction: The direction to apply filtering. Only frames moving in
                this direction will be filtered. Defaults to DOWNSTREAM.
+            filter_system_frames: Whether to filter system frames. Defaults to False.
        """
        super().__init__()
        self._filter = filter
        self._direction = direction
+        self._filter_system_frames = filter_system_frames

    #
    # Frame processor
    #

-    # Ignore system frames, end frames and frames that are not following the
-    # direction of this gate
    def _should_passthrough_frame(self, frame, direction):
        """Check if a frame should pass through without filtering."""
-        # Ignore system frames, end frames and frames that are not following the
-        # direction of this gate
-        return isinstance(frame, (SystemFrame, EndFrame)) or direction != self._direction
+        # Always passthrough frames in the wrong direction
+        if direction != self._direction:
+            return True
+
+        # Always passthrough lifecycle frames
+        if isinstance(frame, (StartFrame, EndFrame, CancelFrame)):
+            return True
+
+        # If not filtering system frames, passthrough all other system frames
+        if not self._filter_system_frames and isinstance(frame, SystemFrame):
+            return True
+
+        return False

    async def process_frame(self, frame: Frame, direction: FrameDirection):
        """Process a frame through the filter.
--- a/src/pipecat/processors/frameworks/rtvi.py
+++ b/src/pipecat/processors/frameworks/rtvi.py
@@ -1018,6 +1018,7 @@ class RTVIObserver(BaseObserver):

        if (
            isinstance(frame, (UserStartedSpeakingFrame, UserStoppedSpeakingFrame))
+            and (direction == FrameDirection.DOWNSTREAM)
            and self._params.user_speaking_enabled
        ):
            await self._handle_interruptions(frame)
--- a/src/pipecat/processors/transcript_processor.py
+++ b/src/pipecat/processors/transcript_processor.py
@@ -26,6 +26,7 @@ from pipecat.frames.frames import (
    TTSTextFrame,
 )
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.utils.string import concatenate_aggregated_text
 from pipecat.utils.time import time_now_iso8601


@@ -140,29 +141,7 @@ class AssistantTranscriptProcessor(BaseTranscriptProcessor):
                Result: "Hello there how are you"
        """
        if self._current_text_parts and self._aggregation_start_time:
-            # Check specifically for space characters, previously isspace() was used
-            # but that includes all whitespace characters (e.g. \n), not just spaces.
-            has_leading_spaces = any(
-                part and part[0] == " " for part in self._current_text_parts[1:]
-            )
-            has_trailing_spaces = any(
-                part and part[-1] == " " for part in self._current_text_parts[:-1]
-            )
-
-            # If there are embedded spaces in the fragments, use direct concatenation
-            contains_spacing_between_fragments = has_leading_spaces or has_trailing_spaces
-
-            # Apply corresponding joining method
-            if contains_spacing_between_fragments:
-                # Fragments already have spacing - just concatenate
-                content = "".join(self._current_text_parts)
-            else:
-                # Word-by-word fragments - join with spaces
-                content = " ".join(self._current_text_parts)
-
-            # Clean up any excessive whitespace
-            content = content.strip()
-
+            content = concatenate_aggregated_text(self._current_text_parts)
            if content:
                logger.trace(f"Emitting aggregated assistant message: {content}")
                message = TranscriptionMessage(
--- a/src/pipecat/runner/daily.py
+++ b/src/pipecat/runner/daily.py
@@ -44,6 +44,8 @@ from loguru import logger
 from pydantic import BaseModel

 from pipecat.transports.daily.utils import (
+    DailyMeetingTokenParams,
+    DailyMeetingTokenProperties,
    DailyRESTHelper,
    DailyRoomParams,
    DailyRoomProperties,
@@ -76,12 +78,15 @@ class DailyRoomConfig(BaseModel):
 async def configure(
    aiohttp_session: aiohttp.ClientSession,
    *,
+    api_key: Optional[str] = None,
    room_exp_duration: Optional[float] = 2.0,
    token_exp_duration: Optional[float] = 2.0,
    sip_caller_phone: Optional[str] = None,
    sip_enable_video: Optional[bool] = False,
    sip_num_endpoints: Optional[int] = 1,
    sip_codecs: Optional[Dict[str, List[str]]] = None,
+    room_properties: Optional[DailyRoomProperties] = None,
+    token_properties: Optional["DailyMeetingTokenProperties"] = None,
 ) -> DailyRoomConfig:
    """Configure Daily room URL and token with optional SIP capabilities.

@@ -91,6 +96,7 @@ async def configure(

    Args:
        aiohttp_session: HTTP session for making API requests.
+        api_key: Daily API key.
        room_exp_duration: Room expiration time in hours.
        token_exp_duration: Token expiration time in hours.
        sip_caller_phone: Phone number or identifier for SIP display name.
@@ -99,6 +105,13 @@ async def configure(
        sip_num_endpoints: Number of allowed SIP endpoints.
        sip_codecs: Codecs to support for audio and video. If None, uses Daily defaults.
            Example: {"audio": ["OPUS"], "video": ["H264"]}
+        room_properties: Optional DailyRoomProperties to use instead of building from
+            individual parameters. When provided, this overrides room_exp_duration and
+            SIP-related parameters. If not provided, properties are built from the
+            individual parameters as before.
+        token_properties: Optional DailyMeetingTokenProperties to customize the meeting
+            token. When provided, these properties are passed to the token creation API.
+            Note that room_name, exp, and is_owner will be set automatically.

    Returns:
        DailyRoomConfig: Object with room_url, token, and optional sip_endpoint.
@@ -115,18 +128,48 @@ async def configure(
        # SIP-enabled room
        sip_config = await configure(session, sip_caller_phone="+15551234567")
        print(f"SIP endpoint: {sip_config.sip_endpoint}")
+
+        # Custom room properties with recording enabled
+        custom_props = DailyRoomProperties(
+            enable_recording="cloud",
+            max_participants=2,
+        )
+        config = await configure(session, room_properties=custom_props)
    """
    # Check for required API key
-    api_key = os.getenv("DAILY_API_KEY")
+    api_key = api_key or os.getenv("DAILY_API_KEY")
    if not api_key:
        raise Exception(
            "DAILY_API_KEY environment variable is required. "
            "Get your API key from https://dashboard.daily.co/developers"
        )

+    # Warn if both room_properties and individual parameters are provided
+    if room_properties is not None:
+        individual_params_provided = any(
+            [
+                room_exp_duration != 2.0,
+                token_exp_duration != 2.0,
+                sip_caller_phone is not None,
+                sip_enable_video is not False,
+                sip_num_endpoints != 1,
+                sip_codecs is not None,
+            ]
+        )
+        if individual_params_provided:
+            logger.warning(
+                "Both room_properties and individual parameters (room_exp_duration, token_exp_duration, "
+                "sip_*) were provided. The room_properties will be used and individual parameters "
+                "will be ignored."
+            )
+
    # Determine if SIP mode is enabled
    sip_enabled = sip_caller_phone is not None

+    # If room_properties is provided, check if it has SIP configuration
+    if room_properties and room_properties.sip:
+        sip_enabled = True
+
    daily_rest_helper = DailyRESTHelper(
        daily_api_key=api_key,
        daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
@@ -142,7 +185,10 @@ async def configure(

        # Create token and return standard format
        expiry_time: float = token_exp_duration * 60 * 60
-        token = await daily_rest_helper.get_token(room_url, expiry_time)
+        token_params = None
+        if token_properties:
+            token_params = DailyMeetingTokenParams(properties=token_properties)
+        token = await daily_rest_helper.get_token(room_url, expiry_time, params=token_params)
        return DailyRoomConfig(room_url=room_url, token=token)

    # Create a new room
@@ -150,27 +196,29 @@ async def configure(
    room_name = f"{room_prefix}-{uuid.uuid4().hex[:8]}"
    logger.info(f"Creating new Daily room: {room_name}")

-    # Calculate expiration time
-    expiration_time = time.time() + (room_exp_duration * 60 * 60)
+    # Use provided room_properties or build from parameters
+    if room_properties is None:
+        # Calculate expiration time
+        expiration_time = time.time() + (room_exp_duration * 60 * 60)

-    # Create room properties
-    room_properties = DailyRoomProperties(
-        exp=expiration_time,
-        eject_at_room_exp=True,
-    )
-
-    # Add SIP configuration if enabled
-    if sip_enabled:
-        sip_params = DailyRoomSipParams(
-            display_name=sip_caller_phone,
-            video=sip_enable_video,
-            sip_mode="dial-in",
-            num_endpoints=sip_num_endpoints,
-            codecs=sip_codecs,
+        # Create room properties
+        room_properties = DailyRoomProperties(
+            exp=expiration_time,
+            eject_at_room_exp=True,
        )
-        room_properties.sip = sip_params
-        room_properties.enable_dialout = True  # Enable outbound calls if needed
-        room_properties.start_video_off = not sip_enable_video  # Voice-only by default
+
+        # Add SIP configuration if enabled
+        if sip_enabled:
+            sip_params = DailyRoomSipParams(
+                display_name=sip_caller_phone,
+                video=sip_enable_video,
+                sip_mode="dial-in",
+                num_endpoints=sip_num_endpoints,
+                codecs=sip_codecs,
+            )
+            room_properties.sip = sip_params
+            room_properties.enable_dialout = True  # Enable outbound calls if needed
+            room_properties.start_video_off = not sip_enable_video  # Voice-only by default

    # Create room parameters
    room_params = DailyRoomParams(name=room_name, properties=room_properties)
@@ -182,7 +230,12 @@ async def configure(

        # Create meeting token
        token_expiry_seconds = token_exp_duration * 60 * 60
-        token = await daily_rest_helper.get_token(room_url, token_expiry_seconds)
+        token_params = None
+        if token_properties:
+            token_params = DailyMeetingTokenParams(properties=token_properties)
+        token = await daily_rest_helper.get_token(
+            room_url, token_expiry_seconds, params=token_params
+        )

        if sip_enabled:
            # Return SIP configuration object
--- a/src/pipecat/runner/run.py
+++ b/src/pipecat/runner/run.py
@@ -70,16 +70,19 @@ import asyncio
 import mimetypes
 import os
 import sys
+import uuid
 from contextlib import asynccontextmanager
+from http import HTTPMethod
 from pathlib import Path
-from typing import Optional
+from typing import Any, Dict, List, Optional, TypedDict

 import aiohttp
-from fastapi.responses import FileResponse
+from fastapi.responses import FileResponse, Response
 from loguru import logger

 from pipecat.runner.types import (
    DailyRunnerArguments,
+    RunnerArguments,
    SmallWebRTCRunnerArguments,
    WebSocketRunnerArguments,
 )
@@ -166,6 +169,7 @@ def _create_server_app(
    host: str = "localhost",
    proxy: str,
    esp32_mode: bool = False,
+    whatsapp_enabled: bool = False,
    folder: Optional[str] = None,
 ):
    """Create FastAPI app with transport-specific routes."""
@@ -182,7 +186,8 @@ def _create_server_app(
    # Set up transport-specific routes
    if transport_type == "webrtc":
        _setup_webrtc_routes(app, esp32_mode=esp32_mode, host=host, folder=folder)
-        _setup_whatsapp_routes(app)
+        if whatsapp_enabled:
+            _setup_whatsapp_routes(app)
    elif transport_type == "daily":
        _setup_daily_routes(app)
    elif transport_type in TELEPHONY_TRANSPORTS:
@@ -200,8 +205,10 @@ def _setup_webrtc_routes(
    try:
        from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI

-        from pipecat.transports.smallwebrtc.connection import SmallWebRTCConnection
+        from pipecat.transports.smallwebrtc.connection import IceServer, SmallWebRTCConnection
        from pipecat.transports.smallwebrtc.request_handler import (
+            IceCandidate,
+            SmallWebRTCPatchRequest,
            SmallWebRTCRequest,
            SmallWebRTCRequestHandler,
        )
@@ -209,6 +216,16 @@ def _setup_webrtc_routes(
        logger.error(f"WebRTC transport dependencies not installed: {e}")
        return

+    class IceConfig(TypedDict):
+        iceServers: List[IceServer]
+
+    class StartBotResult(TypedDict, total=False):
+        sessionId: str
+        iceConfig: Optional[IceConfig]
+
+    # In-memory store of active sessions: session_id -> session info
+    active_sessions: Dict[str, Dict[str, Any]] = {}
+
    # Mount the frontend
    app.mount("/client", SmallWebRTCPrebuiltUI)

@@ -254,6 +271,74 @@ def _setup_webrtc_routes(
        )
        return answer

+    @app.patch("/api/offer")
+    async def ice_candidate(request: SmallWebRTCPatchRequest):
+        """Handle WebRTC new ice candidate requests."""
+        logger.debug(f"Received patch request: {request}")
+        await small_webrtc_handler.handle_patch_request(request)
+        return {"status": "success"}
+
+    @app.post("/start")
+    async def rtvi_start(request: Request):
+        """Mimic Pipecat Cloud's /start endpoint."""
+        # Parse the request body
+        try:
+            request_data = await request.json()
+            logger.debug(f"Received request: {request_data}")
+        except Exception as e:
+            logger.error(f"Failed to parse request body: {e}")
+            request_data = {}
+
+        # Store session info immediately in memory, replicate the behavior expected on Pipecat Cloud
+        session_id = str(uuid.uuid4())
+        active_sessions[session_id] = request_data
+
+        result: StartBotResult = {"sessionId": session_id}
+        if request_data.get("enableDefaultIceServers"):
+            result["iceConfig"] = IceConfig(
+                iceServers=[IceServer(urls="stun:stun.l.google.com:19302")]
+            )
+
+        return result
+
+    @app.api_route(
+        "/sessions/{session_id}/{path:path}",
+        methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
+    )
+    async def proxy_request(
+        session_id: str, path: str, request: Request, background_tasks: BackgroundTasks
+    ):
+        """Mimic Pipecat Cloud's proxy."""
+        active_session = active_sessions.get(session_id)
+        if active_session is None:
+            return Response(content="Invalid or not-yet-ready session_id", status_code=404)
+
+        if path.endswith("api/offer"):
+            # Parse the request body and convert to SmallWebRTCRequest
+            try:
+                request_data = await request.json()
+                if request.method == HTTPMethod.POST.value:
+                    webrtc_request = SmallWebRTCRequest(
+                        sdp=request_data["sdp"],
+                        type=request_data["type"],
+                        pc_id=request_data.get("pc_id"),
+                        restart_pc=request_data.get("restart_pc"),
+                        request_data=request_data,
+                    )
+                    return await offer(webrtc_request, background_tasks)
+                elif request.method == HTTPMethod.PATCH.value:
+                    patch_request = SmallWebRTCPatchRequest(
+                        pc_id=request_data["pc_id"],
+                        candidates=[IceCandidate(**c) for c in request_data.get("candidates", [])],
+                    )
+                    return await ice_candidate(patch_request)
+            except Exception as e:
+                logger.error(f"Failed to parse WebRTC request: {e}")
+                return Response(content="Invalid WebRTC request", status_code=400)
+
+        logger.info(f"Received request for path: {path}")
+        return Response(status_code=200)
+
    @asynccontextmanager
    async def smallwebrtc_lifespan(app: FastAPI):
        """Manage FastAPI application lifecycle and cleanup connections."""
@@ -289,6 +374,29 @@ def _add_lifespan_to_app(app: FastAPI, new_lifespan):

 def _setup_whatsapp_routes(app: FastAPI):
    """Set up WebRTC-specific routes."""
+    WHATSAPP_APP_SECRET = os.getenv("WHATSAPP_APP_SECRET")
+    WHATSAPP_PHONE_NUMBER_ID = os.getenv("WHATSAPP_PHONE_NUMBER_ID")
+    WHATSAPP_TOKEN = os.getenv("WHATSAPP_TOKEN")
+    WHATSAPP_WEBHOOK_VERIFICATION_TOKEN = os.getenv("WHATSAPP_WEBHOOK_VERIFICATION_TOKEN")
+
+    if not all(
+        [
+            WHATSAPP_APP_SECRET,
+            WHATSAPP_PHONE_NUMBER_ID,
+            WHATSAPP_TOKEN,
+            WHATSAPP_WEBHOOK_VERIFICATION_TOKEN,
+        ]
+    ):
+        logger.error(
+            """Missing required environment variables for WhatsApp transport:
+    WHATSAPP_APP_SECRET
+    WHATSAPP_PHONE_NUMBER_ID
+    WHATSAPP_TOKEN
+    WHATSAPP_WEBHOOK_VERIFICATION_TOKEN
+            """
+        )
+        return
+
    try:
        from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI

@@ -300,24 +408,7 @@ def _setup_whatsapp_routes(app: FastAPI):
        from pipecat.transports.whatsapp.api import WhatsAppWebhookRequest
        from pipecat.transports.whatsapp.client import WhatsAppClient
    except ImportError as e:
-        logger.error(f"WebRTC transport dependencies not installed: {e}")
-        return
-
-    WHATSAPP_TOKEN = os.getenv("WHATSAPP_TOKEN")
-    WHATSAPP_PHONE_NUMBER_ID = os.getenv("WHATSAPP_PHONE_NUMBER_ID")
-    WHATSAPP_WEBHOOK_VERIFICATION_TOKEN = os.getenv("WHATSAPP_WEBHOOK_VERIFICATION_TOKEN")
-    WHATSAPP_APP_SECRET = os.getenv("WHATSAPP_APP_SECRET")
-
-    if not all(
-        [
-            WHATSAPP_TOKEN,
-            WHATSAPP_PHONE_NUMBER_ID,
-            WHATSAPP_WEBHOOK_VERIFICATION_TOKEN,
-        ]
-    ):
-        logger.debug(
-            "Missing required environment variables for WhatsApp transport. Keeping it disabled."
-        )
+        logger.error(f"WhatsApp transport dependencies not installed: {e}")
        return

    # Global WhatsApp client instance
@@ -439,9 +530,9 @@ def _setup_daily_routes(app: FastAPI):
    """Set up Daily-specific routes."""

    @app.get("/")
-    async def start_agent():
+    async def create_room_and_start_agent():
        """Launch a Daily bot and redirect to room."""
-        print("Starting bot with Daily transport")
+        print("Starting bot with Daily transport and redirecting to Daily room")

        import aiohttp

@@ -456,14 +547,15 @@ def _setup_daily_routes(app: FastAPI):
            asyncio.create_task(bot_module.bot(runner_args))
            return RedirectResponse(room_url)

-    async def _handle_rtvi_request(request: Request):
-        """Common handler for both /start and /connect endpoints.
+    @app.post("/start")
+    async def start_agent(request: Request):
+        """Handler for /start endpoints.

        Expects POST body like::
-
            {
                "createDailyRoom": true,
                "dailyRoomProperties": { "start_video_off": true },
+                "dailyMeetingTokenProperties": { "is_owner": true, "user_name": "Bot" },
                "body": { "custom_data": "value" }
            }
        """
@@ -477,47 +569,68 @@ def _setup_daily_routes(app: FastAPI):
            logger.error(f"Failed to parse request body: {e}")
            request_data = {}

-        # Extract the body data that should be passed to the bot
-        # This mimics Pipecat Cloud's behavior
-        bot_body = request_data.get("body", {})
+        create_daily_room = request_data.get("createDailyRoom", False)
+        body = request_data.get("body", {})
+        daily_room_properties_dict = request_data.get("dailyRoomProperties", None)
+        daily_token_properties_dict = request_data.get("dailyMeetingTokenProperties", None)

-        # Log the extracted body data for debugging
-        if bot_body:
-            logger.info(f"Extracted body data for bot: {bot_body}")
+        bot_module = _get_bot_module()
+
+        existing_room_url = os.getenv("DAILY_SAMPLE_ROOM_URL")
+
+        result = None
+
+        # Configure room if:
+        # 1. Explicitly requested via createDailyRoom in payload
+        # 2. Using pre-configured room from DAILY_SAMPLE_ROOM_URL env var
+        if create_daily_room or existing_room_url:
+            import aiohttp
+
+            from pipecat.runner.daily import configure
+            from pipecat.transports.daily.utils import (
+                DailyMeetingTokenProperties,
+                DailyRoomProperties,
+            )
+
+            async with aiohttp.ClientSession() as session:
+                # Parse dailyRoomProperties if provided
+                room_properties = None
+                if daily_room_properties_dict:
+                    try:
+                        room_properties = DailyRoomProperties(**daily_room_properties_dict)
+                        logger.debug(f"Using custom room properties: {room_properties}")
+                    except Exception as e:
+                        logger.error(f"Failed to parse dailyRoomProperties: {e}")
+                        # Continue without custom properties
+
+                # Parse dailyMeetingTokenProperties if provided
+                token_properties = None
+                if daily_token_properties_dict:
+                    try:
+                        token_properties = DailyMeetingTokenProperties(
+                            **daily_token_properties_dict
+                        )
+                        logger.debug(f"Using custom token properties: {token_properties}")
+                    except Exception as e:
+                        logger.error(f"Failed to parse dailyMeetingTokenProperties: {e}")
+                        # Continue without custom properties
+
+                room_url, token = await configure(
+                    session, room_properties=room_properties, token_properties=token_properties
+                )
+                runner_args = DailyRunnerArguments(room_url=room_url, token=token, body=body)
+                result = {
+                    "dailyRoom": room_url,
+                    "dailyToken": token,
+                    "sessionId": str(uuid.uuid4()),
+                }
        else:
-            logger.debug("No body data provided in request")
+            runner_args = RunnerArguments(body=body)

-        import aiohttp
+        # Start the bot in the background
+        asyncio.create_task(bot_module.bot(runner_args))

-        from pipecat.runner.daily import configure
-
-        async with aiohttp.ClientSession() as session:
-            room_url, token = await configure(session)
-
-            # Start the bot in the background with extracted body data
-            bot_module = _get_bot_module()
-            runner_args = DailyRunnerArguments(room_url=room_url, token=token, body=bot_body)
-            asyncio.create_task(bot_module.bot(runner_args))
-            # Match PCC /start endpoint response format:
-            return {"dailyRoom": room_url, "dailyToken": token}
-
-    @app.post("/start")
-    async def rtvi_start(request: Request):
-        """Launch a Daily bot and return connection info for RTVI clients."""
-        return await _handle_rtvi_request(request)
-
-    @app.post("/connect")
-    async def rtvi_connect(request: Request):
-        """Launch a Daily bot and return connection info for RTVI clients.
-
-        .. deprecated:: 0.0.78
-            Use /start instead. This endpoint will be removed in a future version.
-        """
-        logger.warning(
-            "DEPRECATED: /connect endpoint is deprecated. Please use /start instead. "
-            "This endpoint will be removed in a future version."
-        )
-        return await _handle_rtvi_request(request)
+        return result


 def _setup_telephony_routes(app: FastAPI, *, transport_type: str, proxy: str):
@@ -576,8 +689,6 @@ def _setup_telephony_routes(app: FastAPI, *, transport_type: str, proxy: str):
 async def _run_daily_direct():
    """Run Daily bot with direct connection (no FastAPI server)."""
    try:
-        import aiohttp
-
        from pipecat.runner.daily import configure
    except ImportError as e:
        logger.error("Daily transport dependencies not installed.")
@@ -689,6 +800,12 @@ def main():
    parser.add_argument(
        "--verbose", "-v", action="count", default=0, help="Increase logging verbosity"
    )
+    parser.add_argument(
+        "--whatsapp",
+        action="store_true",
+        default=False,
+        help="Ensure requried WhatsApp environment variables are present",
+    )

    args = parser.parse_args()

@@ -708,10 +825,6 @@ def main():
        logger.error("For ESP32, you need to specify `--host IP` so we can do SDP munging.")
        return

-    if args.transport in TELEPHONY_TRANSPORTS and not args.proxy:
-        logger.error(f"For telephony transports, you need to specify `--proxy PROXY`.")
-        return
-
    # Log level
    logger.remove()
    logger.add(sys.stderr, level="TRACE" if args.verbose else "DEBUG")
@@ -731,10 +844,11 @@ def main():
        print()
        if args.esp32:
            print(f"🚀 Bot ready! (ESP32 mode)")
-            print(f"   → Open http://{args.host}:{args.port}/client in your browser")
+        elif args.whatsapp:
+            print(f"🚀 Bot ready! (WhatsApp)")
        else:
            print(f"🚀 Bot ready!")
-            print(f"   → Open http://{args.host}:{args.port}/client in your browser")
+        print(f"   → Open http://{args.host}:{args.port}/client in your browser")
        print()
    elif args.transport == "daily":
        print()
@@ -752,6 +866,7 @@ def main():
        host=args.host,
        proxy=args.proxy,
        esp32_mode=args.esp32,
+        whatsapp_enabled=args.whatsapp,
        folder=args.folder,
    )

--- a/src/pipecat/runner/types.py
+++ b/src/pipecat/runner/types.py
@@ -20,9 +20,11 @@ from fastapi import WebSocket
 class RunnerArguments:
    """Base class for runner session arguments."""

-    handle_sigint: bool = field(init=False)
-    handle_sigterm: bool = field(init=False)
-    pipeline_idle_timeout_secs: int = field(init=False)
+    # Use kw_only so subclasses don't need to worry about ordering.
+    handle_sigint: bool = field(init=False, kw_only=True)
+    handle_sigterm: bool = field(init=False, kw_only=True)
+    pipeline_idle_timeout_secs: int = field(init=False, kw_only=True)
+    body: Optional[Any] = field(default_factory=dict, kw_only=True)

    def __post_init__(self):
        self.handle_sigint = False
@@ -42,7 +44,6 @@ class DailyRunnerArguments(RunnerArguments):

    room_url: str
    token: Optional[str] = None
-    body: Optional[Any] = field(default_factory=dict)


@dataclass
@@ -55,7 +56,6 @@ class WebSocketRunnerArguments(RunnerArguments):
    """

    websocket: WebSocket
-    body: Optional[Any] = field(default_factory=dict)


@dataclass
--- a/src/pipecat/services/assemblyai/models.py
+++ b/src/pipecat/services/assemblyai/models.py
@@ -108,6 +108,8 @@ class AssemblyAIConnectionParams(BaseModel):
        end_of_turn_confidence_threshold: Confidence threshold for end-of-turn detection.
        min_end_of_turn_silence_when_confident: Minimum silence duration when confident about end-of-turn.
        max_turn_silence: Maximum silence duration before forcing end-of-turn.
+        keyterms_prompt: List of key terms to guide transcription. Will be JSON serialized before sending.
+        speech_model: Select between English and multilingual models. Defaults to "universal-streaming-english".
    """

    sample_rate: int = 16000
@@ -117,3 +119,7 @@ class AssemblyAIConnectionParams(BaseModel):
    end_of_turn_confidence_threshold: Optional[float] = None
    min_end_of_turn_silence_when_confident: Optional[int] = None
    max_turn_silence: Optional[int] = None
+    keyterms_prompt: Optional[List[str]] = None
+    speech_model: Literal["universal-streaming-english", "universal-streaming-multilingual"] = (
+        "universal-streaming-english"
+    )
--- a/src/pipecat/services/assemblyai/stt.py
+++ b/src/pipecat/services/assemblyai/stt.py
@@ -174,11 +174,16 @@ class AssemblyAISTTService(STTService):

    def _build_ws_url(self) -> str:
        """Build WebSocket URL with query parameters using urllib.parse.urlencode."""
-        params = {
-            k: str(v).lower() if isinstance(v, bool) else v
-            for k, v in self._connection_params.model_dump().items()
-            if v is not None
-        }
+        params = {}
+        for k, v in self._connection_params.model_dump().items():
+            if v is not None:
+                if k == "keyterms_prompt":
+                    params[k] = json.dumps(v)
+                elif isinstance(v, bool):
+                    params[k] = str(v).lower()
+                else:
+                    params[k] = v
+
        if params:
            query_string = urlencode(params)
            return f"{self._api_endpoint_base_url}?{query_string}"
@@ -197,6 +202,8 @@ class AssemblyAISTTService(STTService):
            )
            self._connected = True
            self._receive_task = self.create_task(self._receive_task_handler())
+
+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"Failed to connect to AssemblyAI: {e}")
            self._connected = False
@@ -238,6 +245,7 @@ class AssemblyAISTTService(STTService):
            self._websocket = None
            self._connected = False
            self._receive_task = None
+            await self._call_event_handler("on_disconnected")

    async def _receive_task_handler(self):
        """Handle incoming WebSocket messages."""
--- a/src/pipecat/services/asyncai/tts.py
+++ b/src/pipecat/services/asyncai/tts.py
@@ -235,6 +235,8 @@ class AsyncAITTSService(InterruptibleTTSService):
            }

            await self._get_websocket().send(json.dumps(init_msg))
+
+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"{self} initialization error: {e}")
            self._websocket = None
@@ -252,6 +254,7 @@ class AsyncAITTSService(InterruptibleTTSService):
        finally:
            self._websocket = None
            self._started = False
+            await self._call_event_handler("on_disconnected")

    def _get_websocket(self):
        if self._websocket:
--- a/src/pipecat/services/aws/llm.py
+++ b/src/pipecat/services/aws/llm.py
@@ -720,11 +720,11 @@ class AWSBedrockLLMService(LLMService):
            additional_model_request_fields: Additional model-specific parameters.
        """

-        max_tokens: Optional[int] = Field(default_factory=lambda: 4096, ge=1)
-        temperature: Optional[float] = Field(default_factory=lambda: 0.7, ge=0.0, le=1.0)
-        top_p: Optional[float] = Field(default_factory=lambda: 0.999, ge=0.0, le=1.0)
+        max_tokens: Optional[int] = Field(default=None, ge=1)
+        temperature: Optional[float] = Field(default=None, ge=0.0, le=1.0)
+        top_p: Optional[float] = Field(default=None, ge=0.0, le=1.0)
        stop_sequences: Optional[List[str]] = Field(default_factory=lambda: [])
-        latency: Optional[str] = Field(default_factory=lambda: "standard")
+        latency: Optional[str] = Field(default=None)
        additional_model_request_fields: Optional[Dict[str, Any]] = Field(default_factory=dict)

    def __init__(
@@ -801,6 +801,24 @@ class AWSBedrockLLMService(LLMService):
        """
        return True

+    def _build_inference_config(self) -> Dict[str, Any]:
+        """Build inference config with only the parameters that are set.
+
+        This prevents conflicts with models (e.g., Claude Sonnet 4.5) that don't
+        allow certain parameter combinations like temperature and top_p together.
+
+        Returns:
+            Dictionary containing only the inference parameters that are not None.
+        """
+        inference_config = {}
+        if self._settings["max_tokens"] is not None:
+            inference_config["maxTokens"] = self._settings["max_tokens"]
+        if self._settings["temperature"] is not None:
+            inference_config["temperature"] = self._settings["temperature"]
+        if self._settings["top_p"] is not None:
+            inference_config["topP"] = self._settings["top_p"]
+        return inference_config
+
    async def run_inference(self, context: LLMContext | OpenAILLMContext) -> Optional[str]:
        """Run a one-shot, out-of-band (i.e. out-of-pipeline) inference with the given LLM context.

@@ -826,16 +844,16 @@ class AWSBedrockLLMService(LLMService):
        model_id = self.model_name

        # Prepare request parameters
+        inference_config = self._build_inference_config()
+
        request_params = {
            "modelId": model_id,
            "messages": messages,
-            "inferenceConfig": {
-                "maxTokens": 8192,
-                "temperature": 0.7,
-                "topP": 0.9,
-            },
        }

+        if inference_config:
+            request_params["inferenceConfig"] = inference_config
+
        if system:
            request_params["system"] = system

@@ -974,21 +992,20 @@ class AWSBedrockLLMService(LLMService):
            tools = params_from_context["tools"]
            tool_choice = params_from_context["tool_choice"]

-            # Set up inference config
-            inference_config = {
-                "maxTokens": self._settings["max_tokens"],
-                "temperature": self._settings["temperature"],
-                "topP": self._settings["top_p"],
-            }
+            # Set up inference config - only include parameters that are set
+            inference_config = self._build_inference_config()

            # Prepare request parameters
            request_params = {
                "modelId": self.model_name,
                "messages": messages,
-                "inferenceConfig": inference_config,
                "additionalModelRequestFields": self._settings["additional_model_request_fields"],
            }

+            # Only add inference config if it has parameters
+            if inference_config:
+                request_params["inferenceConfig"] = inference_config
+
            # Add system message
            if system:
                request_params["system"] = system
--- a/src/pipecat/services/aws/nova_sonic/context.py
+++ b/src/pipecat/services/aws/nova_sonic/context.py
@@ -8,8 +8,77 @@

 This module provides specialized context aggregators and message handling for AWS Nova Sonic,
 including conversation history management and role-specific message processing.
+
+.. deprecated:: 0.0.91
+    AWS Nova Sonic no longer uses types from this module under the hood.
+    It now uses `LLMContext` and `LLMContextAggregatorPair`.
+    Using the new patterns should allow you to not need types from this module.
+
+    BEFORE:
+    ```
+    # Setup
+    context = OpenAILLMContext(messages, tools)
+    context_aggregator = llm.create_context_aggregator(context)
+
+    # Context frame type
+    frame: OpenAILLMContextFrame
+
+    # Context type
+    context: AWSNovaSonicLLMContext
+    # or
+    context: OpenAILLMContext
+    ```
+
+    AFTER:
+    ```
+    # Setup
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    # Context frame type
+    frame: LLMContextFrame
+
+    # Context type
+    context: LLMContext
+    ```
 """

+import warnings
+
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.aws.nova_sonic.context (or "
+        "pipecat.services.aws_nova_sonic.context) are deprecated. \n"
+        "AWS Nova Sonic no longer uses types from this module under the hood. \n"
+        "It now uses `LLMContext` and `LLMContextAggregatorPair`. \n"
+        "Using the new patterns should allow you to not need types from this module.\n\n"
+        "BEFORE:\n"
+        "```\n"
+        "# Setup\n"
+        "context = OpenAILLMContext(messages, tools)\n"
+        "context_aggregator = llm.create_context_aggregator(context)\n\n"
+        "# Context frame type\n"
+        "frame: OpenAILLMContextFrame\n\n"
+        "# Context type\n"
+        "context: AWSNovaSonicLLMContext\n"
+        "# or\n"
+        "context: OpenAILLMContext\n\n"
+        "```\n\n"
+        "AFTER:\n"
+        "```\n"
+        "# Setup\n"
+        "context = LLMContext(messages, tools)\n"
+        "context_aggregator = LLMContextAggregatorPair(context)\n\n"
+        "# Context frame type\n"
+        "frame: LLMContextFrame\n\n"
+        "# Context type\n"
+        "context: LLMContext\n\n"
+        "```",
+        DeprecationWarning,
+        stacklevel=2,
+    )
+
 import copy
 from dataclasses import dataclass, field
 from enum import Enum
--- a/src/pipecat/services/aws/nova_sonic/llm.py
+++ b/src/pipecat/services/aws/nova_sonic/llm.py
@@ -25,7 +25,7 @@ from loguru import logger
 from pydantic import BaseModel, Field

 from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.adapters.services.aws_nova_sonic_adapter import AWSNovaSonicLLMAdapter
+from pipecat.adapters.services.aws_nova_sonic_adapter import AWSNovaSonicLLMAdapter, Role
 from pipecat.frames.frames import (
    BotStoppedSpeakingFrame,
    CancelFrame,
@@ -33,35 +33,30 @@ from pipecat.frames.frames import (
    Frame,
    FunctionCallFromLLM,
    InputAudioRawFrame,
-    InterimTranscriptionFrame,
+    InterruptionFrame,
    LLMContextFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
-    LLMTextFrame,
    StartFrame,
    TranscriptionFrame,
    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
    TTSTextFrame,
+    UserStartedSpeakingFrame,
+    UserStoppedSpeakingFrame,
 )
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response import (
    LLMAssistantAggregatorParams,
    LLMUserAggregatorParams,
 )
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.aggregators.openai_llm_context import (
    OpenAILLMContext,
    OpenAILLMContextFrame,
 )
 from pipecat.processors.frame_processor import FrameDirection
-from pipecat.services.aws.nova_sonic.context import (
-    AWSNovaSonicAssistantContextAggregator,
-    AWSNovaSonicContextAggregatorPair,
-    AWSNovaSonicLLMContext,
-    AWSNovaSonicUserContextAggregator,
-    Role,
-)
-from pipecat.services.aws.nova_sonic.frames import AWSNovaSonicFunctionCallResultFrame
 from pipecat.services.llm_service import LLMService
 from pipecat.utils.time import time_now_iso8601

@@ -217,6 +212,11 @@ class AWSNovaSonicLLMService(LLMService):
            system_instruction: System-level instruction for the model.
            tools: Available tools/functions for the model to use.
            send_transcription_frames: Whether to emit transcription frames.
+
+                .. deprecated:: 0.0.91
+                    This parameter is deprecated and will be removed in a future version.
+                    Transcription frames are always sent.
+
            **kwargs: Additional arguments passed to the parent LLMService.
        """
        super().__init__(**kwargs)
@@ -230,8 +230,20 @@ class AWSNovaSonicLLMService(LLMService):
        self._params = params or Params()
        self._system_instruction = system_instruction
        self._tools = tools
-        self._send_transcription_frames = send_transcription_frames
-        self._context: Optional[AWSNovaSonicLLMContext] = None
+
+        if not send_transcription_frames:
+            import warnings
+
+            with warnings.catch_warnings():
+                warnings.simplefilter("always")
+                warnings.warn(
+                    "`send_transcription_frames` is deprecated and will be removed in a future version. "
+                    "Transcription frames are always sent.",
+                    DeprecationWarning,
+                    stacklevel=2,
+                )
+
+        self._context: Optional[LLMContext] = None
        self._stream: Optional[
            DuplexEventStream[
                InvokeModelWithBidirectionalStreamInput,
@@ -244,12 +256,17 @@ class AWSNovaSonicLLMService(LLMService):
        self._input_audio_content_name: Optional[str] = None
        self._content_being_received: Optional[CurrentContent] = None
        self._assistant_is_responding = False
+        self._may_need_repush_assistant_text = False
        self._ready_to_send_context = False
        self._handling_bot_stopped_speaking = False
        self._triggering_assistant_response = False
+        self._waiting_for_trigger_transcription = False
        self._disconnecting = False
        self._connected_time: Optional[float] = None
        self._wants_connection = False
+        self._user_text_buffer = ""
+        self._assistant_text_buffer = ""
+        self._completed_tool_calls = set()

        file_path = files("pipecat.services.aws.nova_sonic").joinpath("ready.wav")
        with wave.open(file_path.open("rb"), "rb") as wav_file:
@@ -302,12 +319,12 @@ class AWSNovaSonicLLMService(LLMService):
        logger.debug("Resetting conversation")
        await self._handle_bot_stopped_speaking(delay_to_catch_trailing_assistant_text=False)

-        # Carry over previous context through disconnect
+        # Grab context to carry through disconnect/reconnect
        context = self._context
-        await self._disconnect()
-        self._context = context

+        await self._disconnect()
        await self._start_connecting()
+        await self._handle_context(context)

    #
    # frame processing
@@ -322,28 +339,35 @@ class AWSNovaSonicLLMService(LLMService):
        """
        await super().process_frame(frame, direction)

-        if isinstance(frame, OpenAILLMContextFrame):
-            await self._handle_context(frame.context)
-        elif isinstance(frame, LLMContextFrame):
-            raise NotImplementedError(
-                "Universal LLMContext is not yet supported for AWS Nova Sonic."
+        if isinstance(frame, (LLMContextFrame, OpenAILLMContextFrame)):
+            context = (
+                frame.context
+                if isinstance(frame, LLMContextFrame)
+                else LLMContext.from_openai_context(frame.context)
            )
+            await self._handle_context(context)
        elif isinstance(frame, InputAudioRawFrame):
            await self._handle_input_audio_frame(frame)
        elif isinstance(frame, BotStoppedSpeakingFrame):
            await self._handle_bot_stopped_speaking(delay_to_catch_trailing_assistant_text=True)
-        elif isinstance(frame, AWSNovaSonicFunctionCallResultFrame):
-            await self._handle_function_call_result(frame)
+        elif isinstance(frame, InterruptionFrame):
+            await self._handle_interruption_frame()

        await self.push_frame(frame, direction)

-    async def _handle_context(self, context: OpenAILLMContext):
+    async def _handle_context(self, context: LLMContext):
+        if self._disconnecting:
+            return
+
        if not self._context:
-            # We got our initial context - try to finish connecting
-            self._context = AWSNovaSonicLLMContext.upgrade_to_nova_sonic(
-                context, self._system_instruction
-            )
+            # We got our initial context
+            # Try to finish connecting
+            self._context = context
            await self._finish_connecting_if_context_available()
+        else:
+            # We got an updated context
+            # Send results for any newly-completed function calls
+            await self._process_completed_function_calls(send_new_results=True)

    async def _handle_input_audio_frame(self, frame: InputAudioRawFrame):
        # Wait until we're done sending the assistant response trigger audio before sending audio
@@ -393,9 +417,9 @@ class AWSNovaSonicLLMService(LLMService):
        else:
            await finalize_assistant_response()

-    async def _handle_function_call_result(self, frame: AWSNovaSonicFunctionCallResultFrame):
-        result = frame.result_frame
-        await self._send_tool_result(tool_call_id=result.tool_call_id, result=result.result)
+    async def _handle_interruption_frame(self):
+        if self._assistant_is_responding:
+            self._may_need_repush_assistant_text = True

    #
    # LLM communication: lifecycle
@@ -431,6 +455,17 @@ class AWSNovaSonicLLMService(LLMService):
            logger.error(f"{self} initialization error: {e}")
            await self._disconnect()

+    async def _process_completed_function_calls(self, send_new_results: bool):
+        # Check for set of completed function calls in the context
+        for message in self._context.get_messages():
+            if message.get("role") and message.get("content") != "IN_PROGRESS":
+                tool_call_id = message.get("tool_call_id")
+                if tool_call_id and tool_call_id not in self._completed_tool_calls:
+                    # Found a newly-completed function call - send the result to the service
+                    if send_new_results:
+                        await self._send_tool_result(tool_call_id, message.get("content"))
+                    self._completed_tool_calls.add(tool_call_id)
+
    async def _finish_connecting_if_context_available(self):
        # We can only finish connecting once we've gotten our initial context and we're ready to
        # send it
@@ -439,30 +474,38 @@ class AWSNovaSonicLLMService(LLMService):

        logger.info("Finishing connecting (setting up session)...")

+        # Initialize our bookkeeping of already-completed tool calls in the
+        # context
+        await self._process_completed_function_calls(send_new_results=False)
+
        # Read context
-        history = self._context.get_messages_for_initializing_history()
+        adapter: AWSNovaSonicLLMAdapter = self.get_llm_adapter()
+        llm_connection_params = adapter.get_llm_invocation_params(self._context)

        # Send prompt start event, specifying tools.
        # Tools from context take priority over self._tools.
        tools = (
-            self._context.tools
-            if self._context.tools
-            else self.get_llm_adapter().from_standard_tools(self._tools)
+            llm_connection_params["tools"]
+            if llm_connection_params["tools"]
+            else adapter.from_standard_tools(self._tools)
        )
        logger.debug(f"Using tools: {tools}")
        await self._send_prompt_start_event(tools)

        # Send system instruction.
        # Instruction from context takes priority over self._system_instruction.
-        # (NOTE: this prioritizing occurred automatically behind the scenes: the context was
-        # initialized with self._system_instruction and then updated itself from its messages when
-        # get_messages_for_initializing_history() was called).
-        logger.debug(f"Using system instruction: {history.system_instruction}")
-        if history.system_instruction:
-            await self._send_text_event(text=history.system_instruction, role=Role.SYSTEM)
+        system_instruction = (
+            llm_connection_params["system_instruction"]
+            if llm_connection_params["system_instruction"]
+            else self._system_instruction
+        )
+        logger.debug(f"Using system instruction: {system_instruction}")
+        if system_instruction:
+            await self._send_text_event(text=system_instruction, role=Role.SYSTEM)

        # Send conversation history
-        for message in history.messages:
+        for message in llm_connection_params["messages"]:
+            # logger.debug(f"Seeding conversation history with message: {message}")
            await self._send_text_event(text=message.text, role=message.role)

        # Start audio input
@@ -492,9 +535,12 @@ class AWSNovaSonicLLMService(LLMService):
                await self._send_session_end_events()
                self._client = None

+            # Clean up context
+            self._context = None
+
            # Clean up stream
            if self._stream:
-                await self._stream.input_stream.close()
+                await self._stream.close()
                self._stream = None

            # NOTE: see explanation of HACK, below
@@ -510,15 +556,23 @@ class AWSNovaSonicLLMService(LLMService):
                self._receive_task = None

            # Reset remaining connection-specific state
+            # Should be all private state except:
+            # - _wants_connection
+            # - _assistant_response_trigger_audio
            self._prompt_name = None
            self._input_audio_content_name = None
            self._content_being_received = None
            self._assistant_is_responding = False
+            self._may_need_repush_assistant_text = False
            self._ready_to_send_context = False
            self._handling_bot_stopped_speaking = False
            self._triggering_assistant_response = False
+            self._waiting_for_trigger_transcription = False
            self._disconnecting = False
            self._connected_time = None
+            self._user_text_buffer = ""
+            self._assistant_text_buffer = ""
+            self._completed_tool_calls = set()

            logger.info("Finished disconnecting")
        except Exception as e:
@@ -826,6 +880,10 @@ class AWSNovaSonicLLMService(LLMService):
                            # Handle the LLM completion ending
                            await self._handle_completion_end_event(event_json)
        except Exception as e:
+            if self._disconnecting:
+                # Errors are kind of expected while disconnecting, so just
+                # ignore them and do nothing
+                return
            logger.error(f"{self} error processing responses: {e}")
            if self._wants_connection:
                await self.reset_conversation()
@@ -956,7 +1014,7 @@ class AWSNovaSonicLLMService(LLMService):
    async def _report_assistant_response_started(self):
        logger.debug("Assistant response started")

-        # Report that the assistant has started their response.
+        # Report the start of the assistant response.
        await self.push_frame(LLMFullResponseStartFrame())

        # Report that equivalent of TTS (this is a speech-to-speech model) started
@@ -968,23 +1026,16 @@ class AWSNovaSonicLLMService(LLMService):

        logger.debug(f"Assistant response text added: {text}")

-        # Report some text added to the ongoing assistant response
-        await self.push_frame(LLMTextFrame(text))
-
-        # Report some text added to the *equivalent* of TTS (this is a speech-to-speech model)
+        # Report the text of the assistant response.
        await self.push_frame(TTSTextFrame(text))

-        # TODO: this is a (hopefully temporary) HACK. Here we directly manipulate the context rather
-        # than relying on the frames pushed to the assistant context aggregator. The pattern of
-        # receiving full-sentence text after the assistant has spoken does not easily fit with the
-        # Pipecat expectation of chunks of text streaming in while the assistant is speaking.
-        # Interruption handling was especially challenging. Rather than spend days trying to fit a
-        # square peg in a round hole, I decided on this hack for the time being. We can most cleanly
-        # abandon this hack if/when AWS Nova Sonic implements streaming smaller text chunks
-        # interspersed with audio. Note that when we move away from this hack, we need to make sure
-        # that on an interruption we avoid sending LLMFullResponseEndFrame, which gets the
-        # LLMAssistantContextAggregator into a bad state.
-        self._context.buffer_assistant_text(text)
+        # HACK: here we're also buffering the assistant text ourselves as a
+        # backup rather than relying solely on the assistant context aggregator
+        # to do it, because the text arrives from Nova Sonic only after all the
+        # assistant audio frames have been pushed, meaning that if an
+        # interruption frame were to arrive we would lose all of it (the text
+        # frames sitting in the queue would be wiped).
+        self._assistant_text_buffer += text

    async def _report_assistant_response_ended(self):
        if not self._context:  # should never happen
@@ -992,14 +1043,34 @@ class AWSNovaSonicLLMService(LLMService):

        logger.debug("Assistant response ended")

-        # Report that the assistant has finished their response.
+        # If an interruption frame arrived while the assistant was responding
+        # we may have lost all of the assistant text (see HACK, above), so
+        # re-push it downstream to the aggregator now.
+        if self._may_need_repush_assistant_text:
+            # Just in case, check that assistant text hasn't already made it
+            # into the context (sometimes it does, despite the interruption).
+            messages = self._context.get_messages()
+            last_message = messages[-1] if messages else None
+            if (
+                not last_message
+                or last_message.get("role") != "assistant"
+                or last_message.get("content") != self._assistant_text_buffer
+            ):
+                # We also need to re-push the LLMFullResponseStartFrame since the
+                # TTSTextFrame would be ignored otherwise (the interruption frame
+                # would have cleared the assistant aggregator state).
+                await self.push_frame(LLMFullResponseStartFrame())
+                await self.push_frame(TTSTextFrame(self._assistant_text_buffer))
+            self._may_need_repush_assistant_text = False
+
+        # Report the end of the assistant response.
        await self.push_frame(LLMFullResponseEndFrame())

        # Report that equivalent of TTS (this is a speech-to-speech model) stopped.
        await self.push_frame(TTSStoppedFrame())

-        # For an explanation of this hack, see _report_assistant_response_text_added.
-        self._context.flush_aggregated_assistant_text()
+        # Clear out the buffered assistant text
+        self._assistant_text_buffer = ""

    #
    # user transcription reporting
@@ -1016,33 +1087,67 @@ class AWSNovaSonicLLMService(LLMService):

        logger.debug(f"User transcription text added: {text}")

-        # Manually add new user transcription text to context.
-        # We can't rely on the user context aggregator to do this since it's upstream from the LLM.
-        self._context.buffer_user_text(text)
-
-        # Report that some new user transcription text is available.
-        if self._send_transcription_frames:
-            await self.push_frame(
-                InterimTranscriptionFrame(text=text, user_id="", timestamp=time_now_iso8601())
-            )
+        # HACK: here we're buffering the user text ourselves rather than
+        # relying on the upstream user context aggregator to do it, because the
+        # text arrives in fairly large chunks spaced fairly far apart in time.
+        # That means the user text would be split between different messages in
+        # context. Even if we sent placeholder InterimTranscriptionFrames in
+        # between each TranscriptionFrame to tell the aggregator to hold off on
+        # finalizing the user message, the aggregator would likely get the last
+        # chunk too late.
+        self._user_text_buffer += f" {text}" if self._user_text_buffer else text

    async def _report_user_transcription_ended(self):
        if not self._context:  # should never happen
            return

-        # Manually add user transcription to context (if any has been buffered).
-        # We can't rely on the user context aggregator to do this since it's upstream from the LLM.
-        transcription = self._context.flush_aggregated_user_text()
-
-        if not transcription:
-            return
-
        logger.debug(f"User transcription ended")

-        if self._send_transcription_frames:
-            await self.push_frame(
-                TranscriptionFrame(text=transcription, user_id="", timestamp=time_now_iso8601())
+        # Report to the upstream user context aggregator that some new user
+        # transcription text is available.
+
+        # HACK: Check if this transcription was triggered by our own
+        # assistant response trigger. If so, we need to wrap it with
+        # UserStarted/StoppedSpeakingFrames; otherwise the user aggregator
+        # would fire an EmulatedUserStartedSpeakingFrame, which would
+        # trigger an interruption, which would prevent us from writing the
+        # assistant response to context.
+        #
+        # Sending an EmulateUserStartedSpeakingFrame ourselves doesn't
+        # work: it just causes the interruption we're trying to avoid.
+        #
+        # Setting enable_emulated_vad_interruptions also doesn't work: at
+        # the time the user aggregator receives the TranscriptionFrame, it
+        # doesn't yet know the assistant has started responding, so it
+        # doesn't know that emulating the user starting to speak would
+        # cause an interruption.
+        should_wrap_in_user_started_stopped_speaking_frames = (
+            self._waiting_for_trigger_transcription
+            and self._user_text_buffer.strip().lower() == "ready"
+        )
+
+        # Start wrapping the upstream transcription in UserStarted/StoppedSpeakingFrames if needed
+        if should_wrap_in_user_started_stopped_speaking_frames:
+            logger.debug(
+                "Wrapping assistant response trigger transcription with upstream UserStarted/StoppedSpeakingFrames"
            )
+            await self.push_frame(UserStartedSpeakingFrame(), direction=FrameDirection.UPSTREAM)
+
+        # Send the transcription upstream for the user context aggregator
+        frame = TranscriptionFrame(
+            text=self._user_text_buffer, user_id="", timestamp=time_now_iso8601()
+        )
+        await self.push_frame(frame, direction=FrameDirection.UPSTREAM)
+
+        # Finish wrapping the upstream transcription in UserStarted/StoppedSpeakingFrames if needed
+        if should_wrap_in_user_started_stopped_speaking_frames:
+            await self.push_frame(UserStoppedSpeakingFrame(), direction=FrameDirection.UPSTREAM)
+
+        # Clear out the buffered user text
+        self._user_text_buffer = ""
+
+        # We're no longer waiting for a trigger transcription
+        self._waiting_for_trigger_transcription = False

    #
    # context
@@ -1054,23 +1159,26 @@ class AWSNovaSonicLLMService(LLMService):
        *,
        user_params: LLMUserAggregatorParams = LLMUserAggregatorParams(),
        assistant_params: LLMAssistantAggregatorParams = LLMAssistantAggregatorParams(),
-    ) -> AWSNovaSonicContextAggregatorPair:
+    ) -> LLMContextAggregatorPair:
        """Create context aggregator pair for managing conversation context.

+        NOTE: this method exists only for backward compatibility. New code
+        should instead do:
+            context = LLMContext(...)
+            context_aggregator = LLMContextAggregatorPair(context)
+
        Args:
-            context: The OpenAI LLM context to upgrade.
+            context: The OpenAI LLM context.
            user_params: Parameters for the user context aggregator.
            assistant_params: Parameters for the assistant context aggregator.

        Returns:
            A pair of user and assistant context aggregators.
        """
-        context.set_llm_adapter(self.get_llm_adapter())
-
-        user = AWSNovaSonicUserContextAggregator(context=context, params=user_params)
-        assistant = AWSNovaSonicAssistantContextAggregator(context=context, params=assistant_params)
-
-        return AWSNovaSonicContextAggregatorPair(user, assistant)
+        context = LLMContext.from_openai_context(context)
+        return LLMContextAggregatorPair(
+            context, user_params=user_params, assistant_params=assistant_params
+        )

    #
    # assistant response trigger (HACK)
@@ -1108,6 +1216,8 @@ class AWSNovaSonicLLMService(LLMService):
        try:
            logger.debug("Sending assistant response trigger...")

+            self._waiting_for_trigger_transcription = True
+
            chunk_duration = 0.02  # what we might get from InputAudioRawFrame
            chunk_size = int(
                chunk_duration
--- a/src/pipecat/services/aws/stt.py
+++ b/src/pipecat/services/aws/stt.py
@@ -286,6 +286,7 @@ class AWSTranscribeSTTService(STTService):

                logger.info(f"{self} Successfully connected to AWS Transcribe")

+                await self._call_event_handler("on_connected")
            except Exception as e:
                logger.error(f"{self} Failed to connect to AWS Transcribe: {e}")
                await self._disconnect()
@@ -310,6 +311,7 @@ class AWSTranscribeSTTService(STTService):
            logger.warning(f"{self} Error closing WebSocket connection: {e}")
        finally:
            self._ws_client = None
+            await self._call_event_handler("on_disconnected")

    def language_to_service_language(self, language: Language) -> str | None:
        """Convert internal language enum to AWS Transcribe language code.
--- a/src/pipecat/services/aws_nova_sonic/context.py
+++ b/src/pipecat/services/aws_nova_sonic/context.py
@@ -8,18 +8,14 @@

 This module provides specialized context aggregators and message handling for AWS Nova Sonic,
 including conversation history management and role-specific message processing.
+
+.. deprecated:: 0.0.91
+    AWS Nova Sonic no longer uses types from this module under the hood.
+    It now uses `LLMContext` and `LLMContextAggregatorPair`.
+    Using the new patterns should allow you to not need types from this module.
+
+    See deprecation warning in pipecat.services.aws.nova_sonic.context for more
+    details.
 """

-import warnings
-
 from pipecat.services.aws.nova_sonic.context import *
-
-with warnings.catch_warnings():
-    warnings.simplefilter("always")
-    warnings.warn(
-        "Types in pipecat.services.aws_nova_sonic.context are deprecated. "
-        "Please use the equivalent types from "
-        "pipecat.services.aws.nova_sonic.context instead.",
-        DeprecationWarning,
-        stacklevel=2,
-    )
--- a/src/pipecat/services/azure/realtime/llm.py
+++ b/src/pipecat/services/azure/realtime/llm.py
@@ -38,7 +38,7 @@ class AzureRealtimeLLMService(OpenAIRealtimeLLMService):
        Args:
            api_key: The API key for the Azure OpenAI service.
            base_url: The full Azure WebSocket endpoint URL including api-version and deployment.
-                Example: "wss://my-project.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=my-realtime-deployment"
+                Example: "wss://my-project.openai.azure.com/openai/realtime?api-version=2025-04-01-preview&deployment=my-realtime-deployment"
            **kwargs: Additional arguments passed to parent OpenAIRealtimeLLMService.
        """
        super().__init__(base_url=base_url, api_key=api_key, **kwargs)
@@ -52,7 +52,7 @@ class AzureRealtimeLLMService(OpenAIRealtimeLLMService):
                # handle disconnections in the send/recv code paths.
                return

-            logger.info(f"Connecting to {self.base_url}, api key: {self.api_key}")
+            logger.info(f"Connecting to {self.base_url}")
            self._websocket = await websocket_connect(
                uri=self.base_url,
                additional_headers={
--- a/src/pipecat/services/cartesia/stt.py
+++ b/src/pipecat/services/cartesia/stt.py
@@ -28,13 +28,12 @@ from pipecat.frames.frames import (
    UserStoppedSpeakingFrame,
 )
 from pipecat.processors.frame_processor import FrameDirection
-from pipecat.services.stt_service import STTService
+from pipecat.services.stt_service import WebsocketSTTService
 from pipecat.transcriptions.language import Language
 from pipecat.utils.time import time_now_iso8601
 from pipecat.utils.tracing.service_decorators import traced_stt

 try:
-    import websockets
    from websockets.asyncio.client import connect as websocket_connect
    from websockets.protocol import State
 except ModuleNotFoundError as e:
@@ -124,7 +123,7 @@ class CartesiaLiveOptions:
        return cls(**json.loads(json_str))


-class CartesiaSTTService(STTService):
+class CartesiaSTTService(WebsocketSTTService):
    """Speech-to-text service using Cartesia Live API.

    Provides real-time speech transcription through WebSocket connection
@@ -176,8 +175,7 @@ class CartesiaSTTService(STTService):
        self.set_model_name(merged_options.model)
        self._api_key = api_key
        self._base_url = base_url or "api.cartesia.ai"
-        self._connection = None
-        self._receiver_task = None
+        self._receive_task = None

    def can_generate_metrics(self) -> bool:
        """Check if the service can generate processing metrics.
@@ -214,6 +212,27 @@ class CartesiaSTTService(STTService):
        await super().cancel(frame)
        await self._disconnect()

+    async def start_metrics(self):
+        """Start performance metrics collection for transcription processing."""
+        await self.start_ttfb_metrics()
+        await self.start_processing_metrics()
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process incoming frames and handle speech events.
+
+        Args:
+            frame: The frame to process.
+            direction: Direction of frame flow in the pipeline.
+        """
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserStartedSpeakingFrame):
+            await self.start_metrics()
+        elif isinstance(frame, UserStoppedSpeakingFrame):
+            # Send finalize command to flush the transcription session
+            if self._websocket and self._websocket.state is State.OPEN:
+                await self._websocket.send("finalize")
+
    async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
        """Process audio data for speech-to-text transcription.

@@ -224,45 +243,71 @@ class CartesiaSTTService(STTService):
            None - transcription results are handled via WebSocket responses.
        """
        # If the connection is closed, due to timeout, we need to reconnect when the user starts speaking again
-        if not self._connection or self._connection.state is State.CLOSED:
+        if not self._websocket or self._websocket.state is State.CLOSED:
            await self._connect()

-        await self._connection.send(audio)
+        await self._websocket.send(audio)
        yield None

    async def _connect(self):
-        params = self._settings.to_dict()
-        ws_url = f"wss://{self._base_url}/stt/websocket?{urllib.parse.urlencode(params)}"
-        logger.debug(f"Connecting to Cartesia: {ws_url}")
-        headers = {"Cartesia-Version": "2025-04-16", "X-API-Key": self._api_key}
+        await self._connect_websocket()

+        if self._websocket and not self._receive_task:
+            self._receive_task = asyncio.create_task(self._receive_task_handler(self._report_error))
+
+    async def _disconnect(self):
+        if self._receive_task:
+            await self.cancel_task(self._receive_task)
+            self._receive_task = None
+
+        await self._disconnect_websocket()
+
+    async def _connect_websocket(self):
        try:
-            self._connection = await websocket_connect(ws_url, additional_headers=headers)
-            # Setup the receiver task to handle the incoming messages from the Cartesia server
-            if self._receiver_task is None or self._receiver_task.done():
-                self._receiver_task = asyncio.create_task(self._receive_messages())
-            logger.debug(f"Connected to Cartesia")
+            if self._websocket and self._websocket.state is State.OPEN:
+                return
+            logger.debug("Connecting to Cartesia STT")
+
+            params = self._settings.to_dict()
+            ws_url = f"wss://{self._base_url}/stt/websocket?{urllib.parse.urlencode(params)}"
+            headers = {"Cartesia-Version": "2025-04-16", "X-API-Key": self._api_key}
+
+            self._websocket = await websocket_connect(ws_url, additional_headers=headers)
+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"{self}: unable to connect to Cartesia: {e}")

-    async def _receive_messages(self):
+    async def _disconnect_websocket(self):
        try:
-            while True:
-                if not self._connection or self._connection.state is State.CLOSED:
-                    break
-
-                message = await self._connection.recv()
-                try:
-                    data = json.loads(message)
-                    await self._process_response(data)
-                except json.JSONDecodeError:
-                    logger.warning(f"Received non-JSON message: {message}")
-        except asyncio.CancelledError:
-            pass
-        except websockets.exceptions.ConnectionClosed as e:
-            logger.debug(f"WebSocket connection closed: {e}")
+            if self._websocket and self._websocket.state is State.OPEN:
+                logger.debug("Disconnecting from Cartesia STT")
+                await self._websocket.close()
        except Exception as e:
-            logger.error(f"Error in message receiver: {e}")
+            logger.error(f"{self} error closing websocket: {e}")
+        finally:
+            self._websocket = None
+            await self._call_event_handler("on_disconnected")
+
+    def _get_websocket(self):
+        if self._websocket:
+            return self._websocket
+        raise Exception("Websocket not connected")
+
+    async def _process_messages(self):
+        async for message in self._get_websocket():
+            try:
+                data = json.loads(message)
+                await self._process_response(data)
+            except json.JSONDecodeError:
+                logger.warning(f"Received non-JSON message: {message}")
+
+    async def _receive_messages(self):
+        while True:
+            await self._process_messages()
+            # Cartesia times out after 5 minutes of innactivity (no keepalive
+            # mechanism is available). So, we try to reconnect.
+            logger.debug(f"{self} Cartesia connection was disconnected (timeout?), reconnecting")
+            await self._connect_websocket()

    async def _process_response(self, data):
        if "type" in data:
@@ -316,41 +361,3 @@ class CartesiaSTTService(STTService):
                        language,
                    )
                )
-
-    async def _disconnect(self):
-        if self._receiver_task:
-            self._receiver_task.cancel()
-            try:
-                await self._receiver_task
-            except asyncio.CancelledError:
-                pass
-            except Exception as e:
-                logger.exception(f"Unexpected exception while cancelling task: {e}")
-            self._receiver_task = None
-
-        if self._connection and self._connection.state is State.OPEN:
-            logger.debug("Disconnecting from Cartesia")
-
-            await self._connection.close()
-            self._connection = None
-
-    async def start_metrics(self):
-        """Start performance metrics collection for transcription processing."""
-        await self.start_ttfb_metrics()
-        await self.start_processing_metrics()
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        """Process incoming frames and handle speech events.
-
-        Args:
-            frame: The frame to process.
-            direction: Direction of frame flow in the pipeline.
-        """
-        await super().process_frame(frame, direction)
-
-        if isinstance(frame, UserStartedSpeakingFrame):
-            await self.start_metrics()
-        elif isinstance(frame, UserStoppedSpeakingFrame):
-            # Send finalize command to flush the transcription session
-            if self._connection and self._connection.state is State.OPEN:
-                await self._connection.send("finalize")
--- a/src/pipecat/services/cartesia/tts.py
+++ b/src/pipecat/services/cartesia/tts.py
@@ -48,6 +48,26 @@ except ModuleNotFoundError as e:
    raise Exception(f"Missing module: {e}")


+class GenerationConfig(BaseModel):
+    """Configuration for Cartesia Sonic-3 generation parameters.
+
+    Sonic-3 interprets these parameters as guidance to ensure natural speech.
+    Test against your content for best results.
+
+    Parameters:
+        volume: Volume multiplier for generated speech. Valid range: [0.5, 2.0]. Default is 1.0.
+        speed: Speed multiplier for generated speech. Valid range: [0.6, 1.5]. Default is 1.0.
+        emotion: Single emotion string to guide the emotional tone. Examples include neutral,
+            angry, excited, content, sad, scared. Over 60 emotions are supported. For best
+            results, use with recommended voices: Leo, Jace, Kyle, Gavin, Maya, Tessa, Dana,
+            and Marian.
+    """
+
+    volume: Optional[float] = None
+    speed: Optional[float] = None
+    emotion: Optional[str] = None
+
+
 def language_to_cartesia_language(language: Language) -> Optional[str]:
    """Convert a Language enum to Cartesia language code.

@@ -101,16 +121,20 @@ class CartesiaTTSService(AudioContextWordTTSService):

        Parameters:
            language: Language to use for synthesis.
-            speed: Voice speed control.
-            emotion: List of emotion controls.
+            speed: Voice speed control for non-Sonic-3 models (literal values).
+            emotion: List of emotion controls for non-Sonic-3 models.

                .. deprecated:: 0.0.68
                        The `emotion` parameter is deprecated and will be removed in a future version.
+
+            generation_config: Generation configuration for Sonic-3 models. Includes volume,
+                speed (numeric), and emotion (string) parameters.
        """

        language: Optional[Language] = Language.EN
        speed: Optional[Literal["slow", "normal", "fast"]] = None
        emotion: Optional[List[str]] = []
+        generation_config: Optional[GenerationConfig] = None

    def __init__(
        self,
@@ -119,7 +143,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
        voice_id: str,
        cartesia_version: str = "2025-04-16",
        url: str = "wss://api.cartesia.ai/tts/websocket",
-        model: str = "sonic-2",
+        model: str = "sonic-3",
        sample_rate: Optional[int] = None,
        encoding: str = "pcm_s16le",
        container: str = "raw",
@@ -135,7 +159,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
            voice_id: ID of the voice to use for synthesis.
            cartesia_version: API version string for Cartesia service.
            url: WebSocket URL for Cartesia TTS API.
-            model: TTS model to use (e.g., "sonic-2").
+            model: TTS model to use (e.g., "sonic-3").
            sample_rate: Audio sample rate. If None, uses default.
            encoding: Audio encoding format.
            container: Audio container format.
@@ -179,6 +203,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
            else "en",
            "speed": params.speed,
            "emotion": params.emotion,
+            "generation_config": params.generation_config,
        }
        self.set_model_name(model)
        self.set_voice(voice_id)
@@ -297,6 +322,11 @@ class CartesiaTTSService(AudioContextWordTTSService):
        if self._settings["speed"]:
            msg["speed"] = self._settings["speed"]

+        if self._settings["generation_config"]:
+            msg["generation_config"] = self._settings["generation_config"].model_dump(
+                exclude_none=True
+            )
+
        return json.dumps(msg)

    async def start(self, frame: StartFrame):
@@ -344,10 +374,11 @@ class CartesiaTTSService(AudioContextWordTTSService):
        try:
            if self._websocket and self._websocket.state is State.OPEN:
                return
-            logger.debug("Connecting to Cartesia")
+            logger.debug("Connecting to Cartesia TTS")
            self._websocket = await websocket_connect(
                f"{self._url}?api_key={self._api_key}&cartesia_version={self._cartesia_version}"
            )
+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"{self} initialization error: {e}")
            self._websocket = None
@@ -365,6 +396,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
        finally:
            self._context_id = None
            self._websocket = None
+            await self._call_event_handler("on_disconnected")

    def _get_websocket(self):
        if self._websocket:
@@ -480,23 +512,27 @@ class CartesiaHttpTTSService(TTSService):

        Parameters:
            language: Language to use for synthesis.
-            speed: Voice speed control.
-            emotion: List of emotion controls.
+            speed: Voice speed control for non-Sonic-3 models (literal values).
+            emotion: List of emotion controls for non-Sonic-3 models.

                .. deprecated:: 0.0.68
                        The `emotion` parameter is deprecated and will be removed in a future version.
+
+            generation_config: Generation configuration for Sonic-3 models. Includes volume,
+                speed (numeric), and emotion (string) parameters.
        """

        language: Optional[Language] = Language.EN
        speed: Optional[Literal["slow", "normal", "fast"]] = None
        emotion: Optional[List[str]] = Field(default_factory=list)
+        generation_config: Optional[GenerationConfig] = None

    def __init__(
        self,
        *,
        api_key: str,
        voice_id: str,
-        model: str = "sonic-2",
+        model: str = "sonic-3",
        base_url: str = "https://api.cartesia.ai",
        cartesia_version: str = "2024-11-13",
        sample_rate: Optional[int] = None,
@@ -510,7 +546,7 @@ class CartesiaHttpTTSService(TTSService):
        Args:
            api_key: Cartesia API key for authentication.
            voice_id: ID of the voice to use for synthesis.
-            model: TTS model to use (e.g., "sonic-2").
+            model: TTS model to use (e.g., "sonic-3").
            base_url: Base URL for Cartesia HTTP API.
            cartesia_version: API version string for Cartesia service.
            sample_rate: Audio sample rate. If None, uses default.
@@ -537,6 +573,7 @@ class CartesiaHttpTTSService(TTSService):
            else "en",
            "speed": params.speed,
            "emotion": params.emotion,
+            "generation_config": params.generation_config,
        }
        self.set_voice(voice_id)
        self.set_model_name(model)
@@ -630,6 +667,11 @@ class CartesiaHttpTTSService(TTSService):
            if self._settings["speed"]:
                payload["speed"] = self._settings["speed"]

+            if self._settings["generation_config"]:
+                payload["generation_config"] = self._settings["generation_config"].model_dump(
+                    exclude_none=True
+                )
+
            yield TTSStartedFrame()

            session = await self._client._get_session()
--- a/src/pipecat/services/deepgram/flux/stt.py
+++ b/src/pipecat/services/deepgram/flux/stt.py
@@ -156,6 +156,12 @@ class DeepgramFluxSTTService(WebsocketSTTService):
        self._language = Language.EN
        self._websocket_url = None
        self._receive_task = None
+        # Flux event handlers
+        self._register_event_handler("on_start_of_turn")
+        self._register_event_handler("on_turn_resumed")
+        self._register_event_handler("on_end_of_turn")
+        self._register_event_handler("on_eager_end_of_turn")
+        self._register_event_handler("on_update")

    async def _connect(self):
        """Connect to WebSocket and start background tasks.
@@ -205,6 +211,7 @@ class DeepgramFluxSTTService(WebsocketSTTService):
                additional_headers={"Authorization": f"Token {self._api_key}"},
            )
            logger.debug("Connected to Deepgram Flux Websocket")
+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"{self} initialization error: {e}")
            self._websocket = None
@@ -225,6 +232,9 @@ class DeepgramFluxSTTService(WebsocketSTTService):
                await self._websocket.close()
        except Exception as e:
            logger.error(f"{self} error closing websocket: {e}")
+        finally:
+            self._websocket = None
+            await self._call_event_handler("on_disconnected")

    async def _send_close_stream(self) -> None:
        """Sends a CloseStream control message to the Deepgram Flux WebSocket API.
@@ -519,6 +529,7 @@ class DeepgramFluxSTTService(WebsocketSTTService):
        await self.push_frame(UserStartedSpeakingFrame(), FrameDirection.DOWNSTREAM)
        await self.push_frame(UserStartedSpeakingFrame(), FrameDirection.UPSTREAM)
        await self.start_metrics()
+        await self._call_event_handler("on_start_of_turn", transcript)
        if transcript:
            logger.trace(f"Start of turn transcript: {transcript}")

@@ -533,6 +544,7 @@ class DeepgramFluxSTTService(WebsocketSTTService):
            event: The event type string for logging purposes.
        """
        logger.trace(f"Received event TurnResumed: {event}")
+        await self._call_event_handler("on_turn_resumed")

    async def _handle_end_of_turn(self, transcript: str, data: Dict[str, Any]):
        """Handle EndOfTurn events from Deepgram Flux.
@@ -567,6 +579,7 @@ class DeepgramFluxSTTService(WebsocketSTTService):
        await self.stop_processing_metrics()
        await self.push_frame(UserStoppedSpeakingFrame(), FrameDirection.DOWNSTREAM)
        await self.push_frame(UserStoppedSpeakingFrame(), FrameDirection.UPSTREAM)
+        await self._call_event_handler("on_end_of_turn", transcript)

    async def _handle_eager_end_of_turn(self, transcript: str, data: Dict[str, Any]):
        """Handle EagerEndOfTurn events from Deepgram Flux.
@@ -611,6 +624,7 @@ class DeepgramFluxSTTService(WebsocketSTTService):
                result=data,
            )
        )
+        await self._call_event_handler("on_eager_end_of_turn", transcript)

    async def _handle_update(self, transcript: str):
        """Handle Update events from Deepgram Flux.
@@ -634,3 +648,4 @@ class DeepgramFluxSTTService(WebsocketSTTService):
            # both the "user started speaking" event and the first transcript simultaneously,
            # making this timing measurement meaningless in this context.
            # await self.stop_ttfb_metrics()
+            await self._call_event_handler("on_update", transcript)
--- a/src/pipecat/services/deepgram/tts.py
+++ b/src/pipecat/services/deepgram/tts.py
@@ -12,6 +12,7 @@ for generating speech from text using various voice models.

 from typing import AsyncGenerator, Optional

+import aiohttp
 from loguru import logger

 from pipecat.frames.frames import (
@@ -117,3 +118,114 @@ class DeepgramTTSService(TTSService):
        except Exception as e:
            logger.exception(f"{self} exception: {e}")
            yield ErrorFrame(f"Error getting audio: {str(e)}")
+
+
+class DeepgramHttpTTSService(TTSService):
+    """Deepgram HTTP text-to-speech service.
+
+    Provides text-to-speech synthesis using Deepgram's HTTP TTS API.
+    Supports various voice models and audio encoding formats with
+    configurable sample rates and quality settings.
+    """
+
+    def __init__(
+        self,
+        *,
+        api_key: str,
+        voice: str = "aura-2-helena-en",
+        aiohttp_session: aiohttp.ClientSession,
+        base_url: str = "https://api.deepgram.com",
+        sample_rate: Optional[int] = None,
+        encoding: str = "linear16",
+        **kwargs,
+    ):
+        """Initialize the Deepgram TTS service.
+
+        Args:
+            api_key: Deepgram API key for authentication.
+            voice: Voice model to use for synthesis. Defaults to "aura-2-helena-en".
+            aiohttp_session: Shared aiohttp session for HTTP requests with connection pooling.
+            base_url: Custom base URL for Deepgram API. Defaults to "https://api.deepgram.com".
+            sample_rate: Audio sample rate in Hz. If None, uses service default.
+            encoding: Audio encoding format. Defaults to "linear16".
+            **kwargs: Additional arguments passed to parent TTSService class.
+        """
+        super().__init__(sample_rate=sample_rate, **kwargs)
+
+        self._api_key = api_key
+        self._session = aiohttp_session
+        self._base_url = base_url
+        self._settings = {
+            "encoding": encoding,
+        }
+        self.set_voice(voice)
+
+    def can_generate_metrics(self) -> bool:
+        """Check if the service can generate metrics.
+
+        Returns:
+            True, as Deepgram TTS service supports metrics generation.
+        """
+        return True
+
+    @traced_tts
+    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
+        """Generate speech from text using Deepgram's TTS API.
+
+        Args:
+            text: The text to synthesize into speech.
+
+        Yields:
+            Frame: Audio frames containing the synthesized speech, plus start/stop frames.
+        """
+        logger.debug(f"{self}: Generating TTS [{text}]")
+
+        # Build URL with parameters
+        url = f"{self._base_url}/v1/speak"
+
+        headers = {"Authorization": f"Token {self._api_key}", "Content-Type": "application/json"}
+
+        params = {
+            "model": self._voice_id,
+            "encoding": self._settings["encoding"],
+            "sample_rate": self.sample_rate,
+            "container": "none",
+        }
+
+        payload = {
+            "text": text,
+        }
+
+        try:
+            await self.start_ttfb_metrics()
+
+            async with self._session.post(
+                url, headers=headers, json=payload, params=params
+            ) as response:
+                if response.status != 200:
+                    error_text = await response.text()
+                    raise Exception(f"HTTP {response.status}: {error_text}")
+
+                await self.start_tts_usage_metrics(text)
+                yield TTSStartedFrame()
+
+                CHUNK_SIZE = self.chunk_size
+
+                first_chunk = True
+                async for chunk in response.content.iter_chunked(CHUNK_SIZE):
+                    if first_chunk:
+                        await self.stop_ttfb_metrics()
+                        first_chunk = False
+
+                    if chunk:
+                        yield TTSAudioRawFrame(
+                            audio=chunk,
+                            sample_rate=self.sample_rate,
+                            num_channels=1,
+                        )
+
+            yield TTSStoppedFrame()
+
+        except Exception as e:
+            logger.exception(f"{self} exception: {e}")
+            yield ErrorFrame(f"Error getting audio: {str(e)}")
--- a/src/pipecat/services/elevenlabs/tts.py
+++ b/src/pipecat/services/elevenlabs/tts.py
@@ -168,16 +168,24 @@ def build_elevenlabs_voice_settings(


 def calculate_word_times(
-    alignment_info: Mapping[str, Any], cumulative_time: float
-) -> List[Tuple[str, float]]:
+    alignment_info: Mapping[str, Any],
+    cumulative_time: float,
+    partial_word: str = "",
+    partial_word_start_time: float = 0.0,
+) -> tuple[List[Tuple[str, float]], str, float]:
    """Calculate word timestamps from character alignment information.

    Args:
        alignment_info: Character alignment data from ElevenLabs API.
        cumulative_time: Base time offset for this chunk.
+        partial_word: Partial word carried over from previous chunk.
+        partial_word_start_time: Start time of the partial word.

    Returns:
-        List of (word, timestamp) tuples.
+        Tuple of (word_times, new_partial_word, new_partial_word_start_time):
+        - word_times: List of (word, timestamp) tuples for complete words
+        - new_partial_word: Incomplete word at end of chunk (empty if chunk ends with space)
+        - new_partial_word_start_time: Start time of the incomplete word
    """
    chars = alignment_info["chars"]
    char_start_times_ms = alignment_info["charStartTimesMs"]
@@ -186,41 +194,37 @@ def calculate_word_times(
        logger.error(
            f"calculate_word_times: length mismatch - chars={len(chars)}, times={len(char_start_times_ms)}"
        )
-        return []
+        return ([], partial_word, partial_word_start_time)

    # Build words and track their start positions
    words = []
-    word_start_indices = []
-    current_word = ""
-    word_start_index = None
+    word_start_times = []
+    current_word = partial_word  # Start with any partial word from previous chunk
+    word_start_time = partial_word_start_time if partial_word else None

    for i, char in enumerate(chars):
        if char == " ":
            # End of current word
            if current_word:  # Only add non-empty words
                words.append(current_word)
-                word_start_indices.append(word_start_index)
+                word_start_times.append(word_start_time)
                current_word = ""
-                word_start_index = None
+                word_start_time = None
        else:
            # Building a word
-            if word_start_index is None:  # First character of new word
-                word_start_index = i
+            if word_start_time is None:  # First character of new word
+                # Convert from milliseconds to seconds and add cumulative offset
+                word_start_time = cumulative_time + (char_start_times_ms[i] / 1000.0)
            current_word += char

-    # Handle the last word if there's no trailing space
-    if current_word and word_start_index is not None:
-        words.append(current_word)
-        word_start_indices.append(word_start_index)
+    # Build result for complete words
+    word_times = list(zip(words, word_start_times))

-    # Calculate timestamps for each word
-    word_times = []
-    for word, start_idx in zip(words, word_start_indices):
-        # Convert from milliseconds to seconds and add cumulative offset
-        start_time_seconds = cumulative_time + (char_start_times_ms[start_idx] / 1000.0)
-        word_times.append((word, start_time_seconds))
+    # Return any incomplete word at the end of this chunk
+    new_partial_word = current_word if current_word else ""
+    new_partial_word_start_time = word_start_time if word_start_time is not None else 0.0

-    return word_times
+    return (word_times, new_partial_word, new_partial_word_start_time)


 class ElevenLabsTTSService(AudioContextWordTTSService):
@@ -332,6 +336,9 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
        # there's an interruption or TTSStoppedFrame.
        self._started = False
        self._cumulative_time = 0
+        # Track partial words that span across alignment chunks
+        self._partial_word = ""
+        self._partial_word_start_time = 0.0

        # Context management for v1 multi API
        self._context_id = None
@@ -521,6 +528,7 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
                url, max_size=16 * 1024 * 1024, additional_headers={"xi-api-key": self._api_key}
            )

+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"{self} initialization error: {e}")
            self._websocket = None
@@ -543,6 +551,7 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
            self._started = False
            self._context_id = None
            self._websocket = None
+            await self._call_event_handler("on_disconnected")

    def _get_websocket(self):
        if self._websocket:
@@ -570,6 +579,8 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
                logger.error(f"Error closing context on interruption: {e}")
            self._context_id = None
            self._started = False
+            self._partial_word = ""
+            self._partial_word_start_time = 0.0

    async def _receive_messages(self):
        """Handle incoming WebSocket messages from ElevenLabs."""
@@ -609,7 +620,14 @@ class ElevenLabsTTSService(AudioContextWordTTSService):

            if msg.get("alignment"):
                alignment = msg["alignment"]
-                word_times = calculate_word_times(alignment, self._cumulative_time)
+                word_times, self._partial_word, self._partial_word_start_time = (
+                    calculate_word_times(
+                        alignment,
+                        self._cumulative_time,
+                        self._partial_word,
+                        self._partial_word_start_time,
+                    )
+                )

                if word_times:
                    await self.add_word_timestamps(word_times)
@@ -683,6 +701,8 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
                    yield TTSStartedFrame()
                    self._started = True
                    self._cumulative_time = 0
+                    self._partial_word = ""
+                    self._partial_word_start_time = 0.0
                    # If a context ID does not exist, create a new one and
                    # register it. If an ID exists, that means the Pipeline is
                    # configured for allow_interruptions=False, so continue
@@ -756,6 +776,7 @@ class ElevenLabsHttpTTSService(WordTTSService):
        base_url: str = "https://api.elevenlabs.io",
        sample_rate: Optional[int] = None,
        params: Optional[InputParams] = None,
+        aggregate_sentences: Optional[bool] = True,
        **kwargs,
    ):
        """Initialize the ElevenLabs HTTP TTS service.
@@ -768,10 +789,11 @@ class ElevenLabsHttpTTSService(WordTTSService):
            base_url: Base URL for ElevenLabs HTTP API.
            sample_rate: Audio sample rate. If None, uses default.
            params: Additional input parameters for voice customization.
+            aggregate_sentences: Whether to aggregate sentences within the TTSService.
            **kwargs: Additional arguments passed to the parent service.
        """
        super().__init__(
-            aggregate_sentences=True,
+            aggregate_sentences=aggregate_sentences,
            push_text_frames=False,
            push_stop_frames=True,
            sample_rate=sample_rate,
@@ -809,6 +831,10 @@ class ElevenLabsHttpTTSService(WordTTSService):
        # Store previous text for context within a turn
        self._previous_text = ""

+        # Track partial words that span across alignment chunks
+        self._partial_word = ""
+        self._partial_word_start_time = 0.0
+
    def language_to_service_language(self, language: Language) -> Optional[str]:
        """Convert pipecat Language to ElevenLabs language code.

@@ -836,6 +862,8 @@ class ElevenLabsHttpTTSService(WordTTSService):
        self._cumulative_time = 0
        self._started = False
        self._previous_text = ""
+        self._partial_word = ""
+        self._partial_word_start_time = 0.0
        logger.debug(f"{self}: Reset internal state")

    async def start(self, frame: StartFrame):
@@ -870,11 +898,13 @@ class ElevenLabsHttpTTSService(WordTTSService):
    def calculate_word_times(self, alignment_info: Mapping[str, Any]) -> List[Tuple[str, float]]:
        """Calculate word timing from character alignment data.

+        This method handles partial words that may span across multiple alignment chunks.
+
        Args:
            alignment_info: Character timing data from ElevenLabs.

        Returns:
-            List of (word, timestamp) pairs.
+            List of (word, timestamp) pairs for complete words in this chunk.

        Example input data::

@@ -900,30 +930,28 @@ class ElevenLabsHttpTTSService(WordTTSService):
        # Build the words and find their start times
        words = []
        word_start_times = []
-        current_word = ""
-        first_char_idx = -1
+        # Start with any partial word from previous chunk
+        current_word = self._partial_word
+        word_start_time = self._partial_word_start_time if self._partial_word else None

        for i, char in enumerate(chars):
            if char == " ":
                if current_word:  # Only add non-empty words
                    words.append(current_word)
-                    # Use time of the first character of the word, offset by cumulative time
-                    word_start_times.append(
-                        self._cumulative_time + char_start_times[first_char_idx]
-                    )
+                    word_start_times.append(word_start_time)
                    current_word = ""
-                    first_char_idx = -1
+                    word_start_time = None
            else:
-                if not current_word:  # This is the first character of a new word
-                    first_char_idx = i
+                if word_start_time is None:  # First character of a new word
+                    # Use time of the first character of the word, offset by cumulative time
+                    word_start_time = self._cumulative_time + char_start_times[i]
                current_word += char

-        # Don't forget the last word if there's no trailing space
-        if current_word and first_char_idx >= 0:
-            words.append(current_word)
-            word_start_times.append(self._cumulative_time + char_start_times[first_char_idx])
+        # Store any incomplete word at the end of this chunk
+        self._partial_word = current_word if current_word else ""
+        self._partial_word_start_time = word_start_time if word_start_time is not None else 0.0

-        # Create word-time pairs
+        # Create word-time pairs for complete words only
        word_times = list(zip(words, word_start_times))

        return word_times
@@ -959,6 +987,9 @@ class ElevenLabsHttpTTSService(WordTTSService):
        if self._voice_settings:
            payload["voice_settings"] = self._voice_settings

+        if self._settings["apply_text_normalization"] is not None:
+            payload["apply_text_normalization"] = self._settings["apply_text_normalization"]
+
        language = self._settings["language"]
        if self._model_name in ELEVENLABS_MULTILINGUAL_MODELS and language:
            payload["language_code"] = language
@@ -979,8 +1010,6 @@ class ElevenLabsHttpTTSService(WordTTSService):
        }
        if self._settings["optimize_streaming_latency"] is not None:
            params["optimize_streaming_latency"] = self._settings["optimize_streaming_latency"]
-        if self._settings["apply_text_normalization"] is not None:
-            params["apply_text_normalization"] = self._settings["apply_text_normalization"]

        try:
            await self.start_ttfb_metrics()
@@ -1041,6 +1070,14 @@ class ElevenLabsHttpTTSService(WordTTSService):
                        logger.error(f"Error processing response: {e}", exc_info=True)
                        continue

+                # After processing all chunks, emit any remaining partial word
+                # since this is the end of the utterance
+                if self._partial_word:
+                    final_word_time = [(self._partial_word, self._partial_word_start_time)]
+                    await self.add_word_timestamps(final_word_time)
+                    self._partial_word = ""
+                    self._partial_word_start_time = 0.0
+
                # After processing all chunks, add the total utterance duration
                # to the cumulative time to ensure next utterance starts after this one
                if utterance_duration > 0:
--- a/src/pipecat/services/fish/tts.py
+++ b/src/pipecat/services/fish/tts.py
@@ -225,6 +225,8 @@ class FishAudioTTSService(InterruptibleTTSService):
            start_message = {"event": "start", "request": {"text": "", **self._settings}}
            await self._websocket.send(ormsgpack.packb(start_message))
            logger.debug("Sent start message to Fish Audio")
+
+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"Fish Audio initialization error: {e}")
            self._websocket = None
@@ -245,6 +247,7 @@ class FishAudioTTSService(InterruptibleTTSService):
            self._request_id = None
            self._started = False
            self._websocket = None
+            await self._call_event_handler("on_disconnected")

    async def flush_audio(self):
        """Flush any buffered audio by sending a flush event to Fish Audio."""
--- a/src/pipecat/services/google/gemini_live/llm.py
+++ b/src/pipecat/services/google/gemini_live/llm.py
@@ -17,6 +17,7 @@ import json
 import random
 import time
 import uuid
+import warnings
 from dataclasses import dataclass
 from enum import Enum
 from typing import Any, Dict, List, Optional, Union
@@ -56,10 +57,12 @@ from pipecat.frames.frames import (
    UserStoppedSpeakingFrame,
 )
 from pipecat.metrics.metrics import LLMTokenUsage
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response import (
    LLMAssistantAggregatorParams,
    LLMUserAggregatorParams,
 )
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.aggregators.openai_llm_context import (
    OpenAILLMContext,
    OpenAILLMContextFrame,
@@ -219,6 +222,10 @@ class GeminiLiveContext(OpenAILLMContext):

    Provides Gemini-specific context management including system instruction
    extraction and message format conversion for the Live API.
+
+    .. deprecated:: 0.0.92
+        Gemini Live no longer uses `GeminiLiveContext` under the hood.
+        It now uses `LLMContext`.
    """

    @staticmethod
@@ -231,6 +238,22 @@ class GeminiLiveContext(OpenAILLMContext):
        Returns:
            The upgraded Gemini context instance.
        """
+        # This warning is here rather than `__init__` since `upgrade()` was the
+        # "main" way that GeminiLiveContext instances were created.
+        # Almost no users should be seeing this message anyway, as
+        # GeminiLiveContext instances were typically created under the hood:
+        # the user would pass an OpenAILLMContext instance, which would be
+        # upgraded without them necessarily knowing.
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "GeminiLiveContext is deprecated. "
+                "Gemini Live no longer uses GeminiLiveContext under the hood. "
+                "It now uses LLMContext.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
        if isinstance(obj, OpenAILLMContext) and not isinstance(obj, GeminiLiveContext):
            logger.debug(f"Upgrading to Gemini Live Context: {obj}")
            obj.__class__ = GeminiLiveContext
@@ -328,8 +351,28 @@ class GeminiLiveUserContextAggregator(OpenAIUserContextAggregator):

    Extends OpenAI user aggregator to handle Gemini-specific message passing
    while maintaining compatibility with the standard aggregation pipeline.
+
+    .. deprecated:: 0.0.92
+        Gemini Live no longer expects a `GeminiLiveUserContextAggregator`.
+        It now expects a `LLMUserAggregator`.
    """

+    def __init__(self, *args, **kwargs):
+        """Initialize Gemini Live user context aggregator."""
+        # Almost no users should be seeing this message, as
+        # `GeminiLiveUserContextAggregator`` instances were typically created
+        # under the hood, as part of `llm.create_context_aggregator()`.
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "GeminiLiveUserContextAggregator is deprecated. "
+                "Gemini Live no longer expects a GeminiLiveUserContextAggregator. "
+                "It now expects a LLMUserAggregator.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+        super().__init__(*args, **kwargs)
+
    async def process_frame(self, frame, direction):
        """Process incoming frames for user context aggregation.

@@ -349,8 +392,28 @@ class GeminiLiveAssistantContextAggregator(OpenAIAssistantContextAggregator):
    Handles assistant response aggregation while filtering out LLMTextFrames
    to prevent duplicate context entries, as Gemini Live pushes both
    LLMTextFrames and TTSTextFrames.
+
+    .. deprecated:: 0.0.92
+        Gemini Live no longer uses `GeminiLiveAssistantContextAggregator` under the hood.
+        It now uses `LLMAssistantAggregator`.
    """

+    def __init__(self, *args, **kwargs):
+        """Initialize Gemini Live assistant context aggregator."""
+        # Almost no users should be seeing this message, as
+        # `GeminiLiveAssistantContextAggregator` instances were typically
+        # created under the hood, as part of `llm.create_context_aggregator()`.
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "GeminiLiveAssistantContextAggregator is deprecated. "
+                "Gemini Live no longer uses GeminiLiveAssistantContextAggregator under the hood. "
+                "It now uses LLMAssistantAggregator.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+        super().__init__(*args, **kwargs)
+
    async def process_frame(self, frame: Frame, direction: FrameDirection):
        """Process incoming frames for assistant context aggregation.

@@ -380,6 +443,10 @@ class GeminiLiveAssistantContextAggregator(OpenAIAssistantContextAggregator):
 class GeminiLiveContextAggregatorPair:
    """Pair of user and assistant context aggregators for Gemini Live.

+    .. deprecated:: 0.0.92
+        `GeminiLiveContextAggregatorPair` is deprecated.
+        Use `LLMContextAggregatorPair` instead.
+
    Parameters:
        _user: The user context aggregator instance.
        _assistant: The assistant context aggregator instance.
@@ -388,6 +455,19 @@ class GeminiLiveContextAggregatorPair:
    _user: GeminiLiveUserContextAggregator
    _assistant: GeminiLiveAssistantContextAggregator

+    def __post_init__(self):
+        # Almost no users should be seeing this message, as
+        # `GeminiLiveContextAggregatorPair` instances were typically created
+        # under the hood, with `llm.create_context_aggregator()`.
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "GeminiLiveContextAggregatorPair is deprecated. "
+                "Use LLMContextAggregatorPair instead.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
    def user(self) -> GeminiLiveUserContextAggregator:
        """Get the user context aggregator.

@@ -609,7 +689,7 @@ class GeminiLiveLLMService(LLMService):
        self._run_llm_when_session_ready = False

        self._user_is_speaking = False
-        self._bot_is_speaking = False
+        self._bot_is_responding = False
        self._user_audio_buffer = bytearray()
        self._user_transcription_buffer = ""
        self._last_transcription_sent = ""
@@ -665,6 +745,9 @@ class GeminiLiveLLMService(LLMService):
        # Initialize the API client. Subclasses can override this if needed.
        self.create_client()

+        # Bookkeeping for tool calls
+        self._completed_tool_calls = set()
+
    def create_client(self):
        """Create the Gemini API client instance. Subclasses can override this."""
        self._client = Client(api_key=self._api_key, http_options=self._http_options)
@@ -787,9 +870,13 @@ class GeminiLiveLLMService(LLMService):
    #

    async def _handle_interruption(self):
-        await self._set_bot_is_speaking(False)
-        await self.push_frame(TTSStoppedFrame())
-        await self.push_frame(LLMFullResponseEndFrame())
+        if self._bot_is_responding:
+            await self._set_bot_is_responding(False)
+            if self._settings.get("modalities") == GeminiModalities.AUDIO:
+                await self.push_frame(TTSStoppedFrame())
+            # Do not send LLMFullResponseEndFrame here - an interruption
+            # already tells the assistant context aggregator that the response
+            # is over.

    async def _handle_user_started_speaking(self, frame):
        self._user_is_speaking = True
@@ -807,7 +894,6 @@ class GeminiLiveLLMService(LLMService):

    #
    # frame processing
-    #
    # StartFrame, StopFrame, CancelFrame implemented in base class
    #

@@ -820,7 +906,7 @@ class GeminiLiveLLMService(LLMService):
        """
        # Defer EndFrame handling until after the bot turn is finished
        if isinstance(frame, EndFrame):
-            if self._bot_is_speaking:
+            if self._bot_is_responding:
                logger.debug("Deferring handling EndFrame until bot turn is finished")
                self._end_frame_pending_bot_turn_finished = frame
                return
@@ -829,22 +915,13 @@ class GeminiLiveLLMService(LLMService):

        if isinstance(frame, TranscriptionFrame):
            await self.push_frame(frame, direction)
-        elif isinstance(frame, OpenAILLMContextFrame):
-            context: GeminiLiveContext = GeminiLiveContext.upgrade(frame.context)
-            # For now, we'll only trigger inference here when either:
-            #   1. We have not seen a context frame before
-            #   2. The last message is a tool call result
-            if not self._context:
-                self._context = context
-                if frame.context.tools:
-                    self._tools = frame.context.tools
-                await self._create_initial_response()
-            elif context.messages and context.messages[-1].get("role") == "tool":
-                # Support just one tool call per context frame for now
-                tool_result_message = context.messages[-1]
-                await self._tool_result(tool_result_message)
-        elif isinstance(frame, LLMContextFrame):
-            raise NotImplementedError("Universal LLMContext is not yet supported for Gemini Live.")
+        elif isinstance(frame, (LLMContextFrame, OpenAILLMContextFrame)):
+            context = (
+                frame.context
+                if isinstance(frame, LLMContextFrame)
+                else LLMContext.from_openai_context(frame.context)
+            )
+            await self._handle_context(context)
        elif isinstance(frame, InputTextRawFrame):
            await self._send_user_text(frame.text)
            await self.push_frame(frame, direction)
@@ -883,13 +960,48 @@ class GeminiLiveLLMService(LLMService):
        else:
            await self.push_frame(frame, direction)

-    async def _set_bot_is_speaking(self, speaking: bool):
-        if self._bot_is_speaking == speaking:
+    async def _handle_context(self, context: LLMContext):
+        if not self._context:
+            # We got our initial context
+            self._context = context
+            if context.tools:
+                self._tools = context.tools
+            # Initialize our bookkeeping of already-completed tool calls in
+            # the context
+            await self._process_completed_function_calls(send_new_results=False)
+            await self._create_initial_response()
+        else:
+            # We got an updated context.
+            # This may contain a new user message or tool call result.
+            self._context = context
+            # Send results for newly-completed function calls, if any.
+            await self._process_completed_function_calls(send_new_results=True)
+
+    async def _process_completed_function_calls(self, send_new_results: bool):
+        # Check for set of completed function calls in the context
+        adapter: GeminiLLMAdapter = self.get_llm_adapter()
+        messages = adapter.get_llm_invocation_params(self._context).get("messages", [])
+        for message in messages:
+            if message.parts:
+                for part in message.parts:
+                    if part.function_response:
+                        tool_call_id = part.function_response.id
+                        tool_name = part.function_response.name
+                        if tool_call_id and tool_call_id not in self._completed_tool_calls:
+                            # Found a newly-completed function call - send the result to the service
+                            if send_new_results:
+                                await self._tool_result(
+                                    tool_call_id, tool_name, part.function_response.response
+                                )
+                            self._completed_tool_calls.add(tool_call_id)
+
+    async def _set_bot_is_responding(self, responding: bool):
+        if self._bot_is_responding == responding:
            return

-        self._bot_is_speaking = speaking
+        self._bot_is_responding = responding

-        if not self._bot_is_speaking and self._end_frame_pending_bot_turn_finished:
+        if not self._bot_is_responding and self._end_frame_pending_bot_turn_finished:
            await self.queue_frame(self._end_frame_pending_bot_turn_finished)
            self._end_frame_pending_bot_turn_finished = None

@@ -1116,6 +1228,7 @@ class GeminiLiveLLMService(LLMService):
            if self._session:
                await self._session.close()
                self._session = None
+            self._completed_tool_calls = set()
            self._disconnecting = False
        except Exception as e:
            logger.error(f"{self} error disconnecting: {e}")
@@ -1195,7 +1308,8 @@ class GeminiLiveLLMService(LLMService):
            self._run_llm_when_session_ready = True
            return

-        messages = self._context.get_messages_for_initializing_history()
+        adapter: GeminiLLMAdapter = self.get_llm_adapter()
+        messages = adapter.get_llm_invocation_params(self._context).get("messages", [])
        if not messages:
            return

@@ -1223,8 +1337,9 @@ class GeminiLiveLLMService(LLMService):

        # Create a throwaway context just for the purpose of getting messages
        # in the right format
-        context = GeminiLiveContext.upgrade(OpenAILLMContext(messages=messages_list))
-        messages = context.get_messages_for_initializing_history()
+        context = LLMContext(messages=messages_list)
+        adapter: GeminiLLMAdapter = self.get_llm_adapter()
+        messages = adapter.get_llm_invocation_params(context).get("messages", [])

        if not messages:
            return
@@ -1239,17 +1354,16 @@ class GeminiLiveLLMService(LLMService):
            await self._handle_send_error(e)

    @traced_gemini_live(operation="llm_tool_result")
-    async def _tool_result(self, tool_result_message):
+    async def _tool_result(
+        self, tool_call_id: str, tool_name: str, tool_result_message: Dict[str, Any]
+    ):
        """Send tool result back to the API."""
        if self._disconnecting or not self._session:
            return

        # For now we're shoving the name into the tool_call_id field, so this
        # will work until we revisit that.
-        id = tool_result_message.get("tool_call_id")
-        name = tool_result_message.get("tool_call_name")
-        result = json.loads(tool_result_message.get("content") or "")
-        response = FunctionResponse(name=name, id=id, response=result)
+        response = FunctionResponse(name=tool_name, id=tool_call_id, response=tool_result_message)

        try:
            await self._session.send_tool_response(function_responses=response)
@@ -1277,7 +1391,10 @@ class GeminiLiveLLMService(LLMService):
        # part.text is added when `modalities` is set to TEXT; otherwise, it's None
        text = part.text
        if text:
-            if not self._bot_text_buffer:
+            if not self._bot_is_responding:
+                # Update bot responding state and send service start frame
+                # (AUDIO modality case)
+                await self._set_bot_is_responding(True)
                await self.push_frame(LLMFullResponseStartFrame())

            self._bot_text_buffer += text
@@ -1288,6 +1405,8 @@ class GeminiLiveLLMService(LLMService):
        if msg.server_content and msg.server_content.grounding_metadata:
            self._accumulated_grounding_metadata = msg.server_content.grounding_metadata

+        # If we have no audio, stop here.
+        # All logic below this point pertains to the AUDIO modality.
        inline_data = part.inline_data
        if not inline_data:
            return
@@ -1313,8 +1432,10 @@ class GeminiLiveLLMService(LLMService):
        if not audio:
            return

-        if not self._bot_is_speaking:
-            await self._set_bot_is_speaking(True)
+        # Update bot responding state and send service start frames
+        # (AUDIO modality case)
+        if not self._bot_is_responding:
+            await self._set_bot_is_responding(True)
            await self.push_frame(TTSStartedFrame())
            await self.push_frame(LLMFullResponseStartFrame())

@@ -1354,7 +1475,6 @@ class GeminiLiveLLMService(LLMService):
    @traced_gemini_live(operation="llm_response")
    async def _handle_msg_turn_complete(self, message: LiveServerMessage):
        """Handle the turn complete message."""
-        await self._set_bot_is_speaking(False)
        text = self._bot_text_buffer

        # Trace the complete LLM response (this will be handled by the decorator)
@@ -1373,13 +1493,15 @@ class GeminiLiveLLMService(LLMService):
        self._search_result_buffer = ""
        self._accumulated_grounding_metadata = None

-        # Only push the TTSStoppedFrame if the bot is outputting audio
-        # when text is found, modalities is set to TEXT and no audio
-        # is produced.
-        if not text:
-            await self.push_frame(TTSStoppedFrame())
-
-        await self.push_frame(LLMFullResponseEndFrame())
+        if self._bot_is_responding:
+            await self._set_bot_is_responding(False)
+            if not text:
+                # AUDIO modality case
+                await self.push_frame(TTSStoppedFrame())
+                await self.push_frame(LLMFullResponseEndFrame())
+            else:
+                # TEXT modality case
+                await self.push_frame(LLMFullResponseEndFrame())

    @traced_stt
    async def _handle_user_transcription(
@@ -1442,8 +1564,8 @@ class GeminiLiveLLMService(LLMService):
            return

        # This is the output transcription text when modalities is set to AUDIO.
-        # In this case, we push LLMTextFrame and TTSTextFrame to be handled by the
-        # downstream assistant context aggregator.
+        # In this case, we push TTSTextFrame to be handled by the downstream
+        # assistant context aggregator.
        text = message.server_content.output_transcription.text

        if not text:
@@ -1458,7 +1580,17 @@ class GeminiLiveLLMService(LLMService):
        # Collect text for tracing
        self._llm_output_buffer += text

-        await self.push_frame(LLMTextFrame(text=text))
+        # NOTE: Shoot. When using Vertex AI, output transcription messages
+        # arrive *before* the model_turn messages with audio, so we need to
+        # handle sending TTSStartedFrame and LLMFullResponseStartFrame here as
+        # well. These messages also contain much *more* text (it looks further
+        # ahead). That means that on an interruption our recorded context will
+        # contain some text that was actually never spoken.
+        if not self._bot_is_responding:
+            await self._set_bot_is_responding(True)
+            await self.push_frame(TTSStartedFrame())
+            await self.push_frame(LLMFullResponseStartFrame())
+
        await self.push_frame(TTSTextFrame(text=text))

    async def _handle_msg_grounding_metadata(self, message: LiveServerMessage):
@@ -1557,26 +1689,26 @@ class GeminiLiveLLMService(LLMService):
        *,
        user_params: LLMUserAggregatorParams = LLMUserAggregatorParams(),
        assistant_params: LLMAssistantAggregatorParams = LLMAssistantAggregatorParams(),
-    ) -> GeminiLiveContextAggregatorPair:
+    ) -> LLMContextAggregatorPair:
        """Create an instance of GeminiLiveContextAggregatorPair from an OpenAILLMContext.

        Constructor keyword arguments for both the user and assistant aggregators can be provided.

+        NOTE: this method exists only for backward compatibility. New code
+        should instead do:
+            context = LLMContext(...)
+            context_aggregator = LLMContextAggregatorPair(context)
+
        Args:
            context: The LLM context to use.
            user_params: User aggregator parameters. Defaults to LLMUserAggregatorParams().
            assistant_params: Assistant aggregator parameters. Defaults to LLMAssistantAggregatorParams().

        Returns:
-            GeminiLiveContextAggregatorPair: A pair of context
-            aggregators, one for the user and one for the assistant,
-            encapsulated in an GeminiLiveContextAggregatorPair.
+            A pair of user and assistant context aggregators.
        """
-        context.set_llm_adapter(self.get_llm_adapter())
-
-        GeminiLiveContext.upgrade(context)
-        user = GeminiLiveUserContextAggregator(context, params=user_params)
-
+        context = LLMContext.from_openai_context(context)
        assistant_params.expect_stripped_words = False
-        assistant = GeminiLiveAssistantContextAggregator(context, params=assistant_params)
-        return GeminiLiveContextAggregatorPair(_user=user, _assistant=assistant)
+        return LLMContextAggregatorPair(
+            context, user_params=user_params, assistant_params=assistant_params
+        )
--- a/src/pipecat/services/google/llm.py
+++ b/src/pipecat/services/google/llm.py
@@ -1034,6 +1034,23 @@ class GoogleLLMService(LLMService):
        if context:
            await self._process_context(context)

+    async def stop(self, frame):
+        """Override stop to gracefully close the client."""
+        await super().stop(frame)
+        await self._close_client()
+
+    async def cancel(self, frame):
+        """Override cancel to gracefully close the client."""
+        await super().cancel(frame)
+        await self._close_client()
+
+    async def _close_client(self):
+        try:
+            await self._client.aio.aclose()
+        except Exception:
+            # Do nothing - we're shutting down anyway
+            pass
+
    def create_context_aggregator(
        self,
        context: OpenAILLMContext,
--- a/src/pipecat/services/google/stt.py
+++ b/src/pipecat/services/google/stt.py
@@ -730,6 +730,8 @@ class GoogleSTTService(STTService):
        self._request_queue = asyncio.Queue()
        self._streaming_task = self.create_task(self._stream_audio())

+        await self._call_event_handler("on_connected")
+
    async def _disconnect(self):
        """Clean up streaming recognition resources."""
        if self._streaming_task:
@@ -737,6 +739,8 @@ class GoogleSTTService(STTService):
            await self.cancel_task(self._streaming_task)
            self._streaming_task = None

+        await self._call_event_handler("on_disconnected")
+
    async def _request_generator(self):
        """Generates requests for the streaming recognize method."""
        recognizer_path = f"projects/{self._project_id}/locations/{self._location}/recognizers/_"
--- a/src/pipecat/services/google/tts.py
+++ b/src/pipecat/services/google/tts.py
@@ -22,7 +22,7 @@ from pipecat.utils.tracing.service_decorators import traced_tts
 # Suppress gRPC fork warnings
 os.environ["GRPC_ENABLE_FORK_SUPPORT"] = "false"

-from typing import AsyncGenerator, List, Literal, Optional
+from typing import Any, AsyncGenerator, List, Literal, Mapping, Optional

 from loguru import logger
 from pydantic import BaseModel
@@ -248,7 +248,8 @@ class GoogleHttpTTSService(TTSService):

        Parameters:
            pitch: Voice pitch adjustment (e.g., "+2st", "-50%").
-            rate: Speaking rate adjustment (e.g., "slow", "fast", "125%").
+            rate: Speaking rate adjustment (e.g., "slow", "fast", "125%"). Used for SSML prosody tags (non-Chirp voices).
+            speaking_rate: Speaking rate for AudioConfig (Chirp/Journey voices). Range [0.25, 2.0].
            volume: Volume adjustment (e.g., "loud", "soft", "+6dB").
            emphasis: Emphasis level for the text.
            language: Language for synthesis. Defaults to English.
@@ -258,6 +259,7 @@ class GoogleHttpTTSService(TTSService):

        pitch: Optional[str] = None
        rate: Optional[str] = None
+        speaking_rate: Optional[float] = None
        volume: Optional[str] = None
        emphasis: Optional[Literal["strong", "moderate", "reduced", "none"]] = None
        language: Optional[Language] = Language.EN
@@ -291,6 +293,7 @@ class GoogleHttpTTSService(TTSService):
        self._settings = {
            "pitch": params.pitch,
            "rate": params.rate,
+            "speaking_rate": params.speaking_rate,
            "volume": params.volume,
            "emphasis": params.emphasis,
            "language": self.language_to_service_language(params.language)
@@ -360,6 +363,22 @@ class GoogleHttpTTSService(TTSService):
        """
        return language_to_google_tts_language(language)

+    async def _update_settings(self, settings: Mapping[str, Any]):
+        """Override to handle speaking_rate updates for Chirp/Journey voices.
+
+        Args:
+            settings: Dictionary of settings to update. Can include 'speaking_rate' (float)
+        """
+        if "speaking_rate" in settings:
+            rate_value = float(settings["speaking_rate"])
+            if 0.25 <= rate_value <= 2.0:
+                self._settings["speaking_rate"] = rate_value
+            else:
+                logger.warning(
+                    f"Invalid speaking_rate value: {rate_value}. Must be between 0.25 and 2.0"
+                )
+        await super()._update_settings(settings)
+
    def _construct_ssml(self, text: str) -> str:
        ssml = "<speak>"

@@ -436,10 +455,17 @@ class GoogleHttpTTSService(TTSService):
            voice = texttospeech_v1.VoiceSelectionParams(
                language_code=self._settings["language"], name=self._voice_id
            )
-            audio_config = texttospeech_v1.AudioConfig(
-                audio_encoding=texttospeech_v1.AudioEncoding.LINEAR16,
-                sample_rate_hertz=self.sample_rate,
-            )
+            # Build audio config with conditional speaking_rate
+            audio_config_params = {
+                "audio_encoding": texttospeech_v1.AudioEncoding.LINEAR16,
+                "sample_rate_hertz": self.sample_rate,
+            }
+
+            # For Chirp and Journey voices, include speaking_rate in AudioConfig
+            if (is_chirp_voice or is_journey_voice) and self._settings["speaking_rate"] is not None:
+                audio_config_params["speaking_rate"] = self._settings["speaking_rate"]
+
+            audio_config = texttospeech_v1.AudioConfig(**audio_config_params)

            request = texttospeech_v1.SynthesizeSpeechRequest(
                input=synthesis_input, voice=voice, audio_config=audio_config
@@ -500,7 +526,7 @@ class GoogleTTSService(TTSService):

        Parameters:
            language: Language for synthesis. Defaults to English.
-            speaking_rate: The speaking rate, in the range [0.25, 4.0].
+            speaking_rate: The speaking rate, in the range [0.25, 2.0].
        """

        language: Optional[Language] = Language.EN
@@ -591,6 +617,22 @@ class GoogleTTSService(TTSService):
        """
        return language_to_google_tts_language(language)

+    async def _update_settings(self, settings: Mapping[str, Any]):
+        """Override to handle speaking_rate updates for streaming API.
+
+        Args:
+            settings: Dictionary of settings to update. Can include 'speaking_rate' (float)
+        """
+        if "speaking_rate" in settings:
+            rate_value = float(settings["speaking_rate"])
+            if 0.25 <= rate_value <= 2.0:
+                self._settings["speaking_rate"] = rate_value
+            else:
+                logger.warning(
+                    f"Invalid speaking_rate value: {rate_value}. Must be between 0.25 and 2.0"
+                )
+        await super()._update_settings(settings)
+
    @traced_tts
    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
        """Generate streaming speech from text using Google's streaming API.
--- a/src/pipecat/services/hume/tts.py
+++ b/src/pipecat/services/hume/tts.py
@@ -184,11 +184,15 @@ class HumeTTSService(TTSService):
            # Hume emits mono PCM at 48 kHz; downstream can resample if needed.
            # We buffer audio bytes before sending to prevent glitches.
            self._audio_bytes = b""
+
+            # Use version "2" by default if no description is provided
+            # Version "1" is needed when description is used
+            version = "1" if self._params.description is not None else "2"
            async for chunk in self._client.tts.synthesize_json_streaming(
                utterances=[utterance],
                format=pcm_fmt,
                instant_mode=True,
-                version="2",
+                version=version,
            ):
                audio_b64 = getattr(chunk, "audio", None)
                if not audio_b64:
--- a/src/pipecat/services/llm_service.py
+++ b/src/pipecat/services/llm_service.py
@@ -492,11 +492,19 @@ class LLMService(AIService):
        tool_call_id: Optional[str] = None,
        text_content: Optional[str] = None,
        video_source: Optional[str] = None,
+        timeout: Optional[float] = 10.0,
    ):
        """Request an image from a user.

        Pushes a UserImageRequestFrame upstream to request an image from the
-        specified user.
+        specified user. The user image can then be processed by the LLM.
+
+        Use this function from a function call if you want the LLM to process
+        the image. If you expect the image to be processed by a vision service,
+        you might want to push a UserImageRequestFrame upstream directly.
+
+        .. deprecated:: 0.0.92
+            This method is deprecated, push a `UserImageRequestFrame` instead.

        Args:
            user_id: The ID of the user to request an image from.
@@ -504,15 +512,19 @@ class LLMService(AIService):
            tool_call_id: Optional tool call ID associated with the request.
            text_content: Optional text content/context for the image request.
            video_source: Optional video source identifier.
+            timeout: Optional timeout for the requested image to be added to the LLM context.
+
        """
+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "Method `request_image_frame()` is deprecated, push a `UserImageRequestFrame` instead.",
+                DeprecationWarning,
+            )
        await self.push_frame(
-            UserImageRequestFrame(
-                user_id=user_id,
-                function_name=function_name,
-                tool_call_id=tool_call_id,
-                context=text_content,
-                video_source=video_source,
-            ),
+            UserImageRequestFrame(user_id=user_id, text=text_content),
            FrameDirection.UPSTREAM,
        )

--- a/src/pipecat/services/lmnt/tts.py
+++ b/src/pipecat/services/lmnt/tts.py
@@ -222,6 +222,7 @@ class LmntTTSService(InterruptibleTTSService):
            # Send initialization message
            await self._websocket.send(json.dumps(init_msg))

+            await self._call_event_handler("on_connected")
        except Exception as e:
            logger.error(f"{self} initialization error: {e}")
            self._websocket = None
@@ -243,6 +244,7 @@ class LmntTTSService(InterruptibleTTSService):
        finally:
            self._started = False
            self._websocket = None
+            await self._call_event_handler("on_disconnected")

    def _get_websocket(self):
        """Get the WebSocket connection if available."""
--- a/src/pipecat/services/moondream/vision.py
+++ b/src/pipecat/services/moondream/vision.py
@@ -11,15 +11,17 @@ for image analysis and description generation.
 """

 import asyncio
-import base64
-from io import BytesIO
 from typing import AsyncGenerator, Optional

 from loguru import logger
 from PIL import Image

-from pipecat.frames.frames import ErrorFrame, Frame, TextFrame
-from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.frames.frames import (
+    ErrorFrame,
+    Frame,
+    TextFrame,
+    UserImageRawFrame,
+)
 from pipecat.services.vision_service import VisionService

 try:
@@ -92,16 +94,16 @@ class MoondreamService(VisionService):
            trust_remote_code=True,
            revision=revision,
            device_map={"": device},
-            torch_dtype=dtype,
+            dtype=dtype,
        ).eval()

        logger.debug("Loaded Moondream model")

-    async def run_vision(self, context: LLMContext) -> AsyncGenerator[Frame, None]:
+    async def run_vision(self, frame: UserImageRawFrame) -> AsyncGenerator[Frame, None]:
        """Analyze an image and generate a description.

        Args:
-            context: The context to process, containing image data.
+            frame: The image frame to process.

        Yields:
            Frame: TextFrame containing the generated image description, or ErrorFrame
@@ -112,45 +114,14 @@ class MoondreamService(VisionService):
            yield ErrorFrame("Moondream model not available")
            return

-        image_bytes = None
-        text = None
-        try:
-            messages = context.get_messages()
-            last_message = messages[-1]
-            last_message_content = last_message.get("content")
+        logger.debug(f"Analyzing image (bytes length: {len(frame.image)})")

-            for item in last_message_content:
-                if isinstance(item, dict):
-                    if (
-                        "image_url" in item
-                        and isinstance(item["image_url"], dict)
-                        and item["image_url"].get("url")
-                    ):
-                        image_bytes = base64.b64decode(item["image_url"]["url"].split(",")[1])
-                    elif "text" in item and isinstance(item["text"], str):
-                        text = item["text"]
-
-        except Exception as e:
-            logger.error(f"Exception during image extraction: {e}")
-            yield ErrorFrame("Failed to extract image from context")
-            return
-
-        if not image_bytes:
-            logger.error("No image found in context")
-            yield ErrorFrame("No image found in context")
-            return
-
-        logger.debug(
-            f"Analyzing image (bytes length: {len(image_bytes) if image_bytes else 'None'})"
-        )
-
-        def get_image_description(bytes: bytes, text: Optional[str]) -> str:
-            image_buffer = BytesIO(bytes)
-            image = Image.open(image_buffer)
+        def get_image_description(image_bytes: bytes, text: Optional[str]) -> str:
+            image = Image.frombytes(frame.format, frame.size, image_bytes)
            image_embeds = self._model.encode_image(image)
            description = self._model.query(image_embeds, text)["answer"]
            return description

-        description = await asyncio.to_thread(get_image_description, image_bytes, text)
+        description = await asyncio.to_thread(get_image_description, frame.image, frame.text)

        yield TextFrame(text=description)
--- a/Show More
+++ b/Show More