feat: Add sentence aggregation and Whisker debugger to transcript processor

- Enhance TranscriptHandler to aggregate transcript fragments into complete sentences using match_endofsentence() - Add Whisker debugger integration for real-time pipeline visualization - Implement sentence buffering for both user and assistant messages - Add finalize_partial_sentences() method to handle incomplete sentences on disconnect - Improves transcript readability by reducing fragmented output Changes: - Import match_endofsentence utility for sentence boundary detection - Add pipecat_whisker.WhiskerObserver for debugging capabilities - Modify on_transcript_update() to accumulate and aggregate messages - Create _save_sentence() helper method for complete sentence handling - Update client disconnect handler to preserve partial sentences
2025-09-25 14:01:19 +08:00
209 changed files with 9009 additions and 16190 deletions
--- a/.github/workflows/publish.yaml
+++ b/.github/workflows/publish.yaml
@@ -5,25 +5,25 @@ on:
    inputs:
      gitref:
        type: string
-        description: 'what git tag to build (e.g. v0.0.74)'
+        description: "what git tag to build (e.g. v0.0.74)"
        required: true

 jobs:
  build:
-    name: 'Build and upload wheels'
+    name: "Build and upload wheels"
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repo
        uses: actions/checkout@v4
        with:
          ref: ${{ github.event.inputs.gitref }}
-
+      
      - name: Install uv
        uses: astral-sh/setup-uv@v3
        with:
-          version: 'latest'
+          version: "latest"
      - name: Set up Python
-        run: uv python install 3.12
+        run: uv python install 3.10
      - name: Install development dependencies
        run: uv sync --group dev
      - name: Build project
@@ -35,9 +35,9 @@ jobs:
          path: ./dist

  publish-to-pypi:
-    name: 'Publish to PyPI'
+    name: "Publish to PyPI"
    runs-on: ubuntu-latest
-    needs: [build]
+    needs: [ build ]
    environment:
      name: pypi
      url: https://pypi.org/p/pipecat-ai
@@ -56,12 +56,12 @@ jobs:
          print-hash: true

  publish-to-test-pypi:
-    name: 'Publish to Test PyPI'
+    name: "Publish to Test PyPI"
    runs-on: ubuntu-latest
-    needs: [build]
+    needs: [ build ]
    environment:
      name: testpypi
-      url: https://test.pypi.org/p/pipecat-ai
+      url: https://pypi.org/p/pipecat-ai
    permissions:
      id-token: write
    steps:
@@ -70,7 +70,7 @@ jobs:
        with:
          name: wheels
          path: ./dist
-      - name: Publish to Test PyPI
+      - name: Publish to PyPI
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
          verbose: true
--- a/.github/workflows/publish_test.yaml
+++ b/.github/workflows/publish_test.yaml
@@ -4,7 +4,7 @@ on: workflow_dispatch

 jobs:
  build:
-    name: 'Build and upload wheels'
+    name: "Build and upload wheels"
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repo
@@ -15,9 +15,9 @@ jobs:
      - name: Install uv
        uses: astral-sh/setup-uv@v3
        with:
-          version: 'latest'
+          version: "latest"
      - name: Set up Python
-        run: uv python install 3.12
+        run: uv python install 3.10
      - name: Install development dependencies
        run: uv sync --group dev
      - name: Build project
@@ -29,12 +29,12 @@ jobs:
          path: ./dist

  publish-to-test-pypi:
-    name: 'Publish to Test PyPI'
+    name: "Publish to Test PyPI"
    runs-on: ubuntu-latest
    needs: [build]
    environment:
      name: testpypi
-      url: https://test.pypi.org/p/pipecat-ai
+      url: https://pypi.org/p/pipecat-ai
    permissions:
      id-token: write
    steps:
@@ -43,7 +43,7 @@ jobs:
        with:
          name: wheels
          path: ./dist
-      - name: Publish to Test PyPI
+      - name: Publish to PyPI
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
          verbose: true
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,574 +5,6 @@ All notable changes to **Pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

-## [0.0.92] - 2025-10-31 🎃 "The Haunted Edition" 👻
-
-### Added
-
- Added a new `DeepgramHttpTTSService`, which delivers a meaningful reduction
-  in latency when compared to the `DeepgramTTSService`.
-
- Add support for `speaking_rate` input parameter in `GoogleHttpTTSService`.
-
- Added `enable_speaker_diarization` and `enable_language_identification` to
-  `SonioxSTTService`.
-
- Added `SpeechmaticsTTSService`, which uses Speechmatic's TTS API. Updated
-  examples 07a\* to use the new TTS service.
-
- Added support for including images or audio to LLM context messages using
-  `LLMContext.create_image_message()` or `LLMContext.create_image_url_message()`
-  (not all LLMs support URLs) and `LLMContext.create_audio_message()`. For
-  example, when creating `LLMMessagesAppendFrame`:
-
-  ```python
-  message = LLMContext.create_image_message(image=..., size= ...)
-  await self.push_frame(LLMMessagesAppendFrame(messages=[message], run_llm=True))
-  ```
-
- New event handlers for the `DeepgramFluxSTTService`: `on_start_of_turn`,
-  `on_turn_resumed`, `on_end_of_turn`, `on_eager_end_of_turn`, `on_update`.
-
- Added `generation_config` parameter support to `CartesiaTTSService` and
-  `CartesiaHttpTTSService` for Cartesia Sonic-3 models. Includes a new
-  `GenerationConfig` class with `volume` (0.5-2.0), `speed` (0.6-1.5),
-  and `emotion` (60+ options) parameters for fine-grained speech generation
-  control.
-
- Expanded support for univeral `LLMContext` to `OpenAIRealtimeLLMService`.
-  As a reminder, the context-setup pattern when using `LLMContext` is:
-
-  ```python
-  context = LLMContext(messages, tools)
-  context_aggregator = LLMContextAggregatorPair(context)
-  ```
-
-  (Note that even though `OpenAIRealtimeLLMService` now supports the universal
-  `LLMContext`, it is not meant to be swapped out for another LLM service at
-  runtime with `LLMSwitcher`.)
-
-  Note: `TranscriptionFrame`s and `InterimTranscriptionFrame`s now go upstream
-  from `OpenAIRealtimeLLMService`, so if you're using `TranscriptProcessor`,
-  say, you'll want to adjust accordingly:
-
-  ```python
-  pipeline = Pipeline(
-    [
-      transport.input(),
-      context_aggregator.user(),
-
-      # BEFORE
-      llm,
-      transcript.user(),
-
-      # AFTER
-      transcript.user(),
-      llm,
-
-      transport.output(),
-      transcript.assistant(),
-      context_aggregator.assistant(),
-    ]
-  )
-  ```
-
-  Also worth noting: whether or not you use the new context-setup pattern with
-  `OpenAIRealtimeLLMService`, some types have changed under the hood:
-
-  ```python
-  ## BEFORE:
-
-  # Context aggregator type
-  context_aggregator: OpenAIContextAggregatorPair
-
-  # Context frame type
-  frame: OpenAILLMContextFrame
-
-  # Context type
-  context: OpenAIRealtimeLLMContext
-  # or
-  context: OpenAILLMContext
-
-  ## AFTER:
-
-  # Context aggregator type
-  context_aggregator: LLMContextAggregatorPair
-
-  # Context frame type
-  frame: LLMContextFrame
-
-  # Context type
-  context: LLMContext
-  ```
-
-  Also note that `RealtimeMessagesUpdateFrame` and
-  `RealtimeFunctionCallResultFrame` have been deprecated, since they're no
-  longer used by `OpenAIRealtimeLLMService`. OpenAI Realtime now works more
-  like other LLM services in Pipecat, relying on updates to its context, pushed
-  by context aggregators, to update its internal state. Listen for
-  `LLMContextFrame`s for context updates.
-
-  Finally, `LLMTextFrame`s are no longer pushed from `OpenAIRealtimeLLMService`
-  when it's configured with `output_modalities=['audio']`. If you need
-  to process its output, listen for `TTSTextFrame`s instead.
-
- Expanded support for universal `LLMContext` to `GeminiLiveLLMService`.
-  As a reminder, the context-setup pattern when using `LLMContext` is:
-
-  ```python
-  context = LLMContext(messages, tools)
-  context_aggregator = LLMContextAggregatorPair(context)
-  ```
-
-  (Note that even though `GeminiLiveLLMService` now supports the universal
-  `LLMContext`, it is not meant to be swapped out for another LLM service at
-  runtime with `LLMSwitcher`.)
-
-  Worth noting: whether or not you use the new context-setup pattern with
-  `GeminiLiveLLMService`, some types have changed under the hood:
-
-  ```python
-  ## BEFORE:
-
-  # Context aggregator type
-  context_aggregator: GeminiLiveContextAggregatorPair
-
-  # Context frame type
-  frame: OpenAILLMContextFrame
-
-  # Context type
-  context: GeminiLiveLLMContext
-  # or
-  context: OpenAILLMContext
-
-  ## AFTER:
-
-  # Context aggregator type
-  context_aggregator: LLMContextAggregatorPair
-
-  # Context frame type
-  frame: LLMContextFrame
-
-  # Context type
-  context: LLMContext
-  ```
-
-  Also note that `LLMTextFrame`s are no longer pushed from `GeminiLiveLLMService`
-  when it's configured with `modalities=GeminiModalities.AUDIO`. If you need
-  to process its output, listen for `TTSTextFrame`s instead.
-
-### Changed
-
- The development runner's `/start` endpoint now supports passing
-  `dailyRoomProperties` and `dailyMeetingTokenProperties` in the request body
-  when `createDailyRoom` is true. Properties are validated against the
-  `DailyRoomProperties` and `DailyMeetingTokenProperties` types respectively
-  and passed to Daily's room and token creation APIs.
-
- `UserImageRawFrame` new fields `append_to_context` and `text`. The
-  `append_to_context` field indicates if this image and text should be added to
-  the LLM context (by the LLM assistant aggregator). The `text` field, if set,
-  might also guide the LLM or the vision service on how to analyze the image.
-
- `UserImageRequestFrame` new fiels `append_to_context` and `text`. Both fields
-  will be used to set the same fields on the captured `UserImageRawFrame`.
-
- `UserImageRequestFrame` don't require function call name and ID anymore.
-
- Updated `MoondreamService` to process `UserImageRawFrame`.
-
- `VisionService` expects `UserImageRawFrame` in order to analyze images.
-
- `DailyTransport` triggers `on_error` event if transcription can't be started
-  or stopped.
-
- `DailyTransport` updates: `start_dialout()` now returns two values:
-  `session_id` and `error`. `start_recording()` now returns two values:
-  `stream_id` and `error`.
-
- Updated `daily-python` to 0.21.0.
-
- `SimliVideoService` now accepts `api_key` and `face_id` parameters directly,
-  with optional `params` for `max_session_length` and `max_idle_time`
-  configuration, aligning with other Pipecat service patterns.
-
- Updated the default model to `sonic-3` for `CartesiaTTSService` and
-  `CartesiaHttpTTSService`.
-
- `FunctionFilter` now has a `filter_system_frames` arg, which controls whether
-  or not SystemFrames are filtered.
-
- Upgraded `aws_sdk_bedrock_runtime` to v0.1.1 to resolve potential CPU issues
-  when running `AWSNovaSonicLLMService`.
-
-### Deprecated
-
- The `expect_stripped_words` parameter of `LLMAssistantAggregatorParams` is
-  ignored when used with the newer `LLMAssistantAggregator`, which now handles
-  word spacing automatically.
-
- `LLMService.request_image_frame()` is deprecated, push a
-  `UserImageRequestFrame` instead.
-
- `UserResponseAggregator` is deprecated and will be removed in a future version.
-
- The `send_transcription_frames` argument to `OpenAIRealtimeLLMService` is
-  deprecated. Transcription frames are now always sent. They go upstream, to be
-  handled by the user context aggregator. See "Added" section for details.
-
- Types in `pipecat.services.openai.realtime.context` and
-  `pipecat.services.openai.realtime.frames` are deprecated, as they're no
-  longer used by `OpenAIRealtimeLLMService`. See "Added" section for details.
-
- `SimliVideoService` `simli_config` parameter is deprecated. Use `api_key` and
-  `face_id` parameters instead.
-
-### Removed
-
- Removed `enable_non_final_tokens` and `max_non_final_tokens_duration_ms` from
-  `SonioxSTTService`.
-
- Removed the `aiohttp_session` arg from `SarvamTTSService` as it's no longer
-  used.
-
-### Fixed
-
- Fixed a `PipelineTask` issue that was causing an idle timeout for frames that
-  were being generated but not reaching the end of the pipeline. Since the exact
-  point when frames are discarded is unknown, we now monitor pipeline frames
-  using an observer. If the observer detects frames are being generated, it will
-  prevent the pipeline from being considered idle.
-
- Fixed an issue in `HumeTTSService` that was only using Octave 2, which does
-  not support the `description` field. Now, if a description is provided, it
-  switches to Octave 1.
-
- Fixed an issue where `DailyTransport` would timeout prematurely on join and on
-  leave.
-
- Fixed an issue in the runner where starting a DailyTransport room via
-  `/start` didn't support using the `DAILY_SAMPLE_ROOM_URL` env var.
-
- Fixed an issue in `ServiceSwitcher` where the `STTService`s would result in
-  all STT services producing `TranscriptionFrame`s.
-
-### Other
-
- Updated all vision 12-series foundational examples to load images from a file.
-
- Added 14-series video examples for different services. These new examples
-  request an image from the user camera through a function call.
-
-## [0.0.91] - 2025-10-21
-
-### Added
-
- It is now possible to start a bot from the `/start` endpoint when using the
-  runner Daily's transport. This follows the Pipecat Cloud format with
-  `createDailyRoom` and `body` fields in the POST request body.
-
- Added an ellipsis character (`…`) to the end of sentence detection in the
-  string utils.
-
- Expanded support for universal `LLMContext` to `AWSNovaSonicLLMService`.
-  As a reminder, the context-setup pattern when using `LLMContext` is:
-
-  ```python
-  context = LLMContext(messages, tools)
-  context_aggregator = LLMContextAggregatorPair(context)
-  ```
-
-  (Note that even though `AWSNovaSonicLLMService` now supports the universal
-  `LLMContext`, it is not meant to be swapped out for another LLM service at
-  runtime with `LLMSwitcher`.)
-
-  Worth noting: whether or not you use the new context-setup pattern with
-  `AWSNovaSonicLLMService`, some types have changed under the hood:
-
-  ```python
-  ## BEFORE:
-
-  # Context aggregator type
-  context_aggregator: AWSNovaSonicContextAggregatorPair
-
-  # Context frame type
-  frame: OpenAILLMContextFrame
-
-  # Context type
-  context: AWSNovaSonicLLMContext
-  # or
-  context: OpenAILLMContext
-
-  ## AFTER:
-
-  # Context aggregator type
-  context_aggregator: LLMContextAggregatorPair
-
-  # Context frame type
-  frame: LLMContextFrame
-
-  # Context type
-  context: LLMContext
-  ```
-
- Added support for `bulbul:v3` model in `SarvamTTSService` and
-  `SarvamHttpTTSService`.
-
- Added `keyterms_prompt` parameter to `AssemblyAIConnectionParams`.
-
- Added `speech_model` parameter to `AssemblyAIConnectionParams` to access the
-  multilingual model.
-
- Added support for trickle ICE to the `SmallWebRTCTransport`.
-
- Added support for updating `OpenAITTSService` settings (`instructions` and
-  `speed`) at runtime via `TTSUpdateSettingsFrame`.
-
- Added `--whatsapp` flag to runner to better surface WhatsApp transport logs.
-
- Added `on_connected` and `on_disconnected` events to TTS and STT
-  websocket-based services.
-
- Added an `aggregate_sentences` arg in `ElevenLabsHttpTTSService`, where the
-  default value is True.
-
- Added a `room_properties` arg to the Daily runner's `configure()` method,
-  allowing `DailyRoomProperties` to be provided.
-
- The runner `--folder` argument now supports downloading files from
-  subdirectories.
-
-### Changed
-
- `RunnerArguments` now include the `body` field, so there's no need to add it
-  to subclasses. Also, all `RunnerArguments` fields are now keyword-only.
-
- `CartesiaSTTService` now inherits from `WebsocketSTTService`.
-
- Package upgrades:
-
-  - `daily-python` upgraded to 0.20.0.
-  - `openai` upgraded to support up to 2.x.x.
-  - `openpipe` upgraded to support up to 5.x.x.
-
- `SpeechmaticsSTTService` updated dependencies for `speechmatics-rt>=0.5.0`.
-
-### Deprecated
-
- The `send_transcription_frames` argument to `AWSNovaSonicLLMService` is
-  deprecated. Transcription frames are now always sent. They go upstream, to be
-  handled by the user context aggregator. See "Added" section for details.
-
- Types in `pipecat.services.aws.nova_sonic.context` are deprecated, as they're
-  no longer used by `AWSNovaSonicLLMService`. See "Added" section for
-  details.
-
-### Fixed
-
- Fixed an issue where the `RTVIProcessor` was sending duplicate
-  `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` messages.
-
- Fixed an issue in `AWSBedrockLLMService` where both `temperature` and `top_p`
-  were always sent together, causing conflicts with models like Claude Sonnet 4.5
-  that don't allow both parameters simultaneously. The service now only includes
-  inference parameters that are explicitly set, and `InputParams` defaults have
-  been changed to `None` to rely on AWS Bedrock's built-in model defaults.
-
- Fixed an issue in `RivaSegmentedSTTService` where a runtime error occurred due
-  to a mismatch in the `_handle_transcription` method's signature.
-
- Fixed multiple pipeline task cancellation issues. `asyncio.CancelledError` is
-  now handled properly in `PipelineTask` making it possible to cancel an asyncio
-  task that it's executing a `PipelineRunner` cleanly. Also,
-  `PipelineTask.cancel()` does not block anymore waiting for the `CancelFrame`
-  to reach the end of the pipeline (going back to the behavior in < 0.0.83).
-
- Fixed an issue in `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` where
-  the Flash models would split words, resulting in a space being inserted
-  between words.
-
- Fixed an issue where audio filters' `stop()` would not be called when using
-  `CancelFrame`.
-
- Fixed an issue in `ElevenLabsHttpTTSService`, where
-  `apply_text_normalization` was incorrectly set as a query parameter. It's now
-  being added as a request parameter.
-
- Fixed an issue where `RimeHttpTTSService` and `PiperTTSService` could generate
-  incorrectly 16-bit aligned audio frames, potentially leading to internal
-  errors or static audio.
-
- Fixed an issue in `SpeechmaticsSTTService` where `AdditionalVocabEntry` items
-  needed to have `sounds_like` for the session to start.
-
-### Other
-
- Added foundational example `47-sentry-metrics.py`, demonstrating how to use the
-  `SentryMetrics` processor.
-
- Added foundational example `14x-function-calling-openpipe.py`.
-
-## [0.0.90] - 2025-10-10
-
-### Added
-
- Added audio filter `KrispVivaFilter` using the Krisp VIVA SDK.
-
- Added `--folder` argument to the runner, allowing files saved in that folder
-  to be downloaded from `http://HOST:PORT/file/FILE`.
-
- Added `GeminiLiveVertexLLMService`, for accessing Gemini Live via Google
-  Vertex AI.
-
- Added some new configuration options to `GeminiLiveLLMService`:
-
-  - `thinking`
-  - `enable_affective_dialog`
-  - `proactivity`
-
-  Note that these new configuration options require using a newer model than
-  the default, like "gemini-2.5-flash-native-audio-preview-09-2025". The last
-  two require specifying `http_options=HttpOptions(api_version="v1alpha")`.
-
- Added `on_pipeline_error` event to `PipelineTask`. This event will get fired
-  when an `ErrorFrame` is pushed (use `FrameProcessor.push_error()`).
-
-  ```python
-  @task.event_handler("on_pipeline_error")
-  async def on_pipeline_error(task: PipelineTask, frame: ErrorFrame):
-      ...
-  ```
-
- Added a `service_tier` `InputParam` to the `BaseOpenAILLMService`. This
-  parameter can influence the latency of the response. For example `"priority"`
-  will result in faster completions, but in exchange for a higher price.
-
-### Changed
-
- Updated `GeminiLiveLLMService` to use the `google-genai` library rather than
-  use WebSockets directly.
-
-### Deprecated
-
- `LivekitFrameSerializer` is now deprecated. Use `LiveKitTransport` instead.
-
- `pipecat.service.openai_realtime` is now deprecated, use
-  `pipecat.services.openai.realtime` instead or
-  `pipecat.services.azure.realtime` for Azure Realtime.
-
- `pipecat.service.aws_nova_sonic` is now deprecated, use
-  `pipecat.services.aws.nova_sonic` instead.
-
- `GeminiMultimodalLiveLLMService` is now deprecated, use
-  `GeminiLiveLLMService`.
-
-### Fixed
-
- Fixed a `GoogleVertexLLMService` issue that would generate an error if no
-  token information was returned.
-
- `GeminiLiveLLMService` will now end gracefully (i.e. after the bot has
-  finished) upon receiving an `EndFrame`.
-
- `GeminiLiveLLMService` will try to seamlessly reconnect when it loses its
-  connection.
-
-## [0.0.89] - 2025-10-07
-
-### Fixed
-
- Reverted a change introduced in 0.0.88 that was causing pipelines to be frozen
-  when using interruption strategies and processors that block interruption
-  frames (e.g. `STTMuteFilter`).
-
-## [0.0.88] - 2025-10-07
-
-### Added
-
- Added support for Nano Banana models to `GoogleLLMService`. For example, you
-  can now use the `gemini-2.5-flash-image` model to generate images.
-
- Added `HumeTTSService` for text-to-speech synthesis using Hume AI's expressive
-  voice models. Provides high-quality, emotionally expressive speech synthesis
-  with support for various voice models. Includes example in
-  `examples/foundational/07ad-interruptible-hume.py`. Use with:
-  `uv pip install pipecat-ai[hume]`.
-
-### Changed
-
- Updated default `GoogleLLMService` model to `gemini-2.5-flash`.
-
-### Deprecated
-
- PlayHT is shutting down their API on December 31st, 2025. As a result,
-  `PlayHTTTSService` and `PlayHTHttpTTSService` are deprecated and will be
-  removed in a future version.
-
-### Fixed
-
- Fixed an issue with `AWSNovaSonicLLMService` where the client wouldn't
-  connect due to a breaking change in the AWS dependency chain.
-
- `PermissionError` is now caught if NLTK's `punkt_tab` can't be downloaded.
-
- Fixed an issue that would cause wrong user/assistant context ordering when
-  using interruption strategies.
-
- Fixed RTVI incoming message handling, broken in 0.0.87.
-
-## [0.0.87] - 2025-10-02
-
-### Added
-
- Added `WebsocketSTTService` base class for websocket-based STT services.
-  Combines STT functionality with websocket connectivity, providing automatic
-  error handling and reconnection capabilities with exponential backoff.
-
- Added `DeepgramFluxSTTService` for real-time speech recognition using
-  Deepgram's Flux WebSocket API. Flux understands conversational flow and
-  automatically handles turn-taking.
-
- Added RTVI messages for user/bot audio levels and system logs.
-
- Include OpenAI-based LLM services cached tokens to `MetricsFrame`.
-
-### Changed
-
- Updated the default model for `AnthropicLLMService` to
-  `claude-sonnet-4-5-20250929`.
-
-### Deprecated
-
- `DailyTransportMessageFrame` and `DailyTransportMessageUrgentFrame` are
-  deprecated, use `DailyOutputTransportMessageFrame` and
-  `DailyOutputTransportMessageUrgentFrame` respectively instead.
-
- `LiveKitTransportMessageFrame` and `LiveKitTransportMessageUrgentFrame` are
-  deprecated, use `LiveKitOutputTransportMessageFrame` and
-  `LiveKitOutputTransportMessageUrgentFrame` respectively instead.
-
- `TransportMessageFrame` and `TransportMessageUrgentFrame` are deprecated, use
-  `OutputTransportMessageFrame` and `OutputTransportMessageUrgentFrame`
-  respectively instead.
-
- `InputTransportMessageUrgentFrame` is deprecated, use
-  `InputTransportMessageFrame` instead.
-
- `DailyUpdateRemoteParticipantsFrame` is deprecated and will be removed in a
-  future version. Instead, create your own custom frame and handle it in the
-  `@transport.output().event_handler("on_after_push_frame")` event handler or a
-  custom processor.
-
-## Fixed
-
- Fixed an issue in `AWSBedrockLLMService` where timeout exceptions weren't
-  being detected.
-
- Fixed a `PipelineTask` issue that could prevent the application to exit if
-  `task.cancel()` was called when the task was already finished.
-
- Fixed an issue where local SmartTurn was not being ran in a separate thread.
-
 ## [0.0.86] - 2025-09-24

 ### Added
@@ -1403,8 +835,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Added

- Added `SonioxSTTService` using Soniox's STT websocket API.
-
 - Added `enable_emulated_vad_interruptions` to `LLMUserAggregatorParams`.
  When user speech is emulated (e.g. when a transcription is received but
  VAD doesn't detect speech), this parameter controls whether the emulated
@@ -1896,7 +1326,7 @@ quality and critical bugs impacting `ParallelPipelines` functionality.**
 - Added `session_token` parameter to `AWSNovaSonicLLMService`.

 - Added Gemini Multimodal Live File API for uploading, fetching, listing, and
-  deleting files. See `26f-gemini-live-files-api.py` for example usage.
+  deleting files. See `26f-gemini-multimodal-live-files-api.py` for example usage.

 ### Changed

@@ -3902,7 +3332,7 @@ stt = DeepgramSTTService(..., live_options=LiveOptions(model="nova-2-general"))
 - Added the new modalities option and helper function to set Gemini output
  modalities.

- Added `examples/foundational/26d-gemini-live-text.py` which is
+- Added `examples/foundational/26d-gemini-multimodal-live-text.py` which is
  using Gemini as TEXT modality and using another TTS provider for TTS process.

 ### Changed
@@ -4089,9 +3519,9 @@ stt = DeepgramSTTService(..., live_options=LiveOptions(model="nova-2-general"))
 - Added new foundational examples for `GeminiMultimodalLiveLLMService`:

  - `26-gemini-multimodal-live.py`
-  - `26a-gemini-live-transcription.py`
-  - `26b-gemini-live-video.py`
-  - `26c-gemini-live-video.py`
+  - `26a-gemini-multimodal-live-transcription.py`
+  - `26b-gemini-multimodal-live-video.py`
+  - `26c-gemini-multimodal-live-video.py`

 - Added `SimliVideoService`. This is an integration for Simli AI avatars.
  (see https://www.simli.com)
--- a/COMMUNITY_INTEGRATIONS.md
+++ b/COMMUNITY_INTEGRATIONS.md
@@ -1,336 +0,0 @@
-# Community Integrations Guide
-
-Pipecat welcomes community-maintained integrations! As our ecosystem grows, we've established a process for any developer to create and maintain their own service integrations while ensuring discoverability for the Pipecat community.
-
-## Overview
-
-**What we support:** Community-maintained integrations that live in separate repositories and are maintained by their authors.
-
-**What we don't do:** The Pipecat team does not code review, test, or maintain community integrations. We provide guidance and list approved integrations for discoverability.
-
-**Why this approach:** This allows the community to move quickly while keeping the Pipecat core team focused on maintaining the framework itself.
-
-## Submitting your Integration
-
-To be listed as an official community integration, follow these steps:
-
-### Step 1: Build Your Integration
-
-Create your integration following the patterns and examples shown in the "Integration Patterns and Examples" section below.
-
-### Step 2: Set Up Your Repository
-
-Your repository must contain these components:
-
- **Source code** - Complete implementation following Pipecat patterns
- **Foundational example** - Single file example showing basic usage (see [Pipecat examples](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational))
- **README.md** - Must include:
-
-  - Introduction and explanation of your integration
-  - Installation instructions
-  - Usage instructions with Pipecat Pipeline
-  - How to run your example
-  - Pipecat version compatibility (e.g., "Tested with Pipecat v0.0.86")
-  - Company attribution: If you work for the company providing the service, please mention this in your README. This helps build confidence that the integration will be actively maintained.
-
- **LICENSE** - Permissive license (BSD-2 like Pipecat, or equivalent open source terms)
- **Code documentation** - Source code with docstrings (we recommend following [Pipecat's docstring conventions](https://github.com/pipecat-ai/pipecat/blob/main/CONTRIBUTING.md#docstring-conventions))
- **Changelog** - Maintain a changelog for version updates
-
-### Step 3: Join Discord
-
-Join our Discord: https://discord.gg/pipecat
-
-### Step 4: Submit for Listing
-
-Submit a pull request to add your integration to our [Community Integrations documentation page](https://docs.pipecat.ai/server/services/community-integrations).
-
-**To submit:**
-
-1. Fork the [Pipecat docs repository](https://github.com/pipecat-ai/docs)
-2. Edit the file `server/services/community-integrations.mdx`
-3. Add your integration to the appropriate service category table with:
-   - Service name
-   - Link to your repository
-   - Maintainer GitHub username(s)
-4. Include a link to your demo video (approx 30-60 seconds) in your PR description showing:
-   - Core functionality of your integration
-   - Handling of an interruption (if applicable to service type)
-5. Submit your pull request
-
-Once your PR is submitted, post in the `#community-integrations` Discord channel to let us know.
-
-## Integration Patterns and Examples
-
-### STT (Speech-to-Text) Services
-
-#### Websocket-based Services
-
-**Base class:** `STTService`
-
-**Examples:**
-
- [DeepgramSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/deepgram/stt.py)
- [SpeechmaticsSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/speechmatics/stt.py)
-
-#### File-based Services
-
-**Base class:** `SegmentedSTTService`
-
-**Examples:**
-
- [RivaSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/riva/stt.py)
- [FalSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/fal/stt.py)
-
-#### Key requirements:
-
- STT services should push `InterimTranscriptionFrames` and `TranscriptionFrames`
- If confidence values are available, filter for values >50% confidence
-
-### LLM (Large Language Model) Services
-
-#### OpenAI-Compatible Services
-
-**Base class:** `OpenAILLMService`
-
-**Examples:**
-
- [AzureLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/azure/llm.py)
- [GrokLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/grok/llm.py) - Shows overriding the base class where needed
-
-#### Non-OpenAI Compatible Services
-
-**Requires:** Full implementation
-
-**Examples:**
-
- [AnthropicLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/anthropic/llm.py)
- [GoogleLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/google/llm.py)
-
-#### Key requirements:
-
- **Frame sequence:** Output must follow this frame sequence pattern:
-
-  - `LLMFullResponseStartFrame` - Signals the start of an LLM response
-  - `LLMTextFrame` - Contains LLM content, typically streamed as tokens
-  - `LLMFullResponseEndFrame` - Signals the end of an LLM response
-
- **Context aggregation:** Implement context aggregation to collect user and assistant content:
-  - Aggregators come in pairs with a `user()` instance and `assistant()` instance
-  - Context must adhere to the `LLMContext` universal format
-  - Aggregators should handle adding messages, function calls, and images to the context
-
-### TTS (Text-to-Speech) Services
-
-#### AudioContextWordTTSService
-
-**Use for:** Websocket-based services supporting word/timestamp alignment
-
-**Example:**
-
- [CartesiaTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/cartesia/tts.py)
-
-#### InterruptibleTTSService
-
-**Use for:** Websocket-based services without word/timestamp alignment, requiring disconnection on interruption
-
-**Example:**
-
- [SarvamTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/sarvam/tts.py)
-
-#### WordTTSService
-
-**Use for:** HTTP-based services supporting word/timestamp alignment
-
-**Example:**
-
- [ElevenLabsHttpTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/elevenlabs/tts.py)
-
-#### TTSService
-
-**Use for:** HTTP-based services without word/timestamp alignment
-
-**Example:**
-
- [GoogleHttpTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/google/tts.py)
-
-#### Key requirements:
-
- For websocket services, use asyncio WebSocket implementation (required for v13+ support)
- Handle idle service timeouts with keepalives
- TTSServices push both audio (`TTSRawAudioFrame`) and text (`TTSTextFrame`) frames
-
-### Telephony Serializers
-
-Pipecat supports telephony provider integration using websocket connections to exchange MediaStreams. These services use a FrameSerializer to serialize and deserialize inputs from the FastAPIWebsocketTransport.
-
-**Examples:**
-
- [Twilio](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/serializers/twilio.py)
- [Telnyx](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/serializers/telnyx.py)
-
-#### Key requirements:
-
- Include hang-up functionality using the provider's native API, ideally using `aiohttp`
- Support DTMF (dual-tone multi-frequency) events if the provider supports them:
-  - Deserialize DTMF events from the provider's protocol to `InputDTMFFrame`
-  - Use `KeypadEntry` enum for valid keypad entries (0-9, \*, #, A-D)
-  - Handle invalid DTMF digits gracefully by returning `None`
-
-### Image Generation Services
-
-**Base class:** `ImageGenService`
-
-**Examples:**
-
- [FalImageGenService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/fal/image.py)
- [GoogleImageGenService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/google/image.py)
-
-#### Key requirements:
-
- Must implement `run_image_gen` method returning an `AsyncGenerator`
-
-### Vision Services
-
-Vision services process images and provide analysis such as descriptions, object detection, or visual question answering.
-
-**Base class:** `VisionService`
-
-**Example:**
-
- [MoondreamVisionService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/moondream/vision.py)
-
-#### Key requirements:
-
- Must implement `run_vision` method that takes an `LLMContext` and returns an `AsyncGenerator[Frame, None]`
- The method processes the latest image in the context and yields frames with analysis results
- Typically yields `TextFrame` objects containing descriptions or answers
-
-## Implementation Guidelines
-
-### Naming Conventions
-
- **STT:** `VendorSTTService`
- **LLM:** `VendorLLMService`
- **TTS:**
-  - Websocket: `VendorTTSService`
-  - HTTP: `VendorHttpTTSService`
- **Image:** `VendorImageGenService`
- **Vision:** `VendorVisionService`
- **Telephony:** `VendorFrameSerializer`
-
-### Metrics Support
-
-Enable metrics in your service:
-
-```python
-def can_generate_metrics(self) -> bool:
-    """Check if this service can generate processing metrics.
-
-    Returns:
-        True, as this service supports metrics.
-    """
-    return True
-```
-
-### Dynamic Settings Updates
-
-STT, LLM, and TTS services support `ServiceUpdateSettingsFrame` for dynamic configuration changes. The base STTService has an `_update_settings()` method that handles settings, and the private `_settings` `Dict` is used to store settings and provide access to the subclass.
-
-```python
-async def set_language(self, language: Language):
-    """Set the recognition language and reconnect.
-
-    Args:
-        language: The language to use for speech recognition.
-    """
-    logger.info(f"Switching STT language to: [{language}]")
-    self._settings["language"] = language
-    await self._disconnect()
-    await self._connect()
-```
-
-Note that, in this example, Deepgram requires the websocket connection be disconnected and reconnected to reinitialize the service with the new value. Consider if your service requires reconnection.
-
-### Sample Rate Handling
-
-Sample rates are set via PipelineParams and passed to each frame processor at initialization. The pattern is to _not_ set the sample rate value in the constructor of a given service. Instead, use the `start()` method to initialize sample rates from the frame:
-
-```python
-async def start(self, frame: StartFrame):
-    """Start the service."""
-    await super().start(frame)
-    self._settings["output_format"]["sample_rate"] = self.sample_rate
-    await self._connect()
-```
-
-Note that `self.sample_rate` is a `@property` set in the TTSService base class, which provides access to the private sample rate value obtained from the StartFrame.
-
-### Tracing Decorators
-
-Use Pipecat's tracing decorators:
-
- **STT:** `@traced_stt` - decorate a function that handles `transcript`, `is_final`, `language` as args
- **LLM:** `@traced_llm` - decorate the `_process_context()` method
- **TTS:** `@traced_tts` - decorate the `run_tts()` method
-
-## Best Practices
-
-### Packaging and Distribution
-
- Use [uv](https://docs.astral.sh/uv/) for packaging (encouraged)
- Consider releasing to PyPI for easier installation
- Follow semantic versioning principles
- Maintain a changelog
-
-### HTTP Communication
-
-For REST-based communication, use aiohttp. Pipecat includes this as a required dependency, so using it prevents adding an additional dependency to your integration.
-
-### Error Handling
-
- Wrap API calls in appropriate try/catch blocks
- Handle rate limits and network failures gracefully
- Provide meaningful error messages
- When errors occur, raise exceptions AND push `ErrorFrame`s to notify the pipeline:
-
-```python
-from pipecat.frames.frames import ErrorFrame
-
-try:
-    # Your API call
-    result = await self._make_api_call()
-except Exception as e:
-    # Push error frame to pipeline
-    await self.push_error(ErrorFrame(error=f"{self} error: {e}"))
-    # Raise or handle as appropriate
-    raise
-```
-
-### Testing
-
- Your foundational example serves as a valuable integration-level test
- Unit tests are nice to have. As the Pipecat teams provides better guidance, we will encourage unit testing more
-
-## Disclaimer
-
-Community integrations are community-maintained and not officially supported by the Pipecat team. Users should evaluate these integrations independently. The Pipecat team reserves the right to remove listings that become unmaintained or problematic.
-
-## Staying Up to Date
-
-Pipecat evolves rapidly to support the latest AI technologies and patterns. While we strive to minimize breaking changes, they do occur as the framework matures.
-
-**We strongly recommend:**
-
- Join our Discord at https://discord.gg/pipecat and monitor the `#announcements` channel for release notifications
- Follow our changelog: https://github.com/pipecat-ai/pipecat/blob/main/CHANGELOG.md
- Test your integration against new Pipecat releases promptly
- Update your README with the last tested Pipecat version
-
-This helps ensure your integration remains compatible and your users have clear expectations about version support.
-
-## Questions?
-
-Join our Discord community at https://discord.gg/pipecat and post in the `#community-integrations` channel for guidance and support.
-
-For additional questions, you can also reach out to us at pipecat-ai@daily.co.
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,9 +1,5 @@
 ## Contributing to Pipecat

-**Want to add a new service integration?**
-We encourage community-maintained integrations! Please see our [Community Integration Guide](COMMUNITY_INTEGRATIONS.md) for the process and requirements.
-
-**Want to contribute to Pipecat core?**
 We welcome contributions of all kinds! Your help is appreciated. Follow these steps to get involved:

 1. **Fork this repository**: Start by forking the Pipecat Documentation repository to your GitHub account.
--- a/README.md
+++ b/README.md
@@ -3,7 +3,6 @@
 </div></h1>

 [![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![codecov](https://codecov.io/gh/pipecat-ai/pipecat/graph/badge.svg?token=LNVUIVO4Y9)](https://codecov.io/gh/pipecat-ai/pipecat) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/pipecat-ai/pipecat)
-[![](https://getmanta.ai/api/badges?text=Manta%20Graph&link=manta)](https://getmanta.ai/pipecat)

 # 🎙️ Pipecat: Real-Time Voice & Multimodal AI Agents

@@ -20,6 +19,10 @@
 - **Business Agents** – customer intake, support bots, guided flows
 - **Complex Dialog Systems** – design logic with structured conversations

+🧭 Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
+
+🔍 Looking for help debugging your pipeline and processors? Check out [Whisker](https://github.com/pipecat-ai/whisker), a real-time Pipecat debugger.
+
 ## 🧠 Why Pipecat?

 - **Voice-first**: Integrates speech recognition, text-to-speech, and conversation handling
@@ -27,38 +30,40 @@
 - **Composable Pipelines**: Build complex behavior from modular components
 - **Real-Time**: Ultra-low latency interaction with different transports (e.g. WebSockets or WebRTC)

-## 🌐 Pipecat Ecosystem
+## 📱 Client SDKs

-### 📱 Client SDKs
+You can connect to Pipecat from any platform using our official SDKs:

-Building client applications? You can connect to Pipecat from any platform using our official SDKs:
-
-<a href="https://docs.pipecat.ai/client/js/introduction">JavaScript</a> | <a href="https://docs.pipecat.ai/client/react/introduction">React</a> | <a href="https://docs.pipecat.ai/client/react-native/introduction">React Native</a> |
-<a href="https://docs.pipecat.ai/client/ios/introduction">Swift</a> | <a href="https://docs.pipecat.ai/client/android/introduction">Kotlin</a> | <a href="https://docs.pipecat.ai/client/c++/introduction">C++</a> | <a href="https://github.com/pipecat-ai/pipecat-esp32">ESP32</a>
-
-### 🧭 Structured conversations
-
-Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
-
-### 🪄 Beautiful UIs
-
-Want to build beautiful and engaging experiences? Checkout the [Voice UI Kit](https://github.com/pipecat-ai/voice-ui-kit), a collection of components, hooks and templates for building voice AI applications quickly.
-
-### 🛠️ Create and deploy projects
-
-Create a new project in under a minute with the [Pipecat CLI](https://github.com/pipecat-ai/pipecat-cli). Then use the CLI to monitor and deploy your agent to production.
-
-### 🔍 Debugging
-
-Looking for help debugging your pipeline and processors? Check out [Whisker](https://github.com/pipecat-ai/whisker), a real-time Pipecat debugger.
-
-### 🖥️ Terminal
-
-Love terminal applications? Check out [Tail](https://github.com/pipecat-ai/tail), a terminal dashboard for Pipecat.
-
-### 📺️ Pipecat TV Channel
-
-Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.youtube.com/playlist?list=PLzU2zoMTQIHjqC3v4q2XVSR3hGSzwKFwH) channel.
+<table>
+  <tr>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg" width="40" height="40" alt="JavaScript"/>
+      <a href="https://docs.pipecat.ai/client/js/introduction">JavaScript</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React"/>
+      <a href="https://docs.pipecat.ai/client/react/introduction">React</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React Native"/>
+      <a href="https://docs.pipecat.ai/client/react-native/introduction">React Native</a>
+    </td>
+  </tr>
+  <tr>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/swift/swift-original.svg" width="40" height="40" alt="Swift"/>
+      <a href="https://docs.pipecat.ai/client/ios/introduction">Swift</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/kotlin/kotlin-original.svg" width="40" height="40" alt="Kotlin"/>
+      <a href="https://docs.pipecat.ai/client/android/introduction">Kotlin</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/cplusplus/cplusplus-original.svg" width="40" height="40" alt="JavaScript"/>
+      <a href="https://docs.pipecat.ai/client/c++/introduction">C++</a>
+    </td>
+  </tr>
+</table>

 ## 🎬 See it in action

@@ -67,24 +72,24 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
    <a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
    <br/>
    <a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>&nbsp;
-    <a href="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/12-describe-video.py"><img src="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/assets/moondream.png" width="400" /></a>
+    <a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
 </p>

 ## 🧩 Available services

-| Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
-| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                                                                                                                                                        |
-| LLMs                | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together)                                                                                                                                                                                                                              |
-| Text-to-Speech      | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
-| Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
-| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
-| Serializers         | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
-| Video               | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
-| Memory              | [mem0](https://docs.pipecat.ai/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
-| Vision & Image      | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
-| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
-| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+| Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                    |
+| LLMs                | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together)                                                                                          |
+| Text-to-Speech      | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
+| Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+| Serializers         | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+| Video               | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+| Memory              | [mem0](https://docs.pipecat.ai/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+| Vision & Image      | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |

 📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)

@@ -179,6 +184,54 @@ Run a specific test suite:
 uv run pytest tests/test_name.py
 ```

+### Setting up your editor
+
+This project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting via [Ruff](https://github.com/astral-sh/ruff).
+
+#### Emacs
+
+You can use [use-package](https://github.com/jwiegley/use-package) to install [emacs-lazy-ruff](https://github.com/christophermadsen/emacs-lazy-ruff) package and configure `ruff` arguments:
+
+```elisp
+(use-package lazy-ruff
+  :ensure t
+  :hook ((python-mode . lazy-ruff-mode))
+  :config
+  (setq lazy-ruff-format-command "ruff format")
+  (setq lazy-ruff-check-command "ruff check --select I"))
+```
+
+`ruff` was installed in the `venv` environment described before, so you should be able to use [pyvenv-auto](https://github.com/ryotaro612/pyvenv-auto) to automatically load that environment inside Emacs.
+
+```elisp
+(use-package pyvenv-auto
+  :ensure t
+  :defer t
+  :hook ((python-mode . pyvenv-auto-run)))
+```
+
+#### Visual Studio Code
+
+Install the
+[Ruff](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, and enable formatting on save:
+
+```json
+"[python]": {
+    "editor.defaultFormatter": "charliermarsh.ruff",
+    "editor.formatOnSave": true
+}
+```
+
+#### PyCharm
+
+`ruff` was installed in the `venv` environment described before, now to enable autoformatting on save, go to `File` -> `Settings` -> `Tools` -> `File Watchers` and add a new watcher with the following settings:
+
+1. **Name**: `Ruff formatter`
+2. **File type**: `Python`
+3. **Working directory**: `$ContentRoot$`
+4. **Arguments**: `format $FilePath$`
+5. **Program**: `$PyInterpreterDirectory$/ruff`
+
 ## 🤝 Contributing

 We welcome contributions from the community! Whether you're fixing bugs, improving documentation, or adding new features, here's how you can help:
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -1,5 +0,0 @@
-# Security Policy
-
-## Reporting a Vulnerability
-
-Please email `disclosures@daily.co`.
--- a/docs/api/conf.py
+++ b/docs/api/conf.py
@@ -50,7 +50,6 @@ autodoc_mock_imports = [
    # Krisp - has build issues on some platforms
    "pipecat_ai_krisp",
    "krisp",
-    "krisp_audio",
    # System-specific GUI libraries
    "_tkinter",
    "tkinter",
--- a/env.example
+++ b/env.example
@@ -4,9 +4,6 @@ AICOUSTICS_LICENSE_KEY=...
 # Anthropic
 ANTHROPIC_API_KEY=...

-# Assembly AI
-ASSEMBLYAI_API_KEY=...
-
 # Async
 ASYNCAI_API_KEY=...
 ASYNCAI_VOICE_ID=...
@@ -24,19 +21,12 @@ AZURE_CHATGPT_API_KEY=...
 AZURE_CHATGPT_ENDPOINT=https://...
 AZURE_CHATGPT_MODEL=...

-AZURE_REALTIME_API_KEY=...
-AZURE_REALTIME_BASE_URL=...
-
 AZURE_DALLE_API_KEY=...
 AZURE_DALLE_ENDPOINT=https://...
 AZURE_DALLE_MODEL=...

 # Cartesia
 CARTESIA_API_KEY=...
-CARTESIA_VOICE_ID=...
-
-# Cerebras
-CEREBRAS_API_KEY=...

 # Daily
 DAILY_API_KEY=...
@@ -45,75 +35,39 @@ DAILY_SAMPLE_ROOM_URL=https://...
 # Deepgram
 DEEPGRAM_API_KEY=...

-# DeepSeek
-DEEPSEEK_API_KEY=...
-
 # ElevenLabs
 ELEVENLABS_API_KEY=...
 ELEVENLABS_VOICE_ID=...

+# Neuphonic
+NEUPHONIC_API_KEY=...
+
 # Fal
 FAL_KEY=...

 # Fireworks
 FIREWORKS_API_KEY=...

-# Fish Audio
-FISH_API_KEY=...
-
 # Gladia
 GLADIA_API_KEY=...
 GLADIA_REGION=...

 # Google
 GOOGLE_API_KEY=...
-GOOGLE_VERTEX_TEST_CREDENTIALS=...
 GOOGLE_CLOUD_PROJECT_ID=...
-GOOGLE_CLOUD_LOCATION=...
 GOOGLE_TEST_CREDENTIALS=...
-
-# Grok
-GROK_API_KEY=...
-
-# Groq
-GROQ_API_KEY=...
-
-# Heygen
-HEYGEN_API_KEY=...
-
-# Hume
-HUME_API_KEY=...
-HUME_VOICE_ID=...
-
-# Inworld
-INWORLD_API_KEY=...
-
-# Krisp
-KRISP_MODEL_PATH=...
-
-# Krisp Viva
-KRISP_VIVA_MODEL_PATH=...
-
-# LiveKit
-LIVEKIT_API_KEY=...
-LIVEKIT_API_SECRET=...
+GOOGLE_VERTEX_TEST_CREDENTIALS=...

 # LMNT
 LMNT_API_KEY=...
 LMNT_VOICE_ID=...

-# MiniMax
-MINIMAX_API_KEY=...
-MINIMAX_GROUP_ID=...
+# Perplexity
+PERPLEXITY_API_KEY=...

-# Mistral
-MISTRAL_API_KEY=...
-
-# Neuphonic
-NEUPHONIC_API_KEY=...
-
-# NVIDIA
-NVIDIA_API_KEY=...
+# PlayHT
+PLAYHT_USER_ID=...
+PLAYHT_API_KEY=...

 # OpenAI
 OPENAI_API_KEY=...
@@ -121,73 +75,83 @@ OPENAI_API_KEY=...
 # OpenPipe
 OPENPIPE_API_KEY=...

-# OpenRouter
-OPENROUTER_API_KEY=...
-
-# Perplexity
-PERPLEXITY_API_KEY=...
-
-# Picovoice Koala
-KOALA_ACCESS_KEY=...
-
-# Piper
-PIPER_BASE_URL=...
-
-# PlayHT
-PLAYHT_USER_ID=...
-PLAYHT_API_KEY=...
-
-# Plivo
-PLIVO_AUTH_ID=...
-PLIVO_AUTH_TOKEN=...
-
-# Qwen
-QWEN_API_KEY=...
-
-# Rime
-RIME_API_KEY=...
-RIME_VOICE_ID=...
-
-# SambaNova
-SAMBANOVA_API_KEY=...
-
-# Sarvam AI
-SARVAM_API_KEY=...
-
-# Sentry
-SENTRY_DSN=...
+# Tavus
+TAVUS_API_KEY=...
+TAVUS_REPLICA_ID=...
+TAVUS_PERSONA_ID=...

 # Simli
 SIMLI_API_KEY=...
 SIMLI_FACE_ID=...

-# Smart turn
-LOCAL_SMART_TURN_MODEL_PATH=...
-FAL_SMART_TURN_API_KEY=...
+# Krisp
+KRISP_MODEL_PATH=...

-# Soniox
-SONIOX_API_KEY=...
+# DeepSeek
+DEEPSEEK_API_KEY=...

-# Speechmatics
-SPEECHMATICS_API_KEY=...
+# Groq
+GROQ_API_KEY=...

-# Tavus
-TAVUS_API_KEY=...
-TAVUS_REPLICA_ID=...
+# Grok
+GROK_API_KEY=...

-# Telnyx
-TELNYX_API_KEY=...
-TELNYX_ACCOUNT_SID=...
+# Inworld
+INWORLD_API_KEY=...

 # Together.ai
 TOGETHER_API_KEY=...

+# Cerebras
+CEREBRAS_API_KEY=...
+
+# Fish Audio
+FISH_API_KEY=...
+
+# Assembly AI
+ASSEMBLYAI_API_KEY=...
+
+# OpenRouter
+OPENROUTER_API_KEY=...
+
+# Piper
+PIPER_BASE_URL=...
+
+# Smart turn
+LOCAL_SMART_TURN_MODEL_PATH=...
+FAL_SMART_TURN_API_KEY=...
+
 # Twilio
 TWILIO_ACCOUNT_SID=...
 TWILIO_AUTH_TOKEN=...

-# WhatsApp
-WHATSAPP_TOKEN=...
-WHATSAPP_WEBHOOK_VERIFICATION_TOKEN=...
-WHATSAPP_PHONE_NUMBER_ID=...
-WHATSAPP_APP_SECRET=...
+# MiniMax
+MINIMAX_API_KEY=...
+MINIMAX_GROUP_ID=...
+
+# Sarvam AI
+SARVAM_API_KEY=...
+
+# Soniox
+SONIOX_API_KEY=
+
+# Speechmatics
+SPEECHMATICS_API_KEY=...
+
+# SambaNova
+SAMBANOVA_API_KEY=...
+
+# Sentry
+SENTRY_DSN=...
+
+# Heygen
+HEYGEN_API_KEY=...
+
+# Mistral
+MISTRAL_API_KEY=...
+
+# NVIDIA
+NVIDIA_API_KEY=...
+
+# Qwen
+QWEN_API_KEY=...
--- a/examples/foundational/04a-transports-daily.py
+++ b/examples/foundational/04a-transports-daily.py
@@ -25,7 +25,7 @@ from pipecat.processors.aggregators.llm_response_universal import LLMContextAggr
 from pipecat.runner.daily import configure
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.daily.transport import DailyParams, DailyTransport
+from pipecat.transports.daily.transport import DailyLogLevel, DailyParams, DailyTransport

 load_dotenv(override=True)

@@ -49,6 +49,7 @@ async def main():
                turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
            ),
        )
+        transport.set_log_level(DailyLogLevel.Info)

        tts = CartesiaTTSService(
            api_key=os.getenv("CARTESIA_API_KEY"),
--- a/examples/foundational/07-interruptible.py
+++ b/examples/foundational/07-interruptible.py
@@ -21,8 +21,8 @@ from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.stt import CartesiaSTTService
 from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -58,7 +58,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
--- a/examples/foundational/07a-interruptible-speechmatics-vad.py
+++ b/examples/foundational/07a-interruptible-speechmatics-vad.py
@@ -6,7 +6,6 @@

 import os

-import aiohttp
 from dotenv import load_dotenv
 from loguru import logger

@@ -21,10 +20,10 @@ from pipecat.processors.aggregators.llm_response import (
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
+from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
 from pipecat.services.openai.base_llm import BaseOpenAILLMService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
-from pipecat.services.speechmatics.tts import SpeechmaticsTTSService
 from pipecat.transcriptions.language import Language
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -52,127 +51,121 @@ transport_params = {


 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    """Speechmatics STT and TTS Service Example
+    """Speechmatics STT Service Example

-    This example demonstrates using Speechmatics Speech-to-Text and Text-to-Speech services
-    with speaker diarization and intelligent speaker management. Key features:
+    This example demonstrates using Speechmatics Speech-to-Text service with speaker diarization and intelligent speaker management. Key features:

-    1. Speaker Diarization (STT)
+    1. Speaker Diarization
       - Automatically identifies and distinguishes between different speakers
       - First speaker is identified as 'S1', others get subsequent IDs
       - Uses `enable_diarization` parameter to manage speaker detection

-    2. Smart Speaker Control (STT)
+    2. Smart Speaker Control
       - `focus_speakers` parameter lets you target specific speakers (e.g. ["S1"])
       - Other speakers will be wrapped in PASSIVE tags
       - Only processes speech from focused speakers
       - Words from all speakers are wrapped with XML tags for clear speaker identification
       - Other speakers' speech only sent when focused speaker is active

-    3. Voice Activity Detection (STT)
+    3. Voice Activity Detection
       - Built-in VAD using `enable_vad` parameter
       - Remove `vad_analyzer` from `transport` config to use module's VAD
       - Emits speaker started/stopped events

-    4. Text-to-Speech (TTS)
-       - Low latency streaming audio synthesis
-       - Multiple voice options available including `sarah`, `theo`, and `megan`
-
-    5. Configuration Options
+    4. Configuration Options
       - `operating_point` parameter defaults to `ENHANCED` for optimal accuracy
       - Configurable `end_of_utterance_silence_trigger` (default 0.5s)
       - Customizable speaker formatting
       - Additional diarization settings available

-    For detailed information:
-    - STT: https://docs.speechmatics.com/rt-api-ref
-    - TTS: https://docs.speechmatics.com/text-to-speech/quickstart
+    For detailed information about operating points and configuration:
+    https://docs.speechmatics.com/rt-api-ref
    """

    logger.info(f"Starting bot")
-    async with aiohttp.ClientSession() as session:
-        stt = SpeechmaticsSTTService(
-            api_key=os.getenv("SPEECHMATICS_API_KEY"),
-            params=SpeechmaticsSTTService.InputParams(
-                language=Language.EN,
-                enable_vad=True,
-                enable_diarization=True,
-                focus_speakers=["S1"],
-                end_of_utterance_silence_trigger=0.5,
-                speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
-                speaker_passive_format="<PASSIVE><{speaker_id}>{text}</{speaker_id}></PASSIVE>",
+
+    stt = SpeechmaticsSTTService(
+        api_key=os.getenv("SPEECHMATICS_API_KEY"),
+        params=SpeechmaticsSTTService.InputParams(
+            language=Language.EN,
+            enable_vad=True,
+            enable_diarization=True,
+            focus_speakers=["S1"],
+            end_of_utterance_silence_trigger=0.5,
+            speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
+            speaker_passive_format="<PASSIVE><{speaker_id}>{text}</{speaker_id}></PASSIVE>",
+        ),
+    )
+
+    tts = ElevenLabsTTSService(
+        api_key=os.getenv("ELEVENLABS_API_KEY"),
+        voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
+        model="eleven_turbo_v2_5",
+    )
+
+    llm = OpenAILLMService(
+        api_key=os.getenv("OPENAI_API_KEY"),
+        params=BaseOpenAILLMService.InputParams(temperature=0.75),
+    )
+
+    messages = [
+        {
+            "role": "system",
+            "content": (
+                "You are a helpful British assistant called Alfred. "
+                "Your goal is to demonstrate your capabilities in a succinct way. "
+                "Your output will be converted to audio so don't include special characters in your answers. "
+                "Always include punctuation in your responses. "
+                "Give very short replies - do not give longer replies unless strictly necessary. "
+                "Respond to what the user said in a concise, funny, creative and helpful way. "
+                "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies. "
+                "Do not respond to speakers within `<PASSIVE/>` tags unless explicitly asked to. "
            ),
-        )
+        },
+    ]

-        tts = SpeechmaticsTTSService(
-            api_key=os.getenv("SPEECHMATICS_API_KEY"),
-            voice_id="sarah",
-            aiohttp_session=session,
-        )
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(
+        context,
+        user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
+    )

-        llm = OpenAILLMService(
-            api_key=os.getenv("OPENAI_API_KEY"),
-            params=BaseOpenAILLMService.InputParams(temperature=0.75),
-        )
-
-        messages = [
-            {
-                "role": "system",
-                "content": (
-                    "You are a helpful British assistant called Sarah. "
-                    "Your goal is to demonstrate your capabilities in a succinct way. "
-                    "Your output will be converted to audio so don't include special characters in your answers. "
-                    "Always include punctuation in your responses. "
-                    "Give very short replies - do not give longer replies unless strictly necessary. "
-                    "Respond to what the user said in a concise, funny, creative and helpful way. "
-                    "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies. "
-                    "Do not respond to speakers within `<PASSIVE/>` tags unless explicitly asked to. "
-                ),
-            },
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
        ]
+    )

-        context = LLMContext(messages)
-        context_aggregator = LLMContextAggregatorPair(
-            context,
-            user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
-        )
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )

-        pipeline = Pipeline(
-            [
-                transport.input(),  # Transport user input
-                stt,
-                context_aggregator.user(),  # User responses
-                llm,  # LLM
-                tts,  # TTS
-                transport.output(),  # Transport bot output
-                context_aggregator.assistant(),  # Assistant spoken responses
-            ]
-        )
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        messages.append({"role": "system", "content": "Say a short hello to the user."})
+        await task.queue_frames([LLMRunFrame()])

-        task = PipelineTask(
-            pipeline,
-            params=PipelineParams(
-                enable_metrics=True,
-                enable_usage_metrics=True,
-            ),
-            idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-        )
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()

-        @transport.event_handler("on_client_connected")
-        async def on_client_connected(transport, client):
-            logger.info(f"Client connected")
-            # Kick off the conversation.
-            messages.append({"role": "system", "content": "Say a short hello to the user."})
-            await task.queue_frames([LLMRunFrame()])
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-        @transport.event_handler("on_client_disconnected")
-        async def on_client_disconnected(transport, client):
-            logger.info(f"Client disconnected")
-            await task.cancel()
-
-        runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-        await runner.run(task)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/foundational/07a-interruptible-speechmatics.py
+++ b/examples/foundational/07a-interruptible-speechmatics.py
@@ -6,7 +6,6 @@

 import os

-import aiohttp
 from dotenv import load_dotenv
 from loguru import logger

@@ -25,10 +24,10 @@ from pipecat.processors.aggregators.llm_response import (
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
+from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
 from pipecat.services.openai.base_llm import BaseOpenAILLMService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
-from pipecat.services.speechmatics.tts import SpeechmaticsTTSService
 from pipecat.transcriptions.language import Language
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -62,106 +61,100 @@ transport_params = {


 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    """Run example using Speechmatics STT and TTS.
+    """Run example using Speechmatics STT.

-    This example demonstrates a complete Speechmatics integration with both Speech-to-Text
-    and Text-to-Speech services:
+    This example will use diarization within our STT service and output the words spoken by
+    each individual speaker and wrap them with XML tags for the LLM to process. Note the
+    instructions in the system context for the LLM. This greatly improves the conversation
+    experience by allowing the LLM to understand who is speaking in a multi-party call.

-    STT Features:
-    - Diarization to identify and distinguish between different speakers
-    - Words spoken by each speaker are wrapped with XML tags for LLM processing
-    - System context instructions help the LLM understand multi-party conversations
-    - ENHANCED operating point by default for optimal accuracy
+    By default, this example will use our ENHANCED operating point, which is optimized for
+    high accuracy. You can change this by setting the `operating_point` parameter to a different
+    value.

-    TTS Features:
-    - Low latency streaming audio synthesis
-    - Multiple voice options available including `sarah`, `theo`, and `megan`
-
-    For more information:
-    - STT: https://docs.speechmatics.com/rt-api-ref
-    - TTS: https://docs.speechmatics.com/text-to-speech/quickstart
+    For more information on operating points, see the Speechmatics documentation:
+    https://docs.speechmatics.com/rt-api-ref
    """
    logger.info(f"Starting bot")

-    async with aiohttp.ClientSession() as session:
-        stt = SpeechmaticsSTTService(
-            api_key=os.getenv("SPEECHMATICS_API_KEY"),
-            params=SpeechmaticsSTTService.InputParams(
-                language=Language.EN,
-                enable_diarization=True,
-                end_of_utterance_silence_trigger=0.5,
-                speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
+    stt = SpeechmaticsSTTService(
+        api_key=os.getenv("SPEECHMATICS_API_KEY"),
+        params=SpeechmaticsSTTService.InputParams(
+            language=Language.EN,
+            enable_diarization=True,
+            end_of_utterance_silence_trigger=0.5,
+            speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
+        ),
+    )
+
+    tts = ElevenLabsTTSService(
+        api_key=os.getenv("ELEVENLABS_API_KEY"),
+        voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
+        model="eleven_turbo_v2_5",
+    )
+
+    llm = OpenAILLMService(
+        api_key=os.getenv("OPENAI_API_KEY"),
+        params=BaseOpenAILLMService.InputParams(temperature=0.75),
+    )
+
+    messages = [
+        {
+            "role": "system",
+            "content": (
+                "You are a helpful British assistant called Alfred. "
+                "Your goal is to demonstrate your capabilities in a succinct way. "
+                "Your output will be converted to audio so don't include special characters in your answers. "
+                "Always include punctuation in your responses. "
+                "Give very short replies - do not give longer replies unless strictly necessary. "
+                "Respond to what the user said in a concise, funny, creative and helpful way. "
+                "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies."
            ),
-        )
+        },
+    ]

-        tts = SpeechmaticsTTSService(
-            api_key=os.getenv("SPEECHMATICS_API_KEY"),
-            voice_id="sarah",
-            aiohttp_session=session,
-        )
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(
+        context,
+        user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
+    )

-        llm = OpenAILLMService(
-            api_key=os.getenv("OPENAI_API_KEY"),
-            params=BaseOpenAILLMService.InputParams(temperature=0.75),
-        )
-
-        messages = [
-            {
-                "role": "system",
-                "content": (
-                    "You are a helpful British assistant called Sarah. "
-                    "Your goal is to demonstrate your capabilities in a succinct way. "
-                    "Your output will be converted to audio so don't include special characters in your answers. "
-                    "Always include punctuation in your responses. "
-                    "Give very short replies - do not give longer replies unless strictly necessary. "
-                    "Respond to what the user said in a concise, funny, creative and helpful way. "
-                    "Use `<Sn/>` tags to identify different speakers - do not use tags in your replies."
-                ),
-            },
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
        ]
+    )

-        context = LLMContext(messages)
-        context_aggregator = LLMContextAggregatorPair(
-            context,
-            user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
-        )
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )

-        pipeline = Pipeline(
-            [
-                transport.input(),  # Transport user input
-                stt,  # STT
-                context_aggregator.user(),  # User responses
-                llm,  # LLM
-                tts,  # TTS
-                transport.output(),  # Transport bot output
-                context_aggregator.assistant(),  # Assistant spoken responses
-            ]
-        )
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        messages.append({"role": "system", "content": "Say a short hello to the user."})
+        await task.queue_frames([LLMRunFrame()])

-        task = PipelineTask(
-            pipeline,
-            params=PipelineParams(
-                enable_metrics=True,
-                enable_usage_metrics=True,
-            ),
-            idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-        )
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()

-        @transport.event_handler("on_client_connected")
-        async def on_client_connected(transport, client):
-            logger.info(f"Client connected")
-            # Kick off the conversation.
-            messages.append({"role": "system", "content": "Say a short hello to the user."})
-            await task.queue_frames([LLMRunFrame()])
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)

-        @transport.event_handler("on_client_disconnected")
-        async def on_client_disconnected(transport, client):
-            logger.info(f"Client disconnected")
-            await task.cancel()
-
-        runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-        await runner.run(task)
+    await runner.run(task)


 async def bot(runner_args: RunnerArguments):
--- a/examples/foundational/07ae-interruptible-hume.py
+++ b/examples/foundational/07ae-interruptible-hume.py
@@ -1,138 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIObserver, RTVIProcessor
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.hume.tts import HUME_SAMPLE_RATE, HumeTTSService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = HumeTTSService(
-        api_key=os.getenv("HUME_API_KEY"),
-        # Replace with your Hume voice ID
-        voice_id="f898a92e-685f-43fa-985b-a46920f0650b",
-    )
-
-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    rtvi = RTVIProcessor(config=RTVIConfig(config=[]))
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            rtvi,
-            stt,
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-            audio_out_sample_rate=HUME_SAMPLE_RATE,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-        observers=[RTVIObserver(rtvi)],
-    )
-
-    @rtvi.event_handler("on_client_ready")
-    async def on_client_ready(rtvi):
-        await rtvi.set_bot_ready()
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/07c-interruptible-deepgram-flux.py
+++ b/examples/foundational/07c-interruptible-deepgram-flux.py
@@ -1,122 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_response_universal import (
-    LLMContext,
-    LLMContextAggregatorPair,
-)
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.deepgram.flux.stt import DeepgramFluxSTTService
-from pipecat.services.deepgram.tts import DeepgramTTSService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramFluxSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")
-
-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    @stt.event_handler("on_update")
-    async def on_deepgram_flux_update(stt, transcript):
-        logger.debug(f"On deeggram flux update: {transcript}")
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/07c-interruptible-deepgram-http.py
+++ b/examples/foundational/07c-interruptible-deepgram-http.py
@@ -1,132 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-import aiohttp
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.deepgram.tts import DeepgramHttpTTSService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    async with aiohttp.ClientSession() as session:
-        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-        tts = DeepgramHttpTTSService(
-            api_key=os.getenv("DEEPGRAM_API_KEY"),
-            voice="aura-2-andromeda-en",
-            aiohttp_session=session,
-        )
-
-        llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-        messages = [
-            {
-                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-            },
-        ]
-
-        context = LLMContext(messages)
-        context_aggregator = LLMContextAggregatorPair(context)
-
-        pipeline = Pipeline(
-            [
-                transport.input(),  # Transport user input
-                stt,  # STT
-                context_aggregator.user(),  # User responses
-                llm,  # LLM
-                tts,  # TTS
-                transport.output(),  # Transport bot output
-                context_aggregator.assistant(),  # Assistant spoken responses
-            ]
-        )
-
-        task = PipelineTask(
-            pipeline,
-            params=PipelineParams(
-                enable_metrics=True,
-                enable_usage_metrics=True,
-            ),
-            idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-        )
-
-        @transport.event_handler("on_client_connected")
-        async def on_client_connected(transport, client):
-            logger.info(f"Client connected")
-            # Kick off the conversation.
-            messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-            await task.queue_frames([LLMRunFrame()])
-
-        @transport.event_handler("on_client_disconnected")
-        async def on_client_disconnected(transport, client):
-            logger.info(f"Client disconnected")
-            await task.cancel()
-
-        runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-        await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/07d-interruptible-elevenlabs-http.py
+++ b/examples/foundational/07d-interruptible-elevenlabs-http.py
@@ -23,6 +23,7 @@ from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
+from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.elevenlabs.stt import ElevenLabsSTTService
 from pipecat.services.elevenlabs.tts import ElevenLabsHttpTTSService
 from pipecat.services.openai.llm import OpenAILLMService
--- a/examples/foundational/07m-interruptible-aws.py
+++ b/examples/foundational/07m-interruptible-aws.py
@@ -67,8 +67,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    llm = AWSBedrockLLMService(
        aws_region="us-west-2",
-        model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
-        params=AWSBedrockLLMService.InputParams(temperature=0.8),
+        model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
+        params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
    )

    messages = [
--- a/examples/foundational/07n-interruptible-gemini-image.py
+++ b/examples/foundational/07n-interruptible-gemini-image.py
@@ -1,151 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-"""
-A conversational AI bot using Gemini for both LLM, STT and TTS.
-
-This example demonstrates how to use Gemini's image generation capabilities.
-
-Features showcased:
- Gemini LLM for conversation and image generation
- Google TTS and STT
-
-Run with:
-    python examples/foundational/07n-interruptible-gemini-image.py
-
-Make sure to set your environment variables:
-    export GOOGLE_API_KEY=your_api_key_here
-"""
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.google.llm import GoogleLLMService
-from pipecat.services.google.stt import GoogleSTTService
-from pipecat.services.google.tts import GoogleTTSService
-from pipecat.transcriptions.language import Language
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_out_enabled=True,
-        video_out_width=1024,
-        video_out_height=1024,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_out_enabled=True,
-        video_out_width=1024,
-        video_out_height=1024,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = GoogleSTTService(
-        params=GoogleSTTService.InputParams(languages=Language.EN_US),
-        credentials=os.getenv("GOOGLE_TEST_CREDENTIALS"),
-    )
-
-    tts = GoogleTTSService(
-        voice_id="en-US-Chirp3-HD-Charon",
-        params=GoogleTTSService.InputParams(language=Language.EN_US),
-        credentials=os.getenv("GOOGLE_TEST_CREDENTIALS"),
-    )
-
-    llm = GoogleLLMService(
-        api_key=os.getenv("GOOGLE_API_KEY"),
-        model="gemini-2.5-flash-image",
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # Gemini TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation with a styled introduction
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/07p-interruptible-krisp-viva.py
+++ b/examples/foundational/07p-interruptible-krisp-viva.py
@@ -1,129 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.filters.krisp_viva_filter import KrispVivaFilter
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.deepgram.tts import DeepgramTTSService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-        audio_in_filter=KrispVivaFilter(),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-        audio_in_filter=KrispVivaFilter(),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-        audio_in_filter=KrispVivaFilter(),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-helios-en")
-
-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/08-bots-arguing.py
+++ b/examples/foundational/08-bots-arguing.py
@@ -0,0 +1,147 @@
+import asyncio
+import logging
+import os
+from typing import Tuple
+
+import aiohttp
+from dotenv import load_dotenv
+
+from pipecat.frames.frames import AudioFrame, EndFrame, ImageFrame, LLMContextFrame, TextFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.processors.aggregators import SentenceAggregator
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.daily import configure
+from pipecat.services.azure import AzureLLMService, AzureTTSService
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.services.fal import FalImageGenService
+from pipecat.transports.daily.transport import DailyTransport
+
+load_dotenv(override=True)
+
+logging.basicConfig(format=f"%(levelno)s %(asctime)s %(message)s")
+logger = logging.getLogger("pipecat")
+logger.setLevel(logging.DEBUG)
+
+
+async def main():
+    async with aiohttp.ClientSession() as session:
+        (room_url, _) = await configure(session)
+
+        transport = DailyTransport(
+            room_url,
+            None,
+            "Respond bot",
+            duration_minutes=10,
+            mic_enabled=True,
+            mic_sample_rate=16000,
+            camera_enabled=True,
+            camera_width=1024,
+            camera_height=1024,
+        )
+
+        llm = AzureLLMService(
+            api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
+            endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
+            model=os.getenv("AZURE_CHATGPT_MODEL"),
+        )
+        tts1 = AzureTTSService(
+            api_key=os.getenv("AZURE_SPEECH_API_KEY"),
+            region=os.getenv("AZURE_SPEECH_REGION"),
+        )
+        tts2 = ElevenLabsTTSService(
+            api_key=os.getenv("ELEVENLABS_API_KEY"),
+            voice_id="jBpfuIE2acCO8z3wKNLl",
+        )
+        dalle = FalImageGenService(
+            params=FalImageGenService.InputParams(image_size="1024x1024"),
+            aiohttp_session=session,
+            key=os.getenv("FAL_KEY"),
+        )
+
+        bot1_messages = [
+            {
+                "role": "system",
+                "content": "You are a stern librarian. You strongly believe that a hot dog is a sandwich. Start by stating this fact in a few sentences, then be prepared to debate this with the user. You shouldn't ever compromise on the fundamental truth that a hot dog is a sandwich. Your responses should only be a few sentences long.",
+            },
+        ]
+        bot2_messages = [
+            {
+                "role": "system",
+                "content": "You are a silly cat, and you strongly believe that a hot dog is not a sandwich. Debate this with the user, only responding with a few sentences. Don't ever accept that a hot dog is a sandwich.",
+            },
+        ]
+
+        async def get_text_and_audio(messages) -> Tuple[str, bytearray]:
+            """This function streams text from the LLM and uses the TTS service to convert
+            that text to speech as it's received.
+            """
+            source_queue = asyncio.Queue()
+            sink_queue = asyncio.Queue()
+            sentence_aggregator = SentenceAggregator()
+            pipeline = Pipeline([llm, sentence_aggregator, tts1], source_queue, sink_queue)
+
+            await source_queue.put(LLMContextFrame(LLMContext(messages)))
+            await source_queue.put(EndFrame())
+            await pipeline.run_pipeline()
+
+            message = ""
+            all_audio = bytearray()
+            while sink_queue.qsize():
+                frame = sink_queue.get_nowait()
+                if isinstance(frame, TextFrame):
+                    message += frame.text
+                elif isinstance(frame, AudioFrame):
+                    all_audio.extend(frame.audio)
+
+            return (message, all_audio)
+
+        async def get_bot1_statement():
+            message, audio = await get_text_and_audio(bot1_messages)
+
+            bot1_messages.append({"role": "assistant", "content": message})
+            bot2_messages.append({"role": "user", "content": message})
+
+            return audio
+
+        async def get_bot2_statement():
+            message, audio = await get_text_and_audio(bot2_messages)
+
+            bot2_messages.append({"role": "assistant", "content": message})
+            bot1_messages.append({"role": "user", "content": message})
+
+            return audio
+
+        async def argue():
+            for i in range(100):
+                print(f"In iteration {i}")
+
+                bot1_description = "A woman conservatively dressed as a librarian in a library surrounded by books, cartoon, serious, highly detailed"
+
+                (audio1, image_data1) = await asyncio.gather(
+                    get_bot1_statement(), dalle.run_image_gen(bot1_description)
+                )
+                await transport.send_queue.put(
+                    [
+                        ImageFrame(image_data1[1], image_data1[2]),
+                        AudioFrame(audio1),
+                    ]
+                )
+
+                bot2_description = "A cat dressed in a hot dog costume, cartoon, bright colors, funny, highly detailed"
+
+                (audio2, image_data2) = await asyncio.gather(
+                    get_bot2_statement(), dalle.run_image_gen(bot2_description)
+                )
+                await transport.send_queue.put(
+                    [
+                        ImageFrame(image_data2[1], image_data2[2]),
+                        AudioFrame(audio2),
+                    ]
+                )
+
+        await asyncio.gather(transport.run(), argue())
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/foundational/12-describe-image-openai.py
+++ b/examples/foundational/12-describe-image-openai.py
@@ -1,141 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-from PIL import Image
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-
-load_dotenv(override=True)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-
-        if not runner_args.body:
-            script_dir = os.path.dirname(__file__)
-            runner_args.body = {
-                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
-                "question": "Describe this image",
-            }
-
-        image_path = runner_args.body["image_path"]
-        question = runner_args.body["question"]
-
-        # Kick off the conversation.
-        image = Image.open(image_path)
-        message = LLMContext.create_image_message(
-            image=image.tobytes(),
-            format="RGB",
-            size=image.size,
-            text=question,
-        )
-        messages.append(message)
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/12-describe-video.py
+++ b/examples/foundational/12-describe-video.py
@@ -0,0 +1,180 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import os
+from typing import Optional
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.user_response import UserResponseAggregator
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.moondream.vision import MoondreamService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""
+
+    def __init__(self, participant_id: Optional[str] = None):
+        super().__init__()
+        self._participant_id = participant_id
+
+    def set_participant_id(self, participant_id: str):
+        self._participant_id = participant_id
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if self._participant_id and isinstance(frame, TextFrame):
+            await self.push_frame(
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
+            )
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    user_response = UserResponseAggregator()
+
+    # Initialize the image requester without setting the participant ID yet
+    image_requester = UserImageRequester()
+
+    image_processor = UserImageProcessor()
+
+    # If you run into weird description, try with use_cpu=True
+    moondream = MoondreamService()
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            stt,
+            user_response,
+            image_requester,
+            image_processor,
+            moondream,
+            tts,
+            transport.output(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected: {client}")
+
+        await maybe_capture_participant_camera(transport, client)
+
+        # Set the participant ID in the image requester
+        client_id = get_transport_client_id(transport, client)
+        image_requester.set_participant_id(client_id)
+
+        # Welcome message
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/14d-function-calling-gemini-flash-video.py
+++ b/examples/foundational/14d-function-calling-gemini-flash-video.py
@@ -5,23 +5,29 @@
 #

 import os
+from typing import Optional

 from dotenv import load_dotenv
 from loguru import logger

-from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection
+from pipecat.processors.aggregators.user_response import UserResponseAggregator
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -31,37 +37,53 @@ from pipecat.runner.utils import (
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.google.llm import GoogleLLMService
-from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams

 load_dotenv(override=True)


-async def fetch_user_image(params: FunctionCallParams):
-    """Fetch the user image and push it to the LLM.
+class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""

-    When called, this function pushes a UserImageRequestFrame upstream to the
-    transport. As a result, the transport will request the user image and push a
-    UserImageRawFrame downstream which will be added to the context by the LLM
-    assistant aggregator.
-    """
-    user_id = params.arguments["user_id"]
-    question = params.arguments["question"]
-    logger.debug(f"Requesting image with user_id={user_id}, question={question}")
+    def __init__(self, participant_id: Optional[str] = None):
+        super().__init__()
+        self._participant_id = participant_id

-    # Request a user image frame and indicate that it should be added to the
-    # context.
-    await params.llm.push_frame(
-        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
-        FrameDirection.UPSTREAM,
-    )
+    def set_participant_id(self, participant_id: str):
+        self._participant_id = participant_id

-    await params.result_callback(None)
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)

-    # Instead of None, it's possible to also provide a tool call answer to
-    # tell the LLM that we are grabbing the image to analyze.
-    # await params.result_callback({"result": "Image is being captured."})
+        if self._participant_id and isinstance(frame, TextFrame):
+            await self.push_frame(
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
+            )
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -88,53 +110,33 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

+    user_response = UserResponseAggregator()
+
+    # Initialize the image requester without setting the participant ID yet
+    image_requester = UserImageRequester()
+
+    image_processor = UserImageProcessor()
+
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

+    # Google Gemini model for vision analysis
+    google = GoogleLLMService(model="gemini-2.0-flash-001", api_key=os.getenv("GOOGLE_API_KEY"))
+
    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    # Google Gemini model for vision analysis
-    llm = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
-    llm.register_function("fetch_user_image", fetch_user_image)
-
-    fetch_image_function = FunctionSchema(
-        name="fetch_user_image",
-        description="Called when the user requests a description of their camera feed",
-        properties={
-            "user_id": {
-                "type": "string",
-                "description": "The ID of the user to grab the image from",
-            },
-            "question": {
-                "type": "string",
-                "description": "The question that the user is asking about the image",
-            },
-        },
-        required=["user_id", "question"],
-    )
-    tools = ToolsSchema(standard_tools=[fetch_image_function])
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
-        },
-    ]
-
-    context = LLMContext(messages, tools)
-    context_aggregator = LLMContextAggregatorPair(context)
-
    pipeline = Pipeline(
        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
+            transport.input(),
+            stt,
+            user_response,
+            image_requester,
+            image_processor,
+            google,
+            tts,
+            transport.output(),
        ]
    )

@@ -155,15 +157,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Set the participant ID in the image requester
        client_id = get_transport_client_id(transport, client)
+        image_requester.set_participant_id(client_id)

-        # Kick off the conversation.
-        messages.append(
-            {
-                "role": "system",
-                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
-            }
-        )
-        await task.queue_frames([LLMRunFrame()])
+        # Welcome message
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12b-describe-image-aws.py
+++ b/examples/foundational/12b-describe-image-aws.py
@@ -1,148 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-from PIL import Image
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.aws.llm import AWSBedrockLLMService
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-
-load_dotenv(override=True)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
-    llm = AWSBedrockLLMService(
-        aws_region="us-west-2",
-        model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
-        # Note: usually, prefer providing latency="optimized" param.
-        # Here we can't because AWS Bedrock doesn't support it for Claude 3.7,
-        # which we need for image input.
-        params=AWSBedrockLLMService.InputParams(temperature=0.8),
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-
-        if not runner_args.body:
-            script_dir = os.path.dirname(__file__)
-            runner_args.body = {
-                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
-                "question": "Describe this image",
-            }
-
-        image_path = runner_args.body["image_path"]
-        question = runner_args.body["question"]
-
-        # Kick off the conversation.
-        image = Image.open(image_path)
-        message = LLMContext.create_image_message(
-            image=image.tobytes(),
-            format="RGB",
-            size=image.size,
-            text=question,
-        )
-        messages.append(message)
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/12b-describe-video-gpt-4o.py
+++ b/examples/foundational/12b-describe-video-gpt-4o.py
@@ -4,9 +4,8 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-import io
 import os
-import re
+from typing import Optional

 from dotenv import load_dotenv
 from loguru import logger
@@ -17,17 +16,24 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import (
    Frame,
-    LLMRunFrame,
-    MetricsFrame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
 )
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.user_response import UserResponseAggregator
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.openai.llm import OpenAILLMService
@@ -37,41 +43,46 @@ from pipecat.transports.daily.transport import DailyParams
 load_dotenv(override=True)


-def format_metrics(metrics, indent=0):
-    lines = []
-    tab = "\t" * indent
+class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""

-    for metric in metrics:
-        lines.append(tab + type(metric).__name__)
-        for field, value in vars(metric).items():
-            if hasattr(value, "__dict__") and not isinstance(
-                value, (str, int, float, bool, type(None))
-            ):
-                lines.append(f"{tab}\t{field}={type(value).__name__}")
-                for k, v in vars(value).items():
-                    lines.append(f"{tab}\t\t{k}={repr(v)}")
-            else:
-                lines.append(f"{tab}\t{field}={repr(value)}")
+    def __init__(self, participant_id: Optional[str] = None):
+        super().__init__()
+        self._participant_id = participant_id

-    return "\n".join(lines)
-
-
-class MetricsFrameLogger(FrameProcessor):
-    """MetricsFrameLogger formats and logs all MetericsFrames"""
-
-    def __init__(self, **kwargs):
-        super().__init__(**kwargs)
+    def set_participant_id(self, participant_id: str):
+        self._participant_id = participant_id

    async def process_frame(self, frame: Frame, direction: FrameDirection):
        await super().process_frame(frame, direction)

-        if isinstance(frame, MetricsFrame):
-            logger.info(f"{frame.name}\n    {format_metrics(frame.data)}")
+        if self._participant_id and isinstance(frame, TextFrame):
+            await self.push_frame(
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
+            )
+        else:
            await self.push_frame(frame, direction)

-        # ALWAYS push all frames
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
        else:
-            # SUPER IMPORTANT: always push every frame!
            await self.push_frame(frame, direction)


@@ -82,13 +93,14 @@ transport_params = {
    "daily": lambda: DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
+        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
    "webrtc": lambda: TransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        video_out_enabled=True,
+        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
@@ -98,37 +110,33 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

+    user_response = UserResponseAggregator()
+
+    # Initialize the image requester without setting the participant ID yet
+    image_requester = UserImageRequester()
+
+    image_processor = UserImageProcessor()
+
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

+    # OpenAI GPT-4o for vision analysis
+    openai = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    metrics_frame_processor = MetricsFrameLogger()
-
    pipeline = Pipeline(
        [
            transport.input(),
            stt,
-            context_aggregator.user(),
-            llm,
+            user_response,
+            image_requester,
+            image_processor,
+            openai,
            tts,
            transport.output(),
-            context_aggregator.assistant(),
-            metrics_frame_processor,  # pretty print metrics frames
        ]
    )

@@ -144,9 +152,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected: {client}")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
+
+        await maybe_capture_participant_camera(transport, client)
+
+        # Set the participant ID in the image requester
+        client_id = get_transport_client_id(transport, client)
+        image_requester.set_participant_id(client_id)
+
+        # Welcome message
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12c-describe-image-gemini-flash.py
+++ b/examples/foundational/12c-describe-image-gemini-flash.py
@@ -1,141 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-from PIL import Image
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.google.llm import GoogleLLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-
-load_dotenv(override=True)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
-    llm = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-
-        if not runner_args.body:
-            script_dir = os.path.dirname(__file__)
-            runner_args.body = {
-                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
-                "question": "Describe this image",
-            }
-
-        image_path = runner_args.body["image_path"]
-        question = runner_args.body["question"]
-
-        # Kick off the conversation.
-        image = Image.open(image_path)
-        message = LLMContext.create_image_message(
-            image=image.tobytes(),
-            format="RGB",
-            size=image.size,
-            text=question,
-        )
-        messages.append(message)
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/12c-describe-video-anthropic.py
+++ b/examples/foundational/12c-describe-video-anthropic.py
@@ -4,25 +4,36 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-
 import os
+from typing import Optional

 from dotenv import load_dotenv
 from loguru import logger
-from PIL import Image

 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.user_response import UserResponseAggregator
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
 from pipecat.services.anthropic.llm import AnthropicLLMService
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
@@ -32,6 +43,49 @@ from pipecat.transports.daily.transport import DailyParams
 load_dotenv(override=True)


+class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""
+
+    def __init__(self, participant_id: Optional[str] = None):
+        super().__init__()
+        self._participant_id = participant_id
+
+    def set_participant_id(self, participant_id: str):
+        self._participant_id = participant_id
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if self._participant_id and isinstance(frame, TextFrame):
+            await self.push_frame(
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
+            )
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)
+
+
 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
 # selected.
@@ -39,12 +93,14 @@ transport_params = {
    "daily": lambda: DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
+        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
    "webrtc": lambda: TransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
+        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
@@ -54,34 +110,33 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

+    user_response = UserResponseAggregator()
+
+    # Initialize the image requester without setting the participant ID yet
+    image_requester = UserImageRequester()
+
+    image_processor = UserImageProcessor()
+
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

+    # Anthropic for vision analysis
+    anthropic = AnthropicLLMService(api_key=os.getenv("ANTHROPIC_API_KEY"))
+
    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    llm = AnthropicLLMService(api_key=os.getenv("ANTHROPIC_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
    pipeline = Pipeline(
        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
+            transport.input(),
+            stt,
+            user_response,
+            image_requester,
+            image_processor,
+            anthropic,
+            tts,
+            transport.output(),
        ]
    )

@@ -96,28 +151,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
+        logger.info(f"Client connected: {client}")

-        if not runner_args.body:
-            script_dir = os.path.dirname(__file__)
-            runner_args.body = {
-                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
-                "question": "Describe this image",
-            }
+        await maybe_capture_participant_camera(transport, client)

-        image_path = runner_args.body["image_path"]
-        question = runner_args.body["question"]
+        # Set the participant ID in the image requester
+        client_id = get_transport_client_id(transport, client)
+        image_requester.set_participant_id(client_id)

-        # Kick off the conversation.
-        image = Image.open(image_path)
-        message = LLMContext.create_image_message(
-            image=image.tobytes(),
-            format="RGB",
-            size=image.size,
-            text=question,
-        )
-        messages.append(message)
-        await task.queue_frames([LLMRunFrame()])
+        # Welcome message
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12d-describe-image-moondream.py
+++ b/examples/foundational/12d-describe-image-moondream.py
@@ -1,122 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-from PIL import Image
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import UserImageRawFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.moondream.vision import MoondreamService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-
-load_dotenv(override=True)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
-    vision = MoondreamService()
-
-    pipeline = Pipeline(
-        [
-            vision,  # Vision
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-
-        if not runner_args.body:
-            script_dir = os.path.dirname(__file__)
-            runner_args.body = {
-                "image_path": os.path.join(script_dir, "assets", "cat.jpg"),
-                "question": "Describe this image",
-            }
-
-        image_path = runner_args.body["image_path"]
-        question = runner_args.body["question"]
-
-        # Describe the image.
-        image = Image.open(image_path)
-        await task.queue_frames(
-            [
-                UserImageRawFrame(
-                    image=image.tobytes(),
-                    format="RGB",
-                    size=image.size,
-                    text=question,
-                )
-            ]
-        )
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/14d-function-calling-aws-video.py
+++ b/examples/foundational/14d-function-calling-aws-video.py
@@ -5,23 +5,29 @@
 #

 import os
+from typing import Optional

 from dotenv import load_dotenv
 from loguru import logger

-from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection
+from pipecat.processors.aggregators.user_response import UserResponseAggregator
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -31,37 +37,54 @@ from pipecat.runner.utils import (
 from pipecat.services.aws.llm import AWSBedrockLLMService
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams

 load_dotenv(override=True)


-async def fetch_user_image(params: FunctionCallParams):
-    """Fetch the user image and push it to the LLM.
+class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""

-    When called, this function pushes a UserImageRequestFrame upstream to the
-    transport. As a result, the transport will request the user image and push a
-    UserImageRawFrame downstream which will be added to the context by the LLM
-    assistant aggregator.
-    """
-    user_id = params.arguments["user_id"]
-    question = params.arguments["question"]
-    logger.debug(f"Requesting image with user_id={user_id}, question={question}")
+    def __init__(self, participant_id: Optional[str] = None):
+        super().__init__()
+        self._participant_id = participant_id

-    # Request a user image frame and indicate that it should be added to the
-    # context.
-    await params.llm.push_frame(
-        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
-        FrameDirection.UPSTREAM,
-    )
+    def set_participant_id(self, participant_id: str):
+        self._participant_id = participant_id

-    await params.result_callback(None)
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)

-    # Instead of None, it's possible to also provide a tool call answer to
-    # tell the LLM that we are grabbing the image to analyze.
-    # await params.result_callback({"result": "Image is being captured."})
+        if self._participant_id and isinstance(frame, TextFrame):
+            await self.push_frame(
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
+            )
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                # Note: AWS Bedrock does not yet support the universal LLMContext
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -88,15 +111,17 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

+    user_response = UserResponseAggregator()
+
+    # Initialize the image requester without setting the participant ID yet
+    image_requester = UserImageRequester()
+
+    image_processor = UserImageProcessor()
+
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
    # AWS for vision analysis
-    llm = AWSBedrockLLMService(
+    aws = AWSBedrockLLMService(
        aws_region="us-west-2",
        model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
        # Note: usually, prefer providing latency="optimized" param.
@@ -104,44 +129,22 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        # which we need for image input.
        params=AWSBedrockLLMService.InputParams(temperature=0.8),
    )
-    llm.register_function("fetch_user_image", fetch_user_image)

-    fetch_image_function = FunctionSchema(
-        name="fetch_user_image",
-        description="Called when the user requests a description of their camera feed",
-        properties={
-            "user_id": {
-                "type": "string",
-                "description": "The ID of the user to grab the image from",
-            },
-            "question": {
-                "type": "string",
-                "description": "The question that the user is asking about the image",
-            },
-        },
-        required=["user_id", "question"],
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )
-    tools = ToolsSchema(standard_tools=[fetch_image_function])
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
-        },
-    ]
-
-    context = LLMContext(messages, tools)
-    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
+            transport.input(),
+            stt,
+            user_response,
+            image_requester,
+            image_processor,
+            aws,
+            tts,
+            transport.output(),
        ]
    )

@@ -162,15 +165,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Set the participant ID in the image requester
        client_id = get_transport_client_id(transport, client)
+        image_requester.set_participant_id(client_id)

-        # Kick off the conversation.
-        messages.append(
-            {
-                "role": "system",
-                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
-            }
-        )
-        await task.queue_frames([LLMRunFrame()])
+        # Welcome message
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/13f-cartesia-transcription.py
+++ b/examples/foundational/13f-cartesia-transcription.py
@@ -48,7 +48,10 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
+    stt = CartesiaSTTService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        base_url=os.getenv("CARTESIA_BASE_URL"),
+    )

    tl = TranscriptionLogger()

--- a/examples/foundational/14b-function-calling-anthropic-video.py
+++ b/examples/foundational/14b-function-calling-anthropic-video.py
@@ -4,6 +4,8 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+
+import asyncio
 import os

 from dotenv import load_dotenv
@@ -15,13 +17,12 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
+from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -38,30 +39,34 @@ from pipecat.transports.daily.transport import DailyParams
 load_dotenv(override=True)


-async def fetch_user_image(params: FunctionCallParams):
-    """Fetch the user image and push it to the LLM.
+# Global variable to store the client ID
+client_id = ""

-    When called, this function pushes a UserImageRequestFrame upstream to the
-    transport. As a result, the transport will request the user image and push a
-    UserImageRawFrame downstream which will be added to the context by the LLM
-    assistant aggregator.
-    """
-    user_id = params.arguments["user_id"]
+
+async def get_weather(params: FunctionCallParams):
+    location = params.arguments["location"]
+    await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.")
+
+
+async def get_image(params: FunctionCallParams):
    question = params.arguments["question"]
-    logger.debug(f"Requesting image with user_id={user_id}, question={question}")
+    logger.debug(f"Requesting image with user_id={client_id}, question={question}")

-    # Request a user image frame and indicate that it should be added to the
-    # context.
-    await params.llm.push_frame(
-        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
-        FrameDirection.UPSTREAM,
+    # Request the image frame
+    await params.llm.request_image_frame(
+        user_id=client_id,
+        function_name=params.function_name,
+        tool_call_id=params.tool_call_id,
+        text_content=question,
    )

-    await params.result_callback(None)
+    # Wait a short time for the frame to be processed
+    await asyncio.sleep(0.5)

-    # Instead of None, it's possible to also provide a tool call answer to
-    # tell the LLM that we are grabbing the image to analyze.
-    # await params.result_callback({"result": "Image is being captured."})
+    # Return a result to complete the function call
+    await params.result_callback(
+        f"I've captured an image from your camera and I'm analyzing what you asked about: {question}"
+    )


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -95,32 +100,70 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    # Anthropic for vision analysis
-    llm = AnthropicLLMService(api_key=os.getenv("ANTHROPIC_API_KEY"))
-    llm.register_function("fetch_user_image", fetch_user_image)
+    llm = AnthropicLLMService(
+        api_key=os.getenv("ANTHROPIC_API_KEY"),
+        model="claude-3-7-sonnet-latest",
+        params=AnthropicLLMService.InputParams(enable_prompt_caching=True),
+    )
+    llm.register_function("get_weather", get_weather)
+    llm.register_function("get_image", get_image)

-    fetch_image_function = FunctionSchema(
-        name="fetch_user_image",
-        description="Called when the user requests a description of their camera feed",
+    weather_function = FunctionSchema(
+        name="get_weather",
+        description="Get the current weather",
        properties={
-            "user_id": {
+            "location": {
                "type": "string",
-                "description": "The ID of the user to grab the image from",
-            },
-            "question": {
-                "type": "string",
-                "description": "The question that the user is asking about the image",
+                "description": "The city and state, e.g. San Francisco, CA",
            },
        },
-        required=["user_id", "question"],
+        required=["location"],
    )
-    tools = ToolsSchema(standard_tools=[fetch_image_function])
+    get_image_function = FunctionSchema(
+        name="get_image",
+        description="Get an image from the video stream.",
+        properties={
+            "question": {
+                "type": "string",
+                "description": "The question that the user is asking about the image.",
+            }
+        },
+        required=["question"],
+    )
+    tools = ToolsSchema(standard_tools=[weather_function, get_image_function])
+
+    system_prompt = """\
+You are a helpful assistant who converses with a user and answers questions. Respond concisely to general questions.
+
+Your response will be turned into speech so use only simple words and punctuation.
+
+You have access to two tools: get_weather and get_image.
+
+You can respond to questions about the weather using the get_weather tool.
+
+You can answer questions about the user's video stream using the get_image tool. Some examples of phrases that \
+indicate you should use the get_image tool are:
+- What do you see?
+- What's in the video?
+- Can you describe the video?
+- Tell me about what you see.
+- Tell me something interesting about what you see.
+- What's happening in the video?
+
+If you need to use a tool, simply use the tool. Do not tell the user the tool you are using. Be brief and concise.
+    """

    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+            "content": [
+                {
+                    "type": "text",
+                    "text": system_prompt,
+                }
+            ],
        },
+        {"role": "user", "content": "Start the conversation by introducing yourself."},
    ]

    context = LLMContext(messages, tools)
@@ -130,11 +173,11 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        [
            transport.input(),  # Transport user input
            stt,  # STT
-            context_aggregator.user(),  # User responses
+            context_aggregator.user(),  # User speech to text
            llm,  # LLM
            tts,  # TTS
            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
+            context_aggregator.assistant(),  # Assistant spoken responses and tool context
        ]
    )

@@ -153,16 +196,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        await maybe_capture_participant_camera(transport, client)

-        # Set the participant ID in the image requester
+        global client_id
        client_id = get_transport_client_id(transport, client)

        # Kick off the conversation.
-        messages.append(
-            {
-                "role": "system",
-                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
-            }
-        )
        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_client_disconnected")
--- a/examples/foundational/14d-function-calling-moondream-video.py
+++ b/examples/foundational/14d-function-calling-moondream-video.py
@@ -1,190 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
-from pipecat.pipeline.parallel_pipeline import ParallelPipeline
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import (
-    create_transport,
-    get_transport_client_id,
-    maybe_capture_participant_camera,
-)
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.moondream.vision import MoondreamService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-
-load_dotenv(override=True)
-
-
-async def fetch_user_image(params: FunctionCallParams):
-    """Fetch the user image.
-
-    When called, this function pushes a UserImageRequestFrame upstream to the
-    transport. As a result, the transport will request the user image and push a
-    UserImageRawFrame downstream.
-    """
-    user_id = params.arguments["user_id"]
-    question = params.arguments["question"]
-    logger.debug(f"Requesting image with user_id={user_id}, question={question}")
-
-    # Request a user image frame. In this case, we don't want the requested
-    # image to be added to the context because we will process it with
-    # Moondream.
-    await params.llm.push_frame(
-        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=False),
-        FrameDirection.UPSTREAM,
-    )
-
-    await params.result_callback(None)
-
-    # Instead of None, it's possible to also provide a tool call answer to
-    # tell the LLM that we are grabbing the image to analyze.
-    # await params.result_callback({"result": "Image is being captured."})
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_in_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_in_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-    llm.register_function("fetch_user_image", fetch_user_image)
-
-    fetch_image_function = FunctionSchema(
-        name="fetch_user_image",
-        description="Called when the user requests a description of their camera feed",
-        properties={
-            "user_id": {
-                "type": "string",
-                "description": "The ID of the user to grab the image from",
-            },
-            "question": {
-                "type": "string",
-                "description": "The question that the user is asking about the image",
-            },
-        },
-        required=["user_id", "question"],
-    )
-    tools = ToolsSchema(standard_tools=[fetch_image_function])
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
-        },
-    ]
-
-    context = LLMContext(messages, tools)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    # If you run into weird description, try with use_cpu=True
-    moondream = MoondreamService()
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            ParallelPipeline(
-                [llm],  # LLM
-                [moondream],
-            ),
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected: {client}")
-
-        await maybe_capture_participant_camera(transport, client)
-
-        # Set the participant ID in the image requester
-        client_id = get_transport_client_id(transport, client)
-
-        # Kick off the conversation.
-        messages.append(
-            {
-                "role": "system",
-                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
-            }
-        )
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/14d-function-calling-openai-video.py
+++ b/examples/foundational/14d-function-calling-openai-video.py
@@ -1,186 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import (
-    create_transport,
-    get_transport_client_id,
-    maybe_capture_participant_camera,
-)
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-
-load_dotenv(override=True)
-
-
-async def fetch_user_image(params: FunctionCallParams):
-    """Fetch the user image and push it to the LLM.
-
-    When called, this function pushes a UserImageRequestFrame upstream to the
-    transport. As a result, the transport will request the user image and push a
-    UserImageRawFrame downstream which will be added to the context by the LLM
-    assistant aggregator.
-    """
-    user_id = params.arguments["user_id"]
-    question = params.arguments["question"]
-    logger.debug(f"Requesting image with user_id={user_id}, question={question}")
-
-    # Request a user image frame and indicate that it should be added to the
-    # context.
-    await params.llm.push_frame(
-        UserImageRequestFrame(user_id=user_id, text=question, append_to_context=True),
-        FrameDirection.UPSTREAM,
-    )
-
-    await params.result_callback(None)
-
-    # Instead of None, it's possible to also provide a tool call answer to
-    # tell the LLM that we are grabbing the image to analyze.
-    # await params.result_callback({"result": "Image is being captured."})
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_in_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        video_in_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-    )
-
-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-    llm.register_function("fetch_user_image", fetch_user_image)
-
-    fetch_image_function = FunctionSchema(
-        name="fetch_user_image",
-        description="Called when the user requests a description of their camera feed",
-        properties={
-            "user_id": {
-                "type": "string",
-                "description": "The ID of the user to grab the image from",
-            },
-            "question": {
-                "type": "string",
-                "description": "The question that the user is asking about the image",
-            },
-        },
-        required=["user_id", "question"],
-    )
-    tools = ToolsSchema(standard_tools=[fetch_image_function])
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
-        },
-    ]
-
-    context = LLMContext(messages, tools)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-
-        await maybe_capture_participant_camera(transport, client)
-
-        client_id = get_transport_client_id(transport, client)
-
-        # Kick off the conversation.
-        messages.append(
-            {
-                "role": "system",
-                "content": f"Please introduce yourself to the user. Use '{client_id}' as the user ID during function calls.",
-            }
-        )
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/14x-function-calling-openpipe.py
+++ b/examples/foundational/14x-function-calling-openpipe.py
@@ -4,8 +4,9 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+
+import asyncio
 import os
-import time

 from dotenv import load_dotenv
 from loguru import logger
@@ -16,31 +17,56 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, TTSSpeakFrame
+from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openpipe.llm import OpenPipeLLMService
+from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams

 load_dotenv(override=True)


-async def fetch_weather_from_api(params: FunctionCallParams):
-    await params.result_callback({"conditions": "nice", "temperature": "75"})
+# Global variable to store the client ID
+client_id = ""


-async def fetch_restaurant_recommendation(params: FunctionCallParams):
-    await params.result_callback({"name": "The Golden Dragon"})
+async def get_weather(params: FunctionCallParams):
+    location = params.arguments["location"]
+    await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.")
+
+
+async def get_image(params: FunctionCallParams):
+    question = params.arguments["question"]
+    logger.debug(f"Requesting image with user_id={client_id}, question={question}")
+
+    # Request the image frame
+    await params.llm.request_image_frame(
+        user_id=client_id,
+        function_name=params.function_name,
+        tool_call_id=params.tool_call_id,
+        text_content=question,
+    )
+
+    # Wait a short time for the frame to be processed
+    await asyncio.sleep(0.5)
+
+    # Return a result to complete the function call
+    await params.result_callback(
+        f"I've captured an image from your camera and I'm analyzing what you asked about: {question}"
+    )


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -50,18 +76,14 @@ transport_params = {
    "daily": lambda: DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
+        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
    "webrtc": lambda: TransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
+        video_in_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
    ),
@@ -78,24 +100,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    timestamp = int(time.time())
-    llm = OpenPipeLLMService(
-        api_key=os.getenv("OPENAI_API_KEY"),
-        openpipe_api_key=os.getenv("OPENPIPE_API_KEY"),
-        tags={"conversation_id": f"pipecat-{timestamp}"},
-    )
-
-    # You can also register a function_name of None to get all functions
-    # sent to the same callback with an additional function_name parameter.
-    llm.register_function("get_current_weather", fetch_weather_from_api)
-    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
-
-    @llm.event_handler("on_function_calls_started")
-    async def on_function_calls_started(service, function_calls):
-        await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
+    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+    llm.register_function("get_weather", get_weather)
+    llm.register_function("get_image", get_image)

    weather_function = FunctionSchema(
-        name="get_current_weather",
+        name="get_weather",
        description="Get the current weather",
        properties={
            "location": {
@@ -108,26 +118,41 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                "description": "The temperature unit to use. Infer this from the user's location.",
            },
        },
-        required=["location", "format"],
-    )
-    restaurant_function = FunctionSchema(
-        name="get_restaurant_recommendation",
-        description="Get a restaurant recommendation",
-        properties={
-            "location": {
-                "type": "string",
-                "description": "The city and state, e.g. San Francisco, CA",
-            },
-        },
        required=["location"],
    )
-    tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+    get_image_function = FunctionSchema(
+        name="get_image",
+        description="Get an image from the video stream.",
+        properties={
+            "question": {
+                "type": "string",
+                "description": "The question that the user is asking about the image.",
+            }
        },
+        required=["question"],
+    )
+    tools = ToolsSchema(standard_tools=[weather_function, get_image_function])
+
+    system_prompt = """\
+You are a helpful assistant who converses with a user and answers questions. Respond concisely to general questions.
+
+Your response will be turned into speech so use only simple words and punctuation.
+
+You have access to two tools: get_weather and get_image.
+
+You can respond to questions about the weather using the get_weather tool.
+
+You can answer questions about the user's video stream using the get_image tool. Some examples of phrases that \
+indicate you should use the get_image tool are:
+- What do you see?
+- What's in the video?
+- Can you describe the video?
+- Tell me about what you see.
+- Tell me something interesting about what you see.
+- What's happening in the video?
+"""
+    messages = [
+        {"role": "system", "content": system_prompt},
    ]

    context = LLMContext(messages, tools)
@@ -157,6 +182,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
+
+        await maybe_capture_participant_camera(transport, client)
+
+        global client_id
+        client_id = get_transport_client_id(transport, client)
+
        # Kick off the conversation.
        await task.queue_frames([LLMRunFrame()])

--- a/examples/foundational/14p-function-calling-gemini-vertex-ai.py
+++ b/examples/foundational/14p-function-calling-gemini-vertex-ai.py
@@ -76,8 +76,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    llm = GoogleVertexLLMService(
        credentials=os.getenv("GOOGLE_VERTEX_TEST_CREDENTIALS"),
-        project_id=os.getenv("GOOGLE_CLOUD_PROJECT_ID"),
-        location=os.getenv("GOOGLE_CLOUD_LOCATION"),
+        params=GoogleVertexLLMService.InputParams(
+            project_id=os.getenv("GOOGLE_CLOUD_PROJECT_ID"),
+        ),
    )
    # You can aslo register a function_name of None to get all functions
    # sent to the same callback with an additional function_name parameter.
--- a/examples/foundational/14r-function-calling-aws.py
+++ b/examples/foundational/14r-function-calling-aws.py
@@ -79,8 +79,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    llm = AWSBedrockLLMService(
        aws_region="us-west-2",
-        model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
-        params=AWSBedrockLLMService.InputParams(temperature=0.8),
+        model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
+        params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
    )

    # You can also register a function_name of None to get all functions
--- a/examples/foundational/16-gpu-container-local-bot.py
+++ b/examples/foundational/16-gpu-container-local-bot.py
@@ -26,11 +26,7 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.deepgram.tts import DeepgramTTSService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import (
-    DailyOutputTransportMessageFrame,
-    DailyOutputTransportMessageUrgentFrame,
-    DailyParams,
-)
+from pipecat.transports.daily.transport import DailyParams, DailyTransportMessageFrame
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams

 load_dotenv(override=True)
@@ -132,14 +128,14 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                logger.debug(f"Received latency ping app message: {message}")
                ts = message["latency-ping"]["ts"]
                # Send immediately
-                await task.queue_frame(
-                    DailyOutputTransportMessageUrgentFrame(
+                transport.output().send_message(
+                    DailyTransportMessageFrame(
                        message={"latency-pong-msg-handler": {"ts": ts}}, participant_id=sender
                    )
                )
                # And push to the pipeline for the Daily transport.output to send
                await task.queue_frame(
-                    DailyOutputTransportMessageFrame(
+                    DailyTransportMessageFrame(
                        message={"latency-pong-pipeline-delivery": {"ts": ts}},
                        participant_id=sender,
                    )
--- a/examples/foundational/19-openai-realtime.py
+++ b/examples/foundational/19-openai-realtime.py
@@ -5,7 +5,6 @@
 #


-import asyncio
 import os
 from datetime import datetime

@@ -15,27 +14,24 @@ from loguru import logger
 from pipecat.adapters.schemas.function_schema import FunctionSchema
 from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import LLMRunFrame, LLMSetToolsFrame, TranscriptionMessage
+from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
 from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai.realtime.events import (
-    AudioConfiguration,
-    AudioInput,
+from pipecat.services.openai_realtime import (
    InputAudioNoiseReduction,
    InputAudioTranscription,
+    OpenAIRealtimeLLMService,
    SemanticTurnDetection,
    SessionProperties,
 )
-from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -55,18 +51,6 @@ async def fetch_weather_from_api(params: FunctionCallParams):
    )


-async def get_news(params: FunctionCallParams):
-    await params.result_callback(
-        {
-            "news": [
-                "Massive UFO currently hovering above New York City",
-                "Stock markets reach all-time highs",
-                "Living dinosaur species discovered in the Amazon rainforest",
-            ],
-        }
-    )
-
-
 async def fetch_restaurant_recommendation(params: FunctionCallParams):
    await params.result_callback({"name": "The Golden Dragon"})

@@ -88,13 +72,6 @@ weather_function = FunctionSchema(
    required=["location", "format"],
 )

-get_news_function = FunctionSchema(
-    name="get_news",
-    description="Get the current news.",
-    properties={},
-    required=[],
-)
-
 restaurant_function = FunctionSchema(
    name="get_restaurant_recommendation",
    description="Get a restaurant recommendation",
@@ -162,6 +139,10 @@ even if you're asked about them.
 You are participating in a voice conversation. Keep your responses concise, short, and to the point
 unless specifically asked to elaborate on a topic.

+You have access to the following tools:
+- get_current_weather: Get the current weather for a given location.
+- get_restaurant_recommendation: Get a restaurant recommendation for a given location.
+
 Remember, your responses should be short. Just one or two sentences, usually. Respond in English.""",
    )

@@ -175,26 +156,25 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
    # llm.register_function(None, fetch_weather_from_api)
    llm.register_function("get_current_weather", fetch_weather_from_api)
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
-    llm.register_function("get_news", get_news)

    transcript = TranscriptProcessor()

    # Create a standard OpenAI LLM context object using the normal messages format. The
    # OpenAIRealtimeLLMService will convert this internally to messages that the
    # openai WebSocket API can understand.
-    context = LLMContext(
+    context = OpenAILLMContext(
        [{"role": "user", "content": "Say hello!"}],
        tools,
    )

-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
            context_aggregator.user(),
-            transcript.user(),  # LLM pushes TranscriptionFrames upstream
            llm,  # LLM
+            transcript.user(),  # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
            transport.output(),  # Transport bot output
            transcript.assistant(),  # After the transcript output, to time with the audio output
            context_aggregator.assistant(),
@@ -217,13 +197,6 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
        # Kick off the conversation.
        await task.queue_frames([LLMRunFrame()])

-        # Add a new tool at runtime after a delay.
-        await asyncio.sleep(15)
-        new_tools = ToolsSchema(
-            standard_tools=[weather_function, restaurant_function, get_news_function]
-        )
-        await task.queue_frames([LLMSetToolsFrame(tools=new_tools)])
-
    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
--- a/examples/foundational/19a-azure-realtime.py
+++ b/examples/foundational/19a-azure-realtime.py
@@ -18,19 +18,16 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.azure.realtime.llm import AzureRealtimeLLMService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai.realtime.events import (
-    AudioConfiguration,
-    AudioInput,
+from pipecat.services.openai_realtime import (
+    AzureRealtimeLLMService,
    InputAudioTranscription,
    SessionProperties,
 )
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -157,10 +154,10 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
    llm.register_function("get_current_weather", fetch_weather_from_api)
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)

-    # Create a standard LLM context object using the normal messages format. The
+    # Create a standard OpenAI LLM context object using the normal messages format. The
    # OpenAIRealtimeBetaLLMService will convert this internally to messages that the
    # openai WebSocket API can understand.
-    context = LLMContext(
+    context = OpenAILLMContext(
        [{"role": "user", "content": "Say hello!"}],
        # [{"role": "user", "content": [{"type": "text", "text": "Say hello!"}]}],
        #     [
@@ -175,7 +172,7 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
        tools,
    )

-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/19b-openai-realtime-text.py
+++ b/examples/foundational/19b-openai-realtime-text.py
@@ -18,22 +18,20 @@ from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.cartesia import CartesiaTTSService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai.realtime.events import (
-    AudioConfiguration,
-    AudioInput,
+from pipecat.services.openai_realtime import (
    InputAudioNoiseReduction,
    InputAudioTranscription,
+    OpenAIRealtimeLLMService,
    SemanticTurnDetection,
    SessionProperties,
 )
-from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -170,20 +168,20 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
    # Create a standard OpenAI LLM context object using the normal messages format. The
    # OpenAIRealtimeLLMService will convert this internally to messages that the
    # openai WebSocket API can understand.
-    context = LLMContext(
+    context = OpenAILLMContext(
        [{"role": "user", "content": "Say hello!"}],
        tools,
    )

-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
            context_aggregator.user(),
-            transcript.user(),  # LLM pushes TranscriptionFrames upstream
            llm,  # LLM
            tts,  # TTS
+            transcript.user(),  # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
            transport.output(),  # Transport bot output
            transcript.assistant(),  # After the transcript output, to time with the audio output
            context_aggregator.assistant(),
--- a/examples/foundational/20b-persistent-context-openai-realtime.py
+++ b/examples/foundational/20b-persistent-context-openai-realtime.py
@@ -13,27 +13,25 @@ from datetime import datetime
 from dotenv import load_dotenv
 from loguru import logger

-from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import (
+    OpenAILLMContext,
+)
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai.realtime.events import (
-    AudioConfiguration,
-    AudioInput,
+from pipecat.services.openai_realtime import (
    InputAudioTranscription,
+    OpenAIRealtimeLLMService,
    SessionProperties,
    TurnDetection,
 )
-from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -70,11 +68,11 @@ async def save_conversation(params: FunctionCallParams):
    timestamp = datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
    filename = f"{BASE_FILENAME}{timestamp}.json"
    logger.debug(
-        f"writing conversation to {filename}\n{json.dumps(params.context.get_messages(), indent=4)}"
+        f"writing conversation to {filename}\n{json.dumps(params.context.messages, indent=4)}"
    )
    try:
        with open(filename, "w") as file:
-            messages = params.context.get_messages()
+            messages = params.context.get_messages_for_persistent_storage()
            # remove the last message, which is the instruction we just gave to save the conversation
            messages.pop()
            json.dump(messages, file, indent=2)
@@ -91,10 +89,6 @@ async def load_conversation(params: FunctionCallParams):
            with open(filename, "r") as file:
                params.context.set_messages(json.load(file))
                await params.llm.reset_conversation()
-                # NOTE: we manually create a response here rather than relying
-                # on the function callback to trigger one since we've reset the
-                # conversation so the remote service doesn't know about the
-                # in-progress tool call.
                await params.llm._create_response()
        except Exception as e:
            await params.result_callback({"success": False, "error": str(e)})
@@ -102,12 +96,14 @@ async def load_conversation(params: FunctionCallParams):
    asyncio.create_task(_reset())


-tools = ToolsSchema(
-    standard_tools=[
-        FunctionSchema(
-            name="get_current_weather",
-            description="Get the current weather",
-            properties={
+tools = [
+    {
+        "type": "function",
+        "name": "get_current_weather",
+        "description": "Get the current weather",
+        "parameters": {
+            "type": "object",
+            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
@@ -118,33 +114,45 @@ tools = ToolsSchema(
                    "description": "The temperature unit to use. Infer this from the users location.",
                },
            },
-            required=["location", "format"],
-        ),
-        FunctionSchema(
-            name="save_conversation",
-            description="Save the current conversatione. Use this function to persist the current conversation to external storage.",
-            properties={},
-            required=[],
-        ),
-        FunctionSchema(
-            name="get_saved_conversation_filenames",
-            description="Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
-            properties={},
-            required=[],
-        ),
-        FunctionSchema(
-            name="load_conversation",
-            description="Load a conversation history. Use this function to load a conversation history into the current session.",
-            properties={
+            "required": ["location", "format"],
+        },
+    },
+    {
+        "type": "function",
+        "name": "save_conversation",
+        "description": "Save the current conversatione. Use this function to persist the current conversation to external storage.",
+        "parameters": {
+            "type": "object",
+            "properties": {},
+            "required": [],
+        },
+    },
+    {
+        "type": "function",
+        "name": "get_saved_conversation_filenames",
+        "description": "Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
+        "parameters": {
+            "type": "object",
+            "properties": {},
+            "required": [],
+        },
+    },
+    {
+        "type": "function",
+        "name": "load_conversation",
+        "description": "Load a conversation history. Use this function to load a conversation history into the current session.",
+        "parameters": {
+            "type": "object",
+            "properties": {
                "filename": {
                    "type": "string",
                    "description": "The filename of the conversation history to load.",
                }
            },
-            required=["filename"],
-        ),
-    ]
-)
+            "required": ["filename"],
+        },
+    },
+]


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -215,8 +223,8 @@ Remember, your responses should be short. Just one or two sentences, usually."""
    llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
    llm.register_function("load_conversation", load_conversation)

-    context = LLMContext([{"role": "user", "content": "Say hello!"}], tools)
-    context_aggregator = LLMContextAggregatorPair(context)
+    context = OpenAILLMContext([], tools)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/20c-persistent-context-anthropic.py
+++ b/examples/foundational/20c-persistent-context-anthropic.py
@@ -72,6 +72,7 @@ async def save_conversation(params: FunctionCallParams):
    )
    try:
        with open(filename, "w") as file:
+            # todo: extract 'system' into the first message in the list
            messages = params.context.get_messages()
            # remove the last message, which is the instruction we just gave to save the conversation
            messages.pop()
--- a/examples/foundational/20d-persistent-context-gemini.py
+++ b/examples/foundational/20d-persistent-context-gemini.py
@@ -90,6 +90,7 @@ async def save_conversation(params: FunctionCallParams):
    )
    try:
        with open(filename, "w") as file:
+            # todo: extract 'system' into the first message in the list
            messages = params.context.get_messages()
            # remove the last message (the instruction to save the context)
            messages.pop()
--- a/examples/foundational/20e-persistent-context-aws-nova-sonic.py
+++ b/examples/foundational/20e-persistent-context-aws-nova-sonic.py
@@ -20,12 +20,10 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.aws.nova_sonic.llm import AWSNovaSonicLLMService
+from pipecat.services.aws_nova_sonic.aws import AWSNovaSonicLLMService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -77,7 +75,7 @@ async def save_conversation(params: FunctionCallParams):
    filename = f"{BASE_FILENAME}{timestamp}.json"
    try:
        with open(filename, "w") as file:
-            messages = params.context.get_messages()
+            messages = params.context.get_messages_for_persistent_storage()
            # remove the last few messages. in reverse order, they are:
            # - the in progress save tool call
            # - the invocation of the save tool call
@@ -225,13 +223,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
    llm.register_function("load_conversation", load_conversation)

-    context = LLMContext(
+    context = OpenAILLMContext(
        messages=[
            {"role": "system", "content": f"{system_instruction}"},
        ],
        tools=tools,
    )
-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26-gemini-multimodal-live.py
+++ b/examples/foundational/26-gemini-multimodal-live.py
@@ -17,7 +17,7 @@ from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -65,7 +65,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    Respond to what the user said in a creative and helpful way.
    """

-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
--- a/examples/foundational/26a-gemini-multimodal-live-transcription.py
+++ b/examples/foundational/26a-gemini-multimodal-live-transcription.py
@@ -16,13 +16,11 @@ from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -67,14 +65,14 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Aoede",  # Puck, Charon, Kore, Fenrir, Aoede
        # system_instruction="Talk like a pirate."
        # inference_on_context_initialization=False,
    )

-    context = LLMContext(
+    context = OpenAILLMContext(
        [
            {
                "role": "user",
@@ -92,7 +90,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            #     },
        ],
    )
-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    transcript = TranscriptProcessor()

--- a/examples/foundational/26b-gemini-multimodal-live-function-calling.py
+++ b/examples/foundational/26b-gemini-multimodal-live-function-calling.py
@@ -19,12 +19,10 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -124,15 +122,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        required=["location"],
    )
    search_tool = {"google_search": {}}
-    # KNOWN ISSUE: If using GeminiVertexLiveLLMService, it appears
-    # you cannot use the "google_search" tool alongside other tools.
-    # See https://github.com/googleapis/python-genai/issues/941.
    tools = ToolsSchema(
        standard_tools=[weather_function, restaurant_function],
        custom_tools={AdapterType.GEMINI: [search_tool]},
    )

-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
        tools=tools,
@@ -141,10 +136,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_current_weather", fetch_weather_from_api)
    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)

-    context = LLMContext(
+    context = OpenAILLMContext(
        [{"role": "user", "content": "Say hello."}],
    )
-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26c-gemini-multimodal-live-video.py
+++ b/examples/foundational/26c-gemini-multimodal-live-video.py
@@ -17,16 +17,14 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
    maybe_capture_participant_camera,
    maybe_capture_participant_screen,
 )
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams

@@ -60,14 +58,14 @@ transport_params = {


 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Aoede",  # Puck, Charon, Kore, Fenrir, Aoede
        # system_instruction="Talk like a pirate."
        # inference_on_context_initialization=False,
    )

-    context = LLMContext(
+    context = OpenAILLMContext(
        [
            {
                "role": "user",
@@ -75,7 +73,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            },
        ],
    )
-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26d-gemini-multimodal-live-text.py
+++ b/examples/foundational/26d-gemini-multimodal-live-text.py
@@ -16,14 +16,13 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.google.gemini_live.llm import (
-    GeminiLiveLLMService,
-    GeminiModalities,
+from pipecat.services.gemini_multimodal_live.gemini import (
+    GeminiMultimodalLiveLLMService,
+    GeminiMultimodalModalities,
    InputParams,
 )
 from pipecat.transports.base_transport import BaseTransport, TransportParams
@@ -81,15 +80,11 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    # KNOWN ISSUE: If using GeminiLiveVertexLLMService, you cannot specify a
-    # modality other than AUDIO (at least not if using the service's default
-    # model, which is a native audio model:
-    # https://cloud.google.com/vertex-ai/generative-ai/docs/live-api/tools#native-audio).
-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=SYSTEM_INSTRUCTION,
        tools=[{"google_search": {}}, {"code_execution": {}}],
-        params=InputParams(modalities=GeminiModalities.TEXT),
+        params=InputParams(modalities=GeminiMultimodalModalities.TEXT),
    )

    # Optionally, you can set the response modalities via a function
@@ -110,8 +105,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    # Set up conversation context and management
    # The context_aggregator will automatically collect conversation context
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
+    context = OpenAILLMContext(messages)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26e-gemini-multimodal-google-search.py
+++ b/examples/foundational/26e-gemini-multimodal-google-search.py
@@ -16,12 +16,10 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -85,14 +83,14 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

    # Initialize the Gemini Multimodal Live model
-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
        system_instruction=system_instruction,
        tools=tools,
    )

-    context = LLMContext(
+    context = OpenAILLMContext(
        [
            {
                "role": "user",
@@ -100,7 +98,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            }
        ],
    )
-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26f-gemini-multimodal-live-files-api.py
+++ b/examples/foundational/26f-gemini-multimodal-live-files-api.py
@@ -16,12 +16,12 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.gemini_multimodal_live.gemini import (
+    GeminiMultimodalLiveLLMService,
+)
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -110,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    """

    # Initialize Gemini service with File API support
-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
        voice_id="Charon",  # Aoede, Charon, Fenrir, Kore, Puck
@@ -131,7 +131,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        mime_type = "text/plain"

        # Create context with file reference
-        context = LLMContext(
+        context = OpenAILLMContext(
            [
                {
                    "role": "user",
@@ -154,7 +154,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    except Exception as e:
        logger.error(f"Error uploading file: {e}")
        # Continue with a basic context if file upload fails
-        context = LLMContext(
+        context = OpenAILLMContext(
            [
                {
                    "role": "user",
@@ -164,7 +164,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        )

    # Create context aggregator
-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    # Build the pipeline
    pipeline = Pipeline(
--- a/examples/foundational/26g-gemini-multimodal-live-groundingMetadata.py
+++ b/examples/foundational/26g-gemini-multimodal-live-groundingMetadata.py
@@ -9,15 +9,13 @@ from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import Frame, LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
+from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.services.google.frames import LLMSearchResponseFrame
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -107,7 +105,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        custom_tools={AdapterType.GEMINI: [{"google_search": {}}, {"code_execution": {}}]},
    )

-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=SYSTEM_INSTRUCTION,
        voice_id="Charon",  # Aoede, Charon, Fenrir, Kore, Puck
@@ -126,8 +124,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    ]

    # Set up conversation context and management
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
+    context = OpenAILLMContext(messages)
+    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
--- a/examples/foundational/26h-gemini-live-vertex-function-calling.py
+++ b/examples/foundational/26h-gemini-live-vertex-function-calling.py
@@ -1,189 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-from datetime import datetime
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm_vertex import GeminiLiveVertexLLMService
-from pipecat.services.llm_service import FunctionCallParams
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-
-async def fetch_weather_from_api(params: FunctionCallParams):
-    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
-    await params.result_callback(
-        {
-            "conditions": "nice",
-            "temperature": temperature,
-            "format": params.arguments["format"],
-            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
-        }
-    )
-
-
-async def fetch_restaurant_recommendation(params: FunctionCallParams):
-    await params.result_callback({"name": "The Golden Dragon"})
-
-
-system_instruction = """
-You are a helpful assistant who can answer questions and use tools.
-
-You have three tools available to you:
-1. get_current_weather: Use this tool to get the current weather in a specific location.
-2. get_restaurant_recommendation: Use this tool to get a restaurant recommendation in a specific location.
-"""
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        # set stop_secs to something roughly similar to the internal setting
-        # of the Multimodal Live api, just to align events. This doesn't really
-        # matter because we can only use the Multimodal Live API's phrase
-        # endpointing, for now.
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        # set stop_secs to something roughly similar to the internal setting
-        # of the Multimodal Live api, just to align events. This doesn't really
-        # matter because we can only use the Multimodal Live API's phrase
-        # endpointing, for now.
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        # set stop_secs to something roughly similar to the internal setting
-        # of the Multimodal Live api, just to align events. This doesn't really
-        # matter because we can only use the Multimodal Live API's phrase
-        # endpointing, for now.
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    weather_function = FunctionSchema(
-        name="get_current_weather",
-        description="Get the current weather",
-        properties={
-            "location": {
-                "type": "string",
-                "description": "The city and state, e.g. San Francisco, CA",
-            },
-            "format": {
-                "type": "string",
-                "enum": ["celsius", "fahrenheit"],
-                "description": "The temperature unit to use. Infer this from the user's location.",
-            },
-        },
-        required=["location", "format"],
-    )
-    restaurant_function = FunctionSchema(
-        name="get_restaurant_recommendation",
-        description="Get a restaurant recommendation",
-        properties={
-            "location": {
-                "type": "string",
-                "description": "The city and state, e.g. San Francisco, CA",
-            },
-        },
-        required=["location"],
-    )
-    # KNOWN ISSUE: If using GeminiVertexLiveLLMService, it appears
-    # you cannot use the "google_search" tool alongside other tools.
-    # See https://github.com/googleapis/python-genai/issues/941.
-    tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
-
-    llm = GeminiLiveVertexLLMService(
-        credentials=os.getenv("GOOGLE_VERTEX_TEST_CREDENTIALS"),
-        project_id=os.getenv("GOOGLE_CLOUD_PROJECT_ID"),
-        location=os.getenv("GOOGLE_CLOUD_LOCATION"),
-        system_instruction=system_instruction,
-        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
-        tools=tools,
-    )
-
-    llm.register_function("get_current_weather", fetch_weather_from_api)
-    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
-
-    context = LLMContext([{"role": "user", "content": "Say hello."}])
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),
-            context_aggregator.user(),
-            llm,
-            transport.output(),
-            context_aggregator.assistant(),
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/26i-gemini-live-graceful-end.py
+++ b/examples/foundational/26i-gemini-live-graceful-end.py
@@ -1,206 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import os
-from datetime import datetime
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import EndTaskFrame, LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
-from pipecat.services.llm_service import FunctionCallParams
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-
-async def fetch_weather_from_api(params: FunctionCallParams):
-    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
-    await params.result_callback(
-        {
-            "conditions": "nice",
-            "temperature": temperature,
-            "format": params.arguments["format"],
-            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
-        }
-    )
-
-
-async def fetch_restaurant_recommendation(params: FunctionCallParams):
-    await params.result_callback({"name": "The Golden Dragon"})
-
-
-async def end_conversation(params: FunctionCallParams):
-    await params.result_callback({"success": True})
-    await params.llm.push_frame(EndTaskFrame(), FrameDirection.UPSTREAM)
-
-
-system_instruction = """
-You are a helpful assistant who can answer questions and use tools.
-
-You have three tools available to you:
-1. get_current_weather: Use this tool to get the current weather in a specific location.
-2. get_restaurant_recommendation: Use this tool to get a restaurant recommendation in a specific location.
-3. end_conversation: Use this tool to gracefully end the conversation.
-
-After you've responded to the user three times, do two things, in order:
-1. Politely let them know that that's all the time you have today and say goodbye.
-2. *WITHOUT WAITING FOR THE USER TO RESPOND*, call the end_conversation tool to gracefully end the conversation.
-"""
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        # set stop_secs to something roughly similar to the internal setting
-        # of the Multimodal Live api, just to align events. This doesn't really
-        # matter because we can only use the Multimodal Live API's phrase
-        # endpointing, for now.
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        # set stop_secs to something roughly similar to the internal setting
-        # of the Multimodal Live api, just to align events. This doesn't really
-        # matter because we can only use the Multimodal Live API's phrase
-        # endpointing, for now.
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        # set stop_secs to something roughly similar to the internal setting
-        # of the Multimodal Live api, just to align events. This doesn't really
-        # matter because we can only use the Multimodal Live API's phrase
-        # endpointing, for now.
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    weather_function = FunctionSchema(
-        name="get_current_weather",
-        description="Get the current weather",
-        properties={
-            "location": {
-                "type": "string",
-                "description": "The city and state, e.g. San Francisco, CA",
-            },
-            "format": {
-                "type": "string",
-                "enum": ["celsius", "fahrenheit"],
-                "description": "The temperature unit to use. Infer this from the user's location.",
-            },
-        },
-        required=["location", "format"],
-    )
-    restaurant_function = FunctionSchema(
-        name="get_restaurant_recommendation",
-        description="Get a restaurant recommendation",
-        properties={
-            "location": {
-                "type": "string",
-                "description": "The city and state, e.g. San Francisco, CA",
-            },
-        },
-        required=["location"],
-    )
-    end_conversation_function = FunctionSchema(
-        name="end_conversation",
-        description="Gracefully end the conversation",
-        properties={},
-        required=[],
-    )
-    search_tool = {"google_search": {}}
-    tools = ToolsSchema(
-        standard_tools=[weather_function, restaurant_function, end_conversation_function],
-        custom_tools={AdapterType.GEMINI: [search_tool]},
-    )
-
-    llm = GeminiLiveLLMService(
-        api_key=os.getenv("GOOGLE_API_KEY"),
-        system_instruction=system_instruction,
-        tools=tools,
-    )
-
-    llm.register_function("get_current_weather", fetch_weather_from_api)
-    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
-    llm.register_function("end_conversation", end_conversation)
-
-    context = LLMContext(
-        [{"role": "user", "content": "Say hello."}],
-    )
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),
-            context_aggregator.user(),
-            llm,
-            transport.output(),
-            context_aggregator.assistant(),
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/27-simli-layer.py
+++ b/examples/foundational/27-simli-layer.py
@@ -9,6 +9,7 @@ import os

 from dotenv import load_dotenv
 from loguru import logger
+from simli import SimliConfig

 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
@@ -65,12 +66,11 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",
+        voice_id="a167e0f3-df7e-4d52-a9c3-f949145efdab",
    )

    simli_ai = SimliVideoService(
-        api_key=os.getenv("SIMLI_API_KEY"),
-        face_id="cace3ef7-a4c4-425d-a8cf-a5358eb0c427",
+        SimliConfig(os.getenv("SIMLI_API_KEY"), os.getenv("SIMLI_FACE_ID")),
    )

    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o-mini")
--- a/examples/foundational/28-transcription-processor.py
+++ b/examples/foundational/28-transcription-processor.py
@@ -29,6 +29,10 @@ from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+from pipecat.utils.string import match_endofsentence
+
+logger.info("Loading Whisker debugger...")
+from pipecat_whisker import WhiskerObserver

 load_dotenv(override=True)

@@ -52,6 +56,8 @@ class TranscriptHandler:
        """
        self.messages: List[TranscriptionMessage] = []
        self.output_file: Optional[str] = output_file
+        self._current_user_sentence = ""
+        self._current_assistant_sentence = ""
        logger.debug(
            f"TranscriptHandler initialized {'with output_file=' + output_file if output_file else 'with log output only'}"
        )
@@ -78,11 +84,29 @@ class TranscriptHandler:
            except Exception as e:
                logger.error(f"Error saving transcript message to file: {e}")

+    async def _save_sentence(self, role: str, content: str, timestamp: Optional[str] = None):
+        """Save a complete sentence as a transcript message.
+
+        Args:
+            role: The role (user/assistant)
+            content: The complete sentence content
+            timestamp: Optional timestamp
+        """
+        # Cast role to the appropriate literal type
+        message_role = "user" if role == "user" else "assistant"
+        sentence_message = TranscriptionMessage(
+            role=message_role, content=content.strip(), timestamp=timestamp
+        )
+        self.messages.append(sentence_message)
+        await self.save_message(sentence_message)
+
    async def on_transcript_update(
        self, processor: TranscriptProcessor, frame: TranscriptionUpdateFrame
    ):
        """Handle new transcript messages.

+        Aggregates messages into complete sentences before saving them using match_endofsentence.
+
        Args:
            processor: The TranscriptProcessor that emitted the update
            frame: TranscriptionUpdateFrame containing new messages
@@ -90,8 +114,31 @@ class TranscriptHandler:
        logger.debug(f"Received transcript update with {len(frame.messages)} new messages")

        for msg in frame.messages:
-            self.messages.append(msg)
-            await self.save_message(msg)
+            # Accumulate text for the appropriate role
+            if msg.role == "user":
+                self._current_user_sentence += msg.content + " "
+                # Check if we have a complete sentence
+                if match_endofsentence(self._current_user_sentence):
+                    await self._save_sentence("user", self._current_user_sentence, msg.timestamp)
+                    self._current_user_sentence = ""
+            elif msg.role == "assistant":
+                self._current_assistant_sentence += msg.content + " "
+                # Check if we have a complete sentence
+                if match_endofsentence(self._current_assistant_sentence):
+                    await self._save_sentence(
+                        "assistant", self._current_assistant_sentence, msg.timestamp
+                    )
+                    self._current_assistant_sentence = ""
+
+    async def finalize_partial_sentences(self):
+        """Save any remaining partial sentences when the conversation ends."""
+        if self._current_user_sentence.strip():
+            await self._save_sentence("user", self._current_user_sentence)
+            self._current_user_sentence = ""
+
+        if self._current_assistant_sentence.strip():
+            await self._save_sentence("assistant", self._current_assistant_sentence)
+            self._current_assistant_sentence = ""


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -160,12 +207,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ]
    )

+    # Create Whisker debugger observer
+    whisker = WhiskerObserver(pipeline)
+
    task = PipelineTask(
        pipeline,
        params=PipelineParams(
            enable_metrics=True,
            enable_usage_metrics=True,
        ),
+        observers=[whisker],
        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
    )

@@ -183,6 +234,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info(f"Client disconnected")
+        # Finalize any partial sentences before canceling
+        await transcript_handler.finalize_partial_sentences()
        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
--- a/examples/foundational/39-mcp-stdio.py
+++ b/examples/foundational/39-mcp-stdio.py
@@ -206,14 +206,6 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
-    if not os.getenv("NASA_API_KEY"):
-        logger.error(
-            f"Please set NASA_API_KEY environment variable for this example. See https://api.nasa.gov"
-        )
-        import sys
-
-        sys.exit(1)
-
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/39a-mcp-run-sse.py
+++ b/examples/foundational/39a-mcp-run-sse.py
@@ -141,14 +141,6 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
-    if not os.getenv("MCP_RUN_SSE_URL"):
-        logger.error(
-            f"Please set MCP_RUN_SSE_URL environment variable for this example. See https://mcp.run"
-        )
-        import sys
-
-        sys.exit(1)
-
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/39b-multiple-mcp.py
+++ b/examples/foundational/39b-multiple-mcp.py
@@ -219,14 +219,6 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
-    if not os.getenv("NASA_API_KEY") or not os.getenv("MCP_RUN_SSE_URL"):
-        logger.error(
-            f"Please set NASA_API_KEY and MCP_RUN_SSE_URL environment variables. See https://api.nasa.gov and https://mcp.run"
-        )
-        import sys
-
-        sys.exit(1)
-
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/39c-mcp-run-http.py
+++ b/examples/foundational/39c-mcp-run-http.py
@@ -145,14 +145,6 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
-    if not os.getenv("GITHUB_PERSONAL_ACCESS_TOKEN"):
-        logger.error(
-            f"Please set GITHUB_PERSONAL_ACCESS_TOKEN environment variable for this example."
-        )
-        import sys
-
-        sys.exit(1)
-
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/40-aws-nova-sonic.py
+++ b/examples/foundational/40-aws-nova-sonic.py
@@ -18,11 +18,10 @@ from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.aws.nova_sonic.llm import AWSNovaSonicLLMService
+from pipecat.services.aws_nova_sonic import AWSNovaSonicLLMService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -120,7 +119,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    llm.register_function("get_current_weather", fetch_weather_from_api)

    # Set up context and context management.
-    context = LLMContext(
+    # AWSNovaSonicService will adapt OpenAI LLM context objects with standard message format to
+    # what's expected by Nova Sonic.
+    context = OpenAILLMContext(
        messages=[
            {"role": "system", "content": f"{system_instruction}"},
            {
@@ -130,7 +131,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        ],
        tools=tools,
    )
-    context_aggregator = LLMContextAggregatorPair(context)
+    context_aggregator = llm.create_context_aggregator(context)

    # Build the pipeline
    pipeline = Pipeline(
--- a/examples/foundational/46-video-processing.py
+++ b/examples/foundational/46-video-processing.py
@@ -15,14 +15,12 @@ from pipecat.frames.frames import Frame, InputImageRawFrame, LLMRunFrame, Output
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.processors.frameworks.rtvi import RTVIObserver, RTVIProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.gemini_multimodal_live import GeminiMultimodalLiveLLMService
 from pipecat.transports.base_transport import TransportParams
 from pipecat.transports.daily.transport import DailyParams, DailyTransport

@@ -96,7 +94,7 @@ Respond to what the user said in a creative and helpful way. Keep your responses


 async def run_bot(pipecat_transport):
-    llm = GeminiLiveLLMService(
+    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
        transcribe_user_audio=True,
@@ -110,8 +108,8 @@ async def run_bot(pipecat_transport):
        }
    ]

-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
+    context = OpenAILLMContext(messages)
+    context_aggregator = llm.create_context_aggregator(context)

    # RTVI events for Pipecat client UI
    rtvi = RTVIProcessor()
--- a/examples/foundational/47-sentry-metrics.py
+++ b/examples/foundational/47-sentry-metrics.py
@@ -1,142 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import os
-
-import sentry_sdk
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.metrics.sentry import SentryMetrics
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    # Initialize Sentry
-    sentry_sdk.init(
-        dsn=os.getenv("SENTRY_DSN"),
-        traces_sample_rate=1.0,
-    )
-
-    stt = DeepgramSTTService(
-        api_key=os.getenv("DEEPGRAM_API_KEY"),
-        metrics=SentryMetrics(),
-    )
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-        metrics=SentryMetrics(),
-    )
-
-    llm = OpenAILLMService(
-        api_key=os.getenv("OPENAI_API_KEY"),
-        metrics=SentryMetrics(),
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/48-service-switcher.py
+++ b/examples/foundational/48-service-switcher.py
@@ -1,153 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import asyncio
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, ManuallySwitchServiceFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.service_switcher import ServiceSwitcher, ServiceSwitcherStrategyManual
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.stt import CartesiaSTTService
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
-from pipecat.services.deepgram.tts import DeepgramTTSService
-from pipecat.services.google.llm import GoogleLLMService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt_cartesia = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
-    stt_deepgram = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-    stt_switcher = ServiceSwitcher(
-        services=[stt_cartesia, stt_deepgram], strategy_type=ServiceSwitcherStrategyManual
-    )
-
-    tts_cartesia = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"),
-        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",
-    )
-    tts_deepgram = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"))
-    tts_switcher = ServiceSwitcher(
-        services=[tts_cartesia, tts_deepgram], strategy_type=ServiceSwitcherStrategyManual
-    )
-
-    llm_openai = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-    llm_google = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
-    llm_switcher = ServiceSwitcher(
-        services=[llm_openai, llm_google], strategy_type=ServiceSwitcherStrategyManual
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt_switcher,
-            context_aggregator.user(),  # User responses
-            llm_switcher,  # LLM
-            tts_switcher,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-        await asyncio.sleep(15)
-        print(f"Switching to {stt_deepgram}")
-        await task.queue_frames([ManuallySwitchServiceFrame(service=stt_deepgram)])
-        await asyncio.sleep(15)
-        print(f"Switching to {llm_google}")
-        await task.queue_frames([ManuallySwitchServiceFrame(service=llm_google)])
-        await asyncio.sleep(15)
-        print(f"Switching to {tts_deepgram}")
-        await task.queue_frames([ManuallySwitchServiceFrame(service=tts_deepgram)])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/README.md
+++ b/examples/foundational/README.md
@@ -105,7 +105,7 @@ uv run 07-interruptible.py -t twilio -x NGROK_HOST_NAME
 ### Vision & Multimodal

 - **[12a-describe-video-gemini-flash.py](./12a-describe-video-gemini-flash.py)**: Bot describes user's video (Video input, Multimodal LLMs)
- **[26c-gemini-live-video.py](./26c-gemini-live-video.py)**: Gemini with video input (Streaming video, Function calls)
+- **[26c-gemini-multimodal-live-video.py](./26c-gemini-multimodal-live-video.py)**: Gemini with video input (Streaming video, Function calls)

 ### Voice & Language

--- a/examples/foundational/assets/cat.jpg
+++ b/examples/foundational/assets/cat.jpg
--- a/examples/foundational/assets/moondream.png
+++ b/examples/foundational/assets/moondream.png
--- a/examples/quickstart/README.md
+++ b/examples/quickstart/README.md
@@ -73,13 +73,13 @@ Transform your local bot into a production-ready service. Pipecat Cloud handles

 1. [Sign up for Pipecat Cloud](https://pipecat.daily.co/sign-up).

-2. Install the Pipecat CLI:
+2. Install the Pipecat Cloud CLI:

   ```bash
-   uv tool install pipecat-ai-cli
+   uv add pipecatcloud
   ```

-> 💡 Tip: You can run the `pipecat` CLI using the `pc` alias.
+> 💡 Tip: You can run the `pipecatcloud` CLI using the `pcc` alias.

 3. Set up Docker for building your bot image:

@@ -113,22 +113,12 @@ secret_set = "quickstart-secrets"

 > 💡 Tip: [Set up `image_credentials`](https://docs.pipecat.ai/deployment/pipecat-cloud/fundamentals/secrets#image-pull-secrets) in your TOML file for authenticated image pulls

-### Log in to Pipecat Cloud
-
-To start using the CLI, authenticate to Pipecat Cloud:
-
-```bash
-pipecat cloud auth login
-```
-
-You'll be presented with a link that you can click to authenticate your client.
-
 ### Configure secrets

 Upload your API keys to Pipecat Cloud's secure storage:

 ```bash
-pipecat cloud secrets set quickstart-secrets --file .env
+uv run pcc secrets set quickstart-secrets --file .env
 ```

 This creates a secret set called `quickstart-secrets` (matching your TOML file) and uploads all your API keys from `.env`.
@@ -138,13 +128,13 @@ This creates a secret set called `quickstart-secrets` (matching your TOML file)
 Build your Docker image and push to Docker Hub:

 ```bash
-pipecat cloud docker build-push
+uv run pcc docker build-push
 ```

 Deploy to Pipecat Cloud:

 ```bash
-pipecat cloud deploy
+uv run pcc deploy
 ```

 ### Connect to your agent
--- a/examples/quickstart/pcc-deploy.toml
+++ b/examples/quickstart/pcc-deploy.toml
@@ -1,11 +1,6 @@
 agent_name = "quickstart"
 image = "your_username/quickstart:0.1"
 secret_set = "quickstart-secrets"
-agent_profile = "agent-1x"
-
-# RECOMMENDED: Set an image pull secret:
-# https://docs.pipecat.ai/deployment/pipecat-cloud/fundamentals/secrets#image-pull-secrets
-# image_credentials = "your_image_pull_secret"

 [scaling]
 	min_agents = 1
--- a/examples/quickstart/pyproject.toml
+++ b/examples/quickstart/pyproject.toml
@@ -4,14 +4,13 @@ version = "0.1.0"
 description = "Quickstart example for building voice AI bots with Pipecat"
 requires-python = ">=3.10"
 dependencies = [
-    "pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]",
-    "pipecat-ai-cli"
+    "pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]>=0.0.85",
+    "pipecatcloud>=0.2.4"
 ]

 [dependency-groups]
 dev = [
-    "pyright>=1.1.404,<2",
-    "ruff>=0.12.11,<1",
+    "ruff~=0.12.1",
 ]

 [tool.ruff]
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -34,7 +34,7 @@ dependencies = [
    "pyloudnorm~=0.1.1",
    "resampy~=0.4.3",
    "soxr~=0.5.0",
-    "openai>=1.74.0,<3",
+    "openai>=1.74.0,<=1.99.1",
    # Pinning numba to resolve package dependencies
    "numba==0.61.2",
    "wait_for2>=0.4.1; python_version<'3.12'",
@@ -50,24 +50,23 @@ anthropic = [ "anthropic~=0.49.0" ]
 assemblyai = [ "pipecat-ai[websockets-base]" ]
 asyncai = [ "pipecat-ai[websockets-base]" ]
 aws = [ "aioboto3~=15.0.0", "pipecat-ai[websockets-base]" ]
-aws-nova-sonic = [ "aws_sdk_bedrock_runtime~=0.1.1; python_version>='3.12'" ]
+aws-nova-sonic = [ "aws_sdk_bedrock_runtime~=0.0.2; python_version>='3.12'" ]
 azure = [ "azure-cognitiveservices-speech~=1.42.0"]
 cartesia = [ "cartesia~=2.0.3", "pipecat-ai[websockets-base]" ]
 cerebras = []
 deepseek = []
-daily = [ "daily-python~=0.21.0" ]
+daily = [ "daily-python~=0.19.9" ]
 deepgram = [ "deepgram-sdk~=4.7.0" ]
 elevenlabs = [ "pipecat-ai[websockets-base]" ]
 fal = [ "fal-client~=0.5.9" ]
 fireworks = []
 fish = [ "ormsgpack~=1.7.0", "pipecat-ai[websockets-base]" ]
 gladia = [ "pipecat-ai[websockets-base]" ]
-google = [ "google-cloud-speech>=2.33.0,<3", "google-cloud-texttospeech>=2.31.0,<3", "google-genai>=1.41.0,<2", "pipecat-ai[websockets-base]" ]
+google = [ "google-cloud-speech~=2.32.0", "google-cloud-texttospeech~=2.26.0", "google-genai~=1.24.0", "pipecat-ai[websockets-base]" ]
 grok = []
 groq = [ "groq~=0.23.0" ]
 gstreamer = [ "pygobject~=3.50.0" ]
 heygen = [ "livekit>=1.0.13", "pipecat-ai[websockets-base]" ]
-hume = [ "hume>=0.11.2" ]
 inworld = []
 krisp = [ "pipecat-ai-krisp~=0.4.0" ]
 koala = [ "pvkoala~=2.0.3" ]
@@ -84,7 +83,7 @@ nim = []
 neuphonic = [ "pipecat-ai[websockets-base]" ]
 noisereduce = [ "noisereduce~=3.0.3" ]
 openai = [ "pipecat-ai[websockets-base]" ]
-openpipe = [ "openpipe>=4.50.0,<6" ]
+openpipe = [ "openpipe~=4.50.0" ]
 openrouter = []
 perplexity = []
 playht = [ "pipecat-ai[websockets-base]" ]
@@ -102,7 +101,7 @@ silero = [ "onnxruntime>=1.20.1,<2" ]
 simli = [ "simli-ai~=0.1.10"]
 soniox = [ "pipecat-ai[websockets-base]" ]
 soundfile = [ "soundfile~=0.13.0" ]
-speechmatics = [ "speechmatics-rt>=0.5.0" ]
+speechmatics = [ "speechmatics-rt>=0.4.0" ]
 strands = [ "strands-agents>=1.9.1,<2" ]
 tavus=[]
 together = []
--- a/scripts/evals/eval.py
+++ b/scripts/evals/eval.py
@@ -10,10 +10,9 @@ import os
 import re
 import time
 import wave
-from dataclasses import dataclass
 from datetime import datetime
 from pathlib import Path
-from typing import Any, List, Optional, Tuple
+from typing import List, Optional, Tuple

 import aiofiles
 from deepgram import LiveOptions
@@ -35,8 +34,7 @@ from pipecat.frames.frames import EndTaskFrame, LLMRunFrame, OutputImageRawFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
 from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
@@ -54,14 +52,6 @@ EVAL_TIMEOUT_SECS = 120
 EvalPrompt = str | Tuple[str, ImageFile]


-@dataclass
-class EvalConfig:
-    prompt: EvalPrompt
-    eval: str
-    eval_speaks_first: bool = False
-    runner_args_body: Optional[Any] = None
-
-
 class EvalRunner:
    def __init__(
        self,
@@ -102,7 +92,9 @@ class EvalRunner:
    async def run_eval(
        self,
        example_file: str,
-        eval_config: EvalConfig,
+        prompt: EvalPrompt,
+        eval: str,
+        user_speaks_first: bool = False,
    ):
        if not re.match(self._pattern, example_file):
            return
@@ -119,8 +111,10 @@ class EvalRunner:

        try:
            tasks = [
-                asyncio.create_task(run_example_pipeline(script_path, eval_config)),
-                asyncio.create_task(run_eval_pipeline(self, example_file, eval_config)),
+                asyncio.create_task(run_example_pipeline(script_path)),
+                asyncio.create_task(
+                    run_eval_pipeline(self, example_file, prompt, eval, user_speaks_first)
+                ),
            ]
            _, pending = await asyncio.wait(tasks, timeout=EVAL_TIMEOUT_SECS)
            if pending:
@@ -182,7 +176,7 @@ class EvalRunner:
        return os.path.join(self._recordings_dir, f"{base_name}.wav")


-async def run_example_pipeline(script_path: Path, eval_config: EvalConfig):
+async def run_example_pipeline(script_path: Path):
    room_url = os.getenv("DAILY_SAMPLE_ROOM_URL")

    module = load_module_from_path(script_path)
@@ -201,7 +195,6 @@ async def run_example_pipeline(script_path: Path, eval_config: EvalConfig):

    runner_args = RunnerArguments()
    runner_args.pipeline_idle_timeout_secs = PIPELINE_IDLE_TIMEOUT_SECS
-    runner_args.body = eval_config.runner_args_body

    await module.run_bot(transport, runner_args)

@@ -209,7 +202,9 @@ async def run_example_pipeline(script_path: Path, eval_config: EvalConfig):
 async def run_eval_pipeline(
    eval_runner: EvalRunner,
    example_file: str,
-    eval_config: EvalConfig,
+    prompt: EvalPrompt,
+    eval: str,
+    user_speaks_first: bool = False,
 ):
    logger.info(f"Starting eval bot")

@@ -266,16 +261,17 @@ async def run_eval_pipeline(
    # Load example prompt depending on image.
    example_prompt = ""
    example_image: Optional[ImageFile] = None
-    if isinstance(eval_config.prompt, str):
-        example_prompt = eval_config.prompt
-    elif isinstance(eval_config.prompt, tuple):
-        example_prompt, example_image = eval_config.prompt
+    if isinstance(prompt, str):
+        example_prompt = prompt
+    elif isinstance(prompt, tuple):
+        example_prompt, example_image = prompt

+    eval_prompt = f"The answer is correct if it matches: {eval}."
    common_system_prompt = (
        "The user might say things other than the answer and that's allowed. "
-        f"You should only call the eval function when the user: {eval_config.eval}"
+        f"You should only call the eval function with your assessment when the user actually answers the question. {eval_prompt}"
    )
-    if eval_config.eval_speaks_first:
+    if user_speaks_first:
        system_prompt = f"You are an LLM eval, be extremly brief. You will start the conversation by saying: '{example_prompt}'. {common_system_prompt}"
    else:
        system_prompt = f"You are an LLM eval, be extremly brief. Your goal is to first ask one question: {example_prompt}. {common_system_prompt}"
@@ -287,8 +283,8 @@ async def run_eval_pipeline(
        },
    ]

-    context = LLMContext(messages, tools)
-    context_aggregator = LLMContextAggregatorPair(context)
+    context = OpenAILLMContext(messages, tools)
+    context_aggregator = llm.create_context_aggregator(context)

    audio_buffer = AudioBufferProcessor()

@@ -333,9 +329,9 @@ async def run_eval_pipeline(

        # Default behavior is for the bot to speak first
        # If the eval bot speaks first, we append the prompt to the messages
-        if eval_config.eval_speaks_first:
+        if user_speaks_first:
            messages.append(
-                {"role": "user", "content": f"Start by saying this exactly: '{eval_config.prompt}'"}
+                {"role": "user", "content": f"Start by saying this exactly: '{prompt}'"}
            )
            await task.queue_frames([LLMRunFrame()])

--- a/scripts/evals/run-release-evals.py
+++ b/scripts/evals/run-release-evals.py
@@ -11,7 +11,7 @@ from datetime import datetime, timezone
 from pathlib import Path

 from dotenv import load_dotenv
-from eval import EvalConfig, EvalRunner
+from eval import EvalRunner
 from loguru import logger
 from PIL import Image
 from utils import check_env_variables
@@ -24,184 +24,179 @@ ASSETS_DIR = SCRIPT_DIR / "assets"

 FOUNDATIONAL_DIR = SCRIPT_DIR.parent.parent / "examples" / "foundational"

-EVAL_SIMPLE_MATH = EvalConfig(
-    prompt="A simple math addition.",
-    eval="The user answers the math addition correctly.",
+# Speaking order constants
+USER_SPEAKS_FIRST = True
+BOT_SPEAKS_FIRST = False
+
+# Math
+PROMPT_SIMPLE_MATH = "A simple math addition."
+EVAL_SIMPLE_MATH = "Correct math addition."
+
+# Weather
+PROMPT_WEATHER = "What's the weather in San Francisco?"
+EVAL_WEATHER = (
+    "Something specific about the current weather in San Francisco, including the degrees."
 )

-EVAL_WEATHER = EvalConfig(
-    prompt="What's the weather in San Francisco?",
-    eval="The user says something specific about the current weather in San Francisco, including the degrees.",
-)
+# Online search
+PROMPT_ONLINE_SEARCH = "What's the date right now in London?"
+EVAL_ONLINE_SEARCH = f"Today is {datetime.now(timezone.utc).strftime('%B %d, %Y')}."

-EVAL_ONLINE_SEARCH = EvalConfig(
-    prompt="What's the date right now in London?",
-    eval=f"The user says today is {datetime.now(timezone.utc).strftime('%B %d, %Y')} in London.",
-)
+# Switch language
+PROMPT_SWITCH_LANGUAGE = "Say something in Spanish."
+EVAL_SWITCH_LANGUAGE = "The user is now talking in Spanish."

-EVAL_SWITCH_LANGUAGE = EvalConfig(
-    prompt="Say something in Spanish.",
-    eval="The user talks in Spanish.",
-)
-
-EVAL_VISION_CAMERA = EvalConfig(
-    prompt=("Briefly describe what you see.", Image.open(ASSETS_DIR / "cat.jpg")),
-    eval="The user provides a cat description.",
-)
-
-
-def EVAL_VISION_IMAGE(*, eval_speaks_first: bool = False):
-    return EvalConfig(
-        prompt="Briefly describe this image.",
-        eval="The user provides a cat description.",
-        eval_speaks_first=eval_speaks_first,
-        runner_args_body={
-            "image_path": ASSETS_DIR / "cat.jpg",
-            "question": "Briefly describe this image.",
-        },
-    )
-
-
-EVAL_VOICEMAIL = EvalConfig(
-    prompt="Please leave a message.",
-    eval="The user leaves a voicemail message.",
-    eval_speaks_first=True,
-)
-
-EVAL_CONVERSATION = EvalConfig(
-    prompt="Hello, this is Mark.",
-    eval="The user replies with a greeting.",
-    eval_speaks_first=True,
-)
+# Vision
+PROMPT_VISION = ("What do you see?", Image.open(ASSETS_DIR / "cat.jpg"))
+EVAL_VISION = "A cat description."

+# Voicemail
+PROMPT_VOICEMAIL = "Please leave a message after the beep."
+EVAL_VOICEMAIL = "Assess the conversation and determine if it is a voicemail."
+PROMPT_CONVERSATION = "Hello, this is Mark."
+EVAL_CONVERSATION = "A start of a conversation, not a voicemail."

 TESTS_07 = [
    # 07 series
-    ("07-interruptible.py", EVAL_SIMPLE_MATH),
-    ("07-interruptible-cartesia-http.py", EVAL_SIMPLE_MATH),
-    ("07a-interruptible-speechmatics.py", EVAL_SIMPLE_MATH),
-    ("07aa-interruptible-soniox.py", EVAL_SIMPLE_MATH),
-    ("07ab-interruptible-inworld-http.py", EVAL_SIMPLE_MATH),
-    ("07ac-interruptible-asyncai.py", EVAL_SIMPLE_MATH),
-    ("07ac-interruptible-asyncai-http.py", EVAL_SIMPLE_MATH),
-    ("07b-interruptible-langchain.py", EVAL_SIMPLE_MATH),
-    ("07c-interruptible-deepgram.py", EVAL_SIMPLE_MATH),
-    ("07c-interruptible-deepgram-flux.py", EVAL_SIMPLE_MATH),
-    ("07c-interruptible-deepgram-http.py", EVAL_SIMPLE_MATH),
-    ("07d-interruptible-elevenlabs.py", EVAL_SIMPLE_MATH),
-    ("07d-interruptible-elevenlabs-http.py", EVAL_SIMPLE_MATH),
-    ("07f-interruptible-azure.py", EVAL_SIMPLE_MATH),
-    ("07g-interruptible-openai.py", EVAL_SIMPLE_MATH),
-    ("07h-interruptible-openpipe.py", EVAL_SIMPLE_MATH),
-    ("07j-interruptible-gladia.py", EVAL_SIMPLE_MATH),
-    ("07k-interruptible-lmnt.py", EVAL_SIMPLE_MATH),
-    ("07l-interruptible-groq.py", EVAL_SIMPLE_MATH),
-    ("07m-interruptible-aws.py", EVAL_SIMPLE_MATH),
-    ("07m-interruptible-aws-strands.py", EVAL_WEATHER),
-    ("07n-interruptible-gemini.py", EVAL_SIMPLE_MATH),
-    ("07n-interruptible-google.py", EVAL_SIMPLE_MATH),
-    ("07o-interruptible-assemblyai.py", EVAL_SIMPLE_MATH),
-    ("07q-interruptible-rime.py", EVAL_SIMPLE_MATH),
-    ("07q-interruptible-rime-http.py", EVAL_SIMPLE_MATH),
-    ("07r-interruptible-riva-nim.py", EVAL_SIMPLE_MATH),
-    ("07s-interruptible-google-audio-in.py", EVAL_SIMPLE_MATH),
-    ("07t-interruptible-fish.py", EVAL_SIMPLE_MATH),
-    ("07v-interruptible-neuphonic.py", EVAL_SIMPLE_MATH),
-    ("07v-interruptible-neuphonic-http.py", EVAL_SIMPLE_MATH),
-    ("07w-interruptible-fal.py", EVAL_SIMPLE_MATH),
-    ("07y-interruptible-minimax.py", EVAL_SIMPLE_MATH),
-    ("07z-interruptible-sarvam.py", EVAL_SIMPLE_MATH),
-    ("07ae-interruptible-hume.py", EVAL_SIMPLE_MATH),
+    ("07-interruptible.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07-interruptible-cartesia-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07a-interruptible-speechmatics.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07aa-interruptible-soniox.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07ab-interruptible-inworld-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07ac-interruptible-asyncai.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07ac-interruptible-asyncai-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07b-interruptible-langchain.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07c-interruptible-deepgram.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07d-interruptible-elevenlabs.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    (
+        "07d-interruptible-elevenlabs-http.py",
+        PROMPT_SIMPLE_MATH,
+        EVAL_SIMPLE_MATH,
+        BOT_SPEAKS_FIRST,
+    ),
+    ("07e-interruptible-playht.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07e-interruptible-playht-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07f-interruptible-azure.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07g-interruptible-openai.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07h-interruptible-openpipe.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07j-interruptible-gladia.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07k-interruptible-lmnt.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07l-interruptible-groq.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07m-interruptible-aws.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07m-interruptible-aws-strands.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("07n-interruptible-gemini.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07n-interruptible-google.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07o-interruptible-assemblyai.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07q-interruptible-rime.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07q-interruptible-rime-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07r-interruptible-riva-nim.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    (
+        "07s-interruptible-google-audio-in.py",
+        PROMPT_SIMPLE_MATH,
+        EVAL_SIMPLE_MATH,
+        BOT_SPEAKS_FIRST,
+    ),
+    ("07t-interruptible-fish.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07v-interruptible-neuphonic.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07v-interruptible-neuphonic-http.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07w-interruptible-fal.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07y-interruptible-minimax.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07z-interruptible-sarvam.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    # Needs a local XTTS docker instance running.
-    # ("07i-interruptible-xtts.py", EVAL_SIMPLE_MATH),
+    # ("07i-interruptible-xtts.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    # Needs a Krisp license.
-    # ("07p-interruptible-krisp.py", EVAL_SIMPLE_MATH),
+    # ("07p-interruptible-krisp.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    # Needs GPU resources.
-    # ("07u-interruptible-ultravox.py", EVAL_SIMPLE_MATH),
+    # ("07u-interruptible-ultravox.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
 ]

 TESTS_12 = [
-    ("12-describe-image-openai.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
-    ("12a-describe-image-anthropic.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
-    ("12b-describe-image-aws.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
-    ("12c-describe-image-gemini-flash.py", EVAL_VISION_IMAGE(eval_speaks_first=True)),
-    ("12d-describe-image-moondream.py", EVAL_VISION_IMAGE()),
+    ("12-describe-video.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
+    ("12a-describe-video-gemini-flash.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
+    ("12b-describe-video-gpt-4o.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
+    ("12c-describe-video-anthropic.py", PROMPT_VISION, EVAL_VISION, BOT_SPEAKS_FIRST),
 ]

 TESTS_14 = [
-    ("14-function-calling.py", EVAL_WEATHER),
-    ("14a-function-calling-anthropic.py", EVAL_WEATHER),
-    ("14e-function-calling-google.py", EVAL_WEATHER),
-    ("14f-function-calling-groq.py", EVAL_WEATHER),
-    ("14g-function-calling-grok.py", EVAL_WEATHER),
-    ("14h-function-calling-azure.py", EVAL_WEATHER),
-    ("14i-function-calling-fireworks.py", EVAL_WEATHER),
-    ("14j-function-calling-nim.py", EVAL_WEATHER),
-    ("14k-function-calling-cerebras.py", EVAL_WEATHER),
-    ("14m-function-calling-openrouter.py", EVAL_WEATHER),
-    ("14n-function-calling-perplexity.py", EVAL_WEATHER),
-    ("14p-function-calling-gemini-vertex-ai.py", EVAL_WEATHER),
-    ("14q-function-calling-qwen.py", EVAL_WEATHER),
-    ("14r-function-calling-aws.py", EVAL_WEATHER),
-    ("14v-function-calling-openai.py", EVAL_WEATHER),
-    ("14w-function-calling-mistral.py", EVAL_WEATHER),
-    ("14x-function-calling-openpipe.py", EVAL_WEATHER),
-    # Video
-    ("14d-function-calling-anthropic-video.py", EVAL_VISION_CAMERA),
-    ("14d-function-calling-aws-video.py", EVAL_VISION_CAMERA),
-    ("14d-function-calling-gemini-flash-video.py", EVAL_VISION_CAMERA),
-    ("14d-function-calling-moondream-video.py", EVAL_VISION_CAMERA),
-    ("14d-function-calling-openai-video.py", EVAL_VISION_CAMERA),
+    ("14-function-calling.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14a-function-calling-anthropic.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14b-function-calling-anthropic-video.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14d-function-calling-video.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14e-function-calling-google.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14f-function-calling-groq.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14g-function-calling-grok.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14h-function-calling-azure.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14i-function-calling-fireworks.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14j-function-calling-nim.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14k-function-calling-cerebras.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14m-function-calling-openrouter.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14n-function-calling-perplexity.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14p-function-calling-gemini-vertex-ai.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14q-function-calling-qwen.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14r-function-calling-aws.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14v-function-calling-openai.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14w-function-calling-mistral.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    # Currently not working.
-    # ("14c-function-calling-together.py", EVAL_WEATHER),
-    # ("14l-function-calling-deepseek.py", EVAL_WEATHER),
-    # ("14o-function-calling-gemini-openai-format.py", EVAL_WEATHER),
+    # ("14c-function-calling-together.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    # ("14l-function-calling-deepseek.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    # ("14o-function-calling-gemini-openai-format.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
 ]

 TESTS_15 = [
-    ("15a-switch-languages.py", EVAL_SWITCH_LANGUAGE),
+    ("15a-switch-languages.py", PROMPT_SWITCH_LANGUAGE, EVAL_SWITCH_LANGUAGE, BOT_SPEAKS_FIRST),
 ]

 TESTS_19 = [
-    ("19-openai-realtime.py", EVAL_WEATHER),
-    ("19-openai-realtime-beta.py", EVAL_WEATHER),
-    # OpenAI Realtime not released on Azure yet
-    # ("19a-azure-realtime.py", EVAL_WEATHER),
-    ("19a-azure-realtime-beta.py", EVAL_WEATHER),
-    ("19b-openai-realtime-text.py", EVAL_WEATHER),
-    ("19b-openai-realtime-beta-text.py", EVAL_WEATHER),
+    ("19-openai-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("19a-azure-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("19b-openai-realtime-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("19b-openai-realtime-beta-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
 ]

 TESTS_21 = [
-    ("21a-tavus-video-service.py", EVAL_SIMPLE_MATH),
+    ("21a-tavus-video-service.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
 ]

 TESTS_26 = [
-    ("26-gemini-live.py", EVAL_SIMPLE_MATH),
-    ("26a-gemini-live-transcription.py", EVAL_SIMPLE_MATH),
-    ("26b-gemini-live-function-calling.py", EVAL_WEATHER),
-    ("26c-gemini-live-video.py", EVAL_SIMPLE_MATH),
-    ("26e-gemini-live-google-search.py", EVAL_ONLINE_SEARCH),
-    ("26h-gemini-live-vertex-function-calling.py", EVAL_WEATHER),
+    ("26-gemini-multimodal-live.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    (
+        "26a-gemini-multimodal-live-transcription.py",
+        PROMPT_SIMPLE_MATH,
+        EVAL_SIMPLE_MATH,
+        BOT_SPEAKS_FIRST,
+    ),
+    (
+        "26b-gemini-multimodal-live-function-calling.py",
+        PROMPT_WEATHER,
+        EVAL_WEATHER,
+        BOT_SPEAKS_FIRST,
+    ),
+    ("26c-gemini-multimodal-live-video.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    (
+        "26e-gemini-multimodal-google-search.py",
+        PROMPT_ONLINE_SEARCH,
+        EVAL_ONLINE_SEARCH,
+        BOT_SPEAKS_FIRST,
+    ),
    # Currently not working.
-    # ("26d-gemini-live-text.py", EVAL_SIMPLE_MATH),
+    # ("26d-gemini-multimodal-live-text.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
 ]

 TESTS_27 = [
-    ("27-simli-layer.py", EVAL_SIMPLE_MATH),
+    ("27-simli-layer.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
 ]

 TESTS_40 = [
-    ("40-aws-nova-sonic.py", EVAL_SIMPLE_MATH),
+    ("40-aws-nova-sonic.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
 ]

 TESTS_43 = [
-    ("43a-heygen-video-service.py", EVAL_SIMPLE_MATH),
+    ("43a-heygen-video-service.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
 ]

 TESTS_44 = [
-    ("44-voicemail-detection.py", EVAL_VOICEMAIL),
-    ("44-voicemail-detection.py", EVAL_CONVERSATION),
+    ("44-voicemail-detection.py", PROMPT_VOICEMAIL, EVAL_VOICEMAIL, USER_SPEAKS_FIRST),
+    ("44-voicemail-detection.py", PROMPT_CONVERSATION, EVAL_CONVERSATION, USER_SPEAKS_FIRST),
 ]

 TESTS = [
@@ -239,9 +234,9 @@ async def main(args: argparse.Namespace):

    # Parse test config: (test, prompt, eval, user_speaks_first)
    for test_config in TESTS:
-        test, eval_config = test_config
+        test, prompt, eval, user_speaks_first = test_config

-        await runner.run_eval(test, eval_config)
+        await runner.run_eval(test, prompt, eval, user_speaks_first)

    runner.print_results()

--- a/src/pipecat/adapters/schemas/tools_schema.py
+++ b/src/pipecat/adapters/schemas/tools_schema.py
@@ -22,12 +22,9 @@ class AdapterType(Enum):

    Parameters:
        GEMINI: Google Gemini adapter - currently the only service supporting custom tools.
-        SHIM: Backward compatibility shim for creating ToolsSchemas from lists of tools in
-              any format, used by LLMContext.from_openai_context.
    """

    GEMINI = "gemini"  # that is the only service where we are able to add custom tools for now
-    SHIM = "shim"  # for use as backward compatibility shim for creating ToolsSchemas from list of tools in any format


 class ToolsSchema:
--- a/src/pipecat/adapters/services/anthropic_adapter.py
+++ b/src/pipecat/adapters/services/anthropic_adapter.py
@@ -110,7 +110,7 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
        system = NOT_GIVEN
        messages = []

-        # First, map messages using self._from_universal_context_message(m)
+        # first, map messages using self._from_universal_context_message(m)
        try:
            messages = [self._from_universal_context_message(m) for m in universal_context_messages]
        except Exception as e:
@@ -245,25 +245,13 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
                    item["text"] = "(empty)"
                # handle image_url -> image conversion
                if item["type"] == "image_url":
-                    if item["image_url"]["url"].startswith("data:"):
-                        item["type"] = "image"
-                        item["source"] = {
-                            "type": "base64",
-                            "media_type": "image/jpeg",
-                            "data": item["image_url"]["url"].split(",")[1],
-                        }
-                        del item["image_url"]
-                    elif item["image_url"]["url"].startswith("http"):
-                        item["type"] = "image"
-                        item["source"] = {
-                            "type": "url",
-                            "url": item["image_url"]["url"],
-                        }
-                        del item["image_url"]
-                    else:
-                        url = item["image_url"]["url"]
-                        logger.warning(f"Unsupported 'image_url': {url}")
-
+                    item["type"] = "image"
+                    item["source"] = {
+                        "type": "base64",
+                        "media_type": "image/jpeg",
+                        "data": item["image_url"]["url"].split(",")[1],
+                    }
+                    del item["image_url"]
            # In the case where there's a single image in the list (like what
            # would result from a UserImageRawFrame), ensure that the image
            # comes before text, as recommended by Anthropic docs
--- a/src/pipecat/adapters/services/aws_nova_sonic_adapter.py
+++ b/src/pipecat/adapters/services/aws_nova_sonic_adapter.py
@@ -6,47 +6,13 @@

 """AWS Nova Sonic LLM adapter for Pipecat."""

-import copy
 import json
-from dataclasses import dataclass
-from enum import Enum
-from typing import Any, Dict, List, Optional, TypedDict
-
-from loguru import logger
+from typing import Any, Dict, List, TypedDict

 from pipecat.adapters.base_llm_adapter import BaseLLMAdapter
 from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
-from pipecat.processors.aggregators.llm_context import LLMContext, LLMContextMessage
-
-
-class Role(Enum):
-    """Roles supported in AWS Nova Sonic conversations.
-
-    Parameters:
-        SYSTEM: System-level messages (not used in conversation history).
-        USER: Messages sent by the user.
-        ASSISTANT: Messages sent by the assistant.
-        TOOL: Messages sent by tools (not used in conversation history).
-    """
-
-    SYSTEM = "SYSTEM"
-    USER = "USER"
-    ASSISTANT = "ASSISTANT"
-    TOOL = "TOOL"
-
-
-@dataclass
-class AWSNovaSonicConversationHistoryMessage:
-    """A single message in AWS Nova Sonic conversation history.
-
-    Parameters:
-        role: The role of the message sender (USER or ASSISTANT only).
-        text: The text content of the message.
-    """
-
-    role: Role  # only USER and ASSISTANT
-    text: str
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.processors.aggregators.llm_context import LLMContext


 class AWSNovaSonicLLMInvocationParams(TypedDict):
@@ -55,9 +21,7 @@ class AWSNovaSonicLLMInvocationParams(TypedDict):
    This is a placeholder until support for universal LLMContext machinery is added for AWS Nova Sonic.
    """

-    system_instruction: Optional[str]
-    messages: List[AWSNovaSonicConversationHistoryMessage]
-    tools: List[Dict[str, Any]]
+    pass


 class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
@@ -70,7 +34,7 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
    @property
    def id_for_llm_specific_messages(self) -> str:
        """Get the identifier used in LLMSpecificMessage instances for AWS Nova Sonic."""
-        return "aws-nova-sonic"
+        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")

    def get_llm_invocation_params(self, context: LLMContext) -> AWSNovaSonicLLMInvocationParams:
        """Get AWS Nova Sonic-specific LLM invocation parameters from a universal LLM context.
@@ -83,13 +47,7 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
        Returns:
            Dictionary of parameters for invoking AWS Nova Sonic's LLM API.
        """
-        messages = self._from_universal_context_messages(self.get_messages(context))
-        return {
-            "system_instruction": messages.system_instruction,
-            "messages": messages.messages,
-            # NOTE: LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
-            "tools": self.from_standard_tools(context.tools) or [],
-        }
+        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")

    def get_messages_for_logging(self, context) -> List[Dict[str, Any]]:
        """Get messages from a universal LLM context in a format ready for logging about AWS Nova Sonic.
@@ -104,75 +62,7 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
        Returns:
            List of messages in a format ready for logging about AWS Nova Sonic.
        """
-        return self._from_universal_context_messages(self.get_messages(context)).messages
-
-    @dataclass
-    class ConvertedMessages:
-        """Container for Google-formatted messages converted from universal context."""
-
-        messages: List[AWSNovaSonicConversationHistoryMessage]
-        system_instruction: Optional[str] = None
-
-    def _from_universal_context_messages(
-        self, universal_context_messages: List[LLMContextMessage]
-    ) -> ConvertedMessages:
-        system_instruction = None
-        messages = []
-
-        # Bail if there are no messages
-        if not universal_context_messages:
-            return self.ConvertedMessages()
-
-        universal_context_messages = copy.deepcopy(universal_context_messages)
-
-        # If we have a "system" message as our first message, let's pull that out into "instruction"
-        if universal_context_messages[0].get("role") == "system":
-            system = universal_context_messages.pop(0)
-            content = system.get("content")
-            if isinstance(content, str):
-                system_instruction = content
-            elif isinstance(content, list):
-                system_instruction = content[0].get("text")
-            if system_instruction:
-                self._system_instruction = system_instruction
-
-        # Process remaining messages to fill out conversation history.
-        # Nova Sonic supports "user" and "assistant" messages in history.
-        for universal_context_message in universal_context_messages:
-            message = self._from_universal_context_message(universal_context_message)
-            if message:
-                messages.append(message)
-
-        return self.ConvertedMessages(messages=messages, system_instruction=system_instruction)
-
-    def _from_universal_context_message(self, message) -> AWSNovaSonicConversationHistoryMessage:
-        """Convert standard message format to Nova Sonic format.
-
-        Args:
-            message: Standard message dictionary to convert.
-
-        Returns:
-            Nova Sonic conversation history message, or None if not convertible.
-        """
-        role = message.get("role")
-        if message.get("role") == "user" or message.get("role") == "assistant":
-            content = message.get("content")
-            if isinstance(message.get("content"), list):
-                content = ""
-                for c in message.get("content"):
-                    if c.get("type") == "text":
-                        content += " " + c.get("text")
-                    else:
-                        logger.error(
-                            f"Unhandled content type in context message: {c.get('type')} - {message}"
-                        )
-            # There won't be content if this is an assistant tool call entry.
-            # We're ignoring those since they can't be loaded into AWS Nova Sonic conversation
-            # history
-            if content:
-                return AWSNovaSonicConversationHistoryMessage(role=Role[role.upper()], text=content)
-        # NOTE: we're ignoring messages with role "tool" since they can't be loaded into AWS Nova
-        # Sonic conversation history
+        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")

    @staticmethod
    def _to_aws_nova_sonic_function_format(function: FunctionSchema) -> Dict[str, Any]:
@@ -210,18 +100,4 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
            List of dictionaries in AWS Nova Sonic function format.
        """
        functions_schema = tools_schema.standard_tools
-        standard_tools = [
-            self._to_aws_nova_sonic_function_format(func) for func in functions_schema
-        ]
-
-        # For backward compatibility, AWS Nova Sonic can still be used with
-        # tools in dict format, even though it always uses `LLMContext` under
-        # the hood (via `LLMContext.from_openai_context()`).
-        # To support this behavior, we use "shimmed" custom tools here.
-        # (We maintain this backward compatibility because users aren't
-        # *knowingly* opting into the new `LLMContext`.)
-        shimmed_tools = []
-        if tools_schema.custom_tools:
-            shimmed_tools = tools_schema.custom_tools.get(AdapterType.SHIM, [])
-
-        return standard_tools + shimmed_tools
+        return [self._to_aws_nova_sonic_function_format(func) for func in functions_schema]
--- a/src/pipecat/adapters/services/bedrock_adapter.py
+++ b/src/pipecat/adapters/services/bedrock_adapter.py
@@ -107,7 +107,7 @@ class AWSBedrockLLMAdapter(BaseLLMAdapter[AWSBedrockLLMInvocationParams]):
        system = None
        messages = []

-        # First, map messages using self._from_universal_context_message(m)
+        # first, map messages using self._from_universal_context_message(m)
        try:
            messages = [self._from_universal_context_message(m) for m in universal_context_messages]
        except Exception as e:
@@ -256,22 +256,15 @@ class AWSBedrockLLMAdapter(BaseLLMAdapter[AWSBedrockLLMInvocationParams]):
                    new_content.append({"text": text_content})
                # handle image_url -> image conversion
                if item["type"] == "image_url":
-                    if item["image_url"]["url"].startswith("data:"):
-                        new_item = {
-                            "image": {
-                                "format": "jpeg",
-                                "source": {
-                                    "bytes": base64.b64decode(
-                                        item["image_url"]["url"].split(",")[1]
-                                    )
-                                },
-                            }
+                    new_item = {
+                        "image": {
+                            "format": "jpeg",
+                            "source": {
+                                "bytes": base64.b64decode(item["image_url"]["url"].split(",")[1])
+                            },
                        }
-                        new_content.append(new_item)
-                    else:
-                        url = item["image_url"]["url"]
-                        logger.warning(f"Unsupported 'image_url': {url}")
-
+                    }
+                    new_content.append(new_item)
            # In the case where there's a single image in the list (like what
            # would result from a UserImageRawFrame), ensure that the image
            # comes before text
--- a/src/pipecat/adapters/services/gemini_adapter.py
+++ b/src/pipecat/adapters/services/gemini_adapter.py
@@ -8,8 +8,8 @@

 import base64
 import json
-from dataclasses import dataclass, field
-from typing import Any, Dict, List, Optional, Tuple, TypedDict
+from dataclasses import dataclass
+from typing import Any, Dict, List, Optional, TypedDict

 from loguru import logger
 from openai import NotGiven
@@ -24,7 +24,13 @@ from pipecat.processors.aggregators.llm_context import (
 )

 try:
-    from google.genai.types import Blob, Content, FileData, FunctionCall, FunctionResponse, Part
+    from google.genai.types import (
+        Blob,
+        Content,
+        FunctionCall,
+        FunctionResponse,
+        Part,
+    )
 except ModuleNotFoundError as e:
    logger.error(f"Exception: {e}")
    logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
@@ -81,11 +87,9 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            Includes both converted standard tools and any custom Gemini-specific tools.
        """
        functions_schema = tools_schema.standard_tools
-        formatted_standard_tools = (
-            [{"function_declarations": [func.to_default_dict() for func in functions_schema]}]
-            if functions_schema
-            else []
-        )
+        formatted_standard_tools = [
+            {"function_declarations": [func.to_default_dict() for func in functions_schema]}
+        ]
        custom_gemini_tools = []
        if tools_schema.custom_tools:
            custom_gemini_tools = tools_schema.custom_tools.get(AdapterType.GEMINI, [])
@@ -127,28 +131,6 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
        messages: List[Content]
        system_instruction: Optional[str] = None

-    @dataclass
-    class MessageConversionResult:
-        """Result of converting a single universal context message to Google format.
-
-        Either content (a Google Content object) or a system instruction string
-        is guaranteed to be set.
-
-        Also returns a tool call ID to name mapping for any tool calls
-        discovered in the message.
-        """
-
-        content: Optional[Content] = None
-        system_instruction: Optional[str] = None
-        tool_call_id_to_name_mapping: Dict[str, str] = field(default_factory=dict)
-
-    @dataclass
-    class MessageConversionParams:
-        """Parameters for converting a single universal context message to Google format."""
-
-        already_have_system_instruction: bool
-        tool_call_id_to_name_mapping: Dict[str, str]
-
    def _from_universal_context_messages(
        self, universal_context_messages: List[LLMContextMessage]
    ) -> ConvertedMessages:
@@ -172,26 +154,24 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
        """
        system_instruction = None
        messages = []
-        tool_call_id_to_name_mapping = {}

        # Process each message, preserving Google-formatted messages and converting others
        for message in universal_context_messages:
-            result = self._from_universal_context_message(
-                message,
-                params=self.MessageConversionParams(
-                    already_have_system_instruction=bool(system_instruction),
-                    tool_call_id_to_name_mapping=tool_call_id_to_name_mapping,
-                ),
-            )
-            # Each result is either a Content or a system instruction
-            if result.content:
-                messages.append(result.content)
-            elif result.system_instruction:
-                system_instruction = result.system_instruction
+            if isinstance(message, LLMSpecificMessage):
+                # Assume that LLMSpecificMessage wraps a message in Google format
+                messages.append(message.message)
+                continue

-            # Merge tool call ID to name mapping
-            if result.tool_call_id_to_name_mapping:
-                tool_call_id_to_name_mapping.update(result.tool_call_id_to_name_mapping)
+            # Convert standard format to Google format
+            converted = self._from_standard_message(
+                message, already_have_system_instruction=bool(system_instruction)
+            )
+            if isinstance(converted, Content):
+                # Regular (non-system) message
+                messages.append(converted)
+            else:
+                # System instruction
+                system_instruction = converted

        # Check if we only have function-related messages (no regular text)
        has_regular_messages = any(
@@ -211,16 +191,9 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):

        return self.ConvertedMessages(messages=messages, system_instruction=system_instruction)

-    def _from_universal_context_message(
-        self, message: LLMContextMessage, *, params: MessageConversionParams
-    ) -> MessageConversionResult:
-        if isinstance(message, LLMSpecificMessage):
-            return self.MessageConversionResult(content=message.message)
-        return self._from_standard_message(message, params=params)
-
    def _from_standard_message(
-        self, message: LLMStandardMessage, *, params: MessageConversionParams
-    ) -> MessageConversionResult:
+        self, message: LLMStandardMessage, already_have_system_instruction: bool
+    ) -> Content | str:
        """Convert standard universal context message to Google Content object.

        Handles conversion of text, images, and function calls to Google's
@@ -230,11 +203,10 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
        Args:
            message: Message in standard universal context format.
            already_have_system_instruction: Whether we already have a system instruction
-            params: Parameters for conversion.

        Returns:
-            MessageConversionResult containing either a Content object or a
-            system instruction string.
+            Content object with role and parts, or a plain string for system
+            messages.

        Examples:
            Standard text message::
@@ -268,49 +240,38 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            Converts to Google Content with::

                Content(
-                    role="user",
+                    role="model",
                    parts=[Part(function_call=FunctionCall(name="search", args={"query": "test"}))]
                )
        """
        role = message["role"]
        content = message.get("content", [])
-
        if role == "system":
-            if params.already_have_system_instruction:
+            if already_have_system_instruction:
                role = "user"  # Convert system message to user role if we already have a system instruction
            else:
-                system_instruction: str = None
+                # System instructions are returned as plain text
                if isinstance(content, str):
-                    system_instruction = content
+                    return content
                elif isinstance(content, list):
                    # If content is a list, we assume it's a list of text parts, per the standard
-                    system_instruction = " ".join(
-                        part["text"] for part in content if part.get("type") == "text"
-                    )
-                if system_instruction:
-                    return self.MessageConversionResult(system_instruction=system_instruction)
+                    return " ".join(part["text"] for part in content if part.get("type") == "text")
        elif role == "assistant":
            role = "model"

        parts = []
-        tool_call_id_to_name_mapping = {}
-
        if message.get("tool_calls"):
            for tc in message["tool_calls"]:
-                id = tc["id"]
-                name = tc["function"]["name"]
-                tool_call_id_to_name_mapping[id] = name
                parts.append(
                    Part(
                        function_call=FunctionCall(
-                            id=id,
-                            name=name,
+                            name=tc["function"]["name"],
                            args=json.loads(tc["function"]["arguments"]),
                        )
                    )
                )
        elif role == "tool":
-            role = "user"
+            role = "model"
            try:
                response = json.loads(message["content"])
                if isinstance(response, dict):
@@ -321,18 +282,10 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
                # Response might not be JSON-deserializable.
                # This occurs with a UserImageFrame, for example, where we get a plain "COMPLETED" string.
                response_dict = {"value": message["content"]}
-
-            # Get function name from mapping using tool_call_id, or fallback
-            tool_call_id = message.get("tool_call_id")
-            function_name = "tool_call_result"  # Default fallback
-            if tool_call_id and tool_call_id in params.tool_call_id_to_name_mapping:
-                function_name = params.tool_call_id_to_name_mapping[tool_call_id]
-
            parts.append(
                Part(
                    function_response=FunctionResponse(
-                        id=tool_call_id,
-                        name=function_name,
+                        name="tool_call_result",  # seems to work to hard-code the same name every time
                        response=response_dict,
                    )
                )
@@ -343,7 +296,7 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            for c in content:
                if c["type"] == "text":
                    parts.append(Part(text=c["text"]))
-                elif c["type"] == "image_url" and c["image_url"]["url"].startswith("data:"):
+                elif c["type"] == "image_url":
                    parts.append(
                        Part(
                            inline_data=Blob(
@@ -352,25 +305,9 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
                            )
                        )
                    )
-                elif c["type"] == "image_url":
-                    url = c["image_url"]["url"]
-                    logger.warning(f"Unsupported 'image_url': {url}")
                elif c["type"] == "input_audio":
                    input_audio = c["input_audio"]
                    audio_bytes = base64.b64decode(input_audio["data"])
                    parts.append(Part(inline_data=Blob(mime_type="audio/wav", data=audio_bytes)))
-                elif c["type"] == "file_data":
-                    file_data = c["file_data"]
-                    parts.append(
-                        Part(
-                            file_data=FileData(
-                                mime_type=file_data.get("mime_type"),
-                                file_uri=file_data.get("file_uri"),
-                            )
-                        )
-                    )

-        return self.MessageConversionResult(
-            content=Content(role=role, parts=parts),
-            tool_call_id_to_name_mapping=tool_call_id_to_name_mapping,
-        )
+        return Content(role=role, parts=parts)
--- a/src/pipecat/adapters/services/open_ai_realtime_adapter.py
+++ b/src/pipecat/adapters/services/open_ai_realtime_adapter.py
@@ -6,18 +6,12 @@

 """OpenAI Realtime LLM adapter for Pipecat."""

-import copy
-import json
-from dataclasses import dataclass
-from typing import Any, Dict, List, Optional, TypedDict
-
-from loguru import logger
+from typing import Any, Dict, List, TypedDict

 from pipecat.adapters.base_llm_adapter import BaseLLMAdapter
 from pipecat.adapters.schemas.function_schema import FunctionSchema
-from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
-from pipecat.processors.aggregators.llm_context import LLMContext, LLMContextMessage
-from pipecat.services.openai.realtime import events
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.processors.aggregators.llm_context import LLMContext


 class OpenAIRealtimeLLMInvocationParams(TypedDict):
@@ -26,9 +20,7 @@ class OpenAIRealtimeLLMInvocationParams(TypedDict):
    This is a placeholder until support for universal LLMContext machinery is added for OpenAI Realtime.
    """

-    system_instruction: Optional[str]
-    messages: List[events.ConversationItem]
-    tools: List[Dict[str, Any]]
+    pass


 class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
@@ -41,7 +33,7 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
    @property
    def id_for_llm_specific_messages(self) -> str:
        """Get the identifier used in LLMSpecificMessage instances for OpenAI Realtime."""
-        return "openai-realtime"
+        raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")

    def get_llm_invocation_params(self, context: LLMContext) -> OpenAIRealtimeLLMInvocationParams:
        """Get OpenAI Realtime-specific LLM invocation parameters from a universal LLM context.
@@ -54,13 +46,7 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
        Returns:
            Dictionary of parameters for invoking OpenAI Realtime's API.
        """
-        messages = self._from_universal_context_messages(self.get_messages(context))
-        return {
-            "system_instruction": messages.system_instruction,
-            "messages": messages.messages,
-            # NOTE: LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
-            "tools": self.from_standard_tools(context.tools) or [],
-        }
+        raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")

    def get_messages_for_logging(self, context) -> List[Dict[str, Any]]:
        """Get messages from a universal LLM context in a format ready for logging about OpenAI Realtime.
@@ -75,124 +61,7 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
        Returns:
            List of messages in a format ready for logging about OpenAI Realtime.
        """
-        # NOTE: this is the same as in OpenAIAdapter, as that's what it was
-        # prior to a refactor. Worth noting that for OpenAI Realtime
-        # specifically, not everything handled here is necessarily supported
-        # (or supported yet).
-        msgs = []
-        for message in self.get_messages(context):
-            msg = copy.deepcopy(message)
-            if "content" in msg:
-                if isinstance(msg["content"], list):
-                    for item in msg["content"]:
-                        if item["type"] == "image_url":
-                            if item["image_url"]["url"].startswith("data:image/"):
-                                item["image_url"]["url"] = "data:image/..."
-                        if item["type"] == "input_audio":
-                            item["input_audio"]["data"] = "..."
-            if "mime_type" in msg and msg["mime_type"].startswith("image/"):
-                msg["data"] = "..."
-            msgs.append(msg)
-        return msgs
-
-    @dataclass
-    class ConvertedMessages:
-        """Container for OpenAI-formatted messages converted from universal context."""
-
-        messages: List[events.ConversationItem]
-        system_instruction: Optional[str] = None
-
-    def _from_universal_context_messages(
-        self, universal_context_messages: List[LLMContextMessage]
-    ) -> ConvertedMessages:
-        # We can't load a long conversation history into the openai realtime api yet. (The API/model
-        # forgets that it can do audio, if you do a series of `conversation.item.create` calls.) So
-        # our general strategy until this is fixed is just to put everything into a first "user"
-        # message as a single input.
-
-        if not universal_context_messages:
-            return self.ConvertedMessages(messages=[])
-
-        messages = copy.deepcopy(universal_context_messages)
-        system_instruction = None
-
-        # If we have a "system" message as our first message, let's pull that out into session
-        # "instructions"
-        if messages[0].get("role") == "system":
-            system = messages.pop(0)
-            content = system.get("content")
-            if isinstance(content, str):
-                system_instruction = content
-            elif isinstance(content, list):
-                system_instruction = content[0].get("text")
-            if not messages:
-                return self.ConvertedMessages(messages=[], system_instruction=system_instruction)
-
-        # If we have just a single "user" item, we can just send it normally
-        if len(messages) == 1 and messages[0].get("role") == "user":
-            return self.ConvertedMessages(
-                messages=[self._from_universal_context_message(messages[0])],
-                system_instruction=system_instruction,
-            )
-
-        # Otherwise, let's pack everything into a single "user" message with a bit of
-        # explanation for the LLM
-        intro_text = """
-        This is a previously saved conversation. Please treat this conversation history as a
-        starting point for the current conversation."""
-
-        trailing_text = """
-        This is the end of the previously saved conversation. Please continue the conversation
-        from here. If the last message is a user instruction or question, act on that instruction
-        or answer the question. If the last message is an assistant response, simple say that you
-        are ready to continue the conversation."""
-
-        return self.ConvertedMessages(
-            messages=[
-                {
-                    "role": "user",
-                    "type": "message",
-                    "content": [
-                        {
-                            "type": "input_text",
-                            "text": "\n\n".join(
-                                [intro_text, json.dumps(messages, indent=2), trailing_text]
-                            ),
-                        }
-                    ],
-                }
-            ],
-            system_instruction=system_instruction,
-        )
-
-    def _from_universal_context_message(
-        self, message: LLMContextMessage
-    ) -> events.ConversationItem:
-        if message.get("role") == "user":
-            content = message.get("content")
-            if isinstance(message.get("content"), list):
-                content = ""
-                for c in message.get("content"):
-                    if c.get("type") == "text":
-                        content += " " + c.get("text")
-                    else:
-                        logger.error(
-                            f"Unhandled content type in context message: {c.get('type')} - {message}"
-                        )
-            return events.ConversationItem(
-                role="user",
-                type="message",
-                content=[events.ItemContent(type="input_text", text=content)],
-            )
-        if message.get("role") == "assistant" and message.get("tool_calls"):
-            tc = message.get("tool_calls")[0]
-            return events.ConversationItem(
-                type="function_call",
-                call_id=tc["id"],
-                name=tc["function"]["name"],
-                arguments=tc["function"]["arguments"],
-            )
-        logger.error(f"Unhandled message type in _from_universal_context_message: {message}")
+        raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")

    @staticmethod
    def _to_openai_realtime_function_format(function: FunctionSchema) -> Dict[str, Any]:
@@ -225,18 +94,4 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
            List of function definitions in OpenAI Realtime format.
        """
        functions_schema = tools_schema.standard_tools
-        standard_tools = [
-            self._to_openai_realtime_function_format(func) for func in functions_schema
-        ]
-
-        # For backward compatibility, OpenAI Realtime can still be used with
-        # tools in dict format, even though it always uses `LLMContext` under
-        # the hood (via `LLMContext.from_openai_context()`).
-        # To support this behavior, we use "shimmed" custom tools here.
-        # (We maintain this backward compatibility because users aren't
-        # *knowingly* opting into the new `LLMContext`.)
-        shimmed_tools = []
-        if tools_schema.custom_tools:
-            shimmed_tools = tools_schema.custom_tools.get(AdapterType.SHIM, [])
-
-        return standard_tools + shimmed_tools
+        return [self._to_openai_realtime_function_format(func) for func in functions_schema]
--- a/src/pipecat/audio/filters/krisp_viva_filter.py
+++ b/src/pipecat/audio/filters/krisp_viva_filter.py
@@ -1,193 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-"""Krisp noise reduction audio filter for Pipecat.
-
-This module provides an audio filter implementation using Krisp VIVA SDK.
-"""
-
-import os
-
-import numpy as np
-from loguru import logger
-
-from pipecat.audio.filters.base_audio_filter import BaseAudioFilter
-from pipecat.frames.frames import FilterControlFrame, FilterEnableFrame
-
-try:
-    import krisp_audio
-except ModuleNotFoundError as e:
-    logger.error(f"Exception: {e}")
-    logger.error("In order to use the Krisp filter, you need to install krisp_audio.")
-    raise Exception(f"Missing module: {e}")
-
-
-def _log_callback(log_message, log_level):
-    logger.info(f"[{log_level}] {log_message}")
-
-
-class KrispVivaFilter(BaseAudioFilter):
-    """Audio filter using the Krisp VIVA SDK.
-
-    Provides real-time noise reduction for audio streams using Krisp's
-    proprietary noise suppression algorithms. This filter requires a
-    valid Krisp model file to operate.
-
-    Supported sample rates:
-        - 8000 Hz
-        - 16000 Hz
-        - 24000 Hz
-        - 32000 Hz
-        - 44100 Hz
-        - 48000 Hz
-    """
-
-    # Initialize Krisp Audio SDK globally
-    krisp_audio.globalInit("", _log_callback, krisp_audio.LogLevel.Off)
-    SDK_VERSION = krisp_audio.getVersion()
-    logger.debug(
-        f"Krisp Audio Python SDK Version: {SDK_VERSION.major}."
-        f"{SDK_VERSION.minor}.{SDK_VERSION.patch}"
-    )
-
-    SAMPLE_RATES = {
-        8000: krisp_audio.SamplingRate.Sr8000Hz,
-        16000: krisp_audio.SamplingRate.Sr16000Hz,
-        24000: krisp_audio.SamplingRate.Sr24000Hz,
-        32000: krisp_audio.SamplingRate.Sr32000Hz,
-        44100: krisp_audio.SamplingRate.Sr44100Hz,
-        48000: krisp_audio.SamplingRate.Sr48000Hz,
-    }
-
-    FRAME_SIZE_MS = 10  # Krisp requires audio frames of 10ms duration for processing.
-
-    def __init__(self, model_path: str = None, noise_suppression_level: int = 100) -> None:
-        """Initialize the Krisp noise reduction filter.
-
-        Args:
-            model_path: Path to the Krisp model file (.kef extension).
-                If None, uses KRISP_VIVA_MODEL_PATH environment variable.
-            noise_suppression_level: Noise suppression level.
-
-        Raises:
-            ValueError: If model_path is not provided and KRISP_VIVA_MODEL_PATH is not set.
-            Exception: If model file doesn't have .kef extension.
-            FileNotFoundError: If model file doesn't exist.
-        """
-        super().__init__()
-
-        # Set model path, checking environment if not specified
-        self._model_path = model_path or os.getenv("KRISP_VIVA_MODEL_PATH")
-        if not self._model_path:
-            logger.error("Model path is not provided and KRISP_VIVA_MODEL_PATH is not set.")
-            raise ValueError("Model path for KrispAudioProcessor must be provided.")
-
-        if not self._model_path.endswith(".kef"):
-            raise Exception("Model is expected with .kef extension")
-
-        if not os.path.isfile(self._model_path):
-            raise FileNotFoundError(f"Model file not found: {self._model_path}")
-
-        self._filtering = True
-        self._session = None
-        self._samples_per_frame = None
-        self._noise_suppression_level = noise_suppression_level
-
-        # Audio buffer to accumulate samples for complete frames
-        self._audio_buffer = bytearray()
-
-    def _int_to_sample_rate(self, sample_rate):
-        """Convert integer sample rate to krisp_audio SamplingRate enum.
-
-        Args:
-            sample_rate: Sample rate as integer
-
-        Returns:
-            krisp_audio.SamplingRate enum value
-
-        Raises:
-            ValueError: If sample rate is not supported
-        """
-        if sample_rate not in self.SAMPLE_RATES:
-            raise ValueError("Unsupported sample rate")
-        return self.SAMPLE_RATES[sample_rate]
-
-    async def start(self, sample_rate: int):
-        """Initialize the Krisp processor with the transport's sample rate.
-
-        Args:
-            sample_rate: The sample rate of the input transport in Hz.
-        """
-        model_info = krisp_audio.ModelInfo()
-        model_info.path = self._model_path
-
-        nc_cfg = krisp_audio.NcSessionConfig()
-        nc_cfg.inputSampleRate = self._int_to_sample_rate(sample_rate)
-        nc_cfg.inputFrameDuration = krisp_audio.FrameDuration.Fd10ms
-        nc_cfg.outputSampleRate = nc_cfg.inputSampleRate
-        nc_cfg.modelInfo = model_info
-
-        self._samples_per_frame = int((sample_rate * self.FRAME_SIZE_MS) / 1000)
-        self._session = krisp_audio.NcInt16.create(nc_cfg)
-
-    async def stop(self):
-        """Clean up the Krisp processor when stopping."""
-        self._session = None
-
-    async def process_frame(self, frame: FilterControlFrame):
-        """Process control frames to enable/disable filtering.
-
-        Args:
-            frame: The control frame containing filter commands.
-        """
-        if isinstance(frame, FilterEnableFrame):
-            self._filtering = frame.enable
-
-    async def filter(self, audio: bytes) -> bytes:
-        """Apply Krisp noise reduction to audio data.
-
-        Args:
-            audio: Raw audio data as bytes to be filtered.
-
-        Returns:
-            Noise-reduced audio data as bytes.
-        """
-        if not self._filtering:
-            return audio
-
-        # Add incoming audio to our buffer
-        self._audio_buffer.extend(audio)
-
-        # Calculate how many complete frames we can process
-        total_samples = len(self._audio_buffer) // 2  # 2 bytes per int16 sample
-        num_complete_frames = total_samples // self._samples_per_frame
-
-        if num_complete_frames == 0:
-            # Not enough samples for a complete frame yet, return empty
-            return b""
-
-        # Calculate how many bytes we need for complete frames
-        complete_samples_count = num_complete_frames * self._samples_per_frame
-        bytes_to_process = complete_samples_count * 2  # 2 bytes per sample
-
-        # Extract the bytes we can process
-        audio_to_process = bytes(self._audio_buffer[:bytes_to_process])
-
-        # Remove processed bytes from buffer, keep the remainder
-        self._audio_buffer = self._audio_buffer[bytes_to_process:]
-
-        # Process the complete frames
-        samples = np.frombuffer(audio_to_process, dtype=np.int16)
-        frames = samples.reshape(-1, self._samples_per_frame)
-        processed_samples = np.empty_like(samples)
-
-        for i, frame in enumerate(frames):
-            cleaned_frame = self._session.process(frame, self._noise_suppression_level)
-            processed_samples[i * self._samples_per_frame : (i + 1) * self._samples_per_frame] = (
-                cleaned_frame
-            )
-
-        return processed_samples.tobytes()
--- a/src/pipecat/audio/turn/base_turn_analyzer.py
+++ b/src/pipecat/audio/turn/base_turn_analyzer.py
@@ -14,8 +14,6 @@ from abc import ABC, abstractmethod
 from enum import Enum
 from typing import Optional, Tuple

-from pydantic import BaseModel
-
 from pipecat.metrics.metrics import MetricsData


@@ -31,12 +29,6 @@ class EndOfTurnState(Enum):
    INCOMPLETE = 2


-class BaseTurnParams(BaseModel):
-    """Base class for turn analyzer parameters."""
-
-    pass
-
-
 class BaseTurnAnalyzer(ABC):
    """Abstract base class for analyzing user end of turn.

@@ -86,7 +78,7 @@ class BaseTurnAnalyzer(ABC):

    @property
    @abstractmethod
-    def params(self) -> BaseTurnParams:
+    def params(self):
        """Get the current turn analyzer parameters.

        Returns:
--- a/src/pipecat/audio/turn/smart_turn/base_smart_turn.py
+++ b/src/pipecat/audio/turn/smart_turn/base_smart_turn.py
@@ -11,17 +11,15 @@ machine learning models to determine when a user has finished speaking, going
 beyond simple silence-based detection.
 """

-import asyncio
 import time
 from abc import abstractmethod
-from concurrent.futures import ThreadPoolExecutor
 from typing import Any, Dict, Optional, Tuple

 import numpy as np
 from loguru import logger
 from pydantic import BaseModel

-from pipecat.audio.turn.base_turn_analyzer import BaseTurnAnalyzer, BaseTurnParams, EndOfTurnState
+from pipecat.audio.turn.base_turn_analyzer import BaseTurnAnalyzer, EndOfTurnState
 from pipecat.metrics.metrics import MetricsData, SmartTurnMetricsData

 # Default timing parameters
@@ -31,7 +29,7 @@ MAX_DURATION_SECONDS = 8  # Max allowed segment duration
 USE_ONLY_LAST_VAD_SEGMENT = True


-class SmartTurnParams(BaseTurnParams):
+class SmartTurnParams(BaseModel):
    """Configuration parameters for smart turn analysis.

    Parameters:
@@ -79,9 +77,6 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        self._speech_triggered = False
        self._silence_ms = 0
        self._speech_start_time = 0
-        # Thread executor that will run the model. We only need one thread per
-        # analyzer because one analyzer just handles one audio stream.
-        self._executor = ThreadPoolExecutor(max_workers=1)

    @property
    def speech_triggered(self) -> bool:
@@ -156,10 +151,7 @@ class BaseSmartTurn(BaseTurnAnalyzer):
            Tuple containing the end-of-turn state and optional metrics data
            from the ML model analysis.
        """
-        loop = asyncio.get_running_loop()
-        state, result = await loop.run_in_executor(
-            self._executor, self._process_speech_segment, self._audio_buffer
-        )
+        state, result = await self._process_speech_segment(self._audio_buffer)
        if state == EndOfTurnState.COMPLETE or USE_ONLY_LAST_VAD_SEGMENT:
            self._clear(state)
        logger.debug(f"End of Turn result: {state}")
@@ -177,7 +169,9 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        self._speech_start_time = 0
        self._silence_ms = 0

-    def _process_speech_segment(self, audio_buffer) -> Tuple[EndOfTurnState, Optional[MetricsData]]:
+    async def _process_speech_segment(
+        self, audio_buffer
+    ) -> Tuple[EndOfTurnState, Optional[MetricsData]]:
        """Process accumulated audio segment using ML model."""
        state = EndOfTurnState.INCOMPLETE

@@ -209,7 +203,7 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        if len(segment_audio) > 0:
            start_time = time.perf_counter()
            try:
-                result = self._predict_endpoint(segment_audio)
+                result = await self._predict_endpoint(segment_audio)
                state = (
                    EndOfTurnState.COMPLETE
                    if result["prediction"] == 1
@@ -255,6 +249,6 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        return state, result_data

    @abstractmethod
-    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using ML model from audio data."""
        pass
--- a/src/pipecat/audio/turn/smart_turn/http_smart_turn.py
+++ b/src/pipecat/audio/turn/smart_turn/http_smart_turn.py
@@ -104,15 +104,11 @@ class HttpSmartTurnAnalyzer(BaseSmartTurn):
            logger.error(f"Failed to send raw request to Daily Smart Turn: {e}")
            raise Exception("Failed to send raw request to Daily Smart Turn.")

-    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using remote HTTP ML service."""
        try:
            serialized_array = self._serialize_array(audio_array)
-            loop = asyncio.get_running_loop()
-            future = asyncio.run_coroutine_threadsafe(
-                self._send_raw_request(serialized_array), loop
-            )
-            return future.result()
+            return await self._send_raw_request(serialized_array)
        except Exception as e:
            logger.error(f"Smart turn prediction failed: {str(e)}")
            # Return an incomplete prediction when a failure occurs
--- a/src/pipecat/audio/turn/smart_turn/local_smart_turn.py
+++ b/src/pipecat/audio/turn/smart_turn/local_smart_turn.py
@@ -64,7 +64,7 @@ class LocalSmartTurnAnalyzer(BaseSmartTurn):
        self._turn_model.eval()
        logger.debug("Loaded Local Smart Turn")

-    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using local PyTorch model."""
        inputs = self._turn_processor(
            audio_array,
--- a/src/pipecat/audio/turn/smart_turn/local_smart_turn_v2.py
+++ b/src/pipecat/audio/turn/smart_turn/local_smart_turn_v2.py
@@ -73,7 +73,7 @@ class LocalSmartTurnAnalyzerV2(BaseSmartTurn):
        self._turn_model.eval()
        logger.debug("Loaded Local Smart Turn v2")

-    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using local PyTorch model."""
        inputs = self._turn_processor(
            audio_array,
--- a/src/pipecat/audio/turn/smart_turn/local_smart_turn_v3.py
+++ b/src/pipecat/audio/turn/smart_turn/local_smart_turn_v3.py
@@ -77,7 +77,7 @@ class LocalSmartTurnAnalyzerV3(BaseSmartTurn):

        logger.debug("Loaded Local Smart Turn v3")

-    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using local ONNX model."""

        def truncate_audio_to_last_n_seconds(audio_array, n_seconds=8, sample_rate=16000):
--- a/src/pipecat/audio/vad/vad_analyzer.py
+++ b/src/pipecat/audio/vad/vad_analyzer.py
@@ -11,9 +11,7 @@ data structures for voice activity detection in audio streams. Includes state
 management, parameter configuration, and audio analysis framework.
 """

-import asyncio
 from abc import ABC, abstractmethod
-from concurrent.futures import ThreadPoolExecutor
 from enum import Enum
 from typing import Optional

@@ -86,10 +84,6 @@ class VADAnalyzer(ABC):
        self._smoothing_factor = 0.2
        self._prev_volume = 0

-        # Thread executor that will run the model. We only need one thread per
-        # analyzer because one analyzer just handles one audio stream.
-        self._executor = ThreadPoolExecutor(max_workers=1)
-
    @property
    def sample_rate(self) -> int:
        """Get the current sample rate.
@@ -171,7 +165,7 @@ class VADAnalyzer(ABC):
        volume = calculate_audio_volume(audio, self.sample_rate)
        return exp_smoothing(volume, self._prev_volume, self._smoothing_factor)

-    async def analyze_audio(self, buffer: bytes) -> VADState:
+    def analyze_audio(self, buffer) -> VADState:
        """Analyze audio buffer and return current VAD state.

        Processes incoming audio data, maintains internal state, and determines
@@ -183,12 +177,6 @@ class VADAnalyzer(ABC):
        Returns:
            Current VAD state after processing the buffer.
        """
-        loop = asyncio.get_running_loop()
-        state = await loop.run_in_executor(self._executor, self._run_analyzer, buffer)
-        return state
-
-    def _run_analyzer(self, buffer: bytes) -> VADState:
-        """Analyze audio buffer and return current VAD state."""
        self._vad_buffer += buffer

        num_required_bytes = self._vad_frames_num_bytes
--- a/src/pipecat/frames/frames.py
+++ b/src/pipecat/frames/frames.py
@@ -672,7 +672,7 @@ class TTSSpeakFrame(DataFrame):


@dataclass
-class OutputTransportMessageFrame(DataFrame):
+class TransportMessageFrame(DataFrame):
    """Frame containing transport-specific message data.

    Parameters:
@@ -685,32 +685,6 @@ class OutputTransportMessageFrame(DataFrame):
        return f"{self.name}(message: {self.message})"


-@dataclass
-class TransportMessageFrame(OutputTransportMessageFrame):
-    """Frame containing transport-specific message data.
-
-    .. deprecated:: 0.0.87
-        This frame is deprecated and will be removed in a future version.
-        Instead, use `OutputTransportMessageFrame`.
-
-    Parameters:
-        message: The transport message payload.
-    """
-
-    def __post_init__(self):
-        super().__post_init__()
-        import warnings
-
-        with warnings.catch_warnings():
-            warnings.simplefilter("always")
-            warnings.warn(
-                "TransportMessageFrame is deprecated and will be removed in a future version. "
-                "Instead, use OutputTransportMessageFrame.",
-                DeprecationWarning,
-                stacklevel=2,
-            )
-
-
@dataclass
 class DTMFFrame:
    """Base class for DTMF (Dual-Tone Multi-Frequency) keypad frames.
@@ -1118,8 +1092,8 @@ class STTMuteFrame(SystemFrame):


@dataclass
-class InputTransportMessageFrame(SystemFrame):
-    """Frame for transport messages received from external sources.
+class TransportMessageUrgentFrame(SystemFrame):
+    """Frame for urgent transport messages that need immediate processing.

    Parameters:
        message: The urgent transport message payload.
@@ -1132,92 +1106,46 @@ class InputTransportMessageFrame(SystemFrame):


@dataclass
-class InputTransportMessageUrgentFrame(InputTransportMessageFrame):
+class InputTransportMessageUrgentFrame(TransportMessageUrgentFrame):
    """Frame for transport messages received from external sources.

-    .. deprecated:: 0.0.87
-        This frame is deprecated and will be removed in a future version.
-        Instead, use `InputTransportMessageFrame`.
+    This frame wraps incoming transport messages to distinguish them from outgoing
+    urgent transport messages (TransportMessageUrgentFrame), preventing infinite
+    message loops in the transport layer. It inherits the message payload from
+    TransportMessageFrame while marking the message as having been received
+    rather than generated locally.

-    Parameters:
-        message: The urgent transport message payload.
+    Used by transport implementations to properly handle bidirectional message
+    flow without creating feedback loops.
    """

-    def __post_init__(self):
-        super().__post_init__()
-        import warnings
-
-        with warnings.catch_warnings():
-            warnings.simplefilter("always")
-            warnings.warn(
-                "InputTransportMessageUrgentFrame is deprecated and will be removed in a future version. "
-                "Instead, use InputTransportMessageFrame.",
-                DeprecationWarning,
-                stacklevel=2,
-            )
-
-
-@dataclass
-class OutputTransportMessageUrgentFrame(SystemFrame):
-    """Frame for urgent transport messages that need to be sent immediately.
-
-    Parameters:
-        message: The urgent transport message payload.
-    """
-
-    message: Any
-
-    def __str__(self):
-        return f"{self.name}(message: {self.message})"
-
-
-@dataclass
-class TransportMessageUrgentFrame(OutputTransportMessageUrgentFrame):
-    """Frame for urgent transport messages that need to be sent immediately.
-
-    .. deprecated:: 0.0.87
-        This frame is deprecated and will be removed in a future version.
-        Instead, use `OutputTransportMessageUrgentFrame`.
-
-    Parameters:
-        message: The urgent transport message payload.
-    """
-
-    def __post_init__(self):
-        super().__post_init__()
-        import warnings
-
-        with warnings.catch_warnings():
-            warnings.simplefilter("always")
-            warnings.warn(
-                "TransportMessageUrgentFrame is deprecated and will be removed in a future version. "
-                "Instead, use OutputTransportMessageFrame.",
-                DeprecationWarning,
-                stacklevel=2,
-            )
+    pass


@dataclass
 class UserImageRequestFrame(SystemFrame):
    """Frame requesting an image from a specific user.

-    A frame to request an image from the given user. The request might come with
-    a text that can be later used to describe the requested image.
+    A frame to request an image from the given user. The frame might be
+    generated by a function call in which case the corresponding fields will be
+    properly set.

    Parameters:
        user_id: Identifier of the user to request image from.
-        text: An optional text associated to the image request.
-        append_to_context: Whether the requested image should be appended to the LLM context.
+        context: Optional context for the image request.
+        function_name: Name of function that generated this request (if any).
+        tool_call_id: Tool call ID if generated by function call.
        video_source: Specific video source to capture from.
    """

    user_id: str
-    text: Optional[str] = None
-    append_to_context: Optional[bool] = None
+    context: Optional[Any] = None
+    function_name: Optional[str] = None
+    tool_call_id: Optional[str] = None
    video_source: Optional[str] = None

    def __str__(self):
-        return f"{self.name}(user: {self.user_id}, text: {self.text}, append_to_context: {self.append_to_context}, {self.video_source})"
+        return f"{self.name}(user: {self.user_id}, video_source: {self.video_source}, function: {self.function_name}, request: {self.tool_call_id})"


@dataclass
@@ -1291,17 +1219,15 @@ class UserImageRawFrame(InputImageRawFrame):

    Parameters:
        user_id: Identifier of the user who provided this image.
-        text: An optional text associated to this image.
-        append_to_context: Whether the requested image should be appended to the LLM context.
+        request: The original image request frame if this is a response.
    """

    user_id: str = ""
-    text: Optional[str] = None
-    append_to_context: Optional[bool] = None
+    request: Optional[UserImageRequestFrame] = None

    def __str__(self):
        pts = format_pts(self.pts)
-        return f"{self.name}(pts: {pts}, user: {self.user_id}, source: {self.transport_source}, size: {self.size}, format: {self.format}, text: {self.text}, append_to_context: {self.append_to_context})"
+        return f"{self.name}(pts: {pts}, user: {self.user_id}, source: {self.transport_source}, size: {self.size}, format: {self.format}, request: {self.request})"


@dataclass
--- a/src/pipecat/pipeline/llm_switcher.py
+++ b/src/pipecat/pipeline/llm_switcher.py
@@ -14,41 +14,20 @@ from pipecat.services.llm_service import LLMService


 class LLMSwitcher(ServiceSwitcher[StrategyType]):
-    """A pipeline that switches between different LLMs at runtime.
-
-    Example::
-
-        llm_switcher = LLMSwitcher(
-            llms=[openai_llm, anthropic_llm],
-            strategy_type=ServiceSwitcherStrategyManual
-        )
-    """
+    """A pipeline that switches between different LLMs at runtime."""

    def __init__(self, llms: List[LLMService], strategy_type: Type[StrategyType]):
-        """Initialize the service switcher with a list of LLMs and a switching strategy.
-
-        Args:
-            llms: List of LLM services to switch between.
-            strategy_type: The strategy class to use for switching between LLMs.
-        """
+        """Initialize the service switcher with a list of LLMs and a switching strategy."""
        super().__init__(llms, strategy_type)

    @property
    def llms(self) -> List[LLMService]:
-        """Get the list of LLMs managed by this switcher.
-
-        Returns:
-            List of LLM services managed by this switcher.
-        """
+        """Get the list of LLMs managed by this switcher."""
        return self.services

    @property
    def active_llm(self) -> Optional[LLMService]:
-        """Get the currently active LLM.
-
-        Returns:
-            The currently active LLM service, or None if no LLM is active.
-        """
+        """Get the currently active LLM, if any."""
        return self.strategy.active_service

    async def run_inference(self, context: LLMContext) -> Optional[str]:
--- a/src/pipecat/pipeline/runner.py
+++ b/src/pipecat/pipeline/runner.py
@@ -70,15 +70,11 @@ class PipelineRunner(BaseObject):
        """
        logger.debug(f"Runner {self} started running {task}")
        self._tasks[task.name] = task
-
-        # PipelineTask handles asyncio.CancelledError to shutdown the pipeline
-        # properly and re-raises it in case there's more cleanup to do.
+        params = PipelineTaskParams(loop=self._loop)
        try:
-            params = PipelineTaskParams(loop=self._loop)
            await task.run(params)
        except asyncio.CancelledError:
-            pass
-
+            await self._cancel()
        del self._tasks[task.name]

        # Cleanup base object.
--- a/src/pipecat/pipeline/service_switcher.py
+++ b/src/pipecat/pipeline/service_switcher.py
@@ -21,22 +21,10 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor


 class ServiceSwitcherStrategy:
-    """Base class for service switching strategies.
-
-    Note:
-        Strategy classes are instantiated internally by ServiceSwitcher.
-        Developers should pass the strategy class (not an instance) to ServiceSwitcher.
-    """
+    """Base class for service switching strategies."""

    def __init__(self, services: List[FrameProcessor]):
-        """Initialize the service switcher strategy with a list of services.
-
-        Note:
-            This is called internally by ServiceSwitcher. Do not instantiate directly.
-
-        Args:
-            services: List of frame processors to switch between.
-        """
+        """Initialize the service switcher strategy with a list of services."""
        self.services = services
        self.active_service: Optional[FrameProcessor] = None

@@ -58,24 +46,10 @@ class ServiceSwitcherStrategyManual(ServiceSwitcherStrategy):

    This strategy allows the user to manually select which service is active.
    The initial active service is the first one in the list.
-
-    Example::
-
-        stt_switcher = ServiceSwitcher(
-            services=[stt_1, stt_2],
-            strategy_type=ServiceSwitcherStrategyManual
-        )
    """

    def __init__(self, services: List[FrameProcessor]):
-        """Initialize the manual service switcher strategy with a list of services.
-
-        Note:
-            This is called internally by ServiceSwitcher. Do not instantiate directly.
-
-        Args:
-            services: List of frame processors to switch between.
-        """
+        """Initialize the manual service switcher strategy with a list of services."""
        super().__init__(services)
        self.active_service = services[0] if services else None

@@ -111,12 +85,7 @@ class ServiceSwitcher(ParallelPipeline, Generic[StrategyType]):
    """A pipeline that switches between different services at runtime."""

    def __init__(self, services: List[FrameProcessor], strategy_type: Type[StrategyType]):
-        """Initialize the service switcher with a list of services and a switching strategy.
-
-        Args:
-            services: List of frame processors to switch between.
-            strategy_type: The strategy class to use for switching between services.
-        """
+        """Initialize the service switcher with a list of services and a switching strategy."""
        strategy = strategy_type(services)
        super().__init__(*self._make_pipeline_definitions(services, strategy))
        self.services = services
@@ -131,20 +100,14 @@ class ServiceSwitcher(ParallelPipeline, Generic[StrategyType]):
            active_service: FrameProcessor,
            direction: FrameDirection,
        ):
-            """Initialize the service switcher filter with a strategy and direction.
-
-            Args:
-                wrapped_service: The service that this filter wraps.
-                active_service: The currently active service.
-                direction: The direction of frame flow to filter.
-            """
-            self._wrapped_service = wrapped_service
-            self._active_service = active_service
+            """Initialize the service switcher filter with a strategy and direction."""

            async def filter(_: Frame) -> bool:
                return self._wrapped_service == self._active_service

-            super().__init__(filter, direction, filter_system_frames=True)
+            super().__init__(filter, direction)
+            self._wrapped_service = wrapped_service
+            self._active_service = active_service

        async def process_frame(self, frame, direction):
            """Process a frame through the filter, handling special internal filter-updating frames."""
--- a/src/pipecat/pipeline/task.py
+++ b/src/pipecat/pipeline/task.py
@@ -12,7 +12,9 @@ including heartbeats, idle detection, and observer integration.
 """

 import asyncio
-from typing import Any, AsyncIterable, Dict, Iterable, List, Optional, Tuple, Type
+import time
+from collections import deque
+from typing import Any, AsyncIterable, Deque, Dict, Iterable, List, Optional, Tuple, Type

 from loguru import logger
 from pydantic import BaseModel, ConfigDict, Field
@@ -29,6 +31,7 @@ from pipecat.frames.frames import (
    ErrorFrame,
    Frame,
    HeartbeatFrame,
+    InputAudioRawFrame,
    InterruptionFrame,
    InterruptionTaskFrame,
    MetricsFrame,
@@ -38,7 +41,7 @@ from pipecat.frames.frames import (
    UserSpeakingFrame,
 )
 from pipecat.metrics.metrics import ProcessingMetricsData, TTFBMetricsData
-from pipecat.observers.base_observer import BaseObserver, FramePushed
+from pipecat.observers.base_observer import BaseObserver
 from pipecat.observers.turn_tracking_observer import TurnTrackingObserver
 from pipecat.pipeline.base_task import BasePipelineTask, PipelineTaskParams
 from pipecat.pipeline.pipeline import Pipeline, PipelineSink, PipelineSource
@@ -56,43 +59,6 @@ IDLE_TIMEOUT_SECS = 300
 CANCEL_TIMEOUT_SECS = 20.0


-class IdleFrameObserver(BaseObserver):
-    """Idle timeout observer.
-
-    This observer waits for specific frames being generated in the pipeline. If
-    the frames are generated the given asyncio event is set. If the event is not
-    set it means the pipeline is probably idle.
-
-    """
-
-    def __init__(self, *, idle_event: asyncio.Event, idle_timeout_frames: Tuple[Type[Frame], ...]):
-        """Initialize the observer.
-
-        Args:
-            idle_event: The event to set if the idle timeout frames are being pushed.
-            idle_timeout_frames: A tuple with the frames that should set the event when received
-        """
-        super().__init__()
-        self._idle_event = idle_event
-        self._idle_timeout_frames = idle_timeout_frames
-        self._processed_frames = set()
-
-    async def on_push_frame(self, data: FramePushed):
-        """Callback executed when a frame is pushed in the pipeline.
-
-        Args:
-            data: The frame push event data.
-        """
-        # Skip already processed frames
-        if data.frame.id in self._processed_frames:
-            return
-
-        self._processed_frames.add(data.frame.id)
-
-        if isinstance(data.frame, StartFrame) or isinstance(data.frame, self._idle_timeout_frames):
-            self._idle_event.set()
-
-
 class PipelineParams(BaseModel):
    """Configuration parameters for pipeline execution.

@@ -166,16 +132,12 @@ class PipelineTask(BasePipelineTask):

    - on_pipeline_finished: Called after the pipeline has reached any terminal state.
          This includes:
-
              - StopFrame: pipeline was stopped (processors keep connections open)
              - EndFrame: pipeline ended normally
              - CancelFrame: pipeline was cancelled
-
          Use this event for cleanup, logging, or post-processing tasks. Users can inspect
          the frame if they need to handle specific cases.

-    - on_pipeline_error: Called when an error occurs with ErrorFrame
-
    Example::

        @task.event_handler("on_frame_reached_upstream")
@@ -186,17 +148,9 @@ class PipelineTask(BasePipelineTask):
        async def on_pipeline_idle_timeout(task):
            ...

-        @task.event_handler("on_pipeline_started")
-        async def on_pipeline_started(task, frame):
-            ...
-
        @task.event_handler("on_pipeline_finished")
        async def on_pipeline_finished(task, frame):
            ...
-
-        @task.event_handler("on_pipeline_error")
-        async def on_pipeline_error(task, frame):
-            ...
    """

    def __init__(
@@ -251,6 +205,7 @@ class PipelineTask(BasePipelineTask):
        self._conversation_id = conversation_id
        self._enable_tracing = enable_tracing and is_tracing_available()
        self._enable_turn_tracking = enable_turn_tracking
+        self._idle_timeout_frames = idle_timeout_frames
        self._idle_timeout_secs = idle_timeout_secs
        if self._params.observers:
            import warnings
@@ -285,24 +240,16 @@ class PipelineTask(BasePipelineTask):
        # This queue is the queue used to push frames to the pipeline.
        self._push_queue = asyncio.Queue()
        self._process_push_task: Optional[asyncio.Task] = None
-
        # This is the heartbeat queue. When a heartbeat frame is received in the
        # down queue we add it to the heartbeat queue for processing.
        self._heartbeat_queue = asyncio.Queue()
        self._heartbeat_push_task: Optional[asyncio.Task] = None
        self._heartbeat_monitor_task: Optional[asyncio.Task] = None
-
-        # This is the idle event. When selected frames are pushed from any
-        # processor we consider the pipeline is not idle. We use an observer
-        # which will be listening any part of the pipeline.
-        self._idle_event = asyncio.Event()
+        # This is the idle queue. When frames are received downstream they are
+        # put in the queue. If no frame is received the pipeline is considered
+        # idle.
+        self._idle_queue = asyncio.Queue()
        self._idle_monitor_task: Optional[asyncio.Task] = None
-        if self._idle_timeout_secs:
-            idle_frame_observer = IdleFrameObserver(
-                idle_event=self._idle_event,
-                idle_timeout_frames=idle_timeout_frames,
-            )
-            observers.append(idle_frame_observer)

        # This event is used to indicate the StartFrame has been received at the
        # end of the pipeline.
@@ -312,9 +259,6 @@ class PipelineTask(BasePipelineTask):
        # StopFrame) has been received at the end of the pipeline.
        self._pipeline_end_event = asyncio.Event()

-        # This event is set when the pipeline truly finishes.
-        self._pipeline_finished_event = asyncio.Event()
-
        # This is the final pipeline. It is composed of a source processor,
        # followed by the user pipeline, and ending with a sink processor. The
        # source allows us to receive and react to upstream frames, and the sink
@@ -344,7 +288,6 @@ class PipelineTask(BasePipelineTask):
        self._register_event_handler("on_pipeline_ended")
        self._register_event_handler("on_pipeline_cancelled")
        self._register_event_handler("on_pipeline_finished")
-        self._register_event_handler("on_pipeline_error")

    @property
    def params(self) -> PipelineParams:
@@ -447,9 +390,12 @@ class PipelineTask(BasePipelineTask):
        await self.queue_frame(EndFrame())

    async def cancel(self):
-        """Request the running pipeline to cancel."""
-        if not self._finished:
-            await self._cancel()
+        """Immediately stop the running pipeline.
+
+        Cancels all running tasks and stops frame processing without
+        waiting for completion.
+        """
+        await self._cancel()

    async def run(self, params: PipelineTaskParams):
        """Start and manage the pipeline execution until completion or cancellation.
@@ -459,38 +405,51 @@ class PipelineTask(BasePipelineTask):
        """
        if self.has_finished():
            return
-
-        # Setup processors.
-        await self._setup(params)
-
-        # Create all main tasks and wait for the main push task. This is the
-        # task that pushes frames to the very beginning of our pipeline (i.e. to
-        # our controlled source processor).
-        await self._create_tasks()
-
+        cleanup_pipeline = True
        try:
-            # Wait for pipeline to finish.
-            await self._wait_for_pipeline_finished()
+            # Setup processors.
+            await self._setup(params)
+
+            # Create all main tasks and wait of the main push task. This is the
+            # task that pushes frames to the very beginning of our pipeline (our
+            # controlled source processor).
+            push_task = await self._create_tasks()
+            await push_task
+
+            # We have already cleaned up the pipeline inside the task.
+            cleanup_pipeline = False
+
+            # Pipeline has finished nicely.
+            self._finished = True
        except asyncio.CancelledError:
-            logger.debug(f"Pipeline task {self} got cancelled from outside...")
-            # We have been cancelled from outside, let's just cancel everything.
-            await self._cancel()
-            # Wait again for pipeline to finish. This time we have really
-            # cancelled, so it should really finish.
-            await self._wait_for_pipeline_finished()
-            # Re-raise in case there's more cleanup to do.
+            # Raise exception back to the pipeline runner so it can cancel this
+            # task properly.
            raise
        finally:
            # We can reach this point for different reasons:
            #
-            # 1. The pipeline task has finished (try case).
-            # 2. By an asyncio task cancellation (except case).
-            logger.debug(f"Pipeline task {self} is finishing...")
-            await self._cancel_tasks()
-            if self._check_dangling_tasks:
-                self._print_dangling_tasks()
-            self._finished = True
-            logger.debug(f"Pipeline task {self} has finished")
+            # 1. The task has finished properly (e.g. `EndFrame`).
+            # 2. By calling `PipelineTask.cancel()`.
+            # 3. By asyncio task cancellation.
+            #
+            # Case (1) will execute the code below without issues because
+            # `self._finished` is true.
+            #
+            # Case (2) will execute the code below without issues because
+            # `self._cancelled` is true.
+            #
+            # Case (3) will raise the exception above (because we are cancelling
+            # the asyncio task). This will be then captured by the
+            # `PipelineRunner` which will call `PipelineTask.cancel()` and
+            # therefore becoming case (2).
+            if self._finished or self._cancelled:
+                logger.debug(f"Pipeline task {self} is finishing cleanup...")
+                await self._cancel_tasks()
+                await self._cleanup(cleanup_pipeline)
+                if self._check_dangling_tasks:
+                    self._print_dangling_tasks()
+                self._finished = True
+                logger.debug(f"Pipeline task {self} has finished")

    async def queue_frame(self, frame: Frame):
        """Queue a single frame to be pushed down the pipeline.
@@ -518,7 +477,19 @@ class PipelineTask(BasePipelineTask):
        if not self._cancelled:
            logger.debug(f"Cancelling pipeline task {self}")
            self._cancelled = True
-            await self.queue_frame(CancelFrame())
+            cancel_frame = CancelFrame()
+            # Make sure everything is cleaned up downstream. This is sent
+            # out-of-band from the main streaming task which is what we want since
+            # we want to cancel right away.
+            await self._pipeline.queue_frame(cancel_frame)
+            # Wait for CancelFrame to make it through the pipeline.
+            await self._wait_for_pipeline_end(cancel_frame)
+            # Only cancel the push task, we don't want to be able to process any
+            # other frame after cancel. Everything else will be cancelled in
+            # run().
+            if self._process_push_task:
+                await self._task_manager.cancel_task(self._process_push_task)
+                self._process_push_task = None

    async def _create_tasks(self):
        """Create and start all pipeline processing tasks."""
@@ -573,7 +544,7 @@ class PipelineTask(BasePipelineTask):

    async def _maybe_cancel_idle_task(self):
        """Cancel idle monitoring task if it is running."""
-        if self._idle_monitor_task:
+        if self._idle_timeout_secs and self._idle_monitor_task:
            await self._task_manager.cancel_task(self._idle_monitor_task)
            self._idle_monitor_task = None

@@ -620,17 +591,6 @@ class PipelineTask(BasePipelineTask):

        self._pipeline_end_event.clear()

-        # We are really done.
-        self._pipeline_finished_event.set()
-
-    async def _wait_for_pipeline_finished(self):
-        await self._pipeline_finished_event.wait()
-        self._pipeline_finished_event.clear()
-        # Make sure we wait for the main task to complete.
-        if self._process_push_task:
-            await self._process_push_task
-            self._process_push_task = None
-
    async def _setup(self, params: PipelineTaskParams):
        """Set up the pipeline task and all processors."""
        mgr_params = TaskManagerParams(loop=params.loop)
@@ -733,11 +693,12 @@ class PipelineTask(BasePipelineTask):
            logger.debug(f"{self}: received interruption task frame {frame}")
            await self._pipeline.queue_frame(InterruptionFrame())
        elif isinstance(frame, ErrorFrame):
-            await self._call_event_handler("on_pipeline_error", frame)
            if frame.fatal:
                logger.error(f"A fatal error occurred: {frame}")
                # Cancel all tasks downstream.
                await self.queue_frame(CancelFrame())
+                # Tell the task we should stop.
+                await self.queue_frame(StopTaskFrame())
            else:
                logger.warning(f"{self}: Something went wrong: {frame}")

@@ -749,6 +710,10 @@ class PipelineTask(BasePipelineTask):
        processors have handled the EndFrame and therefore we can exit the task
        cleanly.
        """
+        # Queue received frame to the idle queue so we can monitor idle
+        # pipelines.
+        await self._idle_queue.put(frame)
+
        if isinstance(frame, self._reached_downstream_types):
            await self._call_event_handler("on_frame_reached_downstream", frame)

@@ -811,10 +776,33 @@ class PipelineTask(BasePipelineTask):
        Note: Heartbeats are excluded from idle detection.
        """
        running = True
+        last_frame_time = 0
+
        while running:
            try:
-                await asyncio.wait_for(self._idle_event.wait(), timeout=self._idle_timeout_secs)
-                self._idle_event.clear()
+                frame = await asyncio.wait_for(
+                    self._idle_queue.get(), timeout=self._idle_timeout_secs
+                )
+
+                if isinstance(frame, StartFrame) or isinstance(frame, self._idle_timeout_frames):
+                    # If we find a StartFrame or one of the frames that prevents a
+                    # time out we update the time.
+                    last_frame_time = time.time()
+                else:
+                    # If we find any other frame we check if the pipeline is
+                    # idle by checking the last time we received one of the
+                    # valid frames.
+                    diff_time = time.time() - last_frame_time
+                    if diff_time >= self._idle_timeout_secs:
+                        running = await self._idle_timeout_detected()
+                        # Reset `last_frame_time` so we don't trigger another
+                        # immediate idle timeout if we are not cancelling. For
+                        # example, we might want to force the bot to say goodbye
+                        # and then clean nicely with an `EndFrame`.
+                        last_frame_time = time.time()
+
+                self._idle_queue.task_done()
+
            except asyncio.TimeoutError:
                running = await self._idle_timeout_detected()

@@ -826,7 +814,7 @@ class PipelineTask(BasePipelineTask):
        """
        # If we are cancelling, just exit the task.
        if self._cancelled:
-            return False
+            return True

        logger.warning("Idle timeout detected.")
        await self._call_event_handler("on_idle_timeout")
--- a/src/pipecat/pipeline/task_observer.py
+++ b/src/pipecat/pipeline/task_observer.py
@@ -129,7 +129,7 @@ class TaskObserver(BaseObserver):
        for proxy in self._proxies:
            await proxy.cleanup()

-    async def on_process_frame(self, data: FrameProcessed):
+    async def on_process_frame(self, data: FramePushed):
        """Queue frame data for all managed observers.

        Args:
@@ -189,7 +189,7 @@ class TaskObserver(BaseObserver):
            if isinstance(data, FramePushed):
                if on_push_frame_deprecated:
                    await observer.on_push_frame(
-                        data.source, data.destination, data.frame, data.direction, data.timestamp
+                        data.src, data.dst, data.frame, data.direction, data.timestamp
                    )
                else:
                    await observer.on_push_frame(data)
--- a/src/pipecat/processors/aggregators/llm_context.py
+++ b/src/pipecat/processors/aggregators/llm_context.py
@@ -16,9 +16,8 @@ service-specific adapter.

 import base64
 import io
-import wave
 from dataclasses import dataclass
-from typing import TYPE_CHECKING, Any, List, Optional, TypeAlias, Union
+from typing import Any, List, Optional, TypeAlias, Union

 from loguru import logger
 from openai._types import NOT_GIVEN as OPEN_AI_NOT_GIVEN
@@ -29,12 +28,9 @@ from openai.types.chat import (
 )
 from PIL import Image

-from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.frames.frames import AudioRawFrame

-if TYPE_CHECKING:
-    from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
-
 # "Re-export" types from OpenAI that we're using as universal context types.
 # NOTE: if universal message types need to someday diverge from OpenAI's, we
 # should consider managing our own definitions. But we should do so carefully,
@@ -69,34 +65,6 @@ class LLMContext:
    and content formatting.
    """

-    @staticmethod
-    def from_openai_context(openai_context: "OpenAILLMContext") -> "LLMContext":
-        """Create a universal LLM context from an OpenAI-specific context.
-
-        NOTE: this should only be used internally, for facilitating migration
-        from OpenAILLMContext to LLMContext. New user code should use
-        LLMContext directly.
-
-        Args:
-            openai_context: The OpenAI LLM context to convert.
-
-        Returns:
-            New LLMContext instance with converted messages and settings.
-        """
-        # Convert tools to ToolsSchema if needed.
-        # If the tools are already a ToolsSchema, this is a no-op.
-        # Otherwise, we wrap them in a shim ToolsSchema.
-        converted_tools = openai_context.tools
-        if isinstance(converted_tools, list):
-            converted_tools = ToolsSchema(
-                standard_tools=[], custom_tools={AdapterType.SHIM: converted_tools}
-            )
-        return LLMContext(
-            messages=openai_context.get_messages(),
-            tools=converted_tools,
-            tool_choice=openai_context.tool_choice,
-        )
-
    def __init__(
        self,
        messages: Optional[List[LLMContextMessage]] = None,
@@ -114,129 +82,6 @@ class LLMContext:
        self._tools: ToolsSchema | NotGiven = LLMContext._normalize_and_validate_tools(tools)
        self._tool_choice: LLMContextToolChoice | NotGiven = tool_choice

-    @staticmethod
-    def create_image_url_message(
-        *,
-        role: str = "user",
-        url: str,
-        text: Optional[str] = None,
-    ) -> LLMContextMessage:
-        """Create a context message containing an image URL.
-
-        Args:
-            role: The role of this message (defaults to "user").
-            url: The URL of the image.
-            text: Optional text to include with the image.
-        """
-        content = []
-        if text:
-            content.append({"type": "text", "text": text})
-
-        content.append({"type": "image_url", "image_url": {"url": url}})
-
-        return {"role": role, "content": content}
-
-    @staticmethod
-    def create_image_message(
-        *,
-        role: str = "user",
-        format: str,
-        size: tuple[int, int],
-        image: bytes,
-        text: Optional[str] = None,
-    ) -> LLMContextMessage:
-        """Create a context message containing an image.
-
-        Args:
-            role: The role of this message (defaults to "user").
-            format: Image format (e.g., 'RGB', 'RGBA').
-            size: Image dimensions as (width, height) tuple.
-            image: Raw image bytes.
-            text: Optional text to include with the image.
-        """
-        buffer = io.BytesIO()
-        Image.frombytes(format, size, image).save(buffer, format="JPEG")
-        encoded_image = base64.b64encode(buffer.getvalue()).decode("utf-8")
-        url = f"data:image/jpeg;base64,{encoded_image}"
-
-        return LLMContext.create_image_url_message(role=role, url=url, text=text)
-
-    @staticmethod
-    def create_audio_message(
-        *, role: str = "user", audio_frames: list[AudioRawFrame], text: str = "Audio follows"
-    ) -> LLMContextMessage:
-        """Create a context message containing audio.
-
-        Args:
-            role: The role of this message (defaults to "user").
-            audio_frames: List of audio frame objects to include.
-            text: Optional text to include with the audio.
-        """
-        sample_rate = audio_frames[0].sample_rate
-        num_channels = audio_frames[0].num_channels
-
-        content = []
-        content.append({"type": "text", "text": text})
-        data = b"".join(frame.audio for frame in audio_frames)
-
-        with io.BytesIO() as buffer:
-            with wave.open(buffer, "wb") as wf:
-                wf.setsampwidth(2)
-                wf.setnchannels(num_channels)
-                wf.setframerate(sample_rate)
-                wf.writeframes(data)
-
-        encoded_audio = base64.b64encode(buffer.getvalue()).decode("utf-8")
-
-        content.append(
-            {
-                "type": "input_audio",
-                "input_audio": {"data": encoded_audio, "format": "wav"},
-            }
-        )
-
-        return {"role": role, "content": content}
-
-    @property
-    def messages(self) -> List[LLMContextMessage]:
-        """Get the current messages list.
-
-        NOTE: This is equivalent to calling `get_messages()` with no filter. If
-        you want to filter out LLM-specific messages that don't pertain to your
-        LLM, use `get_messages()` directly.
-
-        Returns:
-            List of conversation messages.
-        """
-        return self.get_messages()
-
-    def get_messages_for_persistent_storage(self) -> List[LLMContextMessage]:
-        """Get messages suitable for persistent storage.
-
-        NOTE: the only reason this method exists is because we're "silently"
-        switching from OpenAILLMContext to LLMContext under the hood in some
-        services and don't want to trip up users who may have been relying on
-        this method, which is part of the public API of OpenAILLMContext but
-        doesn't need to be for LLMContext.
-
-        .. deprecated::
-            Use `get_messages()` instead.
-
-        Returns:
-            List of conversation messages.
-        """
-        import warnings
-
-        with warnings.catch_warnings():
-            warnings.simplefilter("always")
-            warnings.warn(
-                "get_messages_for_persistent_storage() is deprecated, use get_messages() instead.",
-                DeprecationWarning,
-                stacklevel=2,
-            )
-
-        return self.get_messages()
-
    def get_messages(self, llm_specific_filter: Optional[str] = None) -> List[LLMContextMessage]:
        """Get the current messages list.

@@ -244,8 +89,7 @@ class LLMContext:
            llm_specific_filter: Optional filter to return LLM-specific
                messages for the given LLM, in addition to the standard
                messages. If messages end up being filtered, an error will be
-                logged; this is intended to catch accidental use of
-                incompatible LLM-specific messages.
+                logged.

        Returns:
            List of conversation messages.
@@ -322,7 +166,7 @@ class LLMContext:
        self._tool_choice = tool_choice

    def add_image_frame_message(
-        self, *, format: str, size: tuple[int, int], image: bytes, text: Optional[str] = None
+        self, *, format: str, size: tuple[int, int], image: bytes, text: str = None
    ):
        """Add a message containing an image frame.

@@ -332,8 +176,17 @@ class LLMContext:
            image: Raw image bytes.
            text: Optional text to include with the image.
        """
-        message = LLMContext.create_image_message(format=format, size=size, image=image, text=text)
-        self.add_message(message)
+        buffer = io.BytesIO()
+        Image.frombytes(format, size, image).save(buffer, format="JPEG")
+        encoded_image = base64.b64encode(buffer.getvalue()).decode("utf-8")
+
+        content = []
+        if text:
+            content.append({"type": "text", "text": text})
+        content.append(
+            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
+        )
+        self.add_message({"role": "user", "content": content})

    def add_audio_frames_message(
        self, *, audio_frames: list[AudioRawFrame], text: str = "Audio follows"
@@ -344,8 +197,66 @@ class LLMContext:
            audio_frames: List of audio frame objects to include.
            text: Optional text to include with the audio.
        """
-        message = LLMContext.create_audio_message(audio_frames=audio_frames, text=text)
-        self.add_message(message)
+        if not audio_frames:
+            return
+
+        sample_rate = audio_frames[0].sample_rate
+        num_channels = audio_frames[0].num_channels
+
+        content = []
+        content.append({"type": "text", "text": text})
+        data = b"".join(frame.audio for frame in audio_frames)
+        data = bytes(
+            self._create_wav_header(
+                sample_rate,
+                num_channels,
+                16,
+                len(data),
+            )
+            + data
+        )
+        encoded_audio = base64.b64encode(data).decode("utf-8")
+        content.append(
+            {
+                "type": "input_audio",
+                "input_audio": {"data": encoded_audio, "format": "wav"},
+            }
+        )
+        self.add_message({"role": "user", "content": content})
+
+    def _create_wav_header(self, sample_rate, num_channels, bits_per_sample, data_size):
+        """Create a WAV file header for audio data.
+
+        Args:
+            sample_rate: Audio sample rate in Hz.
+            num_channels: Number of audio channels.
+            bits_per_sample: Bits per audio sample.
+            data_size: Size of audio data in bytes.
+
+        Returns:
+            WAV header as a bytearray.
+        """
+        # RIFF chunk descriptor
+        header = bytearray()
+        header.extend(b"RIFF")  # ChunkID
+        header.extend((data_size + 36).to_bytes(4, "little"))  # ChunkSize: total size - 8
+        header.extend(b"WAVE")  # Format
+        # "fmt " sub-chunk
+        header.extend(b"fmt ")  # Subchunk1ID
+        header.extend((16).to_bytes(4, "little"))  # Subchunk1Size (16 for PCM)
+        header.extend((1).to_bytes(2, "little"))  # AudioFormat (1 for PCM)
+        header.extend(num_channels.to_bytes(2, "little"))  # NumChannels
+        header.extend(sample_rate.to_bytes(4, "little"))  # SampleRate
+        # Calculate byte rate and block align
+        byte_rate = sample_rate * num_channels * (bits_per_sample // 8)
+        block_align = num_channels * (bits_per_sample // 8)
+        header.extend(byte_rate.to_bytes(4, "little"))  # ByteRate
+        header.extend(block_align.to_bytes(2, "little"))  # BlockAlign
+        header.extend(bits_per_sample.to_bytes(2, "little"))  # BitsPerSample
+        # "data" sub-chunk
+        header.extend(b"data")  # Subchunk2ID
+        header.extend(data_size.to_bytes(4, "little"))  # Subchunk2Size
+        return header

    @staticmethod
    def _normalize_and_validate_tools(tools: ToolsSchema | NotGiven) -> ToolsSchema | NotGiven:
--- a/src/pipecat/processors/aggregators/llm_response.py
+++ b/src/pipecat/processors/aggregators/llm_response.py
@@ -89,9 +89,7 @@ class LLMAssistantAggregatorParams:

    Parameters:
        expect_stripped_words: Whether to expect and handle stripped words
-            in text frames by adding spaces between tokens. This parameter is
-            ignored when used with the newer LLMAssistantAggregator, which
-            handles word spacing automatically.
+            in text frames by adding spaces between tokens.
    """

    expect_stripped_words: bool = True
--- a/src/pipecat/processors/aggregators/llm_response_universal.py
+++ b/src/pipecat/processors/aggregators/llm_response_universal.py
@@ -13,8 +13,6 @@ LLM processing, and text-to-speech components in conversational AI pipelines.

 import asyncio
 import json
-import warnings
-from abc import abstractmethod
 from typing import Any, Dict, List, Literal, Optional, Set

 from loguru import logger
@@ -66,7 +64,6 @@ from pipecat.processors.aggregators.llm_response import (
    LLMUserAggregatorParams,
 )
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
-from pipecat.utils.string import concatenate_aggregated_text
 from pipecat.utils.time import time_now_iso8601


@@ -90,7 +87,7 @@ class LLMContextAggregator(FrameProcessor):
        self._context = context
        self._role = role

-        self._aggregation: List[str] = []
+        self._aggregation: str = ""

    @property
    def messages(self) -> List[LLMContextMessage]:
@@ -170,20 +167,7 @@ class LLMContextAggregator(FrameProcessor):

    async def reset(self):
        """Reset the aggregation state."""
-        self._aggregation = []
-
-    @abstractmethod
-    async def push_aggregation(self):
-        """Push the current aggregation downstream."""
-        pass
-
-    def aggregation_string(self) -> str:
-        """Get the current aggregation as a string.
-
-        Returns:
-            The concatenated aggregation string.
-        """
-        return concatenate_aggregated_text(self._aggregation)
+        self._aggregation = ""


 class LLMUserAggregator(LLMContextAggregator):
@@ -222,6 +206,8 @@ class LLMUserAggregator(LLMContextAggregator):
        self._turn_params: Optional[SmartTurnParams] = None

        if "aggregation_timeout" in kwargs:
+            import warnings
+
            with warnings.catch_warnings():
                warnings.simplefilter("always")
                warnings.warn(
@@ -298,12 +284,6 @@ class LLMUserAggregator(LLMContextAggregator):
            await self._handle_llm_messages_update(frame)
        elif isinstance(frame, LLMSetToolsFrame):
            self.set_tools(frame.tools)
-            # Push the LLMSetToolsFrame as well, since speech-to-speech LLM
-            # services (like OpenAI Realtime) may need to know about tool
-            # changes; unlike text-based LLM services they won't just "pick up
-            # the change" on the next LLM run, as the LLM is continuously
-            # running.
-            await self.push_frame(frame, direction)
        elif isinstance(frame, LLMSetToolChoiceFrame):
            self.set_tool_choice(frame.tool_choice)
        elif isinstance(frame, SpeechControlParamsFrame):
@@ -315,13 +295,13 @@ class LLMUserAggregator(LLMContextAggregator):

    async def _process_aggregation(self):
        """Process the current aggregation and push it downstream."""
-        aggregation = self.aggregation_string()
+        aggregation = self._aggregation
        await self.reset()
        self._context.add_message({"role": self.role, "content": aggregation})
        frame = LLMContextFrame(self._context)
        await self.push_frame(frame)

-    async def push_aggregation(self):
+    async def _push_aggregation(self):
        """Push the current aggregation based on interruption strategies and conditions."""
        if len(self._aggregation) > 0:
            if self.interruption_strategies and self._bot_speaking:
@@ -363,7 +343,7 @@ class LLMUserAggregator(LLMContextAggregator):
        """

        async def should_interrupt(strategy: BaseInterruptionStrategy):
-            await strategy.append_text(self.aggregation_string())
+            await strategy.append_text(self._aggregation)
            return await strategy.should_interrupt()

        return any([await should_interrupt(s) for s in self._interruption_strategies])
@@ -412,7 +392,7 @@ class LLMUserAggregator(LLMContextAggregator):
        # pushing the aggregation as we will probably get a final transcription.
        if len(self._aggregation) > 0:
            if not self._seen_interim_results:
-                await self.push_aggregation()
+                await self._push_aggregation()
        # Handles the case where both the user and the bot are not speaking,
        # and the bot was previously speaking before the user interruption.
        # So in this case we are resetting the aggregation timer
@@ -433,7 +413,7 @@ class LLMUserAggregator(LLMContextAggregator):
        if not text.strip():
            return

-        self._aggregation.append(text)
+        self._aggregation += f" {text}" if self._aggregation else text
        # We just got a final result, so let's reset interim results.
        self._seen_interim_results = False
        # Reset aggregation timer.
@@ -491,7 +471,7 @@ class LLMUserAggregator(LLMContextAggregator):
                await self._maybe_emulate_user_speaking()
            except asyncio.TimeoutError:
                if not self._user_speaking:
-                    await self.push_aggregation()
+                    await self._push_aggregation()

                # If we are emulating VAD we still need to send the user stopped
                # speaking frame.
@@ -558,31 +538,23 @@ class LLMAssistantAggregator(LLMContextAggregator):
        Args:
            context: The OpenAI LLM context for conversation storage.
            params: Configuration parameters for aggregation behavior.
-            **kwargs: Additional arguments.
+            **kwargs: Additional arguments. Supports deprecated 'expect_stripped_words'.
        """
        super().__init__(context=context, role="assistant", **kwargs)
        self._params = params or LLMAssistantAggregatorParams()

        if "expect_stripped_words" in kwargs:
+            import warnings
+
            with warnings.catch_warnings():
                warnings.simplefilter("always")
                warnings.warn(
-                    "Parameter 'expect_stripped_words' is deprecated. "
-                    "LLMAssistantAggregator now handles word spacing automatically.",
+                    "Parameter 'expect_stripped_words' is deprecated, use 'params' instead.",
                    DeprecationWarning,
                )

            self._params.expect_stripped_words = kwargs["expect_stripped_words"]

-        if params and not params.expect_stripped_words:
-            with warnings.catch_warnings():
-                warnings.simplefilter("always")
-                warnings.warn(
-                    "params.expect_stripped_words is deprecated. "
-                    "LLMAssistantAggregator now handles word spacing automatically.",
-                    DeprecationWarning,
-                )
-
        self._started = 0
        self._function_calls_in_progress: Dict[str, Optional[FunctionCallInProgressFrame]] = {}
        self._context_updated_tasks: Set[asyncio.Task] = set()
@@ -632,20 +604,20 @@ class LLMAssistantAggregator(LLMContextAggregator):
            await self._handle_function_call_result(frame)
        elif isinstance(frame, FunctionCallCancelFrame):
            await self._handle_function_call_cancel(frame)
-        elif isinstance(frame, UserImageRawFrame):
+        elif isinstance(frame, UserImageRawFrame) and frame.request and frame.request.tool_call_id:
            await self._handle_user_image_frame(frame)
        elif isinstance(frame, BotStoppedSpeakingFrame):
-            await self.push_aggregation()
+            await self._push_aggregation()
            await self.push_frame(frame, direction)
        else:
            await self.push_frame(frame, direction)

-    async def push_aggregation(self):
+    async def _push_aggregation(self):
        """Push the current assistant aggregation with timestamp."""
        if not self._aggregation:
            return

-        aggregation = self.aggregation_string()
+        aggregation = self._aggregation.strip()
        await self.reset()

        if aggregation:
@@ -672,7 +644,7 @@ class LLMAssistantAggregator(LLMContextAggregator):
            await self.push_context_frame(FrameDirection.UPSTREAM)

    async def _handle_interruptions(self, frame: InterruptionFrame):
-        await self.push_aggregation()
+        await self._push_aggregation()
        self._started = 0
        await self.reset()

@@ -783,19 +755,30 @@ class LLMAssistantAggregator(LLMContextAggregator):
                message["content"] = result

    async def _handle_user_image_frame(self, frame: UserImageRawFrame):
-        if not frame.append_to_context:
+        logger.debug(
+            f"{self} UserImageRawFrame: [{frame.request.function_name}:{frame.request.tool_call_id}]"
+        )
+
+        if frame.request.tool_call_id not in self._function_calls_in_progress:
+            logger.warning(
+                f"UserImageRawFrame tool_call_id [{frame.request.tool_call_id}] is not running"
+            )
            return

-        logger.debug(f"{self} Appending UserImageRawFrame to LLM context (size: {frame.size})")
+        del self._function_calls_in_progress[frame.request.tool_call_id]

+        # Update context with the image frame
+        self._update_function_call_result(
+            frame.request.function_name, frame.request.tool_call_id, "COMPLETED"
+        )
        self._context.add_image_frame_message(
            format=frame.format,
            size=frame.size,
            image=frame.image,
-            text=frame.text,
+            text=frame.request.context,
        )

-        await self.push_aggregation()
+        await self._push_aggregation()
        await self.push_context_frame(FrameDirection.UPSTREAM)

    async def _handle_llm_start(self, _: LLMFullResponseStartFrame):
@@ -803,17 +786,16 @@ class LLMAssistantAggregator(LLMContextAggregator):

    async def _handle_llm_end(self, _: LLMFullResponseEndFrame):
        self._started -= 1
-        await self.push_aggregation()
+        await self._push_aggregation()

    async def _handle_text(self, frame: TextFrame):
        if not self._started:
            return

-        # Make sure we really have text (spaces count, too!)
-        if len(frame.text) == 0:
-            return
-
-        self._aggregation.append(frame.text)
+        if self._params.expect_stripped_words:
+            self._aggregation += f" {frame.text}" if self._aggregation else frame.text
+        else:
+            self._aggregation += frame.text

    def _context_updated_task_finished(self, task: asyncio.Task):
        self._context_updated_tasks.discard(task)
--- a/src/pipecat/processors/aggregators/user_response.py
+++ b/src/pipecat/processors/aggregators/user_response.py
@@ -12,14 +12,14 @@ in conversational pipelines.
 """

 from pipecat.frames.frames import TextFrame
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMUserAggregator
+from pipecat.processors.aggregators.llm_response import LLMUserContextAggregator
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext


-class UserResponseAggregator(LLMUserAggregator):
+class UserResponseAggregator(LLMUserContextAggregator):
    """Aggregates user responses into TextFrame objects.

-    This aggregator extends LLMUserAggregator to specifically handle
+    This aggregator extends LLMUserContextAggregator to specifically handle
    user input by collecting text responses and outputting them as TextFrame
    objects when the aggregation is complete.
    """
@@ -27,23 +27,10 @@ class UserResponseAggregator(LLMUserAggregator):
    def __init__(self, **kwargs):
        """Initialize the user response aggregator.

-        .. deprecated:: 0.0.92
-            `UserResponseAggregator` is deprecated and will be removed in a future version.
-
        Args:
-            **kwargs: Additional arguments passed to parent LLMUserAggregator.
+            **kwargs: Additional arguments passed to parent LLMUserContextAggregator.
        """
-        super().__init__(context=LLMContext(), **kwargs)
-
-        import warnings
-
-        with warnings.catch_warnings():
-            warnings.simplefilter("always")
-            warnings.warn(
-                "`UserResponseAggregator` is deprecated and will be removed in a future version.",
-                DeprecationWarning,
-                stacklevel=2,
-            )
+        super().__init__(context=OpenAILLMContext(), **kwargs)

    async def push_aggregation(self):
        """Push the aggregated user response as a TextFrame.
--- a/src/pipecat/processors/filters/function_filter.py
+++ b/src/pipecat/processors/filters/function_filter.py
@@ -12,7 +12,7 @@ allowing for flexible frame filtering logic in processing pipelines.

 from typing import Awaitable, Callable

-from pipecat.frames.frames import CancelFrame, EndFrame, Frame, StartFrame, SystemFrame
+from pipecat.frames.frames import EndFrame, Frame, SystemFrame
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor


@@ -28,7 +28,6 @@ class FunctionFilter(FrameProcessor):
        self,
        filter: Callable[[Frame], Awaitable[bool]],
        direction: FrameDirection = FrameDirection.DOWNSTREAM,
-        filter_system_frames: bool = False,
    ):
        """Initialize the function filter.

@@ -37,32 +36,22 @@ class FunctionFilter(FrameProcessor):
                frame should pass through, False otherwise.
            direction: The direction to apply filtering. Only frames moving in
                this direction will be filtered. Defaults to DOWNSTREAM.
-            filter_system_frames: Whether to filter system frames. Defaults to False.
        """
        super().__init__()
        self._filter = filter
        self._direction = direction
-        self._filter_system_frames = filter_system_frames

    #
    # Frame processor
    #

+    # Ignore system frames, end frames and frames that are not following the
+    # direction of this gate
    def _should_passthrough_frame(self, frame, direction):
        """Check if a frame should pass through without filtering."""
-        # Always passthrough frames in the wrong direction
-        if direction != self._direction:
-            return True
-
-        # Always passthrough lifecycle frames
-        if isinstance(frame, (StartFrame, EndFrame, CancelFrame)):
-            return True
-
-        # If not filtering system frames, passthrough all other system frames
-        if not self._filter_system_frames and isinstance(frame, SystemFrame):
-            return True
-
-        return False
+        # Ignore system frames, end frames and frames that are not following the
+        # direction of this gate
+        return isinstance(frame, (SystemFrame, EndFrame)) or direction != self._direction

    async def process_frame(self, frame: Frame, direction: FrameDirection):
        """Process a frame through the filter.
--- a/src/pipecat/processors/frame_processor.py
+++ b/src/pipecat/processors/frame_processor.py
@@ -877,8 +877,6 @@ class FrameProcessor(BaseObject):

        """
        while True:
-            (frame, direction, callback) = await self.__input_queue.get()
-
            if self.__should_block_system_frames and self.__input_event:
                logger.trace(f"{self}: system frame processing paused")
                await self.__input_event.wait()
@@ -886,6 +884,8 @@ class FrameProcessor(BaseObject):
                self.__should_block_system_frames = False
                logger.trace(f"{self}: system frame processing resumed")

+            (frame, direction, callback) = await self.__input_queue.get()
+
            if isinstance(frame, SystemFrame):
                await self.__process_frame(frame, direction, callback)
            elif self.__process_queue:
@@ -900,8 +900,6 @@ class FrameProcessor(BaseObject):
    async def __process_frame_task_handler(self):
        """Handle non-system frames from the process queue."""
        while True:
-            (frame, direction, callback) = await self.__process_queue.get()
-
            if self.__should_block_frames and self.__process_event:
                logger.trace(f"{self}: frame processing paused")
                await self.__process_event.wait()
@@ -909,6 +907,8 @@ class FrameProcessor(BaseObject):
                self.__should_block_frames = False
                logger.trace(f"{self}: frame processing resumed")

+            (frame, direction, callback) = await self.__process_queue.get()
+
            await self.__process_frame(frame, direction, callback)

            self.__process_queue.task_done()
--- a/Show More
+++ b/Show More