introduce pipeline nodes

Make FrameProcessors async iterators and decouple them from pipeline. A new PipelineNode now handles routing frames between processors.
2025-11-03 18:07:09 -08:00
283 changed files with 3581 additions and 9008 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,613 +9,35 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Added

- Added `wait_for_all` argument to the base `LLMService`. When enabled, this
-  ensures all function calls complete before returning results to the LLM (i.e.,
-  before running a new inference with those results).
-
-### Changed
-
- Improved interruption handling to prevent bots from repeating themselves.
-  LLM services that return multiple sentences in a single response (e.g.,
-  `GoogleLLMService`) are now split into individual sentences before being sent
-  to TTS. This ensures interruptions occur at sentence boundaries, preventing
-  the bot from repeating content after being interrupted during long responses.
-
- Text Aggregation Improvements:
-
-  - **Breaking Change**: `BaseTextAggregator.aggregate()` now returns
-    `AsyncIterator[Aggregation]` instead of `Optional[Aggregation]`. This
-    enables the aggregator to return multiple results based on the provided
-    text.
-  - Refactored text aggregators to use inheritance: `SkipTagsAggregator` and
-    `PatternPairAggregator` now inherit from `SimpleTextAggregator`, reusing
-    the base class's sentence detection logic.
-
- Updated `AICFilter` to use Quail STT as the default model
-  (`AICModelType.QUAIL_STT`). Quail STT is optimized for human-to-machine
-  interaction (e.g., voice agents, speech-to-text) and operates at a native
-  sample rate of 16 kHz with fixed enhancement parameters.
-
- Updated Deepgram logging to include Deepgram request IDs for improved debugging.
-
-### Deprecated
-
- Package `pipecat.sync` is deprecated, use `pipecat.utils.sync` instead.
-
- The `noise_gate_enable` parameter in `AICFilter` is deprecated and no longer
-  has any effect. Noise gating is now handled automatically by the AIC VAD
-  system. Use `AICFilter.create_vad_analyzer()` for VAD functionality instead.
-
- NVIDIA Services name changes (all functionality is unchanged):
-
-  - `NimLLMService` is now deprecated, use `NvidiaLLMService` instead.
-  - `RivaSTTService` is now deprecated, use `NvidiaSTTService` instead.
-  - `RivaTTSService` is now deprecated, use `NvidiaTTSService` instead.
-  - Use `uv pip install pipecat-ai[nvidia]` instead of
-    `uv pip install pipecat-ai[riva]`
-
-### Fixed
-
- Fixed an issue where `LLMTextFrame.skip_tts` was being overwritten by LLM
-  services.
-
- Fixed sentence aggregation to correctly handle ambiguous punctuation in
-  streaming text, such as currency ("$29.95") and abbreviations ("Mr. Smith").
-
- Fixed bug in `PatternPairAggregator` where pattern handlers could be called
-  multiple times for `KEEP` or `AGGREGATE` patterns.
-
- Fixed an issue in `SarvamTTSService` where the last sentence was not being
-  spoken. Now, audio is flushed when the TTS services receives the
-  `LLMFullResponseEndFrame` or `EndFrame`.
-
- Fixed an issue in `AWSTranscribeSTTService` where the `region` arg was
-  always set to `us-east-1` when providing an AWS_REGION env var.
-
- Fixed an issue in `DeepgramTTSService` where a `TTSStoppedFrame` was
-  incorrectly pushed after a functional call. This caused an issue with the
-  voice-ui-kit's conversational panel rending of the LLM output after a
-  function call.
-
-## [0.0.96] - 2025-11-26 🦃 "Happy Thanksgiving!" 🦃
-
-### Added
-
- Added `AWSBedrockAgentCoreProcessor` to support invoking an AgentCore-hosted
-  agent in a Pipecat pipeline.
-
- Enhanced error handling across the framework:
-
-  - Added `on_error` callback to `FrameProcessor` for centralized error
-    handling.
-
-  - Renamed `push_error(error: ErrorFrame)` to `push_error_frame(error: ErrorFrame)`
-    for clarity.
-
-  - Added new `push_error` method for simplified error reporting:
-
-    ```python
-    async def push_error(error_msg: str,
-                         exception: Optional[Exception] = None,
-                         fatal: bool = False)
-    ```
-
-  - Standardized error logging by replacing `logger.exception` calls with
-    `logger.error` throughout the codebase.
-
- Added `cache_read_input_tokens`, `cache_creation_input_tokens` and
-  `reasoning_tokens` to OTel spans for LLM call
-
- Added `LiveKitRESTHelper` utility class for managing LiveKit rooms via REST API.
-
- Added `DeepgramSageMakerSTTService` which connects to a SageMaker hosted
-  Deepgram STT model. Added `07c-interruptible-deepgram-sagemaker.py`
-  foundational example.
-
- Added `SageMakerBidiClient` to connect to SageMaker hosted BiDi compatible
-  services.
-
- Added support for `include_timestamps` and `enable_logging` in
-  `ElevenLabsRealtimeSTTService`. When `include_timestamps` is enabled,
-  timestamp data is included in the `TranscriptionFrame`'s `result`
-  parameter.
-
- Added optional speaking rate control to `InworldTTSService`.
-
- Introduced a new `AggregatedTextFrame` type to support passing text along with
-  an `aggregated_by` field to describe the type of text
-  included. `TTSTextFrame`s now inherit from `AggregatedTextFrame`. With this
-  inheritance, an observer can watch for `AggregatedTextFrame`s to accumlate the
-  perceived output and determine whether or not the text was spoken based on if
-  that frame is also a `TTSTextFrame`.
-
-  With this frame, the llm token stream can be transformed into custom
-  composable chunks, allowing for aggregation outside the TTS service. This
-  makes it possible to listen for or handle those aggregations and sets the
-  stage for doing things like composing a best effort of the perceived llm
-  output in a more digestable form and to do so whether or not it is processed
-  by a TTS or if even a TTS exists.
-
- Introduced `LLMTextProcessor`: A new processor meant to allow customization
-  for how LLMTextFrames should be aggregated and considered. It's purpose is to
-  turn `LLMTextFrame`s into `AggregatedTextFrame`s. By default, a TTSService
-  will still aggregate `LLMTextFrame`s by sentence for the service to
-  consume. However, if you wish to override how the llm text is aggregated, you
-  should no longer override the TTS's internal text_aggregator, but instead,
-  insert this processor between your LLM and TTS in the pipeline.
-
- New `bot-output` RTVI message to represent what the bot actually "says".
-
-  - The `RTVIObserver` now emits `bot-output` messages based off the new
-    `AggregatedTextFrame`s (`bot-tts-text` and `bot-llm-text` are still
-    supported and generated, but `bot-transcript` is now deprecated in lieu of
-    this new, more thorough, message).
-
-  - The new `RTVIBotOutputMessage` includes the fields:
-
-    - `spoken`: A boolean indicating whether the text was spoken by TTS
-
-    - `aggregated_by`: A string representing how the text was aggregated
-      ("sentence", "word", "my custom aggregation")
-
-  - Introduced new fields to `RTVIObserver` to support the new `bot-output`
-    messaging:
-
-    - `bot_output_enabled`: Defaults to True. Set to false to disable bot-output
-      messages.
-
-    - `skip_aggregator_types`: Defaults to `None`. Set to a list of strings that
-      match aggregation types that should not be included in bot-output
-      messages. (Ex. `credit_card`)
-
-  - Introduced new methods, `add_text_transformer()` and
-    `remove_text_transformer()`, to `RTVIObserver` to support providing (and
-    subsequently removing) callbacks for various types of aggregations (or all
-    aggregations with `*`) that can modify the text before being sent as a
-    `bot-output` or `tts-text` message. (Think obscuring the credit card or
-    inserting extra detail the client might want that the context doesn't need.)
-
- In `MiniMaxHttpTTSService`:
-
-  - Added support for speech-2.6-hd and speech-2.6-turbo models
-
-  - Added languages: Afrikaans, Bulgarian, Catalan, Danish, Persian, Filipino,
-    Hebrew, Croatian, Hungarian, Malay, Norwegian, Nynorsk, Slovak, Slovenian,
-    Swedish, and Tamil
-
-  - Added new emotions: calm and fluent
-
- Added `enable_logging` to `SimliVideoService` input parameters. It's disabled
-  by default.
-
-### Changed
-
- Updated `FishAudioTTSService` default model to `s1`.
-
- Updated `DeepgramTTSService` to use Deepgram's TTS websocket API. ⚠️ This is
-  a potential breaking change, which only affects you if you're self-hosting
-  `DeepgramTTSService`. The new service uses Websockets and improves TTFB
-  latency.
-
- Updated `daily-python` to 0.22.0.
-
- `BaseTextAggregator` changes:
-
-  Modified the BaseTextAggregator type so that when text gets aggregated,
-  metadata can be associated with it. Currently, that just means a `type`, so
-  that the aggregation can be classified or described. Changes made to support
-  this:
-
-  - ⚠️ IMPORTANT: Aggregators are now expected to strip leading/trailing white
-    space characters before returning their aggregation from `aggregation()` or
-    `.text`. This way all aggregators have a consistent contract allowing
-    downstream use to know how to stitch aggregations back together.
-
-  - Introduced a new `Aggregation` dataclass to represent both the aggregated
-    `text` and a string identifying the `type` of aggregation (ex. "sentence",
-    "word", "my custom aggregation")
-
-  - ⚠️ Breaking change: `BaseTextAggregator.text` now returns an `Aggregation`
-    (instead of `str`).
-
-    Before:
-
-    ```python
-    aggregated_text = myAggregator.text
-    ```
-
-    Now:
-
-    ```python
-    aggregated_text = myAggregator.text.text
-    ```
-
-  - ⚠️ Breaking change: `BaseTextAggregator.aggregate()` now returns
-    `Optional[Aggregation]` (instead of `Optional[str]`).
-
-    Before:
-
-    ```python
-    aggregation = myAggregator.aggregate(text)
-    print(f"successfully aggregated text: {aggregation}")
-    ```
-
-    Now:
-
-    ```python
-    aggregation = myAggregator.aggregate(text)
-    if aggregation:
-      print(f"successfully aggregated text: {aggregation.text}")
-    ```
-
-  - `SimpleTextAggregator`, `SkipTagsAggregator`, `PatternPairAggregator`
-    updated to produce/consume `Aggregation` objects.
-
-  - All uses of the above Aggregators have been updated accordingly.
-
- Augmented the `PatternPairAggregator` so that matched patterns can be treated
-  as their own aggregation, taking advantage of the new. To that end:
-
-  - Introduced a new, preferred version of `add_pattern` to support a new option
-    for treating a match as a separate aggregation returned from
-    `aggregate()`. This replaces the now deprecated `add_pattern_pair` method
-    and you provide a `MatchAction` in lieu of the `remove_match` field.
-
-    - `MatchAction` enum: `REMOVE`, `KEEP`, `AGGREGATE`, allowing customization
-      for how a match should be handled.
-
-      - `REMOVE`: The text along with its delimiters will be removed from the
-        streaming text. Sentence aggregation will continue on as if this text
-        did not exist.
-
-      - `KEEP`: The delimiters will be removed, but the content between them
-        will be kept. Sentence aggregation will continue on with the internal
-        text included.
-
-      - `AGGREGATE`: The delimiters will be removed and the content between will
-        be treated as a separate aggregation. Any text before the start of the
-        pattern will be returned early, whether or not a complete sentence was
-        found. Then the pattern will be returned. Then the aggregation will
-        continue on sentence matching after the closing delimiter is found. The
-        content between the delimiters is not aggregated by sentence. It is
-        aggregated as one single block of text.
-
-    - `PatternMatch` now extends `Aggregation` and provides richer info to
-      handlers.
-
-  - ⚠️ Breaking change: The `PatternMatch` type returned to handlers registered
-    via `on_pattern_match` has been updated to subclass from the new
-    `Aggregation` type, which means that `content` has been replaced with
-    `text` and `pattern_id` has been replaced with `type`:
-
-    ```python
-    async dev on_match_tag(match: PatternMatch):
-       pattern = match.type # instead of match.pattern_id
-       text = match.text # instead of match.content
-    ```
-
- `TextFrame` now includes the field `append_to_context` to support setting
-  whether or not the encompassing text should be added to the LLM context (by
-  the LLM assistant aggregator). It defaults to `True`.
-
- `TTSService` base class updates:
-
-  - `TTSService`s now accept a new `skip_aggregator_types` to avoid speaking
-    certain aggregation types (now determined/returned by the aggregator)
-
-  - Introduced the ability to do a just-in-time transform of text before it gets
-    sent to the TTS service via callbacks you can set up via a new init field,
-    `text_transforms` or a new method `add_text_transformer()`. This makes it
-    possible to do things like introduce TTS-specific tags for spelling or
-    emotion or change the pronunciation of something on the
-    fly. `remove_text_transformer` has also been added to support removing a
-    registered transform callback.
-
-  - TTS services push `AggregatedTextFrame` in addition to `TTSTextFrame`s when
-    either an aggregation occurs that should not be spoken or when the TTS
-    service supports word-by-word timestamping. In the latter case, the
-    `TTSService` preliminarily generates an `AggregatedTextFrame`, aggregated by
-    sentence to generate the full sentence content as early as possible.
-
- Updated `CartesiaTTSService`:
-
-  - Modified use of custom default text_aggregator to avoid deprecation warnings
-    and push users towards use of transformers or the `LLMTextProcessor`
-
-  - Added convenience methods for taking advantage of Cartesia's SSML tags:
-    spell, emotion, pauses, volume, and speed.
-
- Updated `RimeTTSService`:
-
-  - Modified use of custom default text_aggregator to avoid deprecation warnings
-    and push users towards use of transformers or the `LLMTextProcessor`
-
-  - Added convenience methods for taking advantage of Rime's customization
-    options: spell, pauses, pronunciations, and inline speed control.
-
-### Deprecated
-
- The TTS constructor field, `text_aggregator` is deprecated in favor of the new
-  `LLMTextProcessor`. TTSServices still have an internal aggregator for support
-  of default behavior, but if you want to override the aggregation behavior, you
-  should use the new processor.
-
- The RTVI `bot-transcription` event is deprecated in favor of the new
-  `bot-output` message which is the canonical representation of bot output
-  (spoken or not). The code still emits a transcription message for backwards
-  compatibility while transition occurs.
-
- Deprecated `add_pattern_pair` in the `PatternPairAggregator` which takes a
-  `pattern_id` and `remove_match` field in favor of the new `add_pattern` method
-  which takes a `type` and an `action`
-
- `english_normalization` input parameter for `MiniMaxHttpTTSService` is
-  deprecated, use `test_normalization` instead.
-
-### Fixed
-
- Fixed an issue in `AWSBedrockLLMService` where the `aws_region` arg was
-  always set to `us-east-1` when providing an AWS_REGION env var.
-
- Fixed an issue with `DeepgramFluxSTTService` where it sometimes failed to reconnect.
-
- Fixed an issue in `ElevenLabsRealtimeSTTService` where dynamic language
-  updates were not working.
-
- Fixed an issue in `ElevenLabsRealtimeSTTService` where setting the sample
-  rate would result in transcripts failing.
-
- Fixed `InworldTTSService` audio config payload to use camelCase keys expected
-  by the Inworld API.
-
-## [0.0.95] - 2025-11-18
-
-### Added
-
- Added ai-coustics integrated VAD (`AICVADAnalyzer`) with `AICFilter` factory and
-  example wiring; leverages the enhancement model for robust detection with no
-  ONNX dependency or added processing complexity.
-
- Added a watchdog to `DeepgramFluxSTTService` to prevent dangling tasks in case the
-  user was speaking and we stop receiving audio.
-
- Introduced a minimum confidence parameter in `DeepgramFluxSTTService` to avoid
-  generating transcriptions below a defined threshold.
-
- Added `ElevenLabsRealtimeSTTService` which implements the Realtime STT
-  service from ElevenLabs.
-
- Added word-level timestamps support to Hume TTS service
-
-### Changed
-
- ⚠️ Breaking change: `LLMContext.create_image_message()`,
-  `LLMContext.create_audio_message()`, `LLMContext.add_image_frame_message()`
-  and `LLMContext.add_audio_frames_message()` are now async methods. This fixes
-  an issue where the asyncio event loop would be blocked while encoding audio or
-  images.
-
- `ConsumerProcessor` now queues frames from the producer internally instead of
-  pushing them directly. This allows us to subclass consumer processors and
-  manipulate frames before they are pushed.
-
- `BaseTextFilter` only require subclasses to implement the `filter()` method.
-
- Extracted the logic for retrying connections, and create a new `send_with_retry`
-  method inside `WebSocketService`.
-
- Refactored `DeepgramFluxSTTService` to automatically reconnect if sending a
-  message fails.
-
- Updated all STT and TTS services to use consistent error handling pattern with
-  `push_error()` method for better pipeline error event integration.
-
- Added support for `maybe_capture_participant_camera()` and
-  `maybe_capture_participant_screen()` for `SmallWebRTCTransport` in the runner
-  utils.
-
- Added Hindi support for Rime TTS services.
-
- Updated `GeminiTTSService` to use Google Cloud Text-to-Speech streaming API
-  instead of the deprecated Gemini API. Now uses `credentials` /
-  `credentials_path` for authentication. The `api_key` parameter is deprecated.
-  Also, added support for `prompt` parameter for style instructions and
-  expressive markup tags. Significantly improved latency with streaming
-  synthesis.
-
- Updated language mappings for the Google and Gemini TTS services to match
-  official documentation.
-
-### Deprecated
-
- The `api_key` parameter in `GeminiTTSService` is deprecated. Use
-  `credentials` or `credentials_path` instead for Google Cloud authentication.
-
-### Fixed
-
- Fixed a `SimliVideoService` connection issue.
-
- Fixed an issue in the `Runner` where, when using `SmallWebRTCTransport`, the
-  `request_data` was not being passed to the `SmallWebRTCRunnerArguments` body.
-
- Fixed subtle issue of assistant context messages ending up with double spaces
-  between words or sentences.
-
- Fixed an issue where `NeuphonicTTSService` wasn't pushing `TTSTextFrame`s,
-  meaning assistant messages weren't being written to context.
-
- Fixed an issue with OpenTelemetry where tracing wasn't correctly displaying
-  LLM completions and tools when using the universal `LLMContext`.
-
- Fixed issue where `DeepgramFluxSTTService` failed to connect if passing a
-  `keyterm` or `tag` containing a space.
-
- Prevented `HeyGenVideoService` from automatically disconnecting after 5 minutes.
-
-## [0.0.94] - 2025-11-10
-
-### Changed
-
- Added support for retrying `SpeechmaticsTTSService` when it returns a 503
-  error. Default values in `InputParams`.
-
-### Deprecated
-
- The `KrispFilter` is deprecated and will be removed in a future version. Use
-  the `KrispVivaFilter` instead.
-
-### Removed
-
- `LivekitFrameSerializer` has been removed. Use `LiveKitTransport` instead.
-
-### Fixed
-
- Fixed a bug related to `LLMAssistantAggregator` where spaces were sometimes
-  missing from assistant messages in context.
-
-## [0.0.93] - 2025-11-07
-
-### Added
-
- Added support for Sarvam Speech-to-Text service (`SarvamSTTService`) with
-  streaming WebSocket support for `saarika` (STT) and `saaras` (STT-translate)
-  models.
-
- Added support for passing in a `ToolsSchema` in lieu of a list of provider-
-  specific dicts when initializing `OpenAIRealtimeLLMService` or when updating
-  it using `LLMUpdateSettingsFrame`.
-
- Added `TransportParams.audio_out_silence_secs`, which specifies how many
-  seconds of silence to output when an `EndFrame` reaches the output
-  transport. This can help ensure that all audio data is fully delivered to
-  clients.
-
- Added new `FrameProcessor.broadcast_frame()` method. This will push two
-  instances of a given frame class, one upstream and the other downstream.
-
-  ```python
-  await self.broadcast_frame(UserSpeakingFrame)
-  ```
-
- Added `MetricsLogObserver` for logging performance metrics from `MetricsFrame`
-  instances. Supports filtering via `include_metrics` parameter to control which
-  metrics types are logged (TTFB, processing time, LLM token usage, TTS usage,
-  smart turn metrics).
-
- Added `pronunciation_dictionary_locators` to `ElevenLabsTTSService` and
-  `ElevenLabsHttpTTSService`.
-
- Added support for loading external observers. You can now register custom
-  pipeline observers by setting the `PIPECAT_OBSERVER_FILES` environment
-  variable. This variable should contain a colon-separated list of Python files
-  (e.g. `export PIPECAT_OBSERVER_FILES="observer1.py:observer2.py:..."`). Each
-  file must define a function with the following signature:
-
-  ```python
-  async def create_observers(task: PipelineTask) -> Iterable[BaseObserver]:
-      ...
-  ```
-
- Added support for new sonic-3 languages in `CartesiaTTSService` and
-  `CartesiaHttpTTSService`.
+- Refactored pipeline architecture by introducing a new `PipelineNode`
+  abstraction.  Frame processors are now standalone async iterators, and
+  `PipelineNode` is responsible for routing frames upstream or downstream. This
+  decouples frame processors from direct linking, simplifies processor reuse,
+  and provides a clearer separation between processing logic and pipeline
+  wiring. This is an internal, transparent improvement and does not require any
+  changes to existing frame processor code.

 - `EndFrame` and `EndTaskFrame` have an optional `reason` field to indicate why
  the pipeline is being ended.

 - `CancelFrame` and `CancelTaskFrame` have an optional `reason` field to
  indicate why the pipeline is being canceled. This can be also specified when
-  you cancel a task with `PipelineTask.cancel(reason="cancellation reason")`.
-
- Added `include_prob_metrics` parameter to Whisper STT services to enable access
-  to probability metrics from transcription results.
-
- Added utility functions `extract_whisper_probability()`,
-  `extract_openai_gpt4o_probability()`, and `extract_deepgram_probability()` to
-  extract probability metrics from `TranscriptionFrame` objects for Whisper-based,
-  OpenAI GPT-4o-transcribe, and Deepgram STT services respectively.
-
- Added `LLMSwitcher.register_direct_function()`. It works much like
-  `LLMSwitcher.register_function()` in that it's a shorthand for registering
-  functions on all LLMs in the switcher, but for direct functions.
-
- Added `LLMSwitcher.register_direct_function()`. It works much like
-  `LLMSwitcher.register_function()` in that it's a shorthand for registering
-  a function on all LLMs in the switcher, except this new method takes a direct
-  function (a `FunctionSchema`-less function).
-
- Added `MCPClient.get_tools_schema()` and `MCPClient.register_tools_schema()`
-  as a two-step alternative to `MCPClient.register_tools()`, to allow users to
-  pass MCP tools to, say, `GeminiLiveLLMService` (as well as other
-  speech-to-speech services) in the constructor.
-
- Added support for passing in an `LLMSwicher` to `MCPClient.register_tools()`
-  (as well as the new `MCPClient.register_tools_schema()`).
-
- Added `cpu_count` parameter to `LocalSmartTurnAnalyzerV3`. This is set to `1`
-  by default for more predictable performance on low-CPU systems.
-
-### Changed
-
- Updated `simli-ai` to 0.1.25.
-
- `STTMuteFilter` no longer sends `STTMuteFrame` to the STT service. The filter
-  now blocks frames locally without instructing the STT service to stop
-  processing audio. This prevents inactivity-related errors (such as 409 errors
-  from Google STT) while maintaining the same muting behavior at the application
-  level. Important: The STTMuteFilter should be placed _after_ the STT service
-  itself.
-
- Improved `GoogleSTTService` error handling to properly catch gRPC `Aborted`
-  exceptions (corresponding to 409 errors) caused by stream inactivity. These
-  exceptions are now logged at DEBUG level instead of ERROR level, since they
-  indicate expected behavior when no audio is sent for 10+ seconds (e.g., during
-  long silences or when audio input is blocked). The service automatically
-  reconnects when this occurs.
-
- Bumped the `fastapi` dependency's upperbound to `<0.122.0`.
-
- Updated the default model for `GoogleVertexLLMService` to `gemini-2.5-flash`.
-
- Updated the `GoogleVertexLLMService` to use the `GoogleLLMService` as a base
-  class instead of the `OpenAILLMService`.
-
- Updated STT and TTS services to pass through unverified language codes with a
-  warning instead of returning None. This allows developers to use newly
-  supported languages before Pipecat's service classes are updated, while still
-  providing guidance on verified languages.
-
-### Removed
-
- Removed `needs_mcp_alternate_schema()` from `LLMService`. The mechanism that
-  relied on it went away.
+  you cancel a task with `PipelineTask.cancel(reason="cancellation your
+  reason")`.

 ### Fixed

- Restore backwards compatibility for vision/image features (broken in 0.0.92)
-  when using non-universal context and assistant aggregators.
-
- Fixed `DeepgramSTTService._disconnect()` to properly await `is_connected()`
-  method call, which is an async coroutine in the Deepgram SDK.
-
- Fixed an issue where the `SmallWebRTCRequest` dataclass in runner would scrub
-  arbitrary request data from client due to camelCase typing. This fixes data
-  passthrough for JS clients where `APIRequest` is used.
-
- Fixed a bug in `GeminiLiveLLMService` where in some circumstances it wouldn't
-  respond after a tool call.
-
- Fixed `GeminiLiveLLMService` session resumption after a connection timeout.
-
 - `GeminiLiveLLMService` now properly supports context-provided system
-  instruction and tools.
-
- Fixed `GoogleLLMService` token counting to avoid double-counting tokens when
-  Gemini sends usage metadata across multiple streaming chunks.
+  instruction and tools

 ## [0.0.92] - 2025-10-31 🎃 "The Haunted Edition" 👻

 ### Added

+- Added supprt for Sarvam Speech-to-Text service (`SarvamSTTService`) with
+  streaming WebSocket support for `saarika` (STT) and `saaras` (STT-translate)
+  models.
+
 - Added a new `DeepgramHttpTTSService`, which delivers a meaningful reduction
  in latency when compared to the `DeepgramTTSService`.

--- a/COMMUNITY_INTEGRATIONS.md
+++ b/COMMUNITY_INTEGRATIONS.md
@@ -79,7 +79,7 @@ Once your PR is submitted, post in the `#community-integrations` Discord channel

 **Examples:**

- [NvidiaSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/nvidia/stt.py)
+- [RivaSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/riva/stt.py)
 - [FalSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/fal/stt.py)

 #### Key requirements:
--- a/README.md
+++ b/README.md
@@ -74,7 +74,7 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout

 | Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
 | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                                                                                          |
+| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                                                                                                                                                        |
 | LLMs                | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together)                                                                                                                                                                                                                              |
 | Text-to-Speech      | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
 | Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
--- a/docs/api/conf.py
+++ b/docs/api/conf.py
@@ -119,6 +119,7 @@ def import_core_modules():
        "pipecat.observers",
        "pipecat.runner",
        "pipecat.serializers",
+        "pipecat.sync",
        "pipecat.transcriptions",
        "pipecat.utils",
    ]
--- a/docs/api/index.rst
+++ b/docs/api/index.rst
@@ -30,6 +30,7 @@ Quick Links
   Runner <api/pipecat.runner>
   Serializers <api/pipecat.serializers>
   Services <api/pipecat.services>
+   Sync <api/pipecat.sync>
   Transcriptions <api/pipecat.transcriptions>
   Transports <api/pipecat.transports>
-   Utils <api/pipecat.utils>
+   Utils <api/pipecat.utils>
--- a/env.example
+++ b/env.example
@@ -44,7 +44,6 @@ DAILY_SAMPLE_ROOM_URL=https://...

 # Deepgram
 DEEPGRAM_API_KEY=...
-SAGEMAKER_ENDPOINT_NAME=...

 # DeepSeek
 DEEPSEEK_API_KEY=...
--- a/examples/foundational/01c-nvidia-riva-tts.py
+++ b/examples/foundational/01c-nvidia-riva-tts.py
@@ -15,7 +15,7 @@ from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineTask
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.nvidia.tts import NvidiaTTSService
+from pipecat.services.riva.tts import FastPitchTTSService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -36,7 +36,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    tts = NvidiaTTSService(api_key=os.getenv("NVIDIA_API_KEY"))
+    tts = FastPitchTTSService(api_key=os.getenv("NVIDIA_API_KEY"))

    task = PipelineTask(
        Pipeline([tts, transport.output()]),
--- a/examples/foundational/04-transports-small-webrtc.py
+++ b/examples/foundational/04-transports-small-webrtc.py
@@ -77,7 +77,7 @@ async def run_example(webrtc_connection: SmallWebRTCConnection):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/04a-transports-daily.py
+++ b/examples/foundational/04a-transports-daily.py
@@ -60,7 +60,7 @@ async def main():
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/04b-transports-livekit.py
+++ b/examples/foundational/04b-transports-livekit.py
@@ -69,7 +69,7 @@ async def main():
            "role": "system",
            "content": "You are a helpful LLM in a WebRTC call. "
            "Your goal is to demonstrate your capabilities in a succinct way. "
-            "Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. "
+            "Your output will be converted to audio so don't include special characters in your answers. "
            "Respond to what the user said in a creative and helpful way.",
        },
    ]
--- a/examples/foundational/06-listen-and-respond.py
+++ b/examples/foundational/06-listen-and-respond.py
@@ -100,7 +100,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/06a-image-sync.py
+++ b/examples/foundational/06a-image-sync.py
@@ -113,7 +113,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07-interruptible-cartesia-http.py
+++ b/examples/foundational/07-interruptible-cartesia-http.py
@@ -21,8 +21,8 @@ from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.stt import CartesiaSTTService
 from pipecat.services.cartesia.tts import CartesiaHttpTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -59,7 +59,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

    tts = CartesiaHttpTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
@@ -71,7 +71,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07-interruptible.py
+++ b/examples/foundational/07-interruptible.py
@@ -13,17 +13,16 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import Frame, LLMContextFrame, LLMRunFrame
+from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.stt import CartesiaSTTService
 from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -31,44 +30,6 @@ from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams

 load_dotenv(override=True)

-FILTERED_WORDS = ["apple", "banana", "car"]
-
-
-class ContentFilterProcessor(FrameProcessor):
-    """Processor that filters LLMContextFrames containing specific words.
-
-    If the user's message contains any of the filtered words, the context
-    is replaced with a message indicating the assistant cannot respond.
-    """
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        if isinstance(frame, LLMContextFrame):
-            # Check the last user message for filtered words
-            messages = frame.context.messages
-            if messages:
-                last_message = messages[-1]
-                content = last_message.get("content", "")
-                if isinstance(content, str):
-                    content_lower = content.lower()
-                    if any(word in content_lower for word in FILTERED_WORDS):
-                        logger.info(f"Filtered content detected: {content}")
-                        # Create a new context with a filtered response instruction
-                        filtered_context = LLMContext(
-                            messages=[
-                                {
-                                    "role": "system",
-                                    "content": "The user is asking about something you cannot give an answer about. Tell them you don't know how to respond.",
-                                }
-                            ]
-                        )
-                        await self.push_frame(LLMContextFrame(filtered_context), direction)
-                        return
-
-        await self.push_frame(frame, direction)
-
-
 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
 # selected.
@@ -97,7 +58,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+    stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))

    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
@@ -109,20 +70,18 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

    context = LLMContext(messages)
    context_aggregator = LLMContextAggregatorPair(context)
-    content_filter = ContentFilterProcessor()

    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
            stt,
            context_aggregator.user(),  # User responses
-            content_filter,  # Content filter
            llm,  # LLM
            tts,  # TTS
            transport.output(),  # Transport bot output
--- a/examples/foundational/07a-interruptible-speechmatics-vad.py
+++ b/examples/foundational/07a-interruptible-speechmatics-vad.py
@@ -121,7 +121,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                "content": (
                    "You are a helpful British assistant called Sarah. "
                    "Your goal is to demonstrate your capabilities in a succinct way. "
-                    "Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. "
+                    "Your output will be converted to audio so don't include special characters in your answers. "
                    "Always include punctuation in your responses. "
                    "Give very short replies - do not give longer replies unless strictly necessary. "
                    "Respond to what the user said in a concise, funny, creative and helpful way. "
--- a/examples/foundational/07a-interruptible-speechmatics.py
+++ b/examples/foundational/07a-interruptible-speechmatics.py
@@ -111,7 +111,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                "content": (
                    "You are a helpful British assistant called Sarah. "
                    "Your goal is to demonstrate your capabilities in a succinct way. "
-                    "Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. "
+                    "Your output will be converted to audio so don't include special characters in your answers. "
                    "Always include punctuation in your responses. "
                    "Give very short replies - do not give longer replies unless strictly necessary. "
                    "Respond to what the user said in a concise, funny, creative and helpful way. "
--- a/examples/foundational/07aa-interruptible-soniox.py
+++ b/examples/foundational/07aa-interruptible-soniox.py
@@ -70,7 +70,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07ab-interruptible-inworld-http.py
+++ b/examples/foundational/07ab-interruptible-inworld-http.py
@@ -81,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are very knowledgable about dogs. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are very knowledgable about dogs. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07ac-interruptible-asyncai-http.py
+++ b/examples/foundational/07ac-interruptible-asyncai-http.py
@@ -76,7 +76,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07ac-interruptible-asyncai.py
+++ b/examples/foundational/07ac-interruptible-asyncai.py
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07ad-interruptible-aicoustics.py
+++ b/examples/foundational/07ad-interruptible-aicoustics.py
@@ -15,6 +15,7 @@ from loguru import logger
 from pipecat.audio.filters.aic_filter import AICFilter
 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
@@ -47,7 +48,7 @@ def _create_aic_filter() -> AICFilter:

    return AICFilter(
        license_key=license_key,
-        enhancement_level=0.5,
+        enhancement_level=1.0,
    )


@@ -55,33 +56,27 @@ def _create_aic_filter() -> AICFilter:
 # instantiated. The function will be called when the desired transport gets
 # selected.
 transport_params = {
-    "daily": lambda: (
-        lambda aic: DailyParams(
-            audio_in_enabled=True,
-            audio_out_enabled=True,
-            vad_analyzer=aic.create_vad_analyzer(lookback_buffer_size=6.0, sensitivity=6.0),
-            turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-            audio_in_filter=aic,
-        )
-    )(_create_aic_filter()),
-    "twilio": lambda: (
-        lambda aic: FastAPIWebsocketParams(
-            audio_in_enabled=True,
-            audio_out_enabled=True,
-            vad_analyzer=aic.create_vad_analyzer(lookback_buffer_size=6.0, sensitivity=6.0),
-            turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-            audio_in_filter=aic,
-        )
-    )(_create_aic_filter()),
-    "webrtc": lambda: (
-        lambda aic: TransportParams(
-            audio_in_enabled=True,
-            audio_out_enabled=True,
-            vad_analyzer=aic.create_vad_analyzer(lookback_buffer_size=6.0, sensitivity=6.0),
-            turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-            audio_in_filter=aic,
-        )
-    )(_create_aic_filter()),
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+        audio_in_filter=_create_aic_filter(),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+        audio_in_filter=_create_aic_filter(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+        audio_in_filter=_create_aic_filter(),
+    ),
 }


@@ -100,7 +95,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07ae-interruptible-hume.py
+++ b/examples/foundational/07ae-interruptible-hume.py
@@ -13,29 +13,24 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame, TTSTextFrame
-from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint
+from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import (
-    LLMContextAggregatorPair,
-)
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIObserver, RTVIProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.hume.tts import HUME_SAMPLE_RATE, HumeTTSService
 from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.base_output import BaseOutputTransport
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams

 load_dotenv(override=True)

-
 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
 # selected.
@@ -77,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

@@ -93,7 +88,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            stt,
            context_aggregator.user(),  # User responses
            llm,  # LLM
-            tts,  # TTS (HumeTTSService with word timestamps)
+            tts,  # TTS
            transport.output(),  # Transport bot output
            context_aggregator.assistant(),  # Assistant spoken responses
        ]
@@ -107,14 +102,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            audio_out_sample_rate=HUME_SAMPLE_RATE,
        ),
        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-        observers=[
-            RTVIObserver(rtvi),
-            DebugLogObserver(
-                frame_types={
-                    TTSTextFrame: (BaseOutputTransport, FrameEndpoint.SOURCE),
-                }
-            ),
-        ],
+        observers=[RTVIObserver(rtvi)],
    )

    @rtvi.event_handler("on_client_ready")
@@ -124,9 +112,6 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
-        logger.info(
-            "💡 Word timestamps are enabled! Watch the console for TTSTextFrame logs showing each word with its PTS."
-        )
        # Kick off the conversation.
        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
        await task.queue_frames([LLMRunFrame()])
--- a/examples/foundational/07c-interruptible-deepgram-flux.py
+++ b/examples/foundational/07c-interruptible-deepgram-flux.py
@@ -52,10 +52,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = DeepgramFluxSTTService(
-        api_key=os.getenv("DEEPGRAM_API_KEY"),
-        params=DeepgramFluxSTTService.InputParams(min_confidence=0.3),
-    )
+    stt = DeepgramFluxSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")

@@ -64,7 +61,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07c-interruptible-deepgram-http.py
+++ b/examples/foundational/07c-interruptible-deepgram-http.py
@@ -75,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07c-interruptible-deepgram-sagemaker.py
+++ b/examples/foundational/07c-interruptible-deepgram-sagemaker.py
@@ -1,137 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.aws.llm import AWSBedrockLLMService
-from pipecat.services.deepgram.stt_sagemaker import DeepgramSageMakerSTTService
-from pipecat.services.deepgram.tts import DeepgramTTSService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    # Initialize Deepgram SageMaker STT Service
-    # This requires:
-    # - AWS credentials configured (via environment variables or AWS CLI)
-    # - A deployed SageMaker endpoint with Deepgram model
-    stt = DeepgramSageMakerSTTService(
-        endpoint_name=os.getenv("SAGEMAKER_ENDPOINT_NAME"),
-        region=os.getenv("AWS_REGION"),
-    )
-
-    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")
-
-    llm = AWSBedrockLLMService(
-        aws_region=os.getenv("AWS_REGION"),
-        model="us.amazon.nova-pro-v1:0",
-        params=AWSBedrockLLMService.InputParams(temperature=0.8),
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/07c-interruptible-deepgram-vad.py
+++ b/examples/foundational/07c-interruptible-deepgram-vad.py
@@ -68,7 +68,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07c-interruptible-deepgram.py
+++ b/examples/foundational/07c-interruptible-deepgram.py
@@ -69,7 +69,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07d-interruptible-elevenlabs-http.py
+++ b/examples/foundational/07d-interruptible-elevenlabs-http.py
@@ -79,7 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07d-interruptible-elevenlabs.py
+++ b/examples/foundational/07d-interruptible-elevenlabs.py
@@ -22,7 +22,7 @@ from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService
+from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
@@ -60,7 +60,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = ElevenLabsRealtimeSTTService(api_key=os.getenv("ELEVENLABS_API_KEY"))
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

    tts = ElevenLabsTTSService(
        api_key=os.getenv("ELEVENLABS_API_KEY", ""),
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07e-interruptible-playht-http.py
+++ b/examples/foundational/07e-interruptible-playht-http.py
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07e-interruptible-playht.py
+++ b/examples/foundational/07e-interruptible-playht.py
@@ -74,7 +74,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07f-interruptible-azure-http.py
+++ b/examples/foundational/07f-interruptible-azure-http.py
@@ -1,135 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.azure.llm import AzureLLMService
-from pipecat.services.azure.stt import AzureSTTService
-from pipecat.services.azure.tts import AzureHttpTTSService
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = AzureSTTService(
-        api_key=os.getenv("AZURE_SPEECH_API_KEY"),
-        region=os.getenv("AZURE_SPEECH_REGION"),
-    )
-
-    tts = AzureHttpTTSService(
-        api_key=os.getenv("AZURE_SPEECH_API_KEY"),
-        region=os.getenv("AZURE_SPEECH_REGION"),
-    )
-
-    llm = AzureLLMService(
-        api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
-        endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
-        model=os.getenv("AZURE_CHATGPT_MODEL"),
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User responses
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/07f-interruptible-azure.py
+++ b/examples/foundational/07f-interruptible-azure.py
@@ -78,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07g-interruptible-openai.py
+++ b/examples/foundational/07g-interruptible-openai.py
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are very knowledgable about dogs. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are very knowledgable about dogs. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07h-interruptible-openpipe.py
+++ b/examples/foundational/07h-interruptible-openpipe.py
@@ -77,7 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07i-interruptible-xtts.py
+++ b/examples/foundational/07i-interruptible-xtts.py
@@ -75,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07j-interruptible-gladia.py
+++ b/examples/foundational/07j-interruptible-gladia.py
@@ -81,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": f"You are a helpful LLM. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": f"You are a helpful LLM. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07k-interruptible-lmnt.py
+++ b/examples/foundational/07k-interruptible-lmnt.py
@@ -68,7 +68,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07l-interruptible-groq.py
+++ b/examples/foundational/07l-interruptible-groq.py
@@ -71,7 +71,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07m-interruptible-aws.py
+++ b/examples/foundational/07m-interruptible-aws.py
@@ -74,7 +74,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07n-interruptible-gemini-image.py
+++ b/examples/foundational/07n-interruptible-gemini-image.py
@@ -94,7 +94,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07n-interruptible-gemini.py
+++ b/examples/foundational/07n-interruptible-gemini.py
@@ -4,6 +4,24 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+"""
+A conversational AI bot using Gemini for both LLM and TTS.
+
+This example demonstrates how to use Gemini's TTS capabilities with the new
+GeminiTTSService, which uses Gemini's TTS-specific models instead of Google Cloud TTS.
+
+Features showcased:
+- Gemini LLM for conversation
+- Gemini TTS with natural voice control
+- Support for different voice personalities
+- Style and tone control through natural language prompts
+
+Run with:
+    python examples/foundational/gemini-tts.py
+
+Make sure to set your environment variables:
+    export GOOGLE_API_KEY=your_api_key_here
+"""

 import os

@@ -66,13 +84,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    )

    tts = GeminiTTSService(
-        credentials=os.getenv("GOOGLE_TEST_CREDENTIALS"),
-        model="gemini-2.5-flash-tts",
+        api_key=os.getenv("GOOGLE_API_KEY"),
+        model="gemini-2.5-flash-preview-tts",  # TTS-specific model
        voice_id="Charon",
-        params=GeminiTTSService.InputParams(
-            language=Language.EN_US,
-            prompt="You are a helpful AI assistant. Speak in a natural, conversational tone.",
-        ),
+        params=GeminiTTSService.InputParams(language=Language.EN_US),
    )

    llm = GoogleLLMService(
@@ -86,22 +101,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            "role": "system",
            "content": """You are a helpful AI assistant in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way.

-            IMPORTANT: You're using Gemini TTS which supports expressive markup tags. You can use these tags in your responses:
-            - [sigh] - Insert a sigh sound
-            - [laughing] - Insert a laugh
-            - [uhm] - Insert a hesitation sound
-            - [whispering] - Speak the next part in a whisper
-            - [shouting] - Speak the next part louder
-            - [extremely fast] - Speak the next part very quickly
-            - [short pause], [medium pause], [long pause] - Add pauses for dramatic effect
+            IMPORTANT: Since you're using Gemini TTS which supports natural voice control, you can include speaking instructions in your responses. For example:
+            - "Say cheerfully: Welcome to our conversation!"
+            - "Read this in a calm, professional tone: Here are the details you requested."
+            - "Speak in an excited whisper: I have some great news to share!"
+            - "Say slowly and clearly: Let me explain this step by step."

-            Examples:
-            - "Well [sigh] that's a tricky question."
-            - "[laughing] That's a great joke!"
-            - "[whispering] Let me tell you a secret."
-            - "The answer is... [long pause] ...42!"
+            Feel free to use natural language instructions to control your voice style, tone, pace, and emotion. The TTS system will interpret these instructions and adjust the speech accordingly.

-            Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.""",
+            Your output will be converted to audio, so avoid special characters in your answers. Respond to what the user said in a creative and helpful way.""",
        },
    ]

@@ -132,11 +140,11 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
-        # Kick off the conversation
+        # Kick off the conversation with a styled introduction
        messages.append(
            {
                "role": "system",
-                "content": "Hello! I'm your AI assistant. I can help you with a variety of tasks. What would you like to know?",
+                "content": "Say cheerfully and warmly: Hello! I'm your AI assistant powered by Gemini's new TTS technology. I can speak with different voices, tones, and styles. How can I help you today?",
            }
        )
        await task.queue_frames([LLMRunFrame()])
--- a/examples/foundational/07n-interruptible-google-http.py
+++ b/examples/foundational/07n-interruptible-google-http.py
@@ -1,139 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-
-import os
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_context import LLMContext
-from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.runner.types import RunnerArguments
-from pipecat.runner.utils import create_transport
-from pipecat.services.google.llm import GoogleLLMService
-from pipecat.services.google.stt import GoogleSTTService
-from pipecat.services.google.tts import GoogleHttpTTSService, GoogleTTSService
-from pipecat.transcriptions.language import Language
-from pipecat.transports.base_transport import BaseTransport, TransportParams
-from pipecat.transports.daily.transport import DailyParams
-from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-
-load_dotenv(override=True)
-
-# We store functions so objects (e.g. SileroVADAnalyzer) don't get
-# instantiated. The function will be called when the desired transport gets
-# selected.
-transport_params = {
-    "daily": lambda: DailyParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "twilio": lambda: FastAPIWebsocketParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-    "webrtc": lambda: TransportParams(
-        audio_in_enabled=True,
-        audio_out_enabled=True,
-        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
-    ),
-}
-
-
-async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    logger.info(f"Starting bot")
-
-    stt = GoogleSTTService(
-        params=GoogleSTTService.InputParams(languages=Language.EN_US, model="chirp_3"),
-        credentials=os.getenv("GOOGLE_TEST_CREDENTIALS"),
-        location="us",
-    )
-
-    tts = GoogleHttpTTSService(
-        voice_id="en-US-Chirp3-HD-Charon",
-        params=GoogleHttpTTSService.InputParams(language=Language.EN_US),
-        credentials=os.getenv("GOOGLE_TEST_CREDENTIALS"),
-    )
-
-    llm = GoogleLLMService(
-        api_key=os.getenv("GOOGLE_API_KEY"),
-        model="gemini-2.5-flash",
-        # turn on thinking if you want it
-        # params=GoogleLLMService.InputParams(extra={"thinking_config": {"thinking_budget": 4096}}),)
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = LLMContext(messages)
-    context_aggregator = LLMContextAggregatorPair(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),  # Transport user input
-            stt,  # STT
-            context_aggregator.user(),  # User respones
-            llm,  # LLM
-            tts,  # TTS
-            transport.output(),  # Transport bot output
-            context_aggregator.assistant(),  # Assistant spoken responses
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            enable_metrics=True,
-            enable_usage_metrics=True,
-        ),
-        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
-    )
-
-    @transport.event_handler("on_client_connected")
-    async def on_client_connected(transport, client):
-        logger.info(f"Client connected")
-        # Kick off the conversation.
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
-
-    @transport.event_handler("on_client_disconnected")
-    async def on_client_disconnected(transport, client):
-        logger.info(f"Client disconnected")
-        await task.cancel()
-
-    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
-
-    await runner.run(task)
-
-
-async def bot(runner_args: RunnerArguments):
-    """Main bot entry point compatible with Pipecat Cloud."""
-    transport = await create_transport(runner_args, transport_params)
-    await run_bot(transport, runner_args)
-
-
-if __name__ == "__main__":
-    from pipecat.runner.run import main
-
-    main()
--- a/examples/foundational/07n-interruptible-google.py
+++ b/examples/foundational/07n-interruptible-google.py
@@ -61,9 +61,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

    stt = GoogleSTTService(
-        params=GoogleSTTService.InputParams(languages=Language.EN_US, model="chirp_3"),
+        params=GoogleSTTService.InputParams(languages=Language.EN_US),
        credentials=os.getenv("GOOGLE_TEST_CREDENTIALS"),
-        location="us",
    )

    tts = GoogleTTSService(
@@ -82,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07o-interruptible-assemblyai.py
+++ b/examples/foundational/07o-interruptible-assemblyai.py
@@ -74,7 +74,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07p-interruptible-krisp-viva.py
+++ b/examples/foundational/07p-interruptible-krisp-viva.py
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07p-interruptible-krisp.py
+++ b/examples/foundational/07p-interruptible-krisp.py
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07q-interruptible-rime-http.py
+++ b/examples/foundational/07q-interruptible-rime-http.py
@@ -77,7 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07q-interruptible-rime.py
+++ b/examples/foundational/07q-interruptible-rime.py
@@ -71,7 +71,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07r-interruptible-riva-nim.py
+++ b/examples/foundational/07r-interruptible-riva-nim.py
@@ -22,9 +22,9 @@ from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.nvidia.llm import NvidiaLLMService
-from pipecat.services.nvidia.stt import NvidiaSTTService
-from pipecat.services.nvidia.tts import NvidiaTTSService
+from pipecat.services.nim.llm import NimLLMService
+from pipecat.services.riva.stt import RivaSTTService
+from pipecat.services.riva.tts import RivaTTSService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -59,18 +59,16 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    stt = NvidiaSTTService(api_key=os.getenv("NVIDIA_API_KEY"))
+    stt = RivaSTTService(api_key=os.getenv("NVIDIA_API_KEY"))

-    llm = NvidiaLLMService(
-        api_key=os.getenv("NVIDIA_API_KEY"), model="meta/llama-3.1-405b-instruct"
-    )
+    llm = NimLLMService(api_key=os.getenv("NVIDIA_API_KEY"), model="meta/llama-3.1-405b-instruct")

-    tts = NvidiaTTSService(api_key=os.getenv("NVIDIA_API_KEY"))
+    tts = RivaTTSService(api_key=os.getenv("NVIDIA_API_KEY"))

    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07s-interruptible-google-audio-in.py
+++ b/examples/foundational/07s-interruptible-google-audio-in.py
@@ -53,7 +53,7 @@ You are a helpful LLM in a WebRTC call. Your goals are to be helpful and brief i
 You are expert at transcribing audio to text. You will receive a mixture of audio and text input. When
 asked to transcribe what the user said, output an exact, word-for-word transcription.

-Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points.
+Your output will be converted to audio so don't include special characters in your answers.

 Each time you answer, you should respond in three parts.

--- a/examples/foundational/07t-interruptible-fish.py
+++ b/examples/foundational/07t-interruptible-fish.py
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07v-interruptible-neuphonic-http.py
+++ b/examples/foundational/07v-interruptible-neuphonic-http.py
@@ -76,7 +76,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07v-interruptible-neuphonic.py
+++ b/examples/foundational/07v-interruptible-neuphonic.py
@@ -71,7 +71,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07w-interruptible-fal.py
+++ b/examples/foundational/07w-interruptible-fal.py
@@ -74,7 +74,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07x-interruptible-local.py
+++ b/examples/foundational/07x-interruptible-local.py
@@ -56,7 +56,7 @@ async def main():
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/07y-interruptible-minimax.py
+++ b/examples/foundational/07y-interruptible-minimax.py
@@ -78,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/07z-interruptible-sarvam-http.py
+++ b/examples/foundational/07z-interruptible-sarvam-http.py
@@ -15,7 +15,6 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
@@ -80,7 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

@@ -113,7 +112,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            logger.info(f"Client connected")
            # Kick off the conversation.
            messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-            await task.queue_frames([LLMRunFrame()])
+            await task.queue_frames([context_aggregator.user().get_context_frame()])

        @transport.event_handler("on_client_disconnected")
        async def on_client_disconnected(transport, client):
--- a/examples/foundational/07z-interruptible-sarvam.py
+++ b/examples/foundational/07z-interruptible-sarvam.py
@@ -77,7 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/08-custom-frame-processor.py
+++ b/examples/foundational/08-custom-frame-processor.py
@@ -110,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/11-sound-effects.py
+++ b/examples/foundational/11-sound-effects.py
@@ -121,7 +121,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/12-describe-image-openai.py
+++ b/examples/foundational/12-describe-image-openai.py
@@ -66,7 +66,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
        },
    ]

@@ -110,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Kick off the conversation.
        image = Image.open(image_path)
-        message = await LLMContext.create_image_message(
+        message = LLMContext.create_image_message(
            image=image.tobytes(),
            format="RGB",
            size=image.size,
--- a/examples/foundational/12a-describe-image-anthropic.py
+++ b/examples/foundational/12a-describe-image-anthropic.py
@@ -66,7 +66,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
        },
    ]

@@ -110,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Kick off the conversation.
        image = Image.open(image_path)
-        message = await LLMContext.create_image_message(
+        message = LLMContext.create_image_message(
            image=image.tobytes(),
            format="RGB",
            size=image.size,
--- a/examples/foundational/12b-describe-image-aws.py
+++ b/examples/foundational/12b-describe-image-aws.py
@@ -73,7 +73,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
        },
    ]

@@ -117,7 +117,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Kick off the conversation.
        image = Image.open(image_path)
-        message = await LLMContext.create_image_message(
+        message = LLMContext.create_image_message(
            image=image.tobytes(),
            format="RGB",
            size=image.size,
--- a/examples/foundational/12c-describe-image-gemini-flash.py
+++ b/examples/foundational/12c-describe-image-gemini-flash.py
@@ -66,7 +66,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are also able to describe images.",
        },
    ]

@@ -110,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

        # Kick off the conversation.
        image = Image.open(image_path)
-        message = await LLMContext.create_image_message(
+        message = LLMContext.create_image_message(
            image=image.tobytes(),
            format="RGB",
            size=image.size,
--- a/examples/foundational/14-function-calling.py
+++ b/examples/foundational/14-function-calling.py
@@ -120,7 +120,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14c-function-calling-together.py
+++ b/examples/foundational/14c-function-calling-together.py
@@ -106,7 +106,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14d-function-calling-anthropic-video.py
+++ b/examples/foundational/14d-function-calling-anthropic-video.py
@@ -119,7 +119,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
        },
    ]

--- a/examples/foundational/14d-function-calling-aws-video.py
+++ b/examples/foundational/14d-function-calling-aws-video.py
@@ -126,7 +126,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
        },
    ]

--- a/examples/foundational/14d-function-calling-gemini-flash-video.py
+++ b/examples/foundational/14d-function-calling-gemini-flash-video.py
@@ -119,7 +119,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
        },
    ]

--- a/examples/foundational/14d-function-calling-moondream-video.py
+++ b/examples/foundational/14d-function-calling-moondream-video.py
@@ -15,21 +15,14 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import (
-    Frame,
-    LLMFullResponseEndFrame,
-    LLMFullResponseStartFrame,
-    LLMRunFrame,
-    TextFrame,
-    UserImageRequestFrame,
-)
+from pipecat.frames.frames import LLMRunFrame, UserImageRequestFrame
 from pipecat.pipeline.parallel_pipeline import ParallelPipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
    create_transport,
@@ -73,27 +66,6 @@ async def fetch_user_image(params: FunctionCallParams):
    # await params.result_callback({"result": "Image is being captured."})


-class MoondreamTextFrameWrapper(FrameProcessor):
-    """Wraps Moondream-provided TextFrames with LLM response start/end frames.
-
-    This processor detects TextFrames and automatically wraps them with
-    LLMFullResponseStartFrame and LLMFullResponseEndFrame to provide proper
-    response boundaries for downstream processors.
-    """
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        await super().process_frame(frame, direction)
-
-        # If we receive a TextFrame, wrap it with response start/end frames
-        if isinstance(frame, TextFrame):
-            await self.push_frame(LLMFullResponseStartFrame(), direction)
-            await self.push_frame(frame, direction)
-            await self.push_frame(LLMFullResponseEndFrame(), direction)
-        else:
-            # For all other frames, just pass them through
-            await self.push_frame(frame, direction)
-
-
 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
 # selected.
@@ -148,7 +120,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
        },
    ]

@@ -158,12 +130,6 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    # If you run into weird description, try with use_cpu=True
    moondream = MoondreamService()

-    # Wrap TextFrames with LLM response start/end frames, which makes Moondream
-    # output be treated like LLM responses for the purpose of context
-    # aggregation. Without this, the assistant context aggregator would ignore
-    # Moondream output (if the TTS service is disabled).
-    moondream_text_wrapper = MoondreamTextFrameWrapper()
-
    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
@@ -171,7 +137,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            context_aggregator.user(),  # User responses
            ParallelPipeline(
                [llm],  # LLM
-                [moondream, moondream_text_wrapper],
+                [moondream],
            ),
            tts,  # TTS
            transport.output(),  # Transport bot output
--- a/examples/foundational/14d-function-calling-openai-video.py
+++ b/examples/foundational/14d-function-calling-openai-video.py
@@ -119,7 +119,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. You are able to describe images from the user camera.",
        },
    ]

--- a/examples/foundational/14f-function-calling-groq.py
+++ b/examples/foundational/14f-function-calling-groq.py
@@ -104,7 +104,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14g-function-calling-grok.py
+++ b/examples/foundational/14g-function-calling-grok.py
@@ -99,7 +99,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14h-function-calling-azure.py
+++ b/examples/foundational/14h-function-calling-azure.py
@@ -107,7 +107,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14i-function-calling-fireworks.py
+++ b/examples/foundational/14i-function-calling-fireworks.py
@@ -110,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way. Start by saying hello.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. Start by saying hello.",
        },
    ]

--- a/examples/foundational/14j-function-calling-nvidia.py
+++ b/examples/foundational/14j-function-calling-nvidia.py
@@ -27,7 +27,7 @@ from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.nvidia.llm import NvidiaLLMService
+from pipecat.services.nim.llm import NimLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -75,12 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        # text_filters=[MarkdownTextFilter()],
    )

-    llm = NvidiaLLMService(
-        api_key=os.getenv("NVIDIA_API_KEY"),
-        model="nvidia/llama-3.3-nemotron-super-49b-v1.5",
-        # Recommended when turning thinking off
-        params=NvidiaLLMService.InputParams(temperature=0.0),
-    )
+    llm = NimLLMService(api_key=os.getenv("NVIDIA_API_KEY"), model="meta/llama-3.3-70b-instruct")
    # You can also register a function_name of None to get all functions
    # sent to the same callback with an additional function_name parameter.
    llm.register_function("get_current_weather", fetch_weather_from_api)
@@ -107,12 +102,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    )
    tools = ToolsSchema(standard_tools=[weather_function])
    messages = [
-        # Disable thinking by sending this message first
-        # Check the model for the corresponding "no thinking" message
-        {"role": "system", "content": "/no_think"},
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14m-function-calling-openrouter.py
+++ b/examples/foundational/14m-function-calling-openrouter.py
@@ -107,7 +107,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14n-function-calling-perplexity.py
+++ b/examples/foundational/14n-function-calling-perplexity.py
@@ -77,7 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "user",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way, but try to be brief.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14q-function-calling-qwen.py
+++ b/examples/foundational/14q-function-calling-qwen.py
@@ -105,7 +105,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14r-function-calling-aws.py
+++ b/examples/foundational/14r-function-calling-aws.py
@@ -120,7 +120,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14s-function-calling-sambanova.py
+++ b/examples/foundational/14s-function-calling-sambanova.py
@@ -110,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14t-function-calling-direct.py
+++ b/examples/foundational/14t-function-calling-direct.py
@@ -106,7 +106,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14u-function-calling-ollama.py
+++ b/examples/foundational/14u-function-calling-ollama.py
@@ -122,7 +122,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14v-function-calling-openai.py
+++ b/examples/foundational/14v-function-calling-openai.py
@@ -128,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14w-function-calling-mistral.py
+++ b/examples/foundational/14w-function-calling-mistral.py
@@ -116,7 +116,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/14x-function-calling-openpipe.py
+++ b/examples/foundational/14x-function-calling-openpipe.py
@@ -126,7 +126,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/16-gpu-container-local-bot.py
+++ b/examples/foundational/16-gpu-container-local-bot.py
@@ -82,7 +82,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/17-detect-user-idle.py
+++ b/examples/foundational/17-detect-user-idle.py
@@ -72,7 +72,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/19-openai-realtime.py
+++ b/examples/foundational/19-openai-realtime.py
@@ -21,6 +21,7 @@ from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
@@ -147,8 +148,6 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
                noise_reduction=InputAudioNoiseReduction(type="near_field"),
            )
        ),
-        # In this example we provide tools through the context, but you could
-        # alternatively provide them here.
        # tools=tools,
        instructions="""You are a helpful and friendly AI.

@@ -224,15 +223,6 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
            standard_tools=[weather_function, restaurant_function, get_news_function]
        )
        await task.queue_frames([LLMSetToolsFrame(tools=new_tools)])
-        # Alternative pattern, useful if you're changing other session properties, too.
-        # (Though note that tools in your LLMContext take precedence over those
-        # in session properties, so if you have context-provided tools, prefer
-        # LLMSetToolsFrame instead, as it updates your context. Ditto for
-        # updating system instructions: send an LLMMessagesUpdateFrame with
-        # context messages updated with your new desired system message.)
-        # await task.queue_frames(
-        #     [LLMUpdateSettingsFrame(settings=SessionProperties(tools=new_tools).model_dump())]
-        # )

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/19a-azure-realtime.py
+++ b/examples/foundational/19a-azure-realtime.py
@@ -19,6 +19,7 @@ from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
--- a/examples/foundational/20a-persistent-context-openai.py
+++ b/examples/foundational/20a-persistent-context-openai.py
@@ -98,7 +98,7 @@ async def load_conversation(params: FunctionCallParams):
 messages = [
    {
        "role": "system",
-        "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+        "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
    },
 ]

--- a/examples/foundational/20c-persistent-context-anthropic.py
+++ b/examples/foundational/20c-persistent-context-anthropic.py
@@ -100,7 +100,7 @@ async def load_conversation(params: FunctionCallParams):
 messages = [
    {
        "role": "system",
-        "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a succinct, creative and helpful way. Prefer responses that are one sentence long unless you are asked for a longer or more detailed response.",
+        "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a succinct, creative and helpful way. Prefer responses that are one sentence long unless you are asked for a longer or more detailed response.",
    },
    {"role": "user", "content": "Start the call by saying the word 'hello'. Say only that word."},
    # {"role": "user", "content": ""},
--- a/examples/foundational/20d-persistent-context-gemini.py
+++ b/examples/foundational/20d-persistent-context-gemini.py
@@ -121,9 +121,8 @@ messages = [
    {
        "role": "system",
        "content": """You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your
-capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that 
-can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative
-and helpful way.
+capabilities in a succinct way. Your output will be converted to audio so don't include special
+characters in your answers. Respond to what the user said in a creative and helpful way.

 You have several tools you can use to help you.

--- a/examples/foundational/21-tavus-transport.py
+++ b/examples/foundational/21-tavus-transport.py
@@ -61,7 +61,7 @@ async def main():
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/21a-tavus-video-service.py
+++ b/examples/foundational/21a-tavus-video-service.py
@@ -80,7 +80,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        messages = [
            {
                "role": "system",
-                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
            },
        ]

--- a/examples/foundational/22-natural-conversation.py
+++ b/examples/foundational/22-natural-conversation.py
@@ -28,10 +28,10 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import LLMService
 from pipecat.services.openai.llm import OpenAIContextAggregatorPair, OpenAILLMService
+from pipecat.sync.event_notifier import EventNotifier
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-from pipecat.utils.sync.event_notifier import EventNotifier

 load_dotenv(override=True)

@@ -142,7 +142,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/22b-natural-conversation-proposal.py
+++ b/examples/foundational/22b-natural-conversation-proposal.py
@@ -45,11 +45,11 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams, LLMService
 from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.sync.base_notifier import BaseNotifier
+from pipecat.sync.event_notifier import EventNotifier
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-from pipecat.utils.sync.base_notifier import BaseNotifier
-from pipecat.utils.sync.event_notifier import EventNotifier
 from pipecat.utils.time import time_now_iso8601

 load_dotenv(override=True)
@@ -327,7 +327,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    messages = [
        {
            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
        },
    ]

--- a/examples/foundational/22c-natural-conversation-mixed-llms.py
+++ b/examples/foundational/22c-natural-conversation-mixed-llms.py
@@ -46,11 +46,11 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams, LLMService
 from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.sync.base_notifier import BaseNotifier
+from pipecat.sync.event_notifier import EventNotifier
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-from pipecat.utils.sync.base_notifier import BaseNotifier
-from pipecat.utils.sync.event_notifier import EventNotifier
 from pipecat.utils.time import time_now_iso8601

 load_dotenv(override=True)
@@ -258,7 +258,7 @@ Output: YES
 Output: NO
 """

-conversational_system_message = """You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.
+conversational_system_message = """You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.

 Please be very concise in your responses. Unless you are explicitly asked to do otherwise, give me the shortest complete answer possible without unnecessary elaboration. Generally you should answer with a single sentence.
 """
--- a/examples/foundational/22d-natural-conversation-gemini-audio.py
+++ b/examples/foundational/22d-natural-conversation-gemini-audio.py
@@ -47,11 +47,11 @@ from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.google.llm import GoogleLLMService
 from pipecat.services.llm_service import LLMService
+from pipecat.sync.base_notifier import BaseNotifier
+from pipecat.sync.event_notifier import EventNotifier
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
-from pipecat.utils.sync.base_notifier import BaseNotifier
-from pipecat.utils.sync.event_notifier import EventNotifier
 from pipecat.utils.time import time_now_iso8601

 load_dotenv(override=True)
@@ -340,7 +340,7 @@ Output: NO

 conversation_system_instruction = """You are a helpful assistant participating in a voice converation.

-Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.
+Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.

 If you know that a number string is a phone number from the context of the conversation, write it as a phone number. For example 210-333-4567.

@@ -391,7 +391,7 @@ class AudioAccumulator(FrameProcessor):
            )
            self._user_speaking = False
            context = LLMContext()
-            await context.add_audio_frames_message(audio_frames=self._audio_frames)
+            context.add_audio_frames_message(audio_frames=self._audio_frames)
            await self.push_frame(LLMContextFrame(context=context))
        elif isinstance(frame, InputAudioRawFrame):
            # Append the audio frame to our buffer. Treat the buffer as a ring buffer, dropping the oldest
--- a/Show More
+++ b/Show More