Remove log

Send message to client
Show email
2025-05-25 15:17:16 +08:00 · 2025-05-25 15:17:05 +08:00 · 2025-05-25 11:28:52 +08:00 · 2025-05-25 11:27:25 +08:00 · 2025-05-25 11:11:17 +08:00 · 2025-05-25 11:02:14 +08:00
365 changed files with 21606 additions and 2830 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,250 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Added

+- Added `SarvamTTSService`, which implements Sarvam AI's TTS API:
+  https://docs.sarvam.ai/api-reference-docs/text-to-speech/convert.
+
+- Added `PipelineTask.add_observer()` and `PipelineTask.remove_observer()` to
+  allow mangaging observers at runtime. This is useful for cases where the task
+  is passed around to other code components that might want to observe the
+  pipeline dynamically.
+
+- Added `user_id` field to `TranscriptionMessage`. This allows identifying the
+  user in a multi-user scenario. Note that this requires that
+  `TranscriptionFrame` has the `user_id` properly set.
+
+- Added new `PipelineTask` event handlers `on_pipeline_started`,
+  `on_pipeline_stopped`, `on_pipeline_ended` and `on_pipeline_cancelled`, which
+  correspond to the `StartFrame`, `StopFrame`, `EndFrame` and `CancelFrame`
+  respectively.
+
+- Added additional languages to `LmntTTSService`. Languages include: `hi`, `id`,
+  `it`, `ja`, `nl`, `pl`, `ru`, `sv`, `th`, `tr`, `uk`, `vi`.
+
+- Added a `model` parameter to the `LmntTTSService` constructor, allowing
+  switching between LMNT models.
+
+- Added `MiniMaxHttpTTSService`, which implements MiniMax's T2A API for TTS.
+  Learn more: https://www.minimax.io/platform_overview
+
+- A new function `FrameProcessor.setup()` has been added to allow setting up
+  frame processors before receiving a `StartFrame`. This is what's happening
+  internally: `FrameProcessor.setup()` is called, `StartFrame` is pushed from
+  the beginning of the pipeline, your regular pipeline operations, `EndFrame`
+  or `CancelFrame` are pushed from the beginning of the pipeline and finally
+  `FrameProcessor.cleanup()` is called.
+
+- Added support for OpenTelemetry tracing in Pipecat. This initial
+  implementation includes:
+
+  - A `setup_tracing` method where you can specify your OpenTelemetry exporter
+  - Service decorators for STT (`@traced_stt`), LLM (`@traced_llm`), and TTS
+    (`@traced_tts`) which trace the execution and collect properties and
+    metrics (TTFB, token usage, character counts, etc.)
+  - Class decorators that provide execution tracking; these are generic and can
+    be used for service tracking as needed
+  - Spans that help track traces on a per conversations and turn basis:
+
+  ```
+  conversation-uuid
+  ├── turn-1
+  │   ├── stt_deepgramsttservice
+  │   ├── llm_openaillmservice
+  │   └── tts_cartesiattsservice
+  ...
+  └── turn-n
+      └── ...
+  ```
+
+  By default, Pipecat has implemented service decorators to trace execution of
+  STT, LLM, and TTS services. You can enable tracing by setting `enable_tracing`
+  to `True` in the PipelineTask.
+
+- Added `TurnTrackingObserver`, which tracks the start and end of a user/bot
+  turn pair and emits events `on_turn_started` and `on_turn_stopped`
+  corresponding to the start and end of a turn, respectively.
+
+- Allow passing observers to `run_test()` while running unit tests.
+
+### Changed
+
+- Updated the default model for `AnthropicLLMService` to
+  `claude-sonnet-4-20250514`.
+
+- Updated the default model for `GeminiMultimodalLiveLLMService` to
+  `models/gemini-2.5-flash-preview-native-audio-dialog`.
+
+- `BaseTextFilter` methods `filter()`, `update_settings()`,
+  `handle_interruption()` and `reset_interruption()` are now async.
+
+- `BaseTextAggregator` methods `aggregate()`, `handle_interruption()` and
+  `reset()` are now async.
+
+- The API version for `CartesiaTTSService` and `CartesiaHttpTTSService` has
+  been updated. Also, the `cartesia` dependency has been updated to 2.x.
+
+- `CartesiaTTSService` and `CartesiaHttpTTSService` now support Cartesia's new
+  `speed` parameter which accepts values of `slow`, `normal`, and `fast`.
+
+- `GeminiMultimodalLiveLLMService` now uses the user transcription and usage
+  metrics provided by Gemini Live.
+
+- `GoogleLLMService` has been updated to use `google-genai` instead of the
+  deprecated `google-generativeai`.
+
+### Deprecated
+
+- In `CartesiaTTSService` and `CartesiaHttpTTSService`, `emotion` has been
+  deprecated by Cartesia. Pipecat is following suit and deprecating `emotion`
+  as well.
+
+### Removed
+
+- Since `GeminiMultimodalLiveLLMService` now transcribes it's own audio, the
+  `transcribe_user_audio` arg has been removed. Audio is now transcribed
+  automatically.
+
+- Removed `SileroVAD` frame processor, just use `SileroVADAnalyzer`
+  instead. Also removed, `07a-interruptible-vad.py` example.
+
+### Fixed
+
+- Fixed an issue with `ElevenLabsTTSService` where changing the model or voice
+  while the service is running wasn't working.
+
+- Fixed an issue that would cause multiple instances of the same class to behave
+  incorrectly if any of the given constructor arguments defaulted to a mutable
+  value (e.g. lists, dictionaries, objects).
+
+- Fixed an issue with `CartesiaTTSService` where `TTSTextFrame` messages weren't
+  being emitted when the model was set to `sonic`. This resulted in the
+  assistant context not being updated with assistant messages.
+
+### Performance
+
+- Don't create event handler tasks if no user event handlers have been
+  registered.
+
+### Other
+
+- Added foundation examples `07y-interruptible-minimax.py` and
+  `07z-interruptible-sarvam.py`to show how to use the `MiniMaxHttpTTSService`
+  and `SarvamTTSService`, respectively.
+
+- Added an `open-telemetry-tracing` example, showing how to setup tracing. The
+  example also includes Jaeger as an open source OpenTelemetry client to review
+  traces from the example runs.
+
+- Added foundational example `29-turn-tracking-observer.py` to show how to use
+  the `TurnTrackingObserver`.
+
+## [0.0.67] - 2025-05-07
+
+### Added
+
+- Added `DebugLogObserver` for detailed frame logging with configurable
+  filtering by frame type and endpoint. This observer automatically extracts
+  and formats all frame data fields for debug logging.
+
+- `UserImageRequestFrame.video_source` field has been added to request an image
+  from the desired video source.
+
+- Added support for the AWS Nova Sonic speech-to-speech model with the new
+  `AWSNovaSonicLLMService`.
+  See https://docs.aws.amazon.com/nova/latest/userguide/speech.html.
+  Note that it requires Python >= 3.12 and `pip install pipecat-ai[aws-nova-sonic]`.
+
+- Added new AWS services `AWSBedrockLLMService` and `AWSTranscribeSTTService`.
+
+- Added `on_active_speaker_changed` event handler to the `DailyTransport` class.
+
+- Added `enable_ssml_parsing` and `enable_logging` to `InputParams` in
+  `ElevenLabsTTSService`.
+
+- Added support to `RimeHttpTTSService` for the `arcana` model.
+
+### Changed
+
+- Updated `ElevenLabsTTSService` to use the beta websocket API
+  (multi-stream-input). This new API supports context_ids and cancelling those
+  contexts, which greatly improves interruption handling.
+
+- Observers `on_push_frame()` now take a single argument `FramePushed` instead
+  of multiple arguments.
+
+- Updated the default voice for `DeepgramTTSService` to `aura-2-helena-en`.
+
+### Deprecated
+
+- `PollyTTSService` is now deprecated, use `AWSPollyTTSService` instead.
+
+- Observer `on_push_frame(src, dst, frame, direction, timestamp)` is now
+  deprecated, use `on_push_frame(data: FramePushed)` instead.
+
+### Fixed
+
+- Fixed a `DailyTransport` issue that was causing issues when multiple audio or
+  video sources where being captured.
+
+- Fixed a `UltravoxSTTService` issue that would cause the service to generate
+  all tokens as one word.
+
+- Fixed a `PipelineTask` issue that would cause tasks to not be cancelled if
+  task was cancelled from outside of Pipecat.
+
+- Fixed a `TaskManager` that was causing dangling tasks to be reported.
+
+- Fixed an issue that could cause data to be sent to the transports when they
+  were still not ready.
+
+- Remove custom audio tracks from `DailyTransport` before leaving.
+
+### Removed
+
+- Removed `CanonicalMetricsService` as it's no longer maintained.
+
+## [0.0.66] - 2025-05-02
+
+### Added
+
+- Added two new input parameters to `RimeTTSService`: `pause_between_brackets`
+  and `phonemize_between_brackets`.
+
+- Added support for cross-platform local smart turn detection. You can use
+  `LocalSmartTurnAnalyzer` for on-device inference using Torch.
+
+- `BaseOutputTransport` now allows multiple destinations if the transport
+  implementation supports it (e.g. Daily's custom tracks). With multiple
+  destinations it is possible to send different audio or video tracks with a
+  single transport simultaneously. To do that, you need to set the new
+  `Frame.transport_destination` field with your desired transport destination
+  (e.g. custom track name), tell the transport you want a new destination with
+  `TransportParams.audio_out_destinations` or
+  `TransportParams.video_out_destinations` and the transport should take care of
+  the rest.
+
+- Similar to the new `Frame.transport_destination`, there's a new
+  `Frame.transport_source` field which is set by the `BaseInputTransport` if the
+  incoming data comes from a non-default source (e.g. custom tracks).
+
+- `TTSService` has a new `transport_destination` constructor parameter. This
+  parameter will be used to update the `Frame.transport_destination` field for
+  each generated `TTSAudioRawFrame`. This allows sending multiple bots' audio to
+  multiple destinations in the same pipeline.
+
+- Added `DailyTransportParams.camera_out_enabled` and
+  `DailyTransportParams.microphone_out_enabled` which allows you to
+  enable/disable the main output camera or microphone tracks. This is useful if
+  you only want to use custom tracks and not send the main tracks. Note that you
+  still need `audio_out_enabled=True` or `video_out_enabled`.
+
+- Added `DailyTransport.capture_participant_audio()` which allows you to capture
+  an audio source (e.g. "microphone", "screenAudio" or a custom track name) from
+  a remote participant.
+
+- Added `DailyTransport.update_publishing()` which allows you to update the call
+  video and audio publishing settings (e.g. audio and video quality).
+
 - Added `RTVIObserverParams` which allows you to configure what RTVI messages
  are sent to the clients.

@@ -37,6 +281,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Changed

+- `TransportParams.audio_mixer` now supports a string and also a dictionary to
+  provide a mixer per destination. For example:
+
+```python
+  audio_out_mixer={
+      "track-1": SoundfileMixer(...),
+      "track-2": SoundfileMixer(...),
+      "track-N": SoundfileMixer(...),
+  },
+```
+
 - The `STTMuteFilter` now mutes `InterimTranscriptionFrame` and
  `TranscriptionFrame` which allows the `STTMuteFilter` to be used in
  conjunction with transports that generate transcripts, e.g. `DailyTransport`.
@@ -70,6 +325,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
  case there's no need to push audio to the rest of the pipeline, but this is
  not a very common case.

+- Added `RivaSegmentedSTTService`, which allows Riva offline/batch models, such
+  as to be "canary-1b-asr" used in Pipecat.
+
 ### Deprecated

 - Function calls with parameters
@@ -85,8 +343,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - `TransportParams.vad_audio_passthrough` parameter is now deprecated, use
  `TransportParams.audio_in_passthrough` instead.

+- `ParakeetSTTService` is now deprecated, use `RivaSTTService` instead, which uses
+  the model "parakeet-ctc-1.1b-asr" by default.
+
+- `FastPitchTTSService` is now deprecated, use `RivaTTSService` instead, which uses
+  the model "magpie-tts-multilingual" by default.
+
 ### Fixed

+- Fixed an issue with `SimliVideoService` where the bot was continuously outputting
+  audio, which prevents the `BotStoppedSpeakingFrame` from being emitted.
+
+- Fixed an issue where `OpenAIRealtimeBetaLLMService` would add two assistant
+  messages to the context.
+
 - Fixed an issue with `GeminiMultimodalLiveLLMService` where the context
  contained tokens instead of words.

@@ -102,6 +372,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Other

+- Added `examples/daily-custom-tracks` to show how to send and receive Daily
+  custom tracks.
+
+- Added `examples/daily-multi-translation` to showcase how to send multiple
+  simulataneous translations with the same transport.
+
 - Added 04 foundational examples for client/server transports. Also, renamed
  `29-livekit-audio-chat.py` to `04b-transports-livekit.py`.

--- a/README.md
+++ b/README.md
@@ -49,18 +49,18 @@ You can connect to Pipecat from any platform using our official SDKs:

 ## 🧩 Available services

-| Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
-| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Parakeet (NVIDIA)](https://docs.pipecat.ai/server/services/stt/parakeet), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                                                                                                                                            |
-| LLMs                | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
-| Text-to-Speech      | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [FastPitch (NVIDIA)](https://docs.pipecat.ai/server/services/tts/fastpitch), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts)                       |
-| Speech-to-Speech    | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
-| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
-| Video               | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
-| Memory              | [mem0](https://docs.pipecat.ai/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
-| Vision & Image      | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
-| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
-| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/server/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+| Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Parakeet (NVIDIA)](https://docs.pipecat.ai/server/services/stt/parakeet), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper)                                                                                                                                                                                                                                                                                            |
+| LLMs                | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [Together AI](https://docs.pipecat.ai/server/services/llm/together)                                                 |
+| Text-to-Speech      | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [FastPitch (NVIDIA)](https://docs.pipecat.ai/server/services/tts/fastpitch), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
+| Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+| Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+| Video               | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+| Memory              | [mem0](https://docs.pipecat.ai/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+| Vision & Image      | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+| Analytics & Metrics | [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |

 📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)

--- a/docs/api/requirements.txt
+++ b/docs/api/requirements.txt
@@ -10,7 +10,6 @@ pipecat-ai[anthropic]
 pipecat-ai[assemblyai]
 pipecat-ai[aws]
 pipecat-ai[azure]
-pipecat-ai[canonical]
 pipecat-ai[cartesia]
 pipecat-ai[cerebras]
 pipecat-ai[deepseek]
--- a/dot-env.template
+++ b/dot-env.template
@@ -95,9 +95,16 @@ OPENROUTER_API_KEY=...
 PIPER_BASE_URL=...

 # Smart turn
-LOCAL_SMART_TURN_MODEL_PATH=
+LOCAL_SMART_TURN_MODEL_PATH=...
 FAL_SMART_TURN_API_KEY=...

 # Twilio
-TWILIO_ACCOUNT_SID=
-TWILIO_AUTH_TOKEN=
+TWILIO_ACCOUNT_SID=...
+TWILIO_AUTH_TOKEN=...
+
+# MiniMax
+MINIMAX_API_KEY=...
+MINIMAX_GROUP_ID=...
+
+# Sarvam AI
+SARVAM_API_KEY=...
--- a/examples/bot-ready-signalling/client/javascript/package-lock.json
+++ b/examples/bot-ready-signalling/client/javascript/package-lock.json
@@ -12,7 +12,7 @@
        "@daily-co/daily-js": "0.74.0"
      },
      "devDependencies": {
-        "vite": "^6.0.9"
+        "vite": "^6.3.5"
      }
    },
    "node_modules/@babel/runtime": {
@@ -999,9 +999,9 @@
      }
    },
    "node_modules/vite": {
-      "version": "6.3.3",
-      "resolved": "https://registry.npmjs.org/vite/-/vite-6.3.3.tgz",
-      "integrity": "sha512-5nXH+QsELbFKhsEfWLkHrvgRpTdGJzqOZ+utSdmPTvwHmvU6ITTm3xx+mRusihkcI8GeC7lCDyn3kDtiki9scw==",
+      "version": "6.3.5",
+      "resolved": "https://registry.npmjs.org/vite/-/vite-6.3.5.tgz",
+      "integrity": "sha512-cZn6NDFE7wdTpINgs++ZJ4N49W2vRp8LCKrn3Ob1kYNtOo21vfDoaV5GzBfLU4MovSAB8uNRm4jgzVQZ+mBzPQ==",
      "dev": true,
      "dependencies": {
        "esbuild": "^0.25.0",
--- a/examples/bot-ready-signalling/client/javascript/package.json
+++ b/examples/bot-ready-signalling/client/javascript/package.json
@@ -12,7 +12,7 @@
  "license": "ISC",
  "description": "",
  "devDependencies": {
-    "vite": "^6.0.9"
+    "vite": "^6.3.5"
  },
  "dependencies": {
    "@daily-co/daily-js": "0.74.0"
--- a/examples/canonical-metrics/.gitignore
+++ b/examples/canonical-metrics/.gitignore
@@ -1,161 +0,0 @@
-# Byte-compiled / optimized / DLL files
-__pycache__/
-*.py[cod]
-*$py.class
-recordings/
-# C extensions
-*.so
-
-# Distribution / packaging
-.Python
-build/
-develop-eggs/
-dist/
-downloads/
-eggs/
-.eggs/
-lib/
-lib64/
-parts/
-sdist/
-var/
-wheels/
-share/python-wheels/
-*.egg-info/
-.installed.cfg
-*.egg
-MANIFEST
-
-# PyInstaller
-#  Usually these files are written by a python script from a template
-#  before PyInstaller builds the exe, so as to inject date/other infos into it.
-*.manifest
-*.spec
-
-# Installer logs
-pip-log.txt
-pip-delete-this-directory.txt
-
-# Unit test / coverage reports
-htmlcov/
-.tox/
-.nox/
-.coverage
-.coverage.*
-.cache
-nosetests.xml
-coverage.xml
-*.cover
-*.py,cover
-.hypothesis/
-.pytest_cache/
-cover/
-
-# Translations
-*.mo
-*.pot
-
-# Django stuff:
-*.log
-local_settings.py
-db.sqlite3
-db.sqlite3-journal
-
-# Flask stuff:
-instance/
-.webassets-cache
-
-# Scrapy stuff:
-.scrapy
-
-# Sphinx documentation
-docs/_build/
-
-# PyBuilder
-.pybuilder/
-target/
-
-# Jupyter Notebook
-.ipynb_checkpoints
-
-# IPython
-profile_default/
-ipython_config.py
-
-# pyenv
-#   For a library or package, you might want to ignore these files since the code is
-#   intended to run in multiple environments; otherwise, check them in:
-# .python-version
-
-# pipenv
-#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
-#   However, in case of collaboration, if having platform-specific dependencies or dependencies
-#   having no cross-platform support, pipenv may install dependencies that don't work, or not
-#   install all needed dependencies.
-#Pipfile.lock
-
-# poetry
-#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
-#   This is especially recommended for binary packages to ensure reproducibility, and is more
-#   commonly ignored for libraries.
-#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
-#poetry.lock
-
-# pdm
-#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
-#pdm.lock
-#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
-#   in version control.
-#   https://pdm.fming.dev/#use-with-ide
-.pdm.toml
-
-# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
-__pypackages__/
-
-# Celery stuff
-celerybeat-schedule
-celerybeat.pid
-
-# SageMath parsed files
-*.sage.py
-
-# Environments
-.env
-.venv
-env/
-venv/
-ENV/
-env.bak/
-venv.bak/
-
-# Spyder project settings
-.spyderproject
-.spyproject
-
-# Rope project settings
-.ropeproject
-
-# mkdocs documentation
-/site
-
-# mypy
-.mypy_cache/
-.dmypy.json
-dmypy.json
-
-# Pyre type checker
-.pyre/
-
-# pytype static type analyzer
-.pytype/
-
-# Cython debug symbols
-cython_debug/
-
-# PyCharm
-#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
-#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
-#  and can be added to the global gitignore or merged into this file.  For a more nuclear
-#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
-#.idea/
-runpod.toml
--- a/examples/canonical-metrics/README.md
+++ b/examples/canonical-metrics/README.md
@@ -1,66 +0,0 @@
-# Chatbot with canonical-metrics
-
-This project implements a chatbot using a pipeline architecture that integrates audio processing, transcription, and a language model for conversational interactions. The chatbot operates within a daily communication environment, utilizing various services for text-to-speech and language model responses.
-
-## Features
-
- **Audio Input and Output**: Captures microphone input and plays back audio responses.
- **Voice Activity Detection**: Utilizes Silero VAD to manage audio input intelligently.
- **Text-to-Speech**: Integrates ElevenLabs TTS service to convert text responses into audio.
- **Language Model Interaction**: Uses OpenAI's GPT-4 model to generate responses based on user input.
- **Transcription Services**: Captures and transcribes participant speech for analytics.
- **Metrics Collection**: Sends audio data for analysis via Canonical Metrics Service.
-
-## Requirements
-
- Python 3.10+
- `python-dotenv`
- Additional libraries from the `pipecat` package.
-
-## Setup
-
-1. Clone the repository.
-2. Install the required packages.
-3. Set up environment variables for API keys:
-   - `OPENAI_API_KEY`
-   - `ELEVENLABS_API_KEY`
-   - `CANONICAL_API_KEY`
-   - `CANONICAL_API_URL`
-4. Run the script.
-
-## Usage
-
-The chatbot introduces itself and engages in conversations, providing brief and creative responses. Designed for flexibility, it can support multiple languages with appropriate configuration.
-
-## Events
-
- Participants joining or leaving the call are handled dynamically, adjusting the chatbot's behavior accordingly.
-
-
-ℹ️ The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
-
-## Get started
-
-```python
-python3 -m venv venv
-source venv/bin/activate
-pip install -r requirements.txt
-
-cp env.example .env # and add your credentials
-
-```
-
-## Run the server
-
-```bash
-python server.py
-```
-
-Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
-
-## Build and test the Docker image
-
-```
-docker build -t chatbot .
-docker run --env-file .env -p 7860:7860 chatbot
-```
--- a/examples/canonical-metrics/bot.py
+++ b/examples/canonical-metrics/bot.py
@@ -1,146 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import asyncio
-import os
-import sys
-import uuid
-
-import aiohttp
-from dotenv import load_dotenv
-from loguru import logger
-from runner import configure
-
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import EndFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
-from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
-from pipecat.services.canonical.metrics import CanonicalMetricsService
-from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.services.daily import DailyParams, DailyTransport
-
-load_dotenv(override=True)
-
-logger.remove(0)
-logger.add(sys.stderr, level="DEBUG")
-
-
-async def main():
-    async with aiohttp.ClientSession() as session:
-        (room_url, token) = await configure(session)
-
-        transport = DailyTransport(
-            room_url,
-            token,
-            "Chatbot",
-            DailyParams(
-                audio_out_enabled=True,
-                audio_in_enabled=True,
-                video_out_enabled=False,
-                vad_analyzer=SileroVADAnalyzer(),
-                transcription_enabled=True,
-                #
-                # Spanish
-                #
-                # transcription_settings=DailyTranscriptionSettings(
-                #     language="es",
-                #     tier="nova",
-                #     model="2-general"
-                # )
-            ),
-        )
-
-        tts = ElevenLabsTTSService(
-            api_key=os.getenv("ELEVENLABS_API_KEY"),
-            #
-            # English
-            #
-            voice_id="cgSgspJ2msm6clMCkdW9",
-            #
-            # Spanish
-            #
-            # model="eleven_multilingual_v2",
-            # voice_id="gD1IexrzCvsXPHUuT0s3",
-        )
-
-        llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-        messages = [
-            {
-                "role": "system",
-                #
-                # English
-                #
-                "content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your responses to 12 words or fewer.",
-                #
-                # Spanish
-                #
-                # "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
-            },
-        ]
-
-        context = OpenAILLMContext(messages)
-        context_aggregator = llm.create_context_aggregator(context)
-
-        """
-        CanonicalMetrics uses AudioBufferProcessor under the hood to buffer the audio. On
-        call completion, CanonicalMetrics will send the audio buffer to Canonical for
-        analysis. Visit https://voice.canonical.chat to learn more.
-        """
-        audio_buffer_processor = AudioBufferProcessor(num_channels=2)
-        canonical = CanonicalMetricsService(
-            audio_buffer_processor=audio_buffer_processor,
-            aiohttp_session=session,
-            api_key=os.getenv("CANONICAL_API_KEY"),
-            call_id=str(uuid.uuid4()),
-            assistant="pipecat-chatbot",
-            assistant_speaks_first=True,
-            context=context,
-        )
-        pipeline = Pipeline(
-            [
-                transport.input(),  # microphone
-                context_aggregator.user(),
-                llm,
-                tts,
-                transport.output(),
-                canonical,  # uploads audio buffer to Canonical AI for metrics
-                audio_buffer_processor,  # captures audio into a buffer
-                context_aggregator.assistant(),
-            ]
-        )
-
-        task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
-
-        @transport.event_handler("on_first_participant_joined")
-        async def on_first_participant_joined(transport, participant):
-            await audio_buffer_processor.start_recording()
-            await transport.capture_participant_transcription(participant["id"])
-            await task.queue_frames([context_aggregator.user().get_context_frame()])
-
-        @transport.event_handler("on_participant_left")
-        async def on_participant_left(transport, participant, reason):
-            print(f"Participant left: {participant}")
-            await task.cancel()
-
-        @transport.event_handler("on_call_state_updated")
-        async def on_call_state_updated(transport, state):
-            if state == "left":
-                # Here we don't want to cancel, we just want to finish sending
-                # whatever is queued, so we use an EndFrame().
-                await task.queue_frame(EndFrame())
-
-        runner = PipelineRunner()
-
-        await runner.run(task)
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
--- a/examples/canonical-metrics/requirements.txt
+++ b/examples/canonical-metrics/requirements.txt
@@ -1,5 +0,0 @@
-python-dotenv
-fastapi[all]
-uvicorn
-pipecat-ai[daily,openai,silero,elevenlabs,canonical]
-
--- a/examples/chatbot-audio-recording/runner.py
+++ b/examples/chatbot-audio-recording/runner.py
@@ -53,4 +53,3 @@ async def configure(aiohttp_session: aiohttp.ClientSession):
    token = await daily_rest_helper.get_token(url, expiry_time)

    return (url, token)
-    return (url, token)
--- a/examples/daily-custom-tracks/README.md
+++ b/examples/daily-custom-tracks/README.md
@@ -0,0 +1,39 @@
+# Daily Custom Tracks
+
+This example shows how to send and receive Daily custom tracks. We will run a simple `daily-python` application to send an audio file with a custom track (named "pipecat") to a room. Then, the Pipecat bot will mirror that custom track into another custom track (named "pipecat-mirror") in the same room.
+
+## Get started
+
+```python
+python3 -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
+```
+
+## Run the bot
+
+Start the bot by giving it a Daily room URL.
+
+```bash
+python bot.py -u ROOM_URL
+```
+
+The bot will wait for the first participant to join. Then, it will mirror a custom track named "pipecat" into a new custom track named "pipecat-mirror".
+
+## Run the sender
+
+Now, run the custom track sender. This is a simple `daily-python` application that opens and audio file and sends it as a custom track to the same Daily room.
+
+```bash
+python custom_track_sender.py -u ROOM_URL -i office-ambience-mono-16000.mp3
+```
+
+## Open client
+
+Finally, open the client so you can hear both custom tracks.
+
+```bash
+open index.html
+```
+
+Once the client is opened, copy the URL of the Daily room and join it. You should be able to select which custom track you want to hear.
--- a/examples/daily-custom-tracks/bot.py
+++ b/examples/daily-custom-tracks/bot.py
@@ -0,0 +1,87 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import sys
+
+import aiohttp
+from loguru import logger
+from runner import configure
+
+from pipecat.frames.frames import Frame, InputAudioRawFrame, OutputAudioRawFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+class CustomTrackMirrorProcessor(FrameProcessor):
+    def __init__(self, transport_destination: str, **kwargs):
+        super().__init__(**kwargs)
+        self._transport_destination = transport_destination
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, InputAudioRawFrame) and frame.transport_source:
+            output_frame = OutputAudioRawFrame(
+                audio=frame.audio,
+                sample_rate=frame.sample_rate,
+                num_channels=frame.num_channels,
+            )
+            output_frame.transport_destination = self._transport_destination
+            await self.push_frame(output_frame)
+        else:
+            await self.push_frame(frame, direction)
+
+
+async def main():
+    async with aiohttp.ClientSession() as session:
+        (room_url, _) = await configure(session)
+
+        transport = DailyTransport(
+            room_url,
+            None,
+            "Custom tracks mirror",
+            DailyParams(
+                audio_in_enabled=True,
+                audio_out_enabled=True,
+                microphone_out_enabled=False,  # Disable since we just use custom tracks
+                audio_out_destinations=["pipecat-mirror"],
+            ),
+        )
+
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                CustomTrackMirrorProcessor("pipecat-mirror"),
+                transport.output(),  # Transport bot output
+            ]
+        )
+
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                audio_in_sample_rate=16000,
+                audio_out_sample_rate=16000,
+            ),
+        )
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            await transport.capture_participant_audio(participant["id"], audio_source="pipecat")
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/daily-custom-tracks/custom_track_sender.py
+++ b/examples/daily-custom-tracks/custom_track_sender.py
@@ -0,0 +1,74 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+import time
+
+from daily import CallClient, CustomAudioSource, Daily
+from pydub import AudioSegment
+
+parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
+parser.add_argument("-u", "--url", type=str, required=True, help="URL of the Daily room to join")
+parser.add_argument(
+    "-i", "--input", type=str, required=True, help="Input audio file (needs 16000 sample rate)"
+)
+
+args, _ = parser.parse_known_args()
+
+audio = AudioSegment.from_mp3(args.input)
+
+raw_bytes = audio.raw_data
+sample_rate = audio.frame_rate
+channels = audio.channels
+
+print(f"Length: {len(raw_bytes)} bytes")
+print(f"Sample rate: {sample_rate}, Channels: {channels}")
+
+# Initialize the Daily context & create call client
+Daily.init()
+
+client = CallClient()
+
+# Join the room and indicate we have a custom track named "pipecat".
+client.join(
+    args.url,
+    client_settings={
+        "publishing": {
+            "camera": False,
+            "microphone": False,
+            "customAudio": {"pipecat": True},
+        },
+    },
+)
+
+# Just sleep for a couple of seconds. To do this well we should really use
+# completions.
+time.sleep(2)
+
+# Create the custom audio source. This is where we will write our audio.
+audio_source = CustomAudioSource(sample_rate, channels)
+
+# Create an audio track and assign it our audio source.
+client.add_custom_audio_track("pipecat", audio_source)
+
+# Just sleep for a second. To do this well we should really use completions.
+time.sleep(1)
+
+try:
+    # Just write one second of audio until we have read all the file.
+    chunk_size = sample_rate * channels * 2
+    while len(raw_bytes) > 0:
+        chunk = raw_bytes[:chunk_size]
+        raw_bytes = raw_bytes[chunk_size:]
+        audio_source.write_frames(chunk)
+
+except KeyboardInterrupt:
+    client.leave()
+
+# Just sleep for a second. To do this well we should really use completions.
+time.sleep(1)
+
+client.release()
--- a/examples/daily-custom-tracks/index.html
+++ b/examples/daily-custom-tracks/index.html
@@ -0,0 +1,173 @@
+<html>
+  <head>
+    <title>daily custom tracks</title>
+  </head>
+  <script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
+  <link
+    rel="stylesheet"
+    type="text/css"
+    href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
+    />
+  <script>
+    function enableButton(buttonId, enable) {
+        const button = document.getElementById(buttonId);
+        button.disabled = !enable;
+    }
+
+    function enableJoinButton(enable) {
+        enableButton("join-button", enable);
+    }
+
+    function enableLeaveButton(enable) {
+        enableButton("leave-button", enable);
+    }
+
+    function destroyPlayers(query) {
+        const items = document.querySelectorAll(query);
+        if (items) {
+            for (const item of items) {
+                item.remove();
+            }
+        }
+    }
+
+    function destroyParticipantPlayers(participantId) {
+        destroyPlayers(`audio[data-participant-id="${participantId}"]`);
+        destroyPlayers(`button[data-participant-id="${participantId}"]`);
+    }
+
+    async function startPlayer(player, track) {
+        player.muted = false;
+        player.autoplay = true;
+        if (track != null) {
+            player.srcObject = new MediaStream([track]);
+        }
+    }
+
+    async function buildAudioPlayer(track, participantId) {
+        const audioContainer = document.getElementById("audio-container");
+        const player = document.createElement("audio");
+        player.dataset.participantId = participantId;
+
+        // Create a new button for controlling audio
+        const audioControlButton = document.createElement("button");
+        audioControlButton.className = "ui primary green button"
+        audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
+        audioControlButton.dataset.participantId = participantId;
+        audioControlButton.onclick = () => {
+            if (player.paused) {
+
+                player.play();
+                audioControlButton.className = "ui primary red button"
+            } else {
+                player.pause();
+                audioControlButton.className = "ui primary green button"
+            }
+        };
+
+        audioContainer.appendChild(player);
+        audioContainer.appendChild(audioControlButton);
+
+        await startPlayer(player, track);
+        player.pause()
+
+        return player;
+    }
+
+    function subscribeToTracks(participantId) {
+        console.log(`subscribing to track`);
+
+        if (participantId === "local") {
+            return;
+        }
+
+        callObject.updateParticipant(participantId, {
+            setSubscribedTracks: {
+                audio: true,
+                video: false,
+                custom: true,
+            },
+        });
+    }
+
+    function startDaily() {
+        enableJoinButton(true);
+        enableLeaveButton(false);
+
+        window.callObject = window.DailyIframe.createCallObject({});
+
+        callObject.on("participant-joined", (e) => {
+            if (!e.participant.local) {
+                console.log("participant-joined", e.participant);
+               subscribeToTracks(e.participant.session_id);
+            }
+        });
+
+        callObject.on("participant-left", (e) => {
+            console.log("participant-left", e.participant.session_id);
+            destroyParticipantPlayers(e.participant.session_id);
+        });
+
+        callObject.on("track-started", async (e) => {
+            console.log("track-started", e.track);
+            if (e.track.kind === "audio") {
+                await buildAudioPlayer(e.track, e.participant.session_id);
+            }
+        });
+    }
+
+    async function joinRoom() {
+        enableJoinButton(false);
+        enableLeaveButton(true);
+
+        const meetingUrl = document.getElementById("meeting-url").value;
+
+        callObject.join({
+            url: meetingUrl,
+            startVideoOff: true,
+            startAudioOff: true,
+            subscribeToTracksAutomatically: false,
+            receiveSettings: {
+                base: { video: { layer: 0 } },
+            },
+        });
+    }
+
+    async function leaveRoom() {
+        enableJoinButton(true);
+        enableLeaveButton(false);
+
+        callObject.leave();
+
+        const audioContainer = document.getElementById("audio-container");
+        audioContainer.replaceChildren();
+    }
+  </script>
+
+  <body onload="startDaily()">
+    <div class="ui centered page grid" style="margin-top: 30px">
+      <div class="ten wide column">
+        <div class="ui form" style="margin-top: 30px">
+          <div class="field">
+            <label>Meeting URL</label>
+            <input id="meeting-url" value="" />
+          </div>
+        </div>
+      </div>
+    </div>
+    <div class="ui centered aligned header" style="margin-top: 30px">
+      <button id="join-button" class="ui primary button" onclick="joinRoom()">
+        Join
+      </button>
+      <button id="leave-button" class="ui button" onclick="leaveRoom()">
+        Leave
+      </button>
+    </div>
+    <div id="tile" class="ui container" style="margin-top: 30px">
+      <div id="tile" class="ui center aligned grid">
+        <div id="audio-container"></div><br/>
+      </div>
+    </div>
+  </body>
+</html>
--- a/examples/daily-custom-tracks/office-ambience-mono-16000.mp3
+++ b/examples/daily-custom-tracks/office-ambience-mono-16000.mp3
--- a/examples/daily-custom-tracks/requirements.txt
+++ b/examples/daily-custom-tracks/requirements.txt
@@ -0,0 +1,2 @@
+pydub
+pipecat-ai[daily]
--- a/examples/daily-custom-tracks/runner.py
+++ b/examples/daily-custom-tracks/runner.py
--- a/examples/daily-multi-translation/Dockerfile
+++ b/examples/daily-multi-translation/Dockerfile
@@ -1,7 +1,12 @@
 FROM python:3.10-bullseye
+
 RUN mkdir /app
+RUN mkdir /app/assets
+RUN mkdir /app/utils
 COPY *.py /app/
 COPY requirements.txt /app/
+
+
 WORKDIR /app
 RUN pip3 install -r requirements.txt

--- a/examples/daily-multi-translation/README.md
+++ b/examples/daily-multi-translation/README.md
@@ -0,0 +1,39 @@
+# Daily Multi Translation
+
+This example shows how to use Daily to stream multiple simultaneous translations using a single transport. Daily provides custom tracks and in this example we will simultaneously translate incoming audio in English to Spanish, French and German, each of them being sent to a custom track.
+
+## Get started
+
+```python
+python3 -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
+
+cp env.example .env # and add your credentials
+
+```
+
+## Run the server
+
+```bash
+python server.py
+```
+
+Then, visit `http://localhost:7860/` in your browser. This will open a Daily Prebuilt room where you will speak in English (make sure you are not muted).
+
+## Open client
+
+Next, you need to open the client that will listen to the translations.
+
+```bash
+open index.html
+```
+
+Once the client is opened, copy the URL of the Daily room created above and join it. You should be able to select which translation you want to hear.
+
+## Build and test the Docker image
+
+```
+docker build -t daily-multi-translation .
+docker run --env-file .env -p 7860:7860 daily-multi-translation
+```
--- a/examples/daily-multi-translation/bot.py
+++ b/examples/daily-multi-translation/bot.py
@@ -0,0 +1,165 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import os
+import sys
+
+import aiohttp
+from dotenv import load_dotenv
+from loguru import logger
+from runner import configure
+
+from pipecat.audio.mixers.soundfile_mixer import SoundfileMixer
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
+from pipecat.pipeline.parallel_pipeline import ParallelPipeline
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+BACKGROUND_SOUND_FILE = "office-ambience-mono-16000.mp3"
+
+
+async def main():
+    async with aiohttp.ClientSession() as session:
+        (room_url, token) = await configure(session)
+
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Multi translation bot",
+            DailyParams(
+                audio_in_enabled=True,
+                audio_out_enabled=True,
+                audio_out_mixer={
+                    "spanish": SoundfileMixer(
+                        sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
+                    ),
+                    "french": SoundfileMixer(
+                        sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
+                    ),
+                    "german": SoundfileMixer(
+                        sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
+                    ),
+                },
+                audio_out_destinations=["spanish", "french", "german"],
+                microphone_out_enabled=False,  # Disable since we just use custom tracks
+                vad_analyzer=SileroVADAnalyzer(),
+            ),
+        )
+
+        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+        tts_spanish = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_id="cefcb124-080b-4655-b31f-932f3ee743de",
+            transport_destination="spanish",
+        )
+        tts_french = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_id="8832a0b5-47b2-4751-bb22-6a8e2149303d",
+            transport_destination="french",
+        )
+        tts_german = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_id="38aabb6a-f52b-4fb0-a3d1-988518f4dc06",
+            transport_destination="german",
+        )
+
+        messages_spanish = [
+            {
+                "role": "system",
+                "content": "You will be provided with a sentence in English, and your task is to only translate it into Spanish.",
+            },
+        ]
+        messages_french = [
+            {
+                "role": "system",
+                "content": "You will be provided with a sentence in English, and your task is to only translate it into French.",
+            },
+        ]
+        messages_german = [
+            {
+                "role": "system",
+                "content": "You will be provided with a sentence in English, and your task is to only translate it into German.",
+            },
+        ]
+
+        llm_spanish = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+        llm_french = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+        llm_german = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+        context_spanish = OpenAILLMContext(messages_spanish)
+        context_aggregator_spanish = llm_spanish.create_context_aggregator(context_spanish)
+
+        context_french = OpenAILLMContext(messages_french)
+        context_aggregator_french = llm_french.create_context_aggregator(context_french)
+
+        context_german = OpenAILLMContext(messages_german)
+        context_aggregator_german = llm_german.create_context_aggregator(context_german)
+
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,
+                ParallelPipeline(
+                    # Spanish pipeline.
+                    [
+                        context_aggregator_spanish.user(),
+                        llm_spanish,
+                        tts_spanish,
+                        context_aggregator_spanish.assistant(),
+                    ],
+                    # French pipeline.
+                    [
+                        context_aggregator_french.user(),
+                        llm_french,
+                        tts_french,
+                        context_aggregator_french.assistant(),
+                    ],
+                    # German pipeline.
+                    [
+                        context_aggregator_german.user(),
+                        llm_german,
+                        tts_german,
+                        context_aggregator_german.assistant(),
+                    ],
+                ),
+                transport.output(),  # Transport bot output
+            ]
+        )
+
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                audio_in_sample_rate=16000,
+                audio_out_sample_rate=16000,
+                allow_interruptions=True,
+                enable_metrics=True,
+                enable_usage_metrics=True,
+                report_only_initial_ttfb=True,
+            ),
+            observers=[TranscriptionLogObserver()],
+        )
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/daily-multi-translation/env.example
+++ b/examples/daily-multi-translation/env.example
@@ -1,6 +1,5 @@
 DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
 DAILY_API_KEY=7df...
 OPENAI_API_KEY=sk-PL...
-ELEVENLABS_API_KEY=aeb...
-CANONICAL_API_KEY=can...
-CANONICAL_API_URL=
+DEEPGRAM_API_KEY=efb...
+CARTESIA_API_KEY=aeb...
--- a/examples/daily-multi-translation/index.html
+++ b/examples/daily-multi-translation/index.html
@@ -0,0 +1,202 @@
+<html>
+  <head>
+    <title>daily multi translation</title>
+  </head>
+  <script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
+  <script
+    src="https://code.jquery.com/jquery-3.1.1.min.js"
+    integrity="sha256-hVVnYaiADRTO2PzUGmuLJr8BLUSjGIZsDYGmIJLv2b8="
+    crossorigin="anonymous"
+    ></script>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
+  <link
+    rel="stylesheet"
+    type="text/css"
+    href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
+    />
+  <script>
+    function enableButton(buttonId, enable) {
+        const button = document.getElementById(buttonId);
+        button.disabled = !enable;
+    }
+
+    function enableJoinButton(enable) {
+        enableButton("join-button", enable);
+    }
+
+    function enableLeaveButton(enable) {
+        enableButton("leave-button", enable);
+    }
+
+    function destroyPlayers(query) {
+        const items = document.querySelectorAll(query);
+        if (items) {
+            for (const item of items) {
+                item.remove();
+            }
+        }
+    }
+
+    function destroyParticipantPlayers(participantId) {
+        destroyPlayers(`video[data-participant-id="${participantId}"]`);
+        destroyPlayers(`audio[data-participant-id="${participantId}"]`);
+        destroyPlayers(`button[data-participant-id="${participantId}"]`);
+    }
+
+    async function startPlayer(player, track) {
+        player.muted = false;
+        player.autoplay = true;
+        if (track != null) {
+            player.srcObject = new MediaStream([track]);
+        }
+    }
+
+    async function buildVideoPlayer(track, participantId) {
+        const videoContainer = document.getElementById("video-container");
+        const player = document.createElement("video");
+        player.dataset.participantId = participantId;
+
+        videoContainer.appendChild(player);
+
+        await startPlayer(player, track);
+        await player.play();
+
+        return player;
+    }
+
+    async function buildAudioPlayer(track, participantId) {
+        const audioContainer = document.getElementById("audio-container");
+        const player = document.createElement("audio");
+        player.dataset.participantId = participantId;
+
+        // Create a new button for controlling audio
+        const audioControlButton = document.createElement("button");
+        audioControlButton.className = "ui primary green button"
+        audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
+        audioControlButton.dataset.participantId = participantId;
+        audioControlButton.onclick = () => {
+            if (player.paused) {
+
+                player.play();
+                audioControlButton.className = "ui primary red button"
+            } else {
+                player.pause();
+                audioControlButton.className = "ui primary green button"
+            }
+        };
+
+        audioContainer.appendChild(player);
+        audioContainer.appendChild(audioControlButton);
+
+        await startPlayer(player, track);
+        player.pause()
+
+        return player;
+    }
+
+    function subscribeToTracks(participantId) {
+        console.log(`subscribing to track`);
+
+        if (participantId === "local") {
+            return;
+        }
+
+        callObject.updateParticipant(participantId, {
+            setSubscribedTracks: {
+                audio: true,
+                video: true,
+                custom: true,
+            },
+        });
+    }
+
+    function startDaily() {
+        enableJoinButton(true);
+        enableLeaveButton(false);
+
+        window.callObject = window.DailyIframe.createCallObject({});
+
+        callObject.on("participant-joined", (e) => {
+            if (!e.participant.local) {
+                console.log("participant-joined", e.participant);
+               subscribeToTracks(e.participant.session_id);
+            }
+        });
+
+        callObject.on("participant-left", (e) => {
+            console.log("participant-left", e.participant.session_id);
+            destroyParticipantPlayers(e.participant.session_id);
+        });
+
+        callObject.on("track-started", async (e) => {
+            console.log("track-started", e.track);
+            if (e.track.kind === "video") {
+                await buildVideoPlayer(e.track, e.participant.session_id);
+            } else if (e.track.kind === "audio") {
+                await buildAudioPlayer(e.track, e.participant.session_id);
+            }
+        });
+    }
+
+    async function joinRoom() {
+        enableJoinButton(false);
+        enableLeaveButton(true);
+
+        const meetingUrl = document.getElementById("meeting-url").value;
+
+        callObject.join({
+            url: meetingUrl,
+            startVideoOff: true,
+            startAudioOff: true,
+            subscribeToTracksAutomatically: false,
+            receiveSettings: {
+                base: { video: { layer: 0 } },
+            },
+        });
+    }
+
+    async function leaveRoom() {
+        enableJoinButton(true);
+        enableLeaveButton(false);
+
+        callObject.leave();
+
+        const videoContainer = document.getElementById("video-container");
+        videoContainer.replaceChildren();
+
+        const audioContainer = document.getElementById("audio-container");
+        audioContainer.replaceChildren();
+    }
+  </script>
+
+  <body onload="startDaily()">
+    <div class="ui centered page grid" style="margin-top: 30px">
+      <div class="ten wide column">
+        <div class="ui form" style="margin-top: 30px">
+          <div class="field">
+            <label>Meeting URL</label>
+            <input id="meeting-url" value="" />
+          </div>
+        </div>
+      </div>
+    </div>
+    <div class="ui centered aligned header" style="margin-top: 30px">
+      <button id="join-button" class="ui primary button" onclick="joinRoom()">
+        Join
+      </button>
+      <button id="leave-button" class="ui button" onclick="leaveRoom()">
+        Leave
+      </button>
+    </div>
+    <div id="tile" class="ui container" style="margin-top: 30px">
+      <div id="tile" class="ui center aligned grid">
+        <div id="audio-container"></div><br/>
+      </div>
+    </div>
+    <div id="tile" class="ui container" style="margin-top: 30px">
+      <div id="tile" class="ui center aligned grid">
+        <div id="video-container" class="ui segment"></div>
+      </div>
+    </div>
+  </body>
+</html>
--- a/examples/daily-multi-translation/office-ambience-mono-16000.mp3
+++ b/examples/daily-multi-translation/office-ambience-mono-16000.mp3
--- a/examples/daily-multi-translation/requirements.txt
+++ b/examples/daily-multi-translation/requirements.txt
@@ -0,0 +1,5 @@
+aiofiles
+python-dotenv
+fastapi[all]
+uvicorn
+pipecat-ai[daily,deepgram,openai,silero,cartesia]
--- a/examples/daily-multi-translation/runner.py
+++ b/examples/daily-multi-translation/runner.py
@@ -0,0 +1,55 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+import os
+
+import aiohttp
+
+from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
+
+
+async def configure(aiohttp_session: aiohttp.ClientSession):
+    parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
+    parser.add_argument(
+        "-u", "--url", type=str, required=False, help="URL of the Daily room to join"
+    )
+    parser.add_argument(
+        "-k",
+        "--apikey",
+        type=str,
+        required=False,
+        help="Daily API Key (needed to create an owner token for the room)",
+    )
+
+    args, unknown = parser.parse_known_args()
+
+    url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
+    key = args.apikey or os.getenv("DAILY_API_KEY")
+
+    if not url:
+        raise Exception(
+            "No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
+        )
+
+    if not key:
+        raise Exception(
+            "No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
+        )
+
+    daily_rest_helper = DailyRESTHelper(
+        daily_api_key=key,
+        daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
+        aiohttp_session=aiohttp_session,
+    )
+
+    # Create a meeting token for the given room with an expiration 1 hour in
+    # the future.
+    expiry_time: float = 60 * 60
+
+    token = await daily_rest_helper.get_token(url, expiry_time)
+
+    return (url, token)
--- a/examples/daily-multi-translation/server.py
+++ b/examples/daily-multi-translation/server.py
--- a/examples/deployment/modal-example/.gitignore
+++ b/examples/deployment/modal-example/.gitignore
@@ -1,3 +1,6 @@
+# Modal clone
+modal-examples
+
 # Python
 __pycache__/
 *.py[cod]
--- a/examples/deployment/modal-example/README.md
+++ b/examples/deployment/modal-example/README.md
@@ -1,37 +1,91 @@
 # Deploying Pipecat to Modal.com

-Barebones deployment example for [modal.com](https://www.modal.com)
+Deployment example for [modal.com](https://www.modal.com). This example demonstrates how to deploy a FastAPI webapp to Modal with an RTVI compatible `/connect` endpoint that launches a Pipecat pipeline in a separate Modal container and returns a room/token for the client to join. This example also supports providing a parameter to the `/connect` endpoint for specifying which Pipecat pipeline to launch; openai, gemini, or vllm. The vllm pipeline points to a self-hosted OpenAI compatible LLM, using a llama model (neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16), deployed to Modal.

-1. Install dependencies
+![](diagram.jpg)

-```bash
-python -m venv venv
-source venv/bin/active # or OS equivalent
-pip install -r requirements.txt
-```
+# Running this Example

-2. Setup .env
+## Install the Modal CLI

-```bash
-cp env.example .env
-```
+Setup a Modal account and install it on your machine if you have not already, following their easy 3 steps in their [Getting Started Guide](https://modal.com/docs/guide#getting-started)

-Alternatively, you can configure your Modal app to use [secrets](https://modal.com/docs/guide/secrets)
+## Deploy a self-serve LLM

-3. Test the app locally
+1. Deploy Modal's OpenAI-compatible LLM service:

-```bash
-modal serve app.py
-```
+   ```bash
+   git clone https://github.com/modal-labs/modal-examples
+   cd modal-examples
+   modal deploy 06_gpu_and_ml/llm-serving/vllm_inference.py
+   ```
+
+   Refer to Modal's guide and example for [Deploying an OpenAI-compatible LLM service with vLLM](https://modal.com/docs/examples/vllm_inference) for more details.
+
+2. Take note of the endpoint URL from the previous step, which will look like:
+   ```
+   https://{your-workspace}--example-vllm-openai-compatible-serve.modal.run
+   ```
+   You'll need this for the `bot_vllm.py` file in the next section.
+
+    **Note:**  The default Modal LLM example uses Llama-3.1 and will shut down after 15 minutes of inactivity. Cold starts take 5-10 minutes. To prepare the service, we recommend visiting the `/docs` endpoint (`https://<Modal workspace>--example-vllm-openai-compatible-serve.modal.run/docs`) for your deployed LLM and wait for it to fully load before connecting your client.
+
+## Deploy FastAPI App and Pipecat pipeline to Modal 
+
+1. Setup environment variables
+
+   ```bash
+   cd server
+   cp env.example .env
+   # Modify .env to provide your service API Keys
+   ```
+
+   Alternatively, you can configure your Modal app to use [secrets](https://modal.com/docs/guide/secrets)
+
+2. Update the `modal_url` in `server/src/bot_vllm.py` to point to the url produced from the self-serve llm deploy, mentioned above.
+
+3. From within the `server` directory, test the app locally:
+
+   ```bash
+   modal serve app.py
+   ```

 4. Deploy to production

-```bash
-modal deploy app.py
-```
+   ```bash
+   modal deploy app.py
+   ```

-## Configuration options
+5. Note the endpoint URL produced from this deployment. It will look like:

-This app sets some sensible defaults for reducing cold starts, such as `minkeep_warm=1`, which will keep at least 1 warm instance ready for your bot function.
+   ```bash
+   https://{your-workspace}--pipecat-modal-fastapi-app.modal.run
+   ```

-It has been configured to only allow a concurrency of 1 (`max_inputs=1`) as each user will require their own running function.
+   You'll need this URL for the client's `app.js` configuration mentioned in its README.
+
+## Launch your bots on Modal
+
+### Option 1: Direct Link
+
+Simply click on the url displayed after running the server or deploy step to launch an agent and be redirected to a Daily room to talk with the launched bot. This will use the OpenAI pipeline.
+
+### Option 2: Connect via an RTVI Client
+
+Follow the instructions provided in the [client folder's README](client/javascript/README.md) for building and running a custom client that connects to your Modal endpoint. The provided client provides a dropdown for choosing which bot pipeline to run.
+
+# Navigating your llm, server, and Pipecat logs
+
+In your [Modal dashboard](https://modal.com/apps), you should have two Apps listed under Live Apps:
+
+1. `example-vllm-openai-compatible`: This App contains the containers and logs used to run your self-hosted LLM. There will be just one App Function listed: `serve`. Click on this function to view logs for your LLM.
+2. `pipecat-modal`: This App contains the containers and logs used to run your `connect` endpoints and Pipecat pipelines. It will list two App Functions:
+    1. `fastapi_app`: This function is running the endpoints that your client will interact with and initiate starting a new pipeline (`/`, `/connect`, `/status`). Click on this function to see logs for each endpoint hit.
+    2. `bot_runner`: This function handles launching and running a bot pipeline. Click on this function to get a list of all pipeline runs and access each run's logs.
+
+# Modal + Pipecat Tips
+
+- In most other Pipecat examples, we use `Popen` to launch the pipeline process from the `/connect` endpoint. In this example, we use a Modal function instead. This allows us to run the pipelines using a separately defined Modal image as well as run each pipeline in an isolated container.
+- For the FastAPI and most common Pipecat Pipeline containers, a default `debian_slim` CPU-only should be all that's required to run. GPU containers are needed for self-hosted services.
+- To minimize cold starts of the pipeline and reduce latency for users, set `min_containers=1` on the Modal Function that launches the pipeline to ensure at least one warm instance of your function is always available.
+- For next steps on running a self-hosted llm and reducing latency, check out all of [Modal's LLM examples](https://modal.com/docs/examples/vllm_inference).
--- a/examples/deployment/modal-example/app.py
+++ b/examples/deployment/modal-example/app.py
@@ -1,80 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import os
-
-import aiohttp
-import modal
-from bot import _voice_bot_process
-from fastapi import HTTPException
-from fastapi.responses import JSONResponse
-from loguru import logger
-
-MAX_SESSION_TIME = 15 * 60  # 15 minutes
-
-app = modal.App("pipecat-modal")
-
-
-image = modal.Image.debian_slim(python_version="3.12").pip_install_from_requirements(
-    "requirements.txt"
-)
-
-
-@app.function(
-    image=image,
-    cpu=1.0,
-    secrets=[modal.Secret.from_dotenv()],
-    keep_warm=1,
-    enable_memory_snapshot=True,
-    max_inputs=1,  # Do not reuse instances across requests
-    retries=0,
-)
-def launch_bot_process(room_url: str, token: str):
-    _voice_bot_process(room_url, token)
-
-
-@app.function(
-    image=image,
-    secrets=[modal.Secret.from_dotenv()],
-)
-@modal.web_endpoint(method="POST")
-async def start():
-    from pipecat.transports.services.helpers.daily_rest import (
-        DailyRESTHelper,
-        DailyRoomParams,
-    )
-
-    logger.info("Request received")
-
-    async with aiohttp.ClientSession() as session:
-        daily_rest_helper = DailyRESTHelper(
-            daily_api_key=os.getenv("DAILY_API_KEY", ""),
-            daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
-            aiohttp_session=session,
-        )
-
-        # Create new Daily room
-        room = await daily_rest_helper.create_room(DailyRoomParams())
-        if not room.url:
-            raise HTTPException(
-                status_code=500,
-                detail="Unable to create room",
-            )
-        logger.info(f"Created room: {room.url}")
-
-        # Create bot token for room
-        token = await daily_rest_helper.get_token(room.url, MAX_SESSION_TIME)
-        if not token:
-            raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
-
-        logger.info(f"Bot token created: {token}")
-
-        # Spawn a new bot process
-        launch_bot_process.spawn(room_url=room.url, token=token)
-
-        # Return room URL to the user to join
-        # Note: in production, you would want to return a token to the user
-        return JSONResponse(content={"room_url": room.url, token: token})
--- a/examples/deployment/modal-example/bot.py
+++ b/examples/deployment/modal-example/bot.py
@@ -1,95 +0,0 @@
-#
-# Copyright (c) 2024–2025, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import asyncio
-import os
-import sys
-
-from dotenv import load_dotenv
-from loguru import logger
-
-from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
-from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.openai.llm import OpenAILLMService
-from pipecat.transports.services.daily import DailyParams, DailyTransport
-
-load_dotenv(override=True)
-
-logger.remove(0)
-logger.add(sys.stderr, level="DEBUG")
-
-
-async def main(room_url: str, token: str):
-    transport = DailyTransport(
-        room_url,
-        token,
-        "bot",
-        DailyParams(
-            audio_in_enabled=True,
-            audio_out_enabled=True,
-            transcription_enabled=True,
-            vad_analyzer=SileroVADAnalyzer(),
-        ),
-    )
-
-    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY", ""), voice_id="71a7ad14-091c-4e8e-a314-022ece01c121"
-    )
-
-    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-        },
-    ]
-
-    context = OpenAILLMContext(messages)
-    context_aggregator = llm.create_context_aggregator(context)
-
-    pipeline = Pipeline(
-        [
-            transport.input(),
-            context_aggregator.user(),
-            llm,
-            tts,
-            transport.output(),
-            context_aggregator.assistant(),
-        ]
-    )
-
-    task = PipelineTask(
-        pipeline,
-        params=PipelineParams(
-            allow_interruptions=True,
-            enable_metrics=True,
-            enable_usage_metrics=True,
-            report_only_initial_ttfb=True,
-        ),
-    )
-
-    @transport.event_handler("on_first_participant_joined")
-    async def on_first_participant_joined(transport, participant):
-        await transport.capture_participant_transcription(participant["id"])
-        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([context_aggregator.user().get_context_frame()])
-
-    @transport.event_handler("on_participant_left")
-    async def on_participant_left(transport, participant, reason):
-        await task.cancel()
-
-    runner = PipelineRunner()
-
-    await runner.run(task)
-
-
-def _voice_bot_process(room_url: str, token: str):
-    asyncio.run(main(room_url, token))
--- a/examples/deployment/modal-example/client/javascript/.gitignore
+++ b/examples/deployment/modal-example/client/javascript/.gitignore
@@ -0,0 +1 @@
+node_modules
--- a/examples/deployment/modal-example/client/javascript/README.md
+++ b/examples/deployment/modal-example/client/javascript/README.md
@@ -0,0 +1,29 @@
+# JavaScript Implementation
+
+Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
+
+## Setup
+
+1. Deploy the Modal server. See the main [README](../../README).
+
+2. Navigate to the `client/javascript` directory:
+
+```bash
+cd client/javascript
+```
+
+3. Modify the baseUrl in src/app.js to point to your deployed Modal endpoint
+
+4. Install dependencies:
+
+```bash
+npm install
+```
+
+5. Run the client app:
+
+```
+npm run dev
+```
+
+6. Visit http://localhost:5173 in your browser.
--- a/examples/deployment/modal-example/client/javascript/index.html
+++ b/examples/deployment/modal-example/client/javascript/index.html
@@ -0,0 +1,49 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>AI Chatbot</title>
+  </head>
+
+  <body>
+    <div class="container">
+      <div class="status-bar">
+        <div class="status">
+          Status: <span id="connection-status">Disconnected</span>
+        </div>
+        <div class="controls">
+          <select id="bot-selector">
+            <option value="openai">OpenAI</option>
+            <option value="gemini">Gemini</option>
+            <option value="vllm">Llama</option>
+          </select>
+          <button id="connect-btn">Connect</button>
+          <button id="disconnect-btn" disabled>Disconnect</button>
+        </div>
+      </div>
+
+      <div class="main-content">
+        <div class="bot-container">
+          <div id="bot-video-container"></div>
+          <audio id="bot-audio" autoplay></audio>
+        </div>
+      </div>
+
+      <div class="device-bar">
+        <div class="device-controls">
+          <select id="device-selector"></select>
+          <button id="mic-toggle-btn">Mute Mic</button>
+        </div>
+      </div>
+
+      <div class="debug-panel">
+        <h3>Debug Info</h3>
+        <div id="debug-log"></div>
+      </div>
+    </div>
+
+    <script type="module" src="/src/app.js"></script>
+    <link rel="stylesheet" href="/src/style.css" />
+  </body>
+</html>
--- a/examples/deployment/modal-example/client/javascript/package-lock.json
+++ b/examples/deployment/modal-example/client/javascript/package-lock.json
--- a/examples/deployment/modal-example/client/javascript/package.json
+++ b/examples/deployment/modal-example/client/javascript/package.json
@@ -0,0 +1,21 @@
+{
+  "name": "client",
+  "version": "1.0.0",
+  "main": "index.js",
+  "scripts": {
+    "dev": "vite",
+    "build": "vite build",
+    "preview": "vite preview"
+  },
+  "keywords": [],
+  "author": "",
+  "license": "ISC",
+  "description": "",
+  "devDependencies": {
+    "vite": "^6.3.5"
+  },
+  "dependencies": {
+    "@pipecat-ai/client-js": "^0.3.5",
+    "@pipecat-ai/daily-transport": "^0.3.10"
+  }
+}
--- a/examples/deployment/modal-example/client/javascript/src/app.js
+++ b/examples/deployment/modal-example/client/javascript/src/app.js
@@ -0,0 +1,381 @@
+/**
+ * Copyright (c) 2024–2025, Daily
+ *
+ * SPDX-License-Identifier: BSD 2-Clause License
+ */
+
+/**
+ * RTVI Client Implementation
+ *
+ * This client connects to an RTVI-compatible bot server using WebRTC (via Daily).
+ * It handles audio/video streaming and manages the connection lifecycle.
+ *
+ * Requirements:
+ * - A running RTVI bot server (defaults to http://localhost:7860)
+ * - The server must implement the /connect endpoint that returns Daily.co room credentials
+ * - Browser with WebRTC support
+ */
+
+import { RTVIClient, RTVIEvent } from '@pipecat-ai/client-js';
+import { DailyTransport } from '@pipecat-ai/daily-transport';
+
+/**
+ * ChatbotClient handles the connection and media management for a real-time
+ * voice and video interaction with an AI bot.
+ */
+class ChatbotClient {
+  constructor() {
+    // Initialize client state
+    this.rtviClient = null;
+    this.setupDOMElements();
+    this.initializeClientAndTransport();
+    this.setupEventListeners();
+  }
+
+  /**
+   * Set up references to DOM elements and create necessary media elements
+   */
+  setupDOMElements() {
+    // Get references to UI control elements
+    this.connectBtn = document.getElementById('connect-btn');
+    this.disconnectBtn = document.getElementById('disconnect-btn');
+    this.statusSpan = document.getElementById('connection-status');
+    this.debugLog = document.getElementById('debug-log');
+    this.botVideoContainer = document.getElementById('bot-video-container');
+    this.deviceSelector = document.getElementById('device-selector');
+
+    // Create an audio element for bot's voice output
+    this.botAudio = document.createElement('audio');
+    this.botAudio.autoplay = true;
+    this.botAudio.playsInline = true;
+    document.body.appendChild(this.botAudio);
+  }
+
+  /**
+   * Set up event listeners for connect/disconnect buttons
+   */
+  setupEventListeners() {
+    this.connectBtn.addEventListener('click', () => this.connect());
+    this.disconnectBtn.addEventListener('click', () => this.disconnect());
+
+    // Populate device selector
+    this.rtviClient.getAllMics().then((mics) => {
+      console.log('Available mics:', mics);
+      mics.forEach((device) => {
+        const option = document.createElement('option');
+        option.value = device.deviceId;
+        option.textContent = device.label || `Microphone ${device.deviceId}`;
+        this.deviceSelector.appendChild(option);
+      });
+    });
+    this.deviceSelector.addEventListener('change', (event) => {
+      const selectedDeviceId = event.target.value;
+      console.log('Selected device ID:', selectedDeviceId);
+      this.rtviClient.updateMic(selectedDeviceId);
+    });
+
+    // Handle mic mute/unmute toggle
+    const micToggleBtn = document.getElementById('mic-toggle-btn');
+
+    micToggleBtn.addEventListener('click', () => {
+      let micEnabled = this.rtviClient.isMicEnabled;
+      micToggleBtn.textContent = micEnabled ? 'Unmute Mic' : 'Mute Mic';
+      this.rtviClient.enableMic(!micEnabled);
+      // Add logic to mute/unmute the mic
+      if (micEnabled) {
+        console.log('Mic muted');
+        // Add code to mute the mic
+      } else {
+        console.log('Mic unmuted');
+        // Add code to unmute the mic
+      }
+    });
+  }
+
+  /**
+   * Set up the RTVI client and Daily transport
+   */
+  async initializeClientAndTransport() {
+    // Initialize the RTVI client with a DailyTransport and our configuration
+    this.rtviClient = new RTVIClient({
+      transport: new DailyTransport(),
+      params: {
+        // REPLACE WITH YOUR MODAL URL ENDPOINT
+        baseUrl:
+          'https://<Modal workspace>--pipecat-modal-bot-launcher.modal.run',
+        endpoints: {
+          connect: '/connect',
+        },
+        requestData: {
+          bot_name: 'openai',
+        },
+      },
+      enableMic: true, // Enable microphone for user input
+      enableCam: false,
+      callbacks: {
+        // Handle connection state changes
+        onConnected: () => {
+          this.updateStatus('Connected');
+          this.connectBtn.disabled = true;
+          this.disconnectBtn.disabled = false;
+          this.log('Client connected');
+        },
+        onDisconnected: () => {
+          this.updateStatus('Disconnected');
+          this.connectBtn.disabled = false;
+          this.disconnectBtn.disabled = true;
+          this.log('Client disconnected');
+        },
+        // Handle transport state changes
+        onTransportStateChanged: (state) => {
+          this.updateStatus(`Transport: ${state}`);
+          this.log(`Transport state changed: ${state}`);
+          if (state === 'connecting') {
+            window.startTime = Date.now();
+          }
+          if (state === 'ready') {
+            this.setupMediaTracks();
+            console.warn('TIME TO BOT READY:', Date.now() - window.startTime);
+          }
+        },
+        // Handle bot connection events
+        onBotConnected: (participant) => {
+          this.log(`Bot connected: ${JSON.stringify(participant)}`);
+        },
+        onBotDisconnected: (participant) => {
+          this.log(`Bot disconnected: ${JSON.stringify(participant)}`);
+        },
+        onBotReady: (data) => {
+          this.log(`Bot ready: ${JSON.stringify(data)}`);
+          this.setupMediaTracks();
+        },
+        // Transcript events
+        onUserTranscript: (data) => {
+          // Only log final transcripts
+          if (data.final) {
+            this.log(`User: ${data.text}`);
+          }
+        },
+        onBotTranscript: (data) => {
+          this.log(`Bot: ${data.text}`);
+        },
+        // Error handling
+        onMessageError: (error) => {
+          console.log('Message error:', error);
+        },
+        onMicUpdated: (data) => {
+          console.log('Mic updated:', data);
+          this.deviceSelector.value = data.deviceId;
+        },
+        onError: (error) => {
+          console.log('Error:', JSON.stringify(error));
+        },
+      },
+    });
+
+    // Set up listeners for media track events
+    this.setupTrackListeners();
+
+    await this.rtviClient.initDevices();
+    window.client = this.rtviClient;
+  }
+
+  /**
+   * Add a timestamped message to the debug log
+   */
+  log(message) {
+    const entry = document.createElement('div');
+    entry.textContent = `${new Date().toISOString()} - ${message}`;
+
+    // Add styling based on message type
+    if (message.startsWith('User: ')) {
+      entry.style.color = '#2196F3'; // blue for user
+    } else if (message.startsWith('Bot: ')) {
+      entry.style.color = '#4CAF50'; // green for bot
+    }
+
+    this.debugLog.appendChild(entry);
+    this.debugLog.scrollTop = this.debugLog.scrollHeight;
+    console.log(message);
+  }
+
+  /**
+   * Update the connection status display
+   */
+  updateStatus(status) {
+    this.statusSpan.textContent = status;
+    this.log(`Status: ${status}`);
+  }
+
+  /**
+   * Check for available media tracks and set them up if present
+   * This is called when the bot is ready or when the transport state changes to ready
+   */
+  setupMediaTracks() {
+    if (!this.rtviClient) return;
+
+    // Get current tracks from the client
+    const tracks = this.rtviClient.tracks();
+
+    // Set up any available bot tracks
+    if (tracks.bot?.audio) {
+      this.setupAudioTrack(tracks.bot.audio);
+    }
+    if (tracks.bot?.video) {
+      this.setupVideoTrack(tracks.bot.video);
+    }
+  }
+
+  /**
+   * Set up listeners for track events (start/stop)
+   * This handles new tracks being added during the session
+   */
+  setupTrackListeners() {
+    if (!this.rtviClient) return;
+
+    // Listen for new tracks starting
+    this.rtviClient.on(RTVIEvent.TrackStarted, (track, participant) => {
+      // Only handle non-local (bot) tracks
+      if (!participant?.local) {
+        if (track.kind === 'audio') {
+          this.setupAudioTrack(track);
+        } else if (track.kind === 'video') {
+          this.setupVideoTrack(track);
+        }
+        this.log(
+          `Track started event: ${track.kind} from ${
+            participant?.name || 'unknown'
+          }`
+        );
+      } else {
+        this.log('Local mic unmuted');
+      }
+    });
+
+    // Listen for tracks stopping
+    this.rtviClient.on(RTVIEvent.TrackStopped, (track, participant) => {
+      if (participant.local) {
+        this.log('Local mic muted');
+        return;
+      }
+      this.log(
+        `Track stopped event: ${track.kind} from ${
+          participant?.name || 'unknown'
+        }`
+      );
+    });
+  }
+
+  /**
+   * Set up an audio track for playback
+   * Handles both initial setup and track updates
+   */
+  setupAudioTrack(track) {
+    this.log('Setting up audio track');
+    // Check if we're already playing this track
+    if (this.botAudio.srcObject) {
+      const oldTrack = this.botAudio.srcObject.getAudioTracks()[0];
+      if (oldTrack?.id === track.id) return;
+    }
+    // Create a new MediaStream with the track and set it as the audio source
+    this.botAudio.srcObject = new MediaStream([track]);
+  }
+
+  /**
+   * Set up a video track for display
+   * Handles both initial setup and track updates
+   */
+  setupVideoTrack(track) {
+    this.log('Setting up video track');
+    const videoEl = document.createElement('video');
+    videoEl.autoplay = true;
+    videoEl.playsInline = true;
+    videoEl.muted = true;
+    videoEl.style.width = '100%';
+    videoEl.style.height = '100%';
+    videoEl.style.objectFit = 'cover';
+
+    // Check if we're already displaying this track
+    if (this.botVideoContainer.querySelector('video')?.srcObject) {
+      const oldTrack = this.botVideoContainer
+        .querySelector('video')
+        .srcObject.getVideoTracks()[0];
+      if (oldTrack?.id === track.id) return;
+    }
+
+    // Create a new MediaStream with the track and set it as the video source
+    videoEl.srcObject = new MediaStream([track]);
+    this.botVideoContainer.innerHTML = '';
+    this.botVideoContainer.appendChild(videoEl);
+  }
+
+  /**
+   * Initialize and connect to the bot
+   * This sets up the RTVI client, initializes devices, and establishes the connection
+   */
+  async connect() {
+    try {
+      const botSelector = document.getElementById('bot-selector');
+      const selectedBot = botSelector.value;
+      this.rtviClient.params.requestData.bot_name = selectedBot;
+
+      // Initialize audio/video devices
+      this.log('Initializing devices...');
+      await this.rtviClient.initDevices();
+
+      // Connect to the bot
+      this.log(`Connecting to bot: ${selectedBot}`);
+      await this.rtviClient.connect();
+
+      this.log('Connection complete');
+    } catch (error) {
+      // Handle any errors during connection
+      console.error('Connection error:', error);
+      this.log(`Error connecting: ${JSON.stringify(error.message)}`);
+      this.log(`Error stack: ${error.stack}`);
+      this.updateStatus('Error');
+
+      // Clean up if there's an error
+      if (this.rtviClient) {
+        try {
+          await this.rtviClient.disconnect();
+        } catch (disconnectError) {
+          this.log(`Error during disconnect: ${disconnectError.message}`);
+        }
+      }
+    }
+  }
+
+  /**
+   * Disconnect from the bot and clean up media resources
+   */
+  async disconnect() {
+    if (this.rtviClient) {
+      try {
+        // Disconnect the RTVI client
+        await this.rtviClient.disconnect();
+
+        // Clean up audio
+        if (this.botAudio.srcObject) {
+          this.botAudio.srcObject.getTracks().forEach((track) => track.stop());
+          this.botAudio.srcObject = null;
+        }
+
+        // Clean up video
+        if (this.botVideoContainer.querySelector('video')?.srcObject) {
+          const video = this.botVideoContainer.querySelector('video');
+          video.srcObject.getTracks().forEach((track) => track.stop());
+          video.srcObject = null;
+        }
+        this.botVideoContainer.innerHTML = '';
+      } catch (error) {
+        this.log(`Error disconnecting: ${error.message}`);
+      }
+    }
+  }
+}
+
+// Initialize the client when the page loads
+window.addEventListener('DOMContentLoaded', () => {
+  new ChatbotClient();
+});
--- a/examples/deployment/modal-example/client/javascript/src/style.css
+++ b/examples/deployment/modal-example/client/javascript/src/style.css
@@ -0,0 +1,135 @@
+body {
+  margin: 0;
+  padding: 20px;
+  font-family: Arial, sans-serif;
+  background-color: #f0f0f0;
+}
+
+.container {
+  max-width: 1200px;
+  margin: 0 auto;
+}
+
+.status-bar,
+.device-bar {
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  padding: 10px;
+  background-color: #fff;
+  border-radius: 8px;
+  margin-bottom: 20px;
+}
+
+.controls,
+.device-controls {
+  display: flex;
+  align-items: center;
+  gap: 10px; /* Adds spacing between elements */
+}
+
+.device-controls {
+  margin-left: auto;
+}
+
+.controls button,
+.device-controls button {
+  padding: 8px 16px;
+  margin-left: 10px;
+  border: none;
+  border-radius: 4px;
+  cursor: pointer;
+}
+
+#bot-selector,
+#device-selector {
+  padding: 8px 16px;
+  padding-right: 40px;
+  border: none;
+  border-radius: 4px;
+  background-color: #6c757d; /* Gray background */
+  color: white; /* White text */
+  cursor: pointer;
+  appearance: none; /* Removes default browser styling for dropdowns */
+  background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 24 24' fill='white'%3E%3Cpath d='M7 10l5 5 5-5z'/%3E%3C/svg%3E"); /* Custom arrow */
+  background-repeat: no-repeat;
+  background-position: right 8px center; /* Position the arrow */
+}
+
+#bot-selector:focus,
+#device-selector:focus {
+  outline: none;
+  box-shadow: 0 0 4px rgba(0, 0, 0, 0.3); /* Add a subtle focus effect */
+}
+
+#connect-btn {
+  background-color: #4caf50;
+  color: white;
+}
+
+#disconnect-btn {
+  background-color: #f44336;
+  color: white;
+}
+
+#mic-toggle-btn {
+}
+
+button:disabled {
+  opacity: 0.5;
+  cursor: not-allowed;
+}
+
+.main-content {
+  background-color: #fff;
+  border-radius: 8px;
+  padding: 20px;
+  margin-bottom: 20px;
+}
+
+.bot-container {
+  display: flex;
+  flex-direction: column;
+  align-items: center;
+}
+
+#bot-video-container {
+  width: 640px;
+  height: 360px;
+  background-color: #e0e0e0;
+  border-radius: 8px;
+  margin: 20px auto;
+  overflow: hidden;
+  display: flex;
+  align-items: center;
+  justify-content: center;
+}
+
+#bot-video-container video {
+  width: 100%;
+  height: 100%;
+  object-fit: cover;
+}
+
+.debug-panel {
+  background-color: #fff;
+  border-radius: 8px;
+  padding: 20px;
+}
+
+.debug-panel h3 {
+  margin: 0 0 10px 0;
+  font-size: 16px;
+  font-weight: bold;
+}
+
+#debug-log {
+  height: 200px;
+  overflow-y: auto;
+  background-color: #f8f8f8;
+  padding: 10px;
+  border-radius: 4px;
+  font-family: monospace;
+  font-size: 12px;
+  line-height: 1.4;
+}
--- a/examples/deployment/modal-example/diagram.jpg
+++ b/examples/deployment/modal-example/diagram.jpg
--- a/examples/deployment/modal-example/env.example
+++ b/examples/deployment/modal-example/env.example
@@ -1,3 +0,0 @@
-DAILY_API_KEY=
-OPENAI_API_KEY=
-CARTESIA_API_KEY=
--- a/examples/deployment/modal-example/requirements.txt
+++ b/examples/deployment/modal-example/requirements.txt
@@ -1,4 +0,0 @@
-python-dotenv==1.0.1
-modal==0.71.3
-pipecat-ai[daily,silero,cartesia,openai]
-fastapi==0.115.6
--- a/examples/deployment/modal-example/server/init.py
+++ b/examples/deployment/modal-example/server/init.py
--- a/examples/deployment/modal-example/server/app.py
+++ b/examples/deployment/modal-example/server/app.py
@@ -0,0 +1,307 @@
+"""modal_example.
+
+This module shows a simple example of how to deploy a bot using Modal and FastAPI.
+
+It includes:
+- FastAPI endpoints for starting agents and checking bot statuses.
+- Dynamic loading of bot implementations.
+- Use of a Daily transport for bot communication.
+"""
+
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import importlib
+import os
+from contextlib import asynccontextmanager
+from typing import Any, Dict, Literal
+
+import aiohttp
+import modal
+from fastapi import APIRouter, FastAPI, HTTPException
+from fastapi.responses import JSONResponse, RedirectResponse
+from pydantic import BaseModel
+
+# container specifications for the FastAPI web server
+web_image = (
+    modal.Image.debian_slim(python_version="3.13")
+    .pip_install_from_requirements("requirements.txt")
+    .pip_install("pipecat-ai[daily]")
+    .add_local_dir("src", remote_path="/root/src")
+)
+
+# container specifications for the Pipecat pipeline
+bot_image = (
+    modal.Image.debian_slim(python_version="3.13")
+    .apt_install("ffmpeg")
+    .pip_install_from_requirements("requirements.txt")
+    .pip_install("pipecat-ai[daily,elevenlabs,openai,silero,google]")
+    .add_local_dir("src", remote_path="/root/src")
+)
+
+app = modal.App("pipecat-modal", secrets=[modal.Secret.from_dotenv()])
+
+router = APIRouter()
+
+bot_jobs = {}
+daily_helpers = {}
+
+# Names of all supported bot implementations
+# These correspond to the bot files in the src directory
+BotName = Literal["openai", "gemini", "vllm"]
+
+
+def cleanup():
+    """Cleanup function to terminate all bot processes.
+
+    Called during server shutdown.
+    """
+    for entry in bot_jobs.values():
+        func = modal.FunctionCall.from_id(entry[0])
+        if func:
+            func.cancel()
+
+
+def get_bot_file(bot_name: BotName) -> str:
+    """Retrieve the bot file name corresponding to the provided bot_name.
+
+    Args:
+        bot_name (BotName): The name of the bot (e.g., 'openai', 'gemini', 'vllm').
+
+    Returns:
+        str: The file name corresponding to the bot implementation.
+
+    Raises:
+        ValueError: If the bot name is invalid or not supported.
+    """
+    # bot_implementation = os.getenv("BOT_IMPLEMENTATION", "openai").lower().strip()
+    bot_implementation = bot_name.lower().strip()
+    if not bot_implementation:
+        bot_implementation = "openai"
+    if bot_implementation not in ["openai", "gemini", "vllm"]:
+        raise ValueError(
+            f"Invalid BOT_IMPLEMENTATION: {bot_implementation}. Must be 'openai' or 'gemini' or 'vllm'"
+        )
+
+    return f"bot_{bot_implementation}"
+
+
+def get_runner(path: str, bot_file: str) -> callable:
+    """Dynamically import the run_bot function based on the bot name.
+
+    Args:
+        path (str): The path to the bot files (e.g., 'src').
+        bot_file (str): The file name of the bot implementation (e.g., 'openai', 'gemini', 'vllm').
+
+    Returns:
+        function: The run_bot function from the specified bot module.
+
+    Raises:
+        ImportError: If the specified bot module or run_bot function is not found.
+    """
+    try:
+        # Dynamically construct the module name
+        module_name = f"{path}.{bot_file}"
+        # Import the module
+        module = importlib.import_module(module_name)
+        # Get the run_bot function from the module
+        return getattr(module, "run_bot")
+    except (ImportError, AttributeError) as e:
+        raise ImportError(f"Failed to import run_bot from {module_name}: {e}")
+
+
+async def create_room_and_token() -> tuple[str, str]:
+    """Create a Daily room and generate an authentication token.
+
+    This function checks for existing room URL and token in the environment variables.
+    If not found, it creates a new room using the Daily API and generates a token for it.
+
+    Returns:
+        tuple[str, str]: A tuple containing the room URL and the authentication token.
+
+    Raises:
+        HTTPException: If room creation or token generation fails.
+    """
+    from pipecat.transports.services.helpers.daily_rest import DailyRoomParams
+
+    room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", None)
+    token = os.getenv("DAILY_SAMPLE_ROOM_TOKEN", None)
+    if not room_url:
+        room = await daily_helpers["rest"].create_room(DailyRoomParams())
+        if not room.url:
+            raise HTTPException(status_code=500, detail="Failed to create room")
+        room_url = room.url
+
+        token = await daily_helpers["rest"].get_token(room_url)
+        if not token:
+            raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room_url}")
+
+    return room_url, token
+
+
+@app.function(image=bot_image, min_containers=1)
+async def bot_runner(room_url, token, bot_name: BotName = "openai"):
+    """Launch the provided bot process, providing the given room URL and token for the bot to join.
+
+    Args:
+        room_url (str): The URL of the Daily room where the bot and client will communicate.
+        token (str): The authentication token for the room.
+        bot_name (BotName): The name of the bot implementation to use. Defaults to "openai".
+
+    Raises:
+        HTTPException: If the bot pipeline fails to start.
+    """
+    try:
+        path = "src"
+        bot_file = get_bot_file(bot_name)
+        run_bot = get_runner(path, bot_file)
+
+        print(f"Starting bot process: {bot_file} -u {room_url} -t {token}")
+        await run_bot(room_url, token)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Failed to start bot pipeline: {e}")
+
+
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """FastAPI lifespan manager that handles startup and shutdown tasks.
+
+    - Creates aiohttp session
+    - Initializes Daily API helper
+    - Cleans up resources on shutdown
+    """
+    from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
+
+    aiohttp_session = aiohttp.ClientSession()
+    daily_helpers["rest"] = DailyRESTHelper(
+        daily_api_key=os.getenv("DAILY_API_KEY", ""),
+        daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
+        aiohttp_session=aiohttp_session,
+    )
+    yield
+    await aiohttp_session.close()
+    cleanup()
+
+
+class ConnectData(BaseModel):
+    """Data provided by client to specify the bot pipeline.
+
+    Attributes:
+        bot_name (BotName): The name of the bot to connect to. Defaults to "openai".
+    """
+
+    bot_name: BotName = "openai"
+
+
+async def start(data: ConnectData):
+    """Internal method to start a bot agent and return the room URL and token.
+
+    Args:
+        data (ConnectData): The data containing the bot name to use.
+
+    Returns:
+        tuple[str, str]: A tuple containing the room URL and token.
+    """
+    room_url, token = await create_room_and_token()
+    launch_bot_func = modal.Function.from_name("pipecat-modal", "bot_runner")
+    function_id = launch_bot_func.spawn(room_url, token, data.bot_name)
+    bot_jobs[function_id] = (function_id, room_url)
+
+    return room_url, token
+
+
+@router.get("/")
+async def start_agent():
+    """A user endpoint for launching a bot agent and redirecting to the created room URL.
+
+    This function retrieves the bot implementation from the environment,
+    starts the bot agent, and redirects the user to the room URL to
+    interact with the bot through a Daily Prebuilt Interface.
+
+    Returns:
+        RedirectResponse: A response that redirects to the room URL.
+    """
+    bot_name = os.getenv("BOT_IMPLEMENTATION", "openai").lower().strip()
+    print(f"Starting bot: {bot_name}")
+    room_url, token = await start(ConnectData(bot_name=bot_name))
+
+    return RedirectResponse(room_url)
+
+
+@router.post("/connect")
+async def rtvi_connect(data: ConnectData) -> Dict[Any, Any]:
+    """A user endpoint for launching a bot agent and retrieving the room/token credentials.
+
+    This function retrieves the bot implementation from the request, if provided,
+    starts the bot agent, and returns the room URL and token for the bot. This allows the
+    client to then connect to the bot using their own RTVI interface.
+
+    Args:
+        data (ConnectData): Optional. The data containing the bot name to use.
+
+    Returns:
+        Dict[Any, Any]: A dictionary containing the room URL and token.
+    """
+    print(f"Starting bot: {data.bot_name}")
+    if data is None or not data.bot_name:
+        data.bot_name = os.getenv("BOT_IMPLEMENTATION", "openai").lower().strip()
+    room_url, token = await start(data)
+
+    return {"room_url": room_url, "token": token}
+
+
+@router.get("/status/{fid}")
+def get_status(fid: str):
+    """Retrieve the status of a bot process by its function ID.
+
+    Args:
+        fid (str): The function ID of the bot process.
+
+    Returns:
+        JSONResponse: A JSON response containing the bot's status and result code.
+
+    Raises:
+        HTTPException: If the bot process with the given ID is not found.
+    """
+    func = modal.FunctionCall.from_id(fid)
+    if not func:
+        raise HTTPException(status_code=404, detail=f"Bot with process id: {fid} not found")
+
+    try:
+        result = func.get(timeout=0)
+        return JSONResponse({"bot_id": fid, "status": "finished", "code": result})
+    except modal.exception.OutputExpiredError:
+        return JSONResponse({"bot_id": fid, "status": "finished", "code": 404})
+    except TimeoutError:
+        return JSONResponse({"bot_id": fid, "status": "running", "code": 202})
+
+
+@app.function(image=web_image, min_containers=1)
+@modal.concurrent(max_inputs=1)
+@modal.asgi_app()
+def fastapi_app():
+    """Create and configure the FastAPI application.
+
+    This function initializes the FastAPI app with middleware, routes, and lifespan management.
+    It is decorated to be used as a Modal ASGI app.
+    """
+    from fastapi.middleware.cors import CORSMiddleware
+
+    # Initialize FastAPI app
+    web_app = FastAPI(lifespan=lifespan)
+
+    web_app.add_middleware(
+        CORSMiddleware,
+        allow_origins=["*"],
+        allow_credentials=True,
+        allow_methods=["*"],
+        allow_headers=["*"],
+    )
+
+    # Include the endpoints from endpoints.py
+    web_app.include_router(router)
+
+    return web_app
--- a/examples/deployment/modal-example/server/env.example
+++ b/examples/deployment/modal-example/server/env.example
@@ -0,0 +1,14 @@
+DAILY_API_KEY=
+
+# determines which bot file to default to: 'openai', 'gemini', or 'vllm'
+BOT_IMPLEMENTATION=openai
+
+# needed for the openai bot pipeline
+OPENAI_API_KEY=
+ELEVENLABS_API_KEY=
+
+# needed for the gemini live bot pipeline
+GOOGLE_API_KEY=
+
+# needed if you modified the API Key for your self-hosted LLM
+VLLM_API_KEY=
--- a/examples/deployment/modal-example/server/requirements.txt
+++ b/examples/deployment/modal-example/server/requirements.txt
@@ -0,0 +1,2 @@
+python-dotenv==1.0.1
+modal==0.71.3
--- a/examples/deployment/modal-example/server/src/init.py
+++ b/examples/deployment/modal-example/server/src/init.py
--- a/examples/deployment/modal-example/server/src/assets/robot01.png
+++ b/examples/deployment/modal-example/server/src/assets/robot01.png
--- a/examples/deployment/modal-example/server/src/assets/robot010.png
+++ b/examples/deployment/modal-example/server/src/assets/robot010.png
--- a/examples/deployment/modal-example/server/src/assets/robot011.png
+++ b/examples/deployment/modal-example/server/src/assets/robot011.png
--- a/examples/deployment/modal-example/server/src/assets/robot012.png
+++ b/examples/deployment/modal-example/server/src/assets/robot012.png
--- a/examples/deployment/modal-example/server/src/assets/robot013.png
+++ b/examples/deployment/modal-example/server/src/assets/robot013.png
--- a/examples/deployment/modal-example/server/src/assets/robot014.png
+++ b/examples/deployment/modal-example/server/src/assets/robot014.png
--- a/examples/deployment/modal-example/server/src/assets/robot015.png
+++ b/examples/deployment/modal-example/server/src/assets/robot015.png
--- a/examples/deployment/modal-example/server/src/assets/robot016.png
+++ b/examples/deployment/modal-example/server/src/assets/robot016.png
--- a/examples/deployment/modal-example/server/src/assets/robot017.png
+++ b/examples/deployment/modal-example/server/src/assets/robot017.png
--- a/examples/deployment/modal-example/server/src/assets/robot018.png
+++ b/examples/deployment/modal-example/server/src/assets/robot018.png
--- a/examples/deployment/modal-example/server/src/assets/robot019.png
+++ b/examples/deployment/modal-example/server/src/assets/robot019.png
--- a/examples/deployment/modal-example/server/src/assets/robot02.png
+++ b/examples/deployment/modal-example/server/src/assets/robot02.png
--- a/examples/deployment/modal-example/server/src/assets/robot020.png
+++ b/examples/deployment/modal-example/server/src/assets/robot020.png
--- a/examples/deployment/modal-example/server/src/assets/robot021.png
+++ b/examples/deployment/modal-example/server/src/assets/robot021.png
--- a/examples/deployment/modal-example/server/src/assets/robot022.png
+++ b/examples/deployment/modal-example/server/src/assets/robot022.png
--- a/examples/deployment/modal-example/server/src/assets/robot023.png
+++ b/examples/deployment/modal-example/server/src/assets/robot023.png
--- a/examples/deployment/modal-example/server/src/assets/robot024.png
+++ b/examples/deployment/modal-example/server/src/assets/robot024.png
--- a/examples/deployment/modal-example/server/src/assets/robot025.png
+++ b/examples/deployment/modal-example/server/src/assets/robot025.png
--- a/examples/deployment/modal-example/server/src/assets/robot03.png
+++ b/examples/deployment/modal-example/server/src/assets/robot03.png
--- a/examples/deployment/modal-example/server/src/assets/robot04.png
+++ b/examples/deployment/modal-example/server/src/assets/robot04.png
--- a/examples/deployment/modal-example/server/src/assets/robot05.png
+++ b/examples/deployment/modal-example/server/src/assets/robot05.png
--- a/examples/deployment/modal-example/server/src/assets/robot06.png
+++ b/examples/deployment/modal-example/server/src/assets/robot06.png
--- a/examples/deployment/modal-example/server/src/assets/robot07.png
+++ b/examples/deployment/modal-example/server/src/assets/robot07.png
--- a/examples/deployment/modal-example/server/src/assets/robot08.png
+++ b/examples/deployment/modal-example/server/src/assets/robot08.png
--- a/examples/deployment/modal-example/server/src/assets/robot09.png
+++ b/examples/deployment/modal-example/server/src/assets/robot09.png
--- a/examples/deployment/modal-example/server/src/bot_gemini.py
+++ b/examples/deployment/modal-example/server/src/bot_gemini.py
@@ -0,0 +1,198 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Gemini Bot Implementation.
+
+This module implements a chatbot using Google's Gemini Multimodal Live model.
+It includes:
+- Real-time audio/video interaction through Daily
+- Animated robot avatar
+- Speech-to-speech model
+
+The bot runs as part of a pipeline that processes audio/video frames and manages
+the conversation flow using Gemini's streaming capabilities.
+"""
+
+import os
+import sys
+
+from dotenv import load_dotenv
+from loguru import logger
+from PIL import Image
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import (
+    BotStartedSpeakingFrame,
+    BotStoppedSpeakingFrame,
+    Frame,
+    OutputImageRawFrame,
+    SpriteFrame,
+)
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIObserver, RTVIProcessor
+from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+load_dotenv(override=True)
+
+try:
+    logger.remove(0)
+    logger.add(sys.stderr, level="DEBUG")
+except ValueError:
+    # Handle the case where logger is already initialized
+    pass
+
+sprites = []
+script_dir = os.path.dirname(__file__)
+
+for i in range(1, 26):
+    # Build the full path to the image file
+    full_path = os.path.join(script_dir, f"assets/robot0{i}.png")
+    # Get the filename without the extension to use as the dictionary key
+    # Open the image and convert it to bytes
+    with Image.open(full_path) as img:
+        sprites.append(OutputImageRawFrame(image=img.tobytes(), size=img.size, format=img.format))
+
+# Create a smooth animation by adding reversed frames
+flipped = sprites[::-1]
+sprites.extend(flipped)
+
+# Define static and animated states
+quiet_frame = sprites[0]  # Static frame for when bot is listening
+talking_frame = SpriteFrame(images=sprites)  # Animation sequence for when bot is talking
+
+
+class TalkingAnimation(FrameProcessor):
+    """Manages the bot's visual animation states.
+
+    Switches between static (listening) and animated (talking) states based on
+    the bot's current speaking status.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self._is_talking = False
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process incoming frames and update animation state.
+
+        Args:
+            frame: The incoming frame to process
+            direction: The direction of frame flow in the pipeline
+        """
+        await super().process_frame(frame, direction)
+
+        # Switch to talking animation when bot starts speaking
+        if isinstance(frame, BotStartedSpeakingFrame):
+            if not self._is_talking:
+                await self.push_frame(talking_frame)
+                self._is_talking = True
+        # Return to static frame when bot stops speaking
+        elif isinstance(frame, BotStoppedSpeakingFrame):
+            await self.push_frame(quiet_frame)
+            self._is_talking = False
+
+        await self.push_frame(frame, direction)
+
+
+async def run_bot(room_url: str, token: str):
+    """Main bot execution function.
+
+    Sets up and runs the bot pipeline including:
+    - Daily video transport with specific audio parameters
+    - Gemini Live multimodal model integration
+    - Voice activity detection
+    - Animation processing
+    - RTVI event handling
+    """
+    # Set up Daily transport with specific audio/video parameters for Gemini
+    transport = DailyTransport(
+        room_url,
+        token,
+        "Chatbot",
+        DailyParams(
+            audio_out_enabled=True,
+            camera_out_enabled=True,
+            camera_out_width=1024,
+            camera_out_height=576,
+            vad_enabled=True,
+            vad_audio_passthrough=True,
+            vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
+        ),
+    )
+
+    # Initialize the Gemini Multimodal Live model
+    llm = GeminiMultimodalLiveLLMService(
+        api_key=os.getenv("GOOGLE_API_KEY"),
+        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
+        transcribe_user_audio=True,
+    )
+
+    messages = [
+        {
+            "role": "user",
+            "content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself.",
+        },
+    ]
+
+    # Set up conversation context and management
+    # The context_aggregator will automatically collect conversation context
+    context = OpenAILLMContext(messages)
+    context_aggregator = llm.create_context_aggregator(context)
+
+    ta = TalkingAnimation()
+
+    #
+    # RTVI events for Pipecat client UI
+    #
+    rtvi = RTVIProcessor(config=RTVIConfig(config=[]))
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            rtvi,
+            context_aggregator.user(),
+            llm,
+            ta,
+            transport.output(),
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        observers=[RTVIObserver(rtvi)],
+    )
+    await task.queue_frame(quiet_frame)
+
+    @rtvi.event_handler("on_client_ready")
+    async def on_client_ready(rtvi):
+        await rtvi.set_bot_ready()
+        # Kick off the conversation
+        await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+    @transport.event_handler("on_first_participant_joined")
+    async def on_first_participant_joined(transport, participant):
+        await transport.capture_participant_transcription(participant["id"])
+
+    @transport.event_handler("on_participant_left")
+    async def on_participant_left(transport, participant, reason):
+        print(f"Participant left: {participant}")
+        await task.cancel()
+
+    runner = PipelineRunner()
+
+    await runner.run(task)
--- a/examples/deployment/modal-example/server/src/bot_openai.py
+++ b/examples/deployment/modal-example/server/src/bot_openai.py
@@ -0,0 +1,226 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""OpenAI Bot Implementation.
+
+This module implements a chatbot using OpenAI's GPT-4 model for natural language
+processing. It includes:
+- Real-time audio/video interaction through Daily
+- Animated robot avatar
+- Text-to-speech using ElevenLabs
+- Support for both English and Spanish
+
+The bot runs as part of a pipeline that processes audio/video frames and manages
+the conversation flow.
+"""
+
+import os
+import sys
+
+from dotenv import load_dotenv
+from loguru import logger
+from PIL import Image
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import (
+    BotStartedSpeakingFrame,
+    BotStoppedSpeakingFrame,
+    Frame,
+    OutputImageRawFrame,
+    SpriteFrame,
+)
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIObserver, RTVIProcessor
+from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+load_dotenv(override=True)
+
+try:
+    logger.remove(0)
+    logger.add(sys.stderr, level="DEBUG")
+except ValueError:
+    # Handle the case where logger is already initialized
+    pass
+
+sprites = []
+script_dir = os.path.dirname(__file__)
+
+# Load sequential animation frames
+for i in range(1, 26):
+    # Build the full path to the image file
+    full_path = os.path.join(script_dir, f"assets/robot0{i}.png")
+    # Get the filename without the extension to use as the dictionary key
+    # Open the image and convert it to bytes
+    with Image.open(full_path) as img:
+        sprites.append(OutputImageRawFrame(image=img.tobytes(), size=img.size, format=img.format))
+
+# Create a smooth animation by adding reversed frames
+flipped = sprites[::-1]
+sprites.extend(flipped)
+
+# Define static and animated states
+quiet_frame = sprites[0]  # Static frame for when bot is listening
+talking_frame = SpriteFrame(images=sprites)  # Animation sequence for when bot is talking
+
+
+class TalkingAnimation(FrameProcessor):
+    """Manages the bot's visual animation states.
+
+    Switches between static (listening) and animated (talking) states based on
+    the bot's current speaking status.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self._is_talking = False
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process incoming frames and update animation state.
+
+        Args:
+            frame: The incoming frame to process
+            direction: The direction of frame flow in the pipeline
+        """
+        await super().process_frame(frame, direction)
+
+        # Switch to talking animation when bot starts speaking
+        if isinstance(frame, BotStartedSpeakingFrame):
+            if not self._is_talking:
+                await self.push_frame(talking_frame)
+                self._is_talking = True
+        # Return to static frame when bot stops speaking
+        elif isinstance(frame, BotStoppedSpeakingFrame):
+            await self.push_frame(quiet_frame)
+            self._is_talking = False
+
+        await self.push_frame(frame, direction)
+
+
+async def run_bot(room_url: str, token: str):
+    """Main bot execution function.
+
+    Sets up and runs the bot pipeline including:
+    - Daily video transport
+    - Speech-to-text and text-to-speech services
+    - Language model integration
+    - Animation processing
+    - RTVI event handling
+    """
+    # Set up Daily transport with video/audio parameters
+    transport = DailyTransport(
+        room_url,
+        token,
+        "Chatbot",
+        DailyParams(
+            audio_out_enabled=True,
+            camera_out_enabled=True,
+            camera_out_width=1024,
+            camera_out_height=576,
+            vad_enabled=True,
+            vad_analyzer=SileroVADAnalyzer(),
+            transcription_enabled=True,
+            #
+            # Spanish
+            #
+            # transcription_settings=DailyTranscriptionSettings(
+            #     language="es",
+            #     tier="nova",
+            #     model="2-general"
+            # )
+        ),
+    )
+
+    # Initialize text-to-speech service
+    tts = ElevenLabsTTSService(
+        api_key=os.getenv("ELEVENLABS_API_KEY"),
+        #
+        # English
+        #
+        voice_id="SAz9YHcvj6GT2YYXdXww",
+        #
+        # Spanish
+        #
+        # model="eleven_multilingual_v2",
+        # voice_id="gD1IexrzCvsXPHUuT0s3",
+    )
+
+    # Initialize LLM service
+    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+    messages = [
+        {
+            "role": "system",
+            #
+            # English
+            #
+            "content": "You are an incessant one-upper. Start by asking the user how their day is going.",
+            #
+            # Spanish
+            #
+            # "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
+        },
+    ]
+
+    # Set up conversation context and management
+    # The context_aggregator will automatically collect conversation context
+    context = OpenAILLMContext(messages)
+    context_aggregator = llm.create_context_aggregator(context)
+
+    ta = TalkingAnimation()
+
+    #
+    # RTVI events for Pipecat client UI
+    #
+    rtvi = RTVIProcessor(config=RTVIConfig(config=[]))
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            rtvi,
+            context_aggregator.user(),
+            llm,
+            tts,
+            ta,
+            transport.output(),
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        observers=[RTVIObserver(rtvi)],
+    )
+    await task.queue_frame(quiet_frame)
+
+    @rtvi.event_handler("on_client_ready")
+    async def on_client_ready(rtvi):
+        await rtvi.set_bot_ready()
+        # Kick off the conversation
+        await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+    @transport.event_handler("on_first_participant_joined")
+    async def on_first_participant_joined(transport, participant):
+        await transport.capture_participant_transcription(participant["id"])
+
+    @transport.event_handler("on_participant_left")
+    async def on_participant_left(transport, participant, reason):
+        print(f"Participant left: {participant}")
+        await task.cancel()
+
+    runner = PipelineRunner()
+
+    await runner.run(task)
--- a/examples/deployment/modal-example/server/src/bot_vllm.py
+++ b/examples/deployment/modal-example/server/src/bot_vllm.py
@@ -0,0 +1,239 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""OpenAI Bot Implementation.
+
+This module implements a chatbot using OpenAI's GPT-4 model for natural language
+processing. It includes:
+- Real-time audio/video interaction through Daily
+- Animated robot avatar
+- Text-to-speech using ElevenLabs
+- Support for both English and Spanish
+
+The bot runs as part of a pipeline that processes audio/video frames and manages
+the conversation flow.
+"""
+
+import os
+import sys
+from typing import List
+
+from dotenv import load_dotenv
+from loguru import logger
+from openai.types.chat import ChatCompletionMessageParam
+from PIL import Image
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import (
+    BotStartedSpeakingFrame,
+    BotStoppedSpeakingFrame,
+    Frame,
+    OutputImageRawFrame,
+    SpriteFrame,
+)
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIObserver, RTVIProcessor
+from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+load_dotenv(override=True)
+
+try:
+    logger.remove(0)
+    logger.add(sys.stderr, level="DEBUG")
+except ValueError:
+    # Handle the case where logger is already initialized
+    pass
+
+# REPLACE WITH YOUR MODAL URL ENDPOINT
+modal_url = "https://<Modal workspace>--example-vllm-openai-compatible-serve.modal.run"
+api_key = os.getenv("VLLM_API_KEY", "super-secret-key")
+
+
+sprites = []
+script_dir = os.path.dirname(__file__)
+
+# Load sequential animation frames
+for i in range(1, 26):
+    # Build the full path to the image file
+    full_path = os.path.join(script_dir, f"assets/robot0{i}.png")
+    # Get the filename without the extension to use as the dictionary key
+    # Open the image and convert it to bytes
+    with Image.open(full_path) as img:
+        sprites.append(OutputImageRawFrame(image=img.tobytes(), size=img.size, format=img.format))
+
+# Create a smooth animation by adding reversed frames
+flipped = sprites[::-1]
+sprites.extend(flipped)
+
+# Define static and animated states
+quiet_frame = sprites[0]  # Static frame for when bot is listening
+talking_frame = SpriteFrame(images=sprites)  # Animation sequence for when bot is talking
+
+
+class TalkingAnimation(FrameProcessor):
+    """Manages the bot's visual animation states.
+
+    Switches between static (listening) and animated (talking) states based on
+    the bot's current speaking status.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self._is_talking = False
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process incoming frames and update animation state.
+
+        Args:
+            frame: The incoming frame to process
+            direction: The direction of frame flow in the pipeline
+        """
+        await super().process_frame(frame, direction)
+
+        # Switch to talking animation when bot starts speaking
+        if isinstance(frame, BotStartedSpeakingFrame):
+            if not self._is_talking:
+                await self.push_frame(talking_frame)
+                self._is_talking = True
+        # Return to static frame when bot stops speaking
+        elif isinstance(frame, BotStoppedSpeakingFrame):
+            await self.push_frame(quiet_frame)
+            self._is_talking = False
+
+        await self.push_frame(frame, direction)
+
+
+async def run_bot(room_url: str, token: str):
+    """Main bot execution function.
+
+    Sets up and runs the bot pipeline including:
+    - Daily video transport
+    - Speech-to-text and text-to-speech services
+    - Language model integration
+    - Animation processing
+    - RTVI event handling
+    """
+    # Set up Daily transport with video/audio parameters
+    transport = DailyTransport(
+        room_url,
+        token,
+        "Chatbot",
+        DailyParams(
+            audio_out_enabled=True,
+            camera_out_enabled=True,
+            camera_out_width=1024,
+            camera_out_height=576,
+            vad_enabled=True,
+            vad_analyzer=SileroVADAnalyzer(),
+            transcription_enabled=True,
+            #
+            # Spanish
+            #
+            # transcription_settings=DailyTranscriptionSettings(
+            #     language="es",
+            #     tier="nova",
+            #     model="2-general"
+            # )
+        ),
+    )
+
+    # Initialize text-to-speech service
+    tts = ElevenLabsTTSService(
+        api_key=os.getenv("ELEVENLABS_API_KEY"),
+        #
+        # English
+        #
+        voice_id="D38z5RcWu1voky8WS1ja",
+        #
+        # Spanish
+        #
+        # model="eleven_multilingual_v2",
+        # voice_id="gD1IexrzCvsXPHUuT0s3",
+    )
+
+    # Initialize LLM service
+    llm = OpenAILLMService(
+        # To use OpenAI
+        api_key=api_key,
+        # Or, to use a local vLLM (or similar) api server
+        model="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16",
+        base_url=f"{modal_url}/v1",
+    )
+
+    messages = [
+        {
+            "role": "system",
+            #
+            # English
+            #
+            "content": "You are a salesman for Modal, the cloud-native serverless Python computing platform.",
+            #
+            # Spanish
+            #
+            # "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
+        },
+    ]
+
+    # Set up conversation context and management
+    # The context_aggregator will automatically collect conversation context
+    context = OpenAILLMContext(messages)
+    context_aggregator = llm.create_context_aggregator(context)
+
+    ta = TalkingAnimation()
+
+    #
+    # RTVI events for Pipecat client UI
+    #
+    rtvi = RTVIProcessor(config=RTVIConfig(config=[]))
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            rtvi,
+            context_aggregator.user(),
+            llm,
+            tts,
+            ta,
+            transport.output(),
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        observers=[RTVIObserver(rtvi)],
+    )
+    await task.queue_frame(quiet_frame)
+
+    @rtvi.event_handler("on_client_ready")
+    async def on_client_ready(rtvi):
+        await rtvi.set_bot_ready()
+        # Kick off the conversation
+        await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+    @transport.event_handler("on_first_participant_joined")
+    async def on_first_participant_joined(transport, participant):
+        await transport.capture_participant_transcription(participant["id"])
+
+    @transport.event_handler("on_participant_left")
+    async def on_participant_left(transport, participant, reason):
+        print(f"Participant left: {participant}")
+        await task.cancel()
+
+    runner = PipelineRunner()
+
+    await runner.run(task)
--- a/examples/deployment/modal-example/server/src/runner.py
+++ b/examples/deployment/modal-example/server/src/runner.py
@@ -0,0 +1,84 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+import asyncio
+import importlib
+import os
+
+
+def get_bot_file(arg_bot: str | None) -> str:
+    bot_implementation = arg_bot or os.getenv("BOT_IMPLEMENTATION", "openai").lower().strip()
+    if not bot_implementation:
+        bot_implementation = "openai"
+    if bot_implementation not in ["openai", "gemini", "vllm"]:
+        raise ValueError(
+            f"Invalid BOT_IMPLEMENTATION: {bot_implementation}. Must be 'openai' or 'gemini'"
+        )
+    return f"bot_{bot_implementation}"
+
+
+def get_runner(bot_file: str):
+    """Dynamically import the run_bot function based on the bot name.
+
+    Args:
+        bot_name (str): The name of the bot implementation (e.g., 'openai', 'gemini').
+
+    Returns:
+        function: The run_bot function from the specified bot module.
+
+    Raises:
+        ImportError: If the specified bot module or run_bot function is not found.
+    """
+    try:
+        # Dynamically construct the module name
+        module_name = f"{bot_file}"
+        # Import the module
+        module = importlib.import_module(module_name)
+        # Get the run_bot function from the module
+        return getattr(module, "run_bot")
+    except (ImportError, AttributeError) as e:
+        raise ImportError(f"Failed to import run_bot from {module_name}: {e}")
+
+
+def main():
+    """Parse the args to launch the appropriate bot using the given room/token."""
+    parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
+    parser.add_argument(
+        "-u", "--url", type=str, required=False, help="URL of the Daily room to join"
+    )
+    parser.add_argument(
+        "-t",
+        "--token",
+        type=str,
+        required=False,
+        help="Daily room token",
+    )
+    parser.add_argument(
+        "-b",
+        "--bot",
+        type=str,
+        required=False,
+        help="Bot runner to use (e.g., openai, gemini)",
+    )
+
+    args, unknown = parser.parse_known_args()
+
+    url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
+    token = args.token or os.getenv("DAILY_SAMPLE_ROOM_TOKEN")
+    bot_file = get_bot_file(args.bot)
+
+    if not url:
+        raise Exception(
+            "No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
+        )
+
+    run_bot = get_runner(bot_file)
+    asyncio.run(run_bot(url, token))
+
+
+if __name__ == "__main__":
+    main()
--- a/examples/deployment/pipecat-cloud-daily-pstn-server/README.md
+++ b/examples/deployment/pipecat-cloud-daily-pstn-server/README.md
@@ -100,7 +100,28 @@ phone numbers with valid values for your use case.

 ### Dialin Request

-The server will receive a request when a call is received from Daily.
+The server will receive a request when a call is received from Daily. 
+The payload that the webhook received is as follows:
+```json
+{
+  // for dial-in from webhook
+  "To": "+14152251493",
+  "From": "+14158483432",
+  "callId": "string-contains-uuid",
+  "callDomain": "string-contains-uuid",
+  "sipHeaders": {
+    "X-My-Custom-Header": "value",
+    "x-caller": "+1234567890",
+    "x-called": "+1987654321", 
+   },
+}
+```
+The `To`, `From`, `callId`, `callDomain` fields are converted to 
+`snake_case` and mapped to `dialin_settings`. In addition, `sipHeader` 
+contains any custom SIP headers received by Daily on the SIP 
+interconnect address (`sip_uri`). These are headers sent from 
+Twilio or other external SIP platforms, for example, to send the 
+caller's phone number.

 ### Dialout Request

@@ -158,6 +179,7 @@ curl -X POST http://localhost:3000/api/dial \
    "From": "+1987654321",
    "callId": "call-uuid-123",
    "callDomain": "domain-uuid-456",
+    "sipHeader": {},
    "dialout_settings": [
      {
        "phoneNumber": "+1234567890",
--- a/examples/deployment/pipecat-cloud-daily-pstn-server/fastapi-webhook-server/server.py
+++ b/examples/deployment/pipecat-cloud-daily-pstn-server/fastapi-webhook-server/server.py
@@ -39,6 +39,11 @@ class RoomRequest(BaseModel):
        None, description="A flag to perform voicemail or answeing-machine detection"
    )
    call_transfer: Optional[Dict[str, Any]] = Field(None, description="to initiate a call transfer")
+    sipHeaders: Optional[Dict[str, Any]] = Field(
+        None,
+        alias="sip_headers",
+        description="Custom SIP headers received from the external SIP provider",
+    )

    class Config:
        populate_by_name = True
@@ -57,6 +62,14 @@ class RoomRequest(BaseModel):
    "callDomain": "string-contains-uuid"
    These need to be remapped to dialin_settings

+    In addition, we may receive in the body that can be 
+    sent to the bot as a custom field, sip_headers
+    "sipHeaders": {
+        "X-My-Custom-Header": "value",
+        "x-caller": "+14158483432",
+        "x-called": "+14152251493",
+    },
+
    "dialout_settings": [
        {"phoneNumber": "+14158483432", "callerId": "+14152251493"}, 
        {"sipUri": "sip:username@sip.hostname"}
@@ -157,6 +170,7 @@ async def dial(request: RoomRequest, raw_request: Request):
            "dialout_settings": request.dialout_settings,
            "voicemail_detection": request.voicemail_detection,
            "call_transfer": request.call_transfer,
+            "sip_headers": request.sipHeaders,  # passing the SIP headers to the bot
        },
    }

--- a/examples/deployment/pipecat-cloud-daily-pstn-server/nextjs-webhook-server/pages/api/dial.js
+++ b/examples/deployment/pipecat-cloud-daily-pstn-server/nextjs-webhook-server/pages/api/dial.js
@@ -65,6 +65,7 @@ export default async function handler(req, res) {
      From,
      callId,
      callDomain,
+      sipHeaders,
      dialout_settings,
      voicemail_detection,
      call_transfer
@@ -117,6 +118,7 @@ export default async function handler(req, res) {
        dialout_settings,
        voicemail_detection,
        call_transfer,
+        sip_headers: sipHeaders,
      },
    };

--- a/examples/deployment/pipecat-cloud-example/bot.py
+++ b/examples/deployment/pipecat-cloud-example/bot.py
@@ -4,6 +4,7 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+import asyncio
 import os

 import aiohttp
@@ -21,44 +22,23 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.services.daily import DailyParams, DailyTransport

-# Check if we're in local development mode
-LOCAL_RUN = os.getenv("LOCAL_RUN")
-if LOCAL_RUN:
-    import asyncio
-    import webbrowser
-
-    try:
-        from local_runner import configure
-    except ImportError:
-        logger.error("Could not import local_runner module. Local development mode may not work.")
-
 # Load environment variables
 load_dotenv(override=True)

+# Check if we're in local development mode
+LOCAL_RUN = os.getenv("LOCAL_RUN")

-async def main(room_url: str, token: str):
+
+async def main(transport: DailyTransport):
    """Main pipeline setup and execution function.

    Args:
-        room_url: The Daily room URL
-        token: The Daily room token
+        transport: The DailyTransport object for the bot
    """
-    logger.debug("Starting bot in room: {}", room_url)
-
-    transport = DailyTransport(
-        room_url,
-        token,
-        "bot",
-        DailyParams(
-            audio_in_enabled=True,
-            audio_out_enabled=True,
-            transcription_enabled=True,
-            vad_analyzer=SileroVADAnalyzer(),
-        ),
-    )
+    logger.debug("Starting bot")

    tts = CartesiaTTSService(
-        api_key=os.getenv("CARTESIA_API_KEY"), voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22"
+        api_key=os.getenv("CARTESIA_API_KEY"), voice_id="71a7ad14-091c-4e8e-a314-022ece01c121"
    )

    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
@@ -126,10 +106,25 @@ async def bot(args: DailySessionArguments):
        body: The configuration object from the request body
        session_id: The session ID for logging
    """
+    from pipecat.audio.filters.krisp_filter import KrispFilter
+
    logger.info(f"Bot process initialized {args.room_url} {args.token}")

+    transport = DailyTransport(
+        args.room_url,
+        args.token,
+        "Pipecat Bot",
+        DailyParams(
+            audio_in_enabled=True,
+            audio_in_filter=None if LOCAL_RUN else KrispFilter(),
+            audio_out_enabled=True,
+            transcription_enabled=True,
+            vad_analyzer=SileroVADAnalyzer(),
+        ),
+    )
+
    try:
-        await main(args.room_url, args.token)
+        await main(transport)
        logger.info("Bot process completed")
    except Exception as e:
        logger.exception(f"Error in bot process: {str(e)}")
@@ -137,18 +132,27 @@ async def bot(args: DailySessionArguments):


 # Local development functions
-async def local_main():
+async def local_daily():
    """Function for local development testing."""
+    from local_runner import configure
+
    try:
        async with aiohttp.ClientSession() as session:
            (room_url, token) = await configure(session)
-            logger.warning("_")
-            logger.warning("_")
-            logger.warning(f"Talk to your voice agent here: {room_url}")
-            logger.warning("_")
-            logger.warning("_")
-            webbrowser.open(room_url)
-            await main(room_url, token)
+            transport = DailyTransport(
+                room_url,
+                token,
+                "Pipecat Bot",
+                DailyParams(
+                    audio_in_enabled=True,
+                    audio_out_enabled=True,
+                    transcription_enabled=True,
+                    vad_analyzer=SileroVADAnalyzer(),
+                ),
+            )
+
+            await main(transport)
+
    except Exception as e:
        logger.exception(f"Error in local development mode: {e}")

@@ -156,6 +160,6 @@ async def local_main():
 # Local development entry point
 if LOCAL_RUN and __name__ == "__main__":
    try:
-        asyncio.run(local_main())
+        asyncio.run(local_daily())
    except Exception as e:
        logger.exception(f"Failed to run in local mode: {e}")
--- a/examples/deployment/pipecat-cloud-example/env.example
+++ b/examples/deployment/pipecat-cloud-example/env.example
@@ -1,2 +1,4 @@
 CARTESIA_API_KEY=
-OPENAI_API_KEY=
+OPENAI_API_KEY=
+# Local dev only
+DAILY_API_KEY=
--- a/examples/deployment/pipecat-cloud-example/local_runner.py
+++ b/examples/deployment/pipecat-cloud-example/local_runner.py
@@ -7,6 +7,7 @@
 import os

 import aiohttp
+from fastapi import HTTPException

 from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams

--- a/examples/deployment/pipecat-cloud-example/pcc-deploy.toml
+++ b/examples/deployment/pipecat-cloud-example/pcc-deploy.toml
@@ -1,6 +1,8 @@
 agent_name = "my-first-agent"
 image = "your-username/my-first-agent:0.1"
+image_credentials = "your-dockerhub-creds"
 secret_set = "my-first-agent-secrets"
+enable_krisp = true

 [scaling]
 	min_instances = 0
--- a/examples/foundational/04c-transports-daily-audio-source.py
+++ b/examples/foundational/04c-transports-daily-audio-source.py
@@ -0,0 +1,111 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import os
+import sys
+
+import aiohttp
+from daily_runner import configure
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService, Language, LiveOptions
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main():
+    async with aiohttp.ClientSession() as session:
+        (room_url, token) = await configure(session)
+
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Respond bot",
+            DailyParams(
+                audio_in_enabled=True,
+                audio_in_passthrough=False,
+                audio_out_enabled=True,
+                audio_out_sample_rate=16000,
+                transcription_enabled=False,
+                vad_analyzer=SileroVADAnalyzer(),
+            ),
+        )
+
+        stt = DeepgramSTTService(
+            api_key=os.getenv("DEEPGRAM_API_KEY"),
+            live_options=LiveOptions(language=Language.EN),
+        )
+
+        tts = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+        )
+
+        llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        context = OpenAILLMContext(messages)
+        context_aggregator = llm.create_context_aggregator(context)
+
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,
+                context_aggregator.user(),  # User responses
+                llm,  # LLM
+                tts,  # TTS
+                transport.output(),  # Transport bot output
+                context_aggregator.assistant(),  # Assistant spoken responses
+            ]
+        )
+
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                allow_interruptions=True,
+                enable_metrics=True,
+                enable_usage_metrics=True,
+                report_only_initial_ttfb=True,
+            ),
+        )
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            await transport.capture_participant_audio(participant["id"])
+            # Kick off the conversation.
+            messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+        @transport.event_handler("on_participant_left")
+        async def on_participant_left(transport, participant, reason):
+            await task.cancel()
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/foundational/07-interruptible-cartesia-http.py
+++ b/examples/foundational/07-interruptible-cartesia-http.py
@@ -15,7 +15,7 @@ from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
-from pipecat.services.aws.tts import PollyTTSService
+from pipecat.services.cartesia.tts import CartesiaHttpTTSService
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.transports.base_transport import TransportParams
@@ -39,12 +39,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac

    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

-    tts = PollyTTSService(
-        api_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
-        aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
-        region=os.getenv("AWS_REGION"),
-        voice_id="Amy",
-        params=PollyTTSService.InputParams(engine="neural", language="en-GB", rate="1.05"),
+    tts = CartesiaHttpTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
@@ -62,7 +59,7 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
-            stt,  # STT
+            stt,
            context_aggregator.user(),  # User responses
            llm,  # LLM
            tts,  # TTS
--- a/examples/foundational/07c-interruptible-deepgram-vad.py
+++ b/examples/foundational/07c-interruptible-deepgram-vad.py
@@ -47,7 +47,7 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
        live_options=LiveOptions(vad_events=True, utterance_end_ms="1000"),
    )

-    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-helios-en")
+    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")

    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))

--- a/examples/foundational/07c-interruptible-deepgram.py
+++ b/examples/foundational/07c-interruptible-deepgram.py
@@ -39,7 +39,7 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac

    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

-    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-helios-en")
+    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")

    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))

--- a/examples/foundational/07l-interruptible-groq.py
+++ b/examples/foundational/07l-interruptible-groq.py
@@ -14,6 +14,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.services.groq.llm import GroqLLMService
 from pipecat.services.groq.stt import GroqSTTService
@@ -39,7 +40,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac

    stt = GroqSTTService(api_key=os.getenv("GROQ_API_KEY"))

-    llm = GroqLLMService(api_key=os.getenv("GROQ_API_KEY"), model="llama-3.3-70b-versatile")
+    llm = GroqLLMService(
+        api_key=os.getenv("GROQ_API_KEY"), model="meta-llama/llama-4-maverick-17b-128e-instruct"
+    )

    tts = GroqTTSService(api_key=os.getenv("GROQ_API_KEY"))

@@ -51,7 +54,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
    ]

    context = OpenAILLMContext(messages)
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = llm.create_context_aggregator(
+        context, user_params=LLMUserAggregatorParams(aggregation_timeout=0.05)
+    )

    pipeline = Pipeline(
        [
--- a/examples/foundational/07m-interruptible-aws.py
+++ b/examples/foundational/07m-interruptible-aws.py
@@ -0,0 +1,109 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.aws.llm import AWSBedrockLLMService
+from pipecat.services.aws.stt import AWSTranscribeSTTService
+from pipecat.services.aws.tts import AWSPollyTTSService
+from pipecat.transports.base_transport import TransportParams
+from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
+from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
+
+load_dotenv(override=True)
+
+
+async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
+    logger.info(f"Starting bot")
+
+    transport = SmallWebRTCTransport(
+        webrtc_connection=webrtc_connection,
+        params=TransportParams(
+            audio_in_enabled=True,
+            audio_out_enabled=True,
+            vad_analyzer=SileroVADAnalyzer(),
+        ),
+    )
+
+    stt = AWSTranscribeSTTService()
+
+    tts = AWSPollyTTSService(
+        region="us-west-2",  # only specific regions support generative TTS
+        voice_id="Joanna",
+        params=AWSPollyTTSService.InputParams(engine="generative", rate="1.1"),
+    )
+
+    llm = AWSBedrockLLMService(
+        aws_region="us-west-2",
+        model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
+        params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
+    )
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = OpenAILLMContext(messages)
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            enable_usage_metrics=True,
+            report_only_initial_ttfb=True,
+        ),
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        messages.append({"role": "user", "content": "Please introduce yourself to the user."})
+        await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+
+    @transport.event_handler("on_client_closed")
+    async def on_client_closed(transport, client):
+        logger.info(f"Client closed connection")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=False)
+
+    await runner.run(task)
+
+
+if __name__ == "__main__":
+    from run import main
+
+    main()
--- a/examples/foundational/07q-interruptible-rime-http.py
+++ b/examples/foundational/07q-interruptible-rime-http.py
@@ -44,7 +44,8 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac

        tts = RimeHttpTTSService(
            api_key=os.getenv("RIME_API_KEY", ""),
-            voice_id="rex",
+            voice_id="luna",
+            model="arcana",
            aiohttp_session=session,
        )

--- a/examples/foundational/07r-interruptible-riva-nim.py
+++ b/examples/foundational/07r-interruptible-riva-nim.py
@@ -16,8 +16,12 @@ from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.services.nim.llm import NimLLMService
-from pipecat.services.riva.stt import ParakeetSTTService
-from pipecat.services.riva.tts import FastPitchTTSService
+from pipecat.services.riva.stt import (
+    ParakeetSTTService,
+    RivaSegmentedSTTService,
+    RivaSTTService,
+)
+from pipecat.services.riva.tts import FastPitchTTSService, RivaTTSService
 from pipecat.transports.base_transport import TransportParams
 from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
 from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
@@ -37,11 +41,11 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
        ),
    )

-    stt = ParakeetSTTService(api_key=os.getenv("NVIDIA_API_KEY"))
+    stt = RivaSTTService(api_key=os.getenv("NVIDIA_API_KEY"))

    llm = NimLLMService(api_key=os.getenv("NVIDIA_API_KEY"), model="meta/llama-3.1-405b-instruct")

-    tts = FastPitchTTSService(api_key=os.getenv("NVIDIA_API_KEY"))
+    tts = RivaTTSService(api_key=os.getenv("NVIDIA_API_KEY"))

    messages = [
        {
--- a/examples/foundational/07y-interruptible-minimax.py
+++ b/examples/foundational/07y-interruptible-minimax.py
@@ -0,0 +1,111 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+import os
+
+import aiohttp
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.minimax.tts import MiniMaxHttpTTSService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transcriptions.language import Language
+from pipecat.transports.base_transport import TransportParams
+from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
+from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
+
+load_dotenv(override=True)
+
+
+async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
+    logger.info(f"Starting bot")
+
+    # Create an HTTP session
+    async with aiohttp.ClientSession() as session:
+        transport = SmallWebRTCTransport(
+            webrtc_connection=webrtc_connection,
+            params=TransportParams(
+                audio_in_enabled=True,
+                audio_out_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+            ),
+        )
+
+        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+        tts = MiniMaxHttpTTSService(
+            api_key=os.getenv("MINIMAX_API_KEY", ""),
+            group_id=os.getenv("MINIMAX_GROUP_ID", ""),
+            aiohttp_session=session,
+            params=MiniMaxHttpTTSService.InputParams(language=Language.EN),
+        )
+
+        llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        context = OpenAILLMContext(messages)
+        context_aggregator = llm.create_context_aggregator(context)
+
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,
+                context_aggregator.user(),  # User responses
+                llm,  # LLM
+                tts,  # TTS
+                transport.output(),  # Transport bot output
+                context_aggregator.assistant(),  # Assistant spoken responses
+            ]
+        )
+
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                allow_interruptions=True,
+                enable_metrics=True,
+                enable_usage_metrics=True,
+                report_only_initial_ttfb=True,
+            ),
+        )
+
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            logger.info(f"Client connected")
+            # Kick off the conversation.
+            messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+        @transport.event_handler("on_client_disconnected")
+        async def on_client_disconnected(transport, client):
+            logger.info(f"Client disconnected")
+
+        @transport.event_handler("on_client_closed")
+        async def on_client_closed(transport, client):
+            logger.info(f"Client closed connection")
+            await task.cancel()
+
+        runner = PipelineRunner(handle_sigint=False)
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    from run import main
+
+    main()
--- a/examples/foundational/07z-interruptible-sarvam.py
+++ b/examples/foundational/07z-interruptible-sarvam.py
@@ -0,0 +1,109 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+import os
+
+import aiohttp
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.services.sarvam.tts import SarvamTTSService
+from pipecat.transcriptions.language import Language
+from pipecat.transports.base_transport import TransportParams
+from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
+from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
+
+load_dotenv(override=True)
+
+
+async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
+    logger.info(f"Starting bot")
+
+    transport = SmallWebRTCTransport(
+        webrtc_connection=webrtc_connection,
+        params=TransportParams(
+            audio_in_enabled=True,
+            audio_out_enabled=True,
+            vad_analyzer=SileroVADAnalyzer(),
+        ),
+    )
+    # Create an HTTP session
+    async with aiohttp.ClientSession() as session:
+        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+        tts = SarvamTTSService(
+            api_key=os.getenv("SARVAM_API_KEY"),
+            aiohttp_session=session,
+            params=SarvamTTSService.InputParams(language=Language.EN),
+        )
+
+        llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        context = OpenAILLMContext(messages)
+        context_aggregator = llm.create_context_aggregator(context)
+
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,
+                context_aggregator.user(),  # User responses
+                llm,  # LLM
+                tts,  # TTS
+                transport.output(),  # Transport bot output
+                context_aggregator.assistant(),  # Assistant spoken responses
+            ]
+        )
+
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                allow_interruptions=True,
+                enable_metrics=True,
+                enable_usage_metrics=True,
+                report_only_initial_ttfb=True,
+            ),
+        )
+
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            logger.info(f"Client connected")
+            # Kick off the conversation.
+            messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+        @transport.event_handler("on_client_disconnected")
+        async def on_client_disconnected(transport, client):
+            logger.info(f"Client disconnected")
+
+        @transport.event_handler("on_client_closed")
+        async def on_client_closed(transport, client):
+            logger.info(f"Client closed connection")
+            await task.cancel()
+
+        runner = PipelineRunner(handle_sigint=False)
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    from run import main
+
+    main()
--- a/examples/foundational/13c-gladia-translation.py
+++ b/examples/foundational/13c-gladia-translation.py
@@ -4,6 +4,7 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+import argparse
 import os

 from dotenv import load_dotenv
@@ -39,7 +40,7 @@ class TranscriptionLogger(FrameProcessor):
            print(f"Translation ({frame.language}): {frame.text}")


-async def run_bot(webrtc_connection: SmallWebRTCConnection):
+async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
    logger.info(f"Starting bot")

    transport = SmallWebRTCTransport(
--- a/examples/foundational/14f-function-calling-groq.py
+++ b/examples/foundational/14f-function-calling-groq.py
@@ -17,6 +17,7 @@ from pipecat.frames.frames import TTSSpeakFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.groq.llm import GroqLLMService
@@ -53,7 +54,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
    )

-    llm = GroqLLMService(api_key=os.getenv("GROQ_API_KEY"), model="llama-3.3-70b-versatile")
+    llm = GroqLLMService(
+        api_key=os.getenv("GROQ_API_KEY"), model="meta-llama/llama-4-maverick-17b-128e-instruct"
+    )
    # You can also register a function_name of None to get all functions
    # sent to the same callback with an additional function_name parameter.
    llm.register_function("get_current_weather", fetch_weather_from_api)
@@ -83,7 +86,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
    ]

    context = OpenAILLMContext(messages, tools)
-    context_aggregator = llm.create_context_aggregator(context)
+    context_aggregator = llm.create_context_aggregator(
+        context, user_params=LLMUserAggregatorParams(aggregation_timeout=0.05)
+    )

    pipeline = Pipeline(
        [
--- a/examples/foundational/14r-function-calling-aws.py
+++ b/examples/foundational/14r-function-calling-aws.py
@@ -0,0 +1,139 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.aws.llm import AWSBedrockLLMService
+from pipecat.services.aws.stt import AWSTranscribeSTTService
+from pipecat.services.aws.tts import AWSPollyTTSService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.transports.base_transport import TransportParams
+from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
+from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
+
+load_dotenv(override=True)
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    await params.result_callback({"conditions": "nice", "temperature": "75"})
+
+
+async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
+    logger.info(f"Starting bot")
+
+    transport = SmallWebRTCTransport(
+        webrtc_connection=webrtc_connection,
+        params=TransportParams(
+            audio_in_enabled=True,
+            audio_out_enabled=True,
+            vad_analyzer=SileroVADAnalyzer(),
+        ),
+    )
+
+    stt = AWSTranscribeSTTService()
+
+    tts = AWSPollyTTSService(
+        region="us-west-2",  # only specific regions support generative TTS
+        voice_id="Joanna",
+        params=AWSPollyTTSService.InputParams(engine="generative", rate="1.1"),
+    )
+
+    llm = AWSBedrockLLMService(
+        aws_region="us-west-2",
+        model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
+        params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
+    )
+
+    # You can also register a function_name of None to get all functions
+    # sent to the same callback with an additional function_name parameter.
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+
+    weather_function = FunctionSchema(
+        name="get_current_weather",
+        description="Get the current weather",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+            "format": {
+                "type": "string",
+                "enum": ["celsius", "fahrenheit"],
+                "description": "The temperature unit to use. Infer this from the user's location.",
+            },
+        },
+        required=["location", "format"],
+    )
+    tools = ToolsSchema(standard_tools=[weather_function])
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = OpenAILLMContext(messages, tools)
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            stt,
+            context_aggregator.user(),
+            llm,
+            tts,
+            transport.output(),
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            enable_usage_metrics=True,
+            report_only_initial_ttfb=True,
+        ),
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        messages.append({"role": "user", "content": "Please introduce yourself to the user."})
+        await task.queue_frames([context_aggregator.user().get_context_frame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+
+    @transport.event_handler("on_client_closed")
+    async def on_client_closed(transport, client):
+        logger.info(f"Client closed connection")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=False)
+
+    await runner.run(task)
+
+
+if __name__ == "__main__":
+    from run import main
+
+    main()
--- a/examples/foundational/18b-gstreamer.py
+++ b/examples/foundational/18b-gstreamer.py
@@ -0,0 +1,165 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+from typing import Optional
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.frames.frames import (
+    EndFrame,
+    Frame,
+    InputImageRawFrame,
+    OutputImageRawFrame,
+    TextFrame,
+    TTSTextFrame,
+    UserImageRequestFrame,
+    UserStartedSpeakingFrame,
+)
+from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.frameworks.rtvi import (
+    RTVIConfig,
+    RTVIObserver,
+    RTVIProcessor,
+    RTVIServerMessageFrame,
+)
+from pipecat.processors.gstreamer.pipeline_source import GStreamerPipelineSource
+from pipecat.services.moondream.vision import MoondreamService
+from pipecat.transports.base_input import BaseInputTransport
+from pipecat.transports.base_output import BaseOutputTransport
+from pipecat.transports.base_transport import TransportParams
+from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
+from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
+
+load_dotenv(override=True)
+
+
+class AlertProcessor(FrameProcessor):
+    def __init__(self, connection: SmallWebRTCConnection):
+        super().__init__()
+        self._connection = connection
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, TextFrame):
+            text = frame.text.strip().upper()
+            message_frame = RTVIServerMessageFrame(data=text)
+            await self.push_frame(message_frame)
+
+        await self.push_frame(frame, direction)
+
+
+class UserImageRequester(FrameProcessor):
+    def __init__(self, participant_id: Optional[str] = None):
+        super().__init__()
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, OutputImageRawFrame):
+            await self.push_frame(frame)
+            # logger.info(f"UserImageRequester received image frame with size: {frame.size}")
+            text_frame = TextFrame(
+                "Are there people in the bottom right corner of the image? Only answer with YES or NO."
+            )
+            await self.push_frame(text_frame)
+            input_frame = InputImageRawFrame(
+                image=frame.image,
+                size=frame.size,
+                format=frame.format,
+            )
+            await self.push_frame(input_frame)
+        else:
+            await self.push_frame(frame, direction)
+
+
+async def run_bot(webrtc_connection: SmallWebRTCConnection, args: argparse.Namespace):
+    logger.info(f"Starting bot with video input: {args.input}")
+
+    transport = SmallWebRTCTransport(
+        webrtc_connection=webrtc_connection,
+        params=TransportParams(
+            audio_out_enabled=True,
+            video_out_enabled=True,
+            video_out_is_live=True,
+            video_out_width=1280,
+            video_out_height=720,
+        ),
+    )
+
+    gst = GStreamerPipelineSource(
+        pipeline=(f"rtspsrc location={args.input} ! decodebin ! autovideosink"),
+        out_params=GStreamerPipelineSource.OutputParams(
+            video_width=1280,
+            video_height=720,
+        ),
+    )
+
+    rtvi = RTVIProcessor(config=RTVIConfig(config=[]))
+
+    # If you run into weird description, try with use_cpu=True
+    moondream = MoondreamService()
+
+    ir = UserImageRequester()
+    va = VisionImageFrameAggregator()
+    alert = AlertProcessor(connection=webrtc_connection)
+
+    pipeline = Pipeline(
+        [
+            gst,  # GStreamer file source
+            rtvi,
+            ir,
+            # debug,
+            va,
+            moondream,
+            alert,  # Send an email alert or something if the door is open
+            transport.output(),  # Transport bot output
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        observers=[
+            RTVIObserver(rtvi),
+            DebugLogObserver(
+                frame_types={
+                    # TextFrame: None,
+                    TextFrame: (MoondreamService, FrameEndpoint.SOURCE),
+                    # InputImageRawFrame: None,
+                    EndFrame: None,
+                }
+            ),
+        ],
+    )
+
+    @rtvi.event_handler("on_client_ready")
+    async def on_client_ready(rtvi):
+        logger.info(f"Bot ready: {rtvi}")
+        await rtvi.set_bot_ready()
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected: {client}")
+
+    runner = PipelineRunner(handle_sigint=False)
+
+    await runner.run(task)
+
+
+if __name__ == "__main__":
+    from run import main
+
+    parser = argparse.ArgumentParser(description="Pipecat Bot Runner")
+    parser.add_argument("-i", "--input", type=str, required=True, help="Input video file")
+
+    main(parser)
--- a/examples/foundational/20e-persistent-context-aws-nova-sonic.py
+++ b/examples/foundational/20e-persistent-context-aws-nova-sonic.py
@@ -0,0 +1,267 @@
+#
+# Copyright (c) 2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import argparse
+import asyncio
+import glob
+import json
+import os
+from datetime import datetime
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.aws_nova_sonic.aws import AWSNovaSonicLLMService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.transports.base_transport import TransportParams
+from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
+from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
+
+load_dotenv(override=True)
+
+BASE_FILENAME = "/tmp/pipecat_conversation_"
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
+    await params.result_callback(
+        {
+            "conditions": "nice",
+            "temperature": temperature,
+            "format": params.arguments["format"],
+            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
+        }
+    )
+
+
+async def get_saved_conversation_filenames(params: FunctionCallParams):
+    # Construct the full pattern including the BASE_FILENAME
+    full_pattern = f"{BASE_FILENAME}*.json"
+
+    # Use glob to find all matching files
+    matching_files = glob.glob(full_pattern)
+    logger.debug(f"matching files: {matching_files}")
+
+    await params.result_callback({"filenames": matching_files})
+
+
+# async def get_saved_conversation_filenames(
+#     function_name, tool_call_id, args, llm, context, result_callback
+# ):
+#     pattern = re.compile(re.escape(BASE_FILENAME) + "\\d{8}_\\d{6}\\.json$")
+#     matching_files = []
+
+#     for filename in os.listdir("."):
+#         if pattern.match(filename):
+#             matching_files.append(filename)
+
+#     await result_callback({"filenames": matching_files})
+
+
+async def save_conversation(params: FunctionCallParams):
+    timestamp = datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
+    filename = f"{BASE_FILENAME}{timestamp}.json"
+    try:
+        with open(filename, "w") as file:
+            messages = params.context.get_messages_for_persistent_storage()
+            # remove the last few messages. in reverse order, they are:
+            # - the in progress save tool call
+            # - the invocation of the save tool call
+            # - the user ask to save (which may encompass one or more messages)
+            # the simplest thing to do is to pop messages until the last one is an assistant
+            # response
+            while messages and not (
+                messages[-1].get("role") == "assistant" and "content" in messages[-1]
+            ):
+                messages.pop()
+            if messages:  # we never expect this to be empty
+                logger.debug(
+                    f"writing conversation to {filename}\n{json.dumps(messages, indent=4)}"
+                )
+                json.dump(messages, file, indent=2)
+        await params.result_callback({"success": True})
+    except Exception as e:
+        await params.result_callback({"success": False, "error": str(e)})
+
+
+async def load_conversation(params: FunctionCallParams):
+    async def _reset():
+        filename = params.arguments["filename"]
+        logger.debug(f"loading conversation from {filename}")
+        try:
+            with open(filename, "r") as file:
+                messages = json.load(file)
+                messages.append(
+                    {
+                        "role": "user",
+                        "content": f"{AWSNovaSonicLLMService.AWAIT_TRIGGER_ASSISTANT_RESPONSE_INSTRUCTION}",
+                    }
+                )
+                params.context.set_messages(messages)
+                await params.llm.reset_conversation()
+                await params.llm.trigger_assistant_response()
+        except Exception as e:
+            await params.result_callback({"success": False, "error": str(e)})
+
+    asyncio.create_task(_reset())
+
+
+get_current_weather_tool = FunctionSchema(
+    name="get_current_weather",
+    description="Get the current weather",
+    properties={
+        "location": {
+            "type": "string",
+            "description": "The city and state, e.g. San Francisco, CA",
+        },
+        "format": {
+            "type": "string",
+            "enum": ["celsius", "fahrenheit"],
+            "description": "The temperature unit to use. Infer this from the user's location.",
+        },
+    },
+    required=["location", "format"],
+)
+
+save_conversation_tool = FunctionSchema(
+    name="save_conversation",
+    description="Save the current conversation. Use this function to persist the current conversation to external storage.",
+    properties={},
+    required=[],
+)
+
+get_saved_conversation_filenames_tool = FunctionSchema(
+    name="get_saved_conversation_filenames",
+    description="Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
+    properties={},
+    required=[],
+)
+
+load_conversation_tool = FunctionSchema(
+    name="load_conversation",
+    description="Load a conversation history. Use this function to load a conversation history into the current session.",
+    properties={
+        "filename": {
+            "type": "string",
+            "description": "The filename of the conversation history to load.",
+        }
+    },
+    required=["filename"],
+)
+
+tools = ToolsSchema(
+    standard_tools=[
+        get_current_weather_tool,
+        save_conversation_tool,
+        get_saved_conversation_filenames_tool,
+        load_conversation_tool,
+    ]
+)
+
+
+async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
+    logger.info(f"Starting bot")
+
+    transport = SmallWebRTCTransport(
+        webrtc_connection=webrtc_connection,
+        params=TransportParams(
+            audio_in_enabled=True,
+            audio_out_enabled=True,
+            vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.8)),
+        ),
+    )
+
+    # Specify initial system instruction.
+    # HACK: note that, for now, we need to inject a special bit of text into this instruction to
+    # allow the first assistant response to be programmatically triggered (which happens in the
+    # on_client_connected handler, below)
+    system_instruction = (
+        "You are a friendly assistant. The user and you will engage in a spoken dialog exchanging "
+        "the transcripts of a natural real-time conversation. Keep your responses short, generally "
+        "two or three sentences for chatty scenarios. "
+        f"{AWSNovaSonicLLMService.AWAIT_TRIGGER_ASSISTANT_RESPONSE_INSTRUCTION}"
+    )
+
+    llm = AWSNovaSonicLLMService(
+        secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
+        access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
+        region=os.getenv("AWS_REGION"),  # as of 2025-05-06, us-east-1 is the only supported region
+        voice_id="tiffany",  # matthew, tiffany, amy
+        # you could choose to pass instruction here rather than via context
+        # system_instruction=system_instruction,
+        # you could choose to pass tools here rather than via context
+        # tools=tools
+    )
+
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("save_conversation", save_conversation)
+    llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
+    llm.register_function("load_conversation", load_conversation)
+
+    context = OpenAILLMContext(
+        messages=[
+            {"role": "system", "content": f"{system_instruction}"},
+        ],
+        tools=tools,
+    )
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            context_aggregator.user(),
+            llm,  # LLM
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            enable_usage_metrics=True,
+            report_only_initial_ttfb=True,
+        ),
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames([context_aggregator.user().get_context_frame()])
+        # HACK: for now, we need this special way of triggering the first assistant response in AWS
+        # Nova Sonic. Note that this trigger requires a special corresponding bit of text in the
+        # system instruction. In the future, simply queueing the context frame should be sufficient.
+        await llm.trigger_assistant_response()
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+
+    @transport.event_handler("on_client_closed")
+    async def on_client_closed(transport, client):
+        logger.info(f"Client closed connection")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=False)
+
+    await runner.run(task)
+
+
+if __name__ == "__main__":
+    from run import main
+
+    main()
--- a/examples/foundational/26-gemini-multimodal-live.py
+++ b/examples/foundational/26-gemini-multimodal-live.py
@@ -53,7 +53,6 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
-        transcribe_user_audio=True,
    )

    # Build the pipeline
--- a/examples/foundational/26a-gemini-multimodal-live-transcription.py
+++ b/examples/foundational/26a-gemini-multimodal-live-transcription.py
@@ -12,10 +12,12 @@ from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import TranscriptionMessage
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.transports.base_transport import TransportParams
 from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
@@ -45,7 +47,6 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Aoede",  # Puck, Charon, Kore, Fenrir, Aoede
        # system_instruction="Talk like a pirate."
-        transcribe_user_audio=True,
        # inference_on_context_initialization=False,
    )

@@ -69,12 +70,16 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
    )
    context_aggregator = llm.create_context_aggregator(context)

+    transcript = TranscriptProcessor()
+
    pipeline = Pipeline(
        [
            transport.input(),
            context_aggregator.user(),
+            transcript.user(),
            llm,
            transport.output(),
+            transcript.assistant(),
            context_aggregator.assistant(),
        ]
    )
@@ -103,6 +108,15 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
        logger.info(f"Client closed connection")
        await task.cancel()

+    # Register event handler for transcript updates
+    @transcript.event_handler("on_transcript_update")
+    async def on_transcript_update(processor, frame):
+        for msg in frame.messages:
+            if isinstance(msg, TranscriptionMessage):
+                timestamp = f"[{msg.timestamp}] " if msg.timestamp else ""
+                line = f"{timestamp}{msg.role}: {msg.content}"
+                logger.info(f"Transcript: {line}")
+
    runner = PipelineRunner(handle_sigint=False)

    await runner.run(task)
--- a/examples/foundational/26b-gemini-multimodal-live-function-calling.py
+++ b/examples/foundational/26b-gemini-multimodal-live-function-calling.py
@@ -89,7 +89,6 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
    llm = GeminiMultimodalLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
-        transcribe_user_audio=True,
        tools=tools,
    )

--- a/Show More
+++ b/Show More