Working on the 46 example

This is working
2025-09-17 11:59:16 +08:00 · 2025-09-17 11:59:10 +08:00 · 2025-09-17 11:53:07 +08:00 · 2025-09-17 11:39:21 +08:00 · 2025-09-17 11:29:11 +08:00 · 2025-09-17 11:09:03 +08:00
131 changed files with 12779 additions and 7770 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,285 @@
+# AGENTS.md
+
+## Project Overview
+
+Pipecat is an open-source Python framework for building real-time voice and multimodal conversational AI agents. The codebase is organized around a pipeline architecture where data flows through connected services (STT → LLM → TTS).
+
+## Development Environment Setup
+
+### Prerequisites
+- **Minimum Python Version:** 3.10
+- **Recommended Python Version:** 3.12
+- **Package Manager:** uv (recommended) or pip
+
+### Setup Commands
+
+```bash
+# Clone the repository
+git clone https://github.com/pipecat-ai/pipecat.git
+cd pipecat
+
+# Install dependencies with uv (recommended)
+uv sync --group dev --all-extras \
+  --no-extra gstreamer \
+  --no-extra krisp \
+  --no-extra local \
+  --no-extra ultravox
+
+# Or with pip
+pip install -e ".[dev]"
+
+# Install pre-commit hooks
+uv run pre-commit install
+
+# Set up environment variables
+cp env.example .env
+```
+
+## Build and Test Commands
+
+### Running Tests
+```bash
+# Run all tests
+uv run pytest
+
+# Run specific test file
+uv run pytest tests/test_name.py
+
+# Run tests with coverage
+uv run pytest --cov=pipecat --cov-report=html
+```
+
+### Code Quality
+```bash
+# Format code (required before commits)
+uv run ruff format
+
+# Lint code
+uv run ruff check
+
+# Type checking
+uv run mypy src/pipecat
+
+# Run pre-commit checks manually
+uv run pre-commit run --all-files
+```
+
+### Documentation
+```bash
+# Build API documentation
+cd docs/api
+./build-docs.sh
+
+# Build docs manually
+sphinx-build -b html . _build/html -W --keep-going
+```
+
+## Code Style Guidelines
+
+### Python Standards
+- **Formatting:** Strict PEP 8 via Ruff
+- **Docstrings:** Google-style format
+- **Type Hints:** Required for all public APIs
+- **Import Organization:** Automated via Ruff
+
+### Docstring Conventions
+- **Classes:** Describe purpose + `__init__` with complete `Args:` section
+- **Dataclasses:** Use `Parameters:` section, no `__init__` docstring
+- **Methods:** Include `Args:` and `Returns:` sections
+- **Properties:** Must have `Returns:` section
+- **Examples:** Use `Examples:` section with `::` syntax
+
+### File Organization
+```
+src/pipecat/           # Main package
+├── processors/        # Frame processors
+├── services/          # AI service integrations
+├── transports/        # Communication layers
+├── frames/            # Data frame definitions
+└── pipeline/          # Pipeline orchestration
+
+examples/foundational/ # Step-by-step tutorials
+tests/                 # Test suite
+```
+
+## Testing Instructions
+
+### Test Structure
+- **Unit Tests:** Test individual components in isolation
+- **Integration Tests:** Test service interactions
+- **Example Tests:** Validate foundational examples work
+
+### Adding Tests
+```bash
+# Test naming convention
+test_<component>_<functionality>.py
+
+# Run specific test pattern
+uv run pytest -k "test_pipeline"
+
+# Run with debugging
+uv run pytest -s -vv tests/test_name.py::test_function
+```
+
+### Pre-commit Requirements
+All commits must pass:
+- Ruff formatting
+- Ruff linting
+- Type checking
+- Basic test suite
+
+## Dependency Management
+
+### Using uv (Recommended)
+```bash
+# Add runtime dependency
+uv add package-name
+
+# Add optional dependency
+uv add --optional service package-name
+
+# Add development dependency
+uv add --group dev package-name
+
+# Update lockfile
+uv lock
+
+# Sync dependencies
+uv sync
+```
+
+### Important Notes
+- **Always commit both `pyproject.toml` and `uv.lock` together**
+- **Never manually edit `uv.lock`** - it's auto-generated
+- **Use extras for optional service dependencies** (e.g., `[openai]`, `[cartesia]`)
+
+## Project Structure Guidelines
+
+### Service Integration
+When adding new AI services:
+1. Create service class in `src/pipecat/services/<provider>/`
+2. Follow existing patterns (e.g., STTService, LLMService)
+3. Add to appropriate extras in `pyproject.toml`
+4. Include tests in `tests/`
+5. Add documentation examples
+
+### Frame Processing
+For custom processors:
+1. Inherit from `FrameProcessor`
+2. Implement `process_frame()` method. ALWAYS explicitly call `await super().process_frame(frame, direction)` at the top of this method.
+3. Handle frame direction (FrameDirection.UPSTREAM/DOWNSTREAM)
+4. Add proper type hints and docstrings
+
+### Transport Implementation
+For new transport layers:
+1. Inherit from `BaseTransport`
+2. Implement required abstract methods
+3. Handle connection lifecycle
+4. Support both input and output streams
+
+## Security Considerations
+
+### API Keys
+- **Never commit API keys** to the repository
+- **Use environment variables** for all secrets
+- **Reference `env.example`** for required variables
+- **Use `.env` files** for local development
+
+### Input Validation
+- **Validate all external inputs** (audio, text, API responses)
+- **Sanitize user data** before processing
+- **Handle rate limiting** for external services
+- **Implement proper timeout handling**
+
+## Performance Guidelines
+
+### Memory Management
+- **Clean up resources** in transport disconnection handlers
+- **Use async context managers** for service connections
+- **Implement proper frame lifecycle** management
+
+### Latency Optimization
+- **Choose appropriate STT services** for latency requirements
+- **Use streaming TTS** when possible
+- **Implement connection pooling** for HTTP services
+- **Consider WebRTC** for real-time applications
+
+## Common Patterns
+
+### Error Handling
+```python
+@transport.event_handler("on_error")
+async def on_error(transport, error):
+    logger.error(f"Transport error: {error}")
+
+    # Shutdown the pipeline
+    await task.queue_frame(EndFrame())
+ 
+```
+
+### Service Configuration
+```python
+# Use environment variables for configuration
+service = OpenAILLMService(
+    api_key=os.getenv("OPENAI_API_KEY", ""),
+    model="gpt-4o",
+    params={"temperature": 0.7}
+)
+```
+
+### Pipeline Assembly
+```python
+pipeline = Pipeline([
+    transport.input(),
+    stt_service,
+    context_aggregator.user(),
+    llm_service,
+    tts_service,
+    transport.output(),
+    context_aggregator.assistant(),
+])
+```
+
+## Commit and PR Guidelines
+
+### Commit Message Format
+```
+<type>(<scope>): <description>
+
+[optional body]
+
+[optional footer]
+```
+
+Types: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore`
+
+### PR Requirements
+- **All tests must pass**
+- **Code must be properly formatted** (Ruff)
+- **Include appropriate tests** for new functionality
+- **Update documentation** if needed
+- **Reference related issues** in description
+
+### Review Process
+1. Automated checks must pass
+2. Manual code review by maintainers
+3. Documentation review for user-facing changes
+4. Integration testing for service additions
+
+## Troubleshooting
+
+### Common Issues
+- **Import errors:** Run `uv sync` to ensure dependencies are installed
+- **Test failures:** Check environment variables in `.env`
+- **Format errors:** Run `uv run ruff format` before committing
+- **Type errors:** Ensure all public methods have type hints
+
+### Development Tips
+- **Use foundational examples** as starting points for testing
+- **Check existing services** for integration patterns
+- **Run tests frequently** during development
+- **Use IDE integration** for Ruff formatting
+
+### Getting Help
+- **Documentation:** [docs.pipecat.ai](https://docs.pipecat.ai)
+- **Issues:** [GitHub Issues](https://github.com/pipecat-ai/pipecat/issues)
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,136 @@ All notable changes to **Pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

+## [Unreleased]
+
+### Added
+
+- Added `on_pipeline_finished` event to `PipelineTask`. This event will get
+  fired when the pipeline is done running. This can be the result of a
+  `StopFrame`, `CancelFrame` or `EndFrame`.
+
+  ```python
+  @task.event_handler("on_pipeline_finished")
+  async def on_pipeline_finished(task: PipelineTask, frame: Frame):
+      ...
+  ```
+
+### Deprecated
+
+- `PipelineTask` events `on_pipeline_stopped`, `on_pipeline_ended` and
+  `on_pipeline_cancelled` are now deprecated. Use `on_pipeline_finished`
+  instead.
+
+### Fixed
+
+- Fixed an issue in `AudioBufferProcessor` where a recording is not created
+  when a bot speaks and user input is blocked.
+
+- Fixed a `FastAPIWebsocketTransport` and `SmallWebRTCTransport` issue where
+  `on_client_disconnected` would be triggered when the bot ends the
+  conversation. That is, `on_client_disconnected` should only be triggered when
+  the remote client actually disconnects.
+
+- Fixed an issue in `HeyGenVideoService` where the `BotStartedSpeakingFrame`
+  was blocked from moving through the Pipeline.
+
+## [0.0.85] - 2025-09-12
+
+### Added
+
+- `AzureSTTService` now pushes interim transcriptions.
+
+- Added `voice_cloning_key` to `GoogleTTSService` to support custom cloned
+  voices.
+
+- Added `speaking_rate` to `GoogleTTSService.InputParams` to control the
+  speaking rate.
+
+- Added a `speed` arg to `OpenAITTSService` to control the speed of the voice
+  response.
+
+- Added `FrameProcessor.push_interruption_task_frame_and_wait()`. Use this
+  method to programatically interrupt the bot from any part of the
+  pipeline. This guarantees that all the processors in the pipeline are
+  interrupted in order (from upstream to downstream). Internally, this works by
+  first pushing an `InterruptionTaskFrame` upstream until it reaches the
+  pipeline task. The pipeline task then generates an `InterruptionFrame`, which
+  flows downstream through all processors. Once the `InterruptionFrame` has
+  reaches the processor waiting for the interruption, the function returns and
+  execution continues after the call. Think of it as sending an upstream request
+  for interruption and waiting until the acknowledgment flows back downstream.
+
+- Added new base `TaskFrame` (which is a system frame). This is the base class
+  for all task frames (`EndTaskFrame`, `CancelTaskFrame`, etc.) that are meant
+  to be pushed upstream to reach the pipeline task.
+
+- Expanded support for universal `LLMContext` to the AWS Bedrock LLM service.
+  Using the universal `LLMContext` and associated `LLMContextAggregatorPair` is
+  a pre-requisite for using `LLMSwitcher` to switch between LLMs at runtime.
+
+- Added new fields to the development runner's `parse_telephony_websocket`
+  method in support of providing dynamic data to a bot.
+
+  - Twilio: Added a new `body` parameter, which parses the websocket message
+    for `customParameters`. Provide data via the `Parameter` nouns in your
+    TwiML to use this feature.
+  - Telnyx & Exotel: Both providers make the `to` and `from` phone numbers
+    available in the websocket messages. You can now access these numbers as
+    `call_data["to"]` and `call_data["from"]`.
+
+  Note: Each telephony provider offers different features. Refer to the
+  corresponding example in `pipecat-examples` to see how to pass custom data
+  to your bot.
+
+- Added `body` to the `WebsocketRunnerArguments` as an optional parameter.
+  Custom `body` information can be passed from the server into the bot file via
+  the `bot()` method using this new parameter.
+
+- Added video streaming support to `LiveKitTransport`.
+
+- Added `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService` which provide
+  access to OpenAI Realtime.
+
+### Changed
+
+- `pipeline.tests.utils.run_test()` now allows passing `PipelineParams` instead
+  of individual parameters.
+
+### Removed
+
+- Remove `VisionImageRawFrame` in favor of context frames (`LLMContextFrame` or
+  `OpenAILLMContextFrame`).
+
+### Deprecated
+
+- `BotInterruptionFrame` is now deprecated, use `InterruptionTaskFrame` instead.
+
+- `StartInterruptionFrame` is now deprected, use `InterruptionFrame` instead.
+
+- Deprecate `VisionImageFrameAggregator` because `VisionImageRawFrame` has been
+  removed. See the `12*` examples for the new recommended replacement pattern.
+
+- `NoisereduceFilter` is now deprecated and will be removed in a future
+  version. Use other audio filters like `KrispFilter` or `AICFilter`.
+
+- Deprecated `OpenAIRealtimeBetaLLMService` and `AzureRealtimeBetaLLMService`.
+  Use `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService`, respectively.
+  Each service will be removed in an upcoming version, 1.0.0.
+
+### Fixed
+
+- Fixed a `BaseOutputTransport` issue that caused incorrect detection of when
+  the bot stopped talking while using an audio mixer.
+
+- Fixed a `LiveKitTransport` issue where RTVI messages were not properly
+  encoded.
+
+- Add additional fixups to Mistral context messages to ensure they meet
+  Mistral-specific requirements, avoiding Mistral "invalid request" errors.
+
+- Fixed `DailyTransport` transcription handling to gracefully handle missing
+  `rawResponse` field in transcription messages, preventing KeyError crashes.
+
 ## [0.0.84] - 2025-09-05

 ### Added
--- a/README.md
+++ b/README.md
@@ -28,6 +28,41 @@
 - **Composable Pipelines**: Build complex behavior from modular components
 - **Real-Time**: Ultra-low latency interaction with different transports (e.g. WebSockets or WebRTC)

+## 📱 Client SDKs
+
+You can connect to Pipecat from any platform using our official SDKs:
+
+<table>
+  <tr>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg" width="40" height="40" alt="JavaScript"/>
+      <a href="https://docs.pipecat.ai/client/js/introduction">JavaScript</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React"/>
+      <a href="https://docs.pipecat.ai/client/react/introduction">React</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React Native"/>
+      <a href="https://docs.pipecat.ai/client/react-native/introduction">React Native</a>
+    </td>
+  </tr>
+  <tr>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/swift/swift-original.svg" width="40" height="40" alt="Swift"/>
+      <a href="https://docs.pipecat.ai/client/ios/introduction">Swift</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/kotlin/kotlin-original.svg" width="40" height="40" alt="Kotlin"/>
+      <a href="https://docs.pipecat.ai/client/android/introduction">Kotlin</a>
+    </td>
+    <td>
+      <img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/cplusplus/cplusplus-original.svg" width="40" height="40" alt="JavaScript"/>
+      <a href="https://docs.pipecat.ai/client/c++/introduction">C++</a>
+    </td>
+  </tr>
+</table>
+
 ## 🎬 See it in action

 <p float="left">
@@ -38,17 +73,6 @@
    <a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
 </p>

-## 📱 Client SDKs
-
-You can connect to Pipecat from any platform using our official SDKs:
-
-| Platform | SDK Repo                                                                       | Description                      |
-| -------- | ------------------------------------------------------------------------------ | -------------------------------- |
-| Web      | [pipecat-client-web](https://github.com/pipecat-ai/pipecat-client-web)         | JavaScript and React client SDKs |
-| iOS      | [pipecat-client-ios](https://github.com/pipecat-ai/pipecat-client-ios)         | Swift SDK for iOS                |
-| Android  | [pipecat-client-android](https://github.com/pipecat-ai/pipecat-client-android) | Kotlin SDK for Android           |
-| C++      | [pipecat-client-cxx](https://github.com/pipecat-ai/pipecat-client-cxx)         | C++ client SDK                   |
-
 ## 🧩 Available services

 | Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
@@ -62,7 +86,7 @@ You can connect to Pipecat from any platform using our official SDKs:
 | Video               | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
 | Memory              | [mem0](https://docs.pipecat.ai/server/services/memory/mem0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
 | Vision & Image      | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
-| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+| Audio Processing    | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
 | Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |

 📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
@@ -129,7 +153,11 @@ You can get started with Pipecat running on your local machine, then move your a
 2. Install development and testing dependencies:

   ```bash
-   uv sync --group dev --all-extras --no-extra gstreamer --no-extra krisp --no-extra local
+   uv sync --group dev --all-extras \
+     --no-extra gstreamer \
+     --no-extra krisp \
+     --no-extra local \
+     --no-extra ultravox # (ultravox not fully supported on macOS)
   ```

 3. Install the git pre-commit hooks:
@@ -138,23 +166,6 @@ You can get started with Pipecat running on your local machine, then move your a
   uv run pre-commit install
   ```

-### Python 3.13+ Compatibility
-
-Some features require PyTorch, which doesn't yet support Python 3.13+. Install using:
-
-```bash
-uv sync --group dev --all-extras \
-  --no-extra gstreamer \
-  --no-extra krisp \
-  --no-extra local \
-  --no-extra local-smart-turn \
-  --no-extra mlx-whisper \
-  --no-extra moondream \
-  --no-extra ultravox
-```
-
-> **Tip:** For full compatibility, use Python 3.12: `uv python pin 3.12`
-
 > **Note**: Some extras (local, gstreamer) require system dependencies. See documentation if you encounter build errors.

 ### Running tests
--- a/examples/foundational/04b-transports-livekit.py
+++ b/examples/foundational/04b-transports-livekit.py
@@ -14,7 +14,7 @@ from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import (
-    BotInterruptionFrame,
+    InterruptionFrame,
    TextFrame,
    TranscriptionFrame,
    UserStartedSpeakingFrame,
@@ -115,7 +115,7 @@ async def main():

        await task.queue_frames(
            [
-                BotInterruptionFrame(),
+                InterruptionFrame(),
                UserStartedSpeakingFrame(),
                TranscriptionFrame(
                    user_id=participant_id,
--- a/examples/foundational/07ad-interruptible-aicoustics.py
+++ b/examples/foundational/07ad-interruptible-aicoustics.py
@@ -36,7 +36,6 @@ load_dotenv(override=True)
 audiobuffer = AudioBufferProcessor(
    num_channels=2,  # 1 for mono, 2 for stereo (user left, bot right)
    enable_turn_audio=False,  # Enable per-turn audio recording
-    user_continuous_stream=True,  # User has continuous audio stream
 )


--- a/examples/foundational/07c-interruptible-deepgram-vad.py
+++ b/examples/foundational/07c-interruptible-deepgram-vad.py
@@ -12,8 +12,8 @@ from dotenv import load_dotenv
 from loguru import logger

 from pipecat.frames.frames import (
+    InterruptionFrame,
    LLMRunFrame,
-    StartInterruptionFrame,
    UserStartedSpeakingFrame,
    UserStoppedSpeakingFrame,
 )
@@ -97,7 +97,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    @stt.event_handler("on_speech_started")
    async def on_speech_started(stt, *args, **kwargs):
-        await task.queue_frames([StartInterruptionFrame(), UserStartedSpeakingFrame()])
+        await task.queue_frames([InterruptionFrame(), UserStartedSpeakingFrame()])

    @stt.event_handler("on_utterance_end")
    async def on_utterance_end(stt, *args, **kwargs):
--- a/examples/foundational/07s-interruptible-google-audio-in.py
+++ b/examples/foundational/07s-interruptible-google-audio-in.py
@@ -16,10 +16,10 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import (
    Frame,
    InputAudioRawFrame,
+    InterruptionFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
    LLMRunFrame,
-    StartInterruptionFrame,
    TextFrame,
    TranscriptionFrame,
    UserStartedSpeakingFrame,
@@ -181,9 +181,7 @@ class TranscriptionContextFixup(FrameProcessor):

        if isinstance(frame, MagicDemoTranscriptionFrame):
            self._transcript = frame.text
-        elif isinstance(frame, LLMFullResponseEndFrame) or isinstance(
-            frame, StartInterruptionFrame
-        ):
+        elif isinstance(frame, LLMFullResponseEndFrame) or isinstance(frame, InterruptionFrame):
            self.swap_user_audio()
            self.add_transcript_back_to_inference_output()
            self._transcript = ""
--- a/examples/foundational/12-describe-video.py
+++ b/examples/foundational/12-describe-video.py
@@ -11,12 +11,19 @@ from dotenv import load_dotenv
 from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
@@ -34,6 +41,8 @@ load_dotenv(override=True)


 class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""
+
    def __init__(self, participant_id: Optional[str] = None):
        super().__init__()
        self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):

        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(
-                UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
            )
-        await self.push_frame(frame, direction)
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    # Initialize the image requester without setting the participant ID yet
    image_requester = UserImageRequester()

-    vision_aggregator = VisionImageFrameAggregator()
+    image_processor = UserImageProcessor()

    # If you run into weird description, try with use_cpu=True
    moondream = MoondreamService()
@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            stt,
            user_response,
            image_requester,
-            vision_aggregator,
+            image_processor,
            moondream,
            tts,
            transport.output(),
@@ -119,7 +151,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        image_requester.set_participant_id(client_id)

        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12a-describe-video-gemini-flash.py
+++ b/examples/foundational/12a-describe-video-gemini-flash.py
@@ -11,12 +11,19 @@ from dotenv import load_dotenv
 from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
@@ -34,6 +41,8 @@ load_dotenv(override=True)


 class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""
+
    def __init__(self, participant_id: Optional[str] = None):
        super().__init__()
        self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):

        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(
-                UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
            )
-        await self.push_frame(frame, direction)
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    # Initialize the image requester without setting the participant ID yet
    image_requester = UserImageRequester()

-    vision_aggregator = VisionImageFrameAggregator()
+    image_processor = UserImageProcessor()

    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            stt,
            user_response,
            image_requester,
-            vision_aggregator,
+            image_processor,
            google,
            tts,
            transport.output(),
@@ -123,7 +155,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        image_requester.set_participant_id(client_id)

        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12b-describe-video-gpt-4o.py
+++ b/examples/foundational/12b-describe-video-gpt-4o.py
@@ -11,12 +11,19 @@ from dotenv import load_dotenv
 from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
@@ -34,6 +41,8 @@ load_dotenv(override=True)


 class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""
+
    def __init__(self, participant_id: Optional[str] = None):
        super().__init__()
        self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):

        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(
-                UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
            )
-        await self.push_frame(frame, direction)
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    # Initialize the image requester without setting the participant ID yet
    image_requester = UserImageRequester()

-    vision_aggregator = VisionImageFrameAggregator()
+    image_processor = UserImageProcessor()

    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            stt,
            user_response,
            image_requester,
-            vision_aggregator,
+            image_processor,
            openai,
            tts,
            transport.output(),
@@ -123,7 +155,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        image_requester.set_participant_id(client_id)

        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12c-describe-video-anthropic.py
+++ b/examples/foundational/12c-describe-video-anthropic.py
@@ -11,12 +11,19 @@ from dotenv import load_dotenv
 from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.user_response import UserResponseAggregator
-from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import (
@@ -34,6 +41,8 @@ load_dotenv(override=True)


 class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""
+
    def __init__(self, participant_id: Optional[str] = None):
        super().__init__()
        self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):

        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(
-                UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
            )
-        await self.push_frame(frame, direction)
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)


 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    # Initialize the image requester without setting the participant ID yet
    image_requester = UserImageRequester()

-    vision_aggregator = VisionImageFrameAggregator()
+    image_processor = UserImageProcessor()

    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            stt,
            user_response,
            image_requester,
-            vision_aggregator,
+            image_processor,
            anthropic,
            tts,
            transport.output(),
@@ -123,7 +155,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        image_requester.set_participant_id(client_id)

        # Welcome message
-        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/12d-describe-video-aws.py
+++ b/examples/foundational/12d-describe-video-aws.py
@@ -0,0 +1,187 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import os
+from typing import Optional
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import (
+    Frame,
+    LLMContextFrame,
+    TextFrame,
+    TTSSpeakFrame,
+    UserImageRawFrame,
+    UserImageRequestFrame,
+)
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.user_response import UserResponseAggregator
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
+from pipecat.services.aws.llm import AWSBedrockLLMService
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+
+load_dotenv(override=True)
+
+
+class UserImageRequester(FrameProcessor):
+    """Converts incoming text into requests for user images."""
+
+    def __init__(self, participant_id: Optional[str] = None):
+        super().__init__()
+        self._participant_id = participant_id
+
+    def set_participant_id(self, participant_id: str):
+        self._participant_id = participant_id
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if self._participant_id and isinstance(frame, TextFrame):
+            await self.push_frame(
+                UserImageRequestFrame(self._participant_id, context=frame.text),
+                FrameDirection.UPSTREAM,
+            )
+        else:
+            await self.push_frame(frame, direction)
+
+
+class UserImageProcessor(FrameProcessor):
+    """Converts incoming user images into context frames."""
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, UserImageRawFrame):
+            if frame.request and frame.request.context:
+                # Note: AWS Bedrock does not yet support the universal LLMContext
+                context = LLMContext()
+                context.add_image_frame_message(
+                    image=frame.image,
+                    text=frame.request.context,
+                    size=frame.size,
+                    format=frame.format,
+                )
+                frame = LLMContextFrame(context)
+                await self.push_frame(frame)
+        else:
+            await self.push_frame(frame, direction)
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    user_response = UserResponseAggregator()
+
+    # Initialize the image requester without setting the participant ID yet
+    image_requester = UserImageRequester()
+
+    image_processor = UserImageProcessor()
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    # AWS for vision analysis
+    aws = AWSBedrockLLMService(
+        aws_region="us-west-2",
+        model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
+        # Note: usually, prefer providing latency="optimized" param.
+        # Here we can't because AWS Bedrock doesn't support it for Claude 3.7,
+        # which we need for image input.
+        params=AWSBedrockLLMService.InputParams(temperature=0.8),
+    )
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            stt,
+            user_response,
+            image_requester,
+            image_processor,
+            aws,
+            tts,
+            transport.output(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected: {client}")
+
+        await maybe_capture_participant_camera(transport, client)
+
+        # Set the participant ID in the image requester
+        client_id = get_transport_client_id(transport, client)
+        image_requester.set_participant_id(client_id)
+
+        # Welcome message
+        await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/13-whisper-transcription.py
+++ b/examples/foundational/13-whisper-transcription.py
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13a-whisper-local.py
+++ b/examples/foundational/13a-whisper-local.py
@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 async def main():
    transport = LocalAudioTransport(
--- a/examples/foundational/13b-deepgram-transcription.py
+++ b/examples/foundational/13b-deepgram-transcription.py
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13c-gladia-transcription.py
+++ b/examples/foundational/13c-gladia-transcription.py
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13c-gladia-translation.py
+++ b/examples/foundational/13c-gladia-translation.py
@@ -40,6 +40,9 @@ class TranscriptionLogger(FrameProcessor):
        elif isinstance(frame, TranslationFrame):
            print(f"Translation ({frame.language}): {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13d-assemblyai-transcription.py
+++ b/examples/foundational/13d-assemblyai-transcription.py
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13e-whisper-mlx.py
+++ b/examples/foundational/13e-whisper-mlx.py
@@ -52,6 +52,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            self._last_transcription_time = time.time()

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13f-cartesia-transcription.py
+++ b/examples/foundational/13f-cartesia-transcription.py
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13g-sambanova-transcription.py
+++ b/examples/foundational/13g-sambanova-transcription.py
@@ -53,6 +53,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            self._last_transcription_time = time.time()

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13h-speechmatics-transcription.py
+++ b/examples/foundational/13h-speechmatics-transcription.py
@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
 # instantiated. The function will be called when the desired transport gets
--- a/examples/foundational/13i-soniox-transcription.py
+++ b/examples/foundational/13i-soniox-transcription.py
@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 transport_params = {
    "daily": lambda: DailyParams(
--- a/examples/foundational/13j-azure-transcription.py
+++ b/examples/foundational/13j-azure-transcription.py
@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

+        # Push all frames through
+        await self.push_frame(frame, direction)
+

 transport_params = {
    "daily": lambda: DailyParams(
--- a/examples/foundational/14aa-function-calling-aws-universal-context.py
+++ b/examples/foundational/14aa-function-calling-aws-universal-context.py
@@ -0,0 +1,214 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import asyncio
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import (
+    create_transport,
+    get_transport_client_id,
+    maybe_capture_participant_camera,
+)
+from pipecat.services.aws.llm import AWSBedrockLLMService
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.services.daily import DailyParams
+
+load_dotenv(override=True)
+
+
+# Global variable to store the client ID
+client_id = ""
+
+
+async def get_weather(params: FunctionCallParams):
+    location = params.arguments["location"]
+    await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.")
+
+
+async def get_image(params: FunctionCallParams):
+    question = params.arguments["question"]
+    logger.debug(f"Requesting image with user_id={client_id}, question={question}")
+
+    # Request the image frame
+    await params.llm.request_image_frame(
+        user_id=client_id,
+        function_name=params.function_name,
+        tool_call_id=params.tool_call_id,
+        text_content=question,
+    )
+
+    # Wait a short time for the frame to be processed
+    await asyncio.sleep(0.5)
+
+    # Return a result to complete the function call
+    await params.result_callback(
+        f"I've captured an image from your camera and I'm analyzing what you asked about: {question}"
+    )
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        video_in_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    llm = AWSBedrockLLMService(
+        aws_region="us-west-2",
+        model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
+        # Note: usually, prefer providing latency="optimized" param.
+        # Here we can't because AWS Bedrock doesn't support it for Claude 3.7,
+        # which we need for image input.
+        params=AWSBedrockLLMService.InputParams(temperature=0.8),
+    )
+    llm.register_function("get_weather", get_weather)
+    llm.register_function("get_image", get_image)
+
+    weather_function = FunctionSchema(
+        name="get_weather",
+        description="Get the current weather",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+        },
+        required=["location"],
+    )
+    get_image_function = FunctionSchema(
+        name="get_image",
+        description="Get an image from the video stream.",
+        properties={
+            "question": {
+                "type": "string",
+                "description": "The question that the user is asking about the image.",
+            }
+        },
+        required=["question"],
+    )
+    tools = ToolsSchema(standard_tools=[weather_function, get_image_function])
+
+    system_prompt = """\
+You are a helpful assistant who converses with a user and answers questions. Respond concisely to general questions.
+
+Your response will be turned into speech so use only simple words and punctuation.
+
+You have access to two tools: get_weather and get_image.
+
+You can respond to questions about the weather using the get_weather tool.
+
+You can answer questions about the user's video stream using the get_image tool. Some examples of phrases that \
+indicate you should use the get_image tool are:
+- What do you see?
+- What's in the video?
+- Can you describe the video?
+- Tell me about what you see.
+- Tell me something interesting about what you see.
+- What's happening in the video?
+
+If you need to use a tool, simply use the tool. Do not tell the user the tool you are using. Be brief and concise.
+    """
+
+    messages = [
+        {"role": "system", "content": system_prompt},
+        {"role": "user", "content": "Start the conversation by introducing yourself."},
+    ]
+
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User speech to text
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses and tool context
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected: {client}")
+
+        await maybe_capture_participant_camera(transport, client)
+
+        global client_id
+        client_id = get_transport_client_id(transport, client)
+
+        # Kick off the conversation.
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/19-openai-realtime.py
+++ b/examples/foundational/19-openai-realtime.py
@@ -0,0 +1,228 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+from datetime import datetime
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
+from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.transcript_processor import TranscriptProcessor
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.services.openai_realtime import (
+    InputAudioNoiseReduction,
+    InputAudioTranscription,
+    OpenAIRealtimeLLMService,
+    SemanticTurnDetection,
+    SessionProperties,
+)
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
+    await params.result_callback(
+        {
+            "conditions": "nice",
+            "temperature": temperature,
+            "format": params.arguments["format"],
+            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
+        }
+    )
+
+
+async def fetch_restaurant_recommendation(params: FunctionCallParams):
+    await params.result_callback({"name": "The Golden Dragon"})
+
+
+weather_function = FunctionSchema(
+    name="get_current_weather",
+    description="Get the current weather",
+    properties={
+        "location": {
+            "type": "string",
+            "description": "The city and state, e.g. San Francisco, CA",
+        },
+        "format": {
+            "type": "string",
+            "enum": ["celsius", "fahrenheit"],
+            "description": "The temperature unit to use. Infer this from the users location.",
+        },
+    },
+    required=["location", "format"],
+)
+
+restaurant_function = FunctionSchema(
+    name="get_restaurant_recommendation",
+    description="Get a restaurant recommendation",
+    properties={
+        "location": {
+            "type": "string",
+            "description": "The city and state, e.g. San Francisco, CA",
+        },
+    },
+    required=["location"],
+)
+
+# Create tools schema
+tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    session_properties = SessionProperties(
+        audio=AudioConfiguration(
+            input=AudioInput(
+                transcription=InputAudioTranscription(),
+                # Set openai TurnDetection parameters. Not setting this at all will turn it
+                # on by default
+                turn_detection=SemanticTurnDetection(),
+                # Or set to False to disable openai turn detection and use transport VAD
+                # turn_detection=False,
+                noise_reduction=InputAudioNoiseReduction(type="near_field"),
+            )
+        ),
+        # tools=tools,
+        instructions="""You are a helpful and friendly AI.
+
+Act like a human, but remember that you aren't a human and that you can't do human
+things in the real world. Your voice and personality should be warm and engaging, with a lively and
+playful tone.
+
+If interacting in a non-English language, start by using the standard accent or dialect familiar to
+the user. Talk quickly. You should always call a function if you can. Do not refer to these rules,
+even if you're asked about them.
+
+You are participating in a voice conversation. Keep your responses concise, short, and to the point
+unless specifically asked to elaborate on a topic.
+
+You have access to the following tools:
+- get_current_weather: Get the current weather for a given location.
+- get_restaurant_recommendation: Get a restaurant recommendation for a given location.
+
+Remember, your responses should be short. Just one or two sentences, usually. Respond in English.""",
+    )
+
+    llm = OpenAIRealtimeLLMService(
+        api_key=os.getenv("OPENAI_API_KEY"),
+        session_properties=session_properties,
+        start_audio_paused=False,
+    )
+
+    # you can either register a single function for all function calls, or specific functions
+    # llm.register_function(None, fetch_weather_from_api)
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+
+    transcript = TranscriptProcessor()
+
+    # Create a standard OpenAI LLM context object using the normal messages format. The
+    # OpenAIRealtimeLLMService will convert this internally to messages that the
+    # openai WebSocket API can understand.
+    context = OpenAILLMContext(
+        [{"role": "user", "content": "Say hello!"}],
+        tools,
+    )
+
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            context_aggregator.user(),
+            llm,  # LLM
+            transcript.user(),  # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
+            transport.output(),  # Transport bot output
+            transcript.assistant(),  # After the transcript output, to time with the audio output
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+        observers=[TranscriptionLogObserver()],
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    # Register event handler for transcript updates
+    @transcript.event_handler("on_transcript_update")
+    async def on_transcript_update(processor, frame):
+        for msg in frame.messages:
+            if isinstance(msg, TranscriptionMessage):
+                timestamp = f"[{msg.timestamp}] " if msg.timestamp else ""
+                line = f"{timestamp}{msg.role}: {msg.content}"
+                logger.info(f"Transcript: {line}")
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/19a-azure-realtime.py
+++ b/examples/foundational/19a-azure-realtime.py
@@ -0,0 +1,221 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+from datetime import datetime
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.services.openai_realtime import (
+    AzureRealtimeLLMService,
+    InputAudioTranscription,
+    SessionProperties,
+)
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
+    await params.result_callback(
+        {
+            "conditions": "nice",
+            "temperature": temperature,
+            "format": params.arguments["format"],
+            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
+        }
+    )
+
+
+async def fetch_restaurant_recommendation(params: FunctionCallParams):
+    await params.result_callback({"name": "The Golden Dragon"})
+
+
+# Define weather function using standardized schema
+weather_function = FunctionSchema(
+    name="get_current_weather",
+    description="Get the current weather",
+    properties={
+        "location": {
+            "type": "string",
+            "description": "The city and state, e.g. San Francisco, CA",
+        },
+        "format": {
+            "type": "string",
+            "enum": ["celsius", "fahrenheit"],
+            "description": "The temperature unit to use. Infer this from the users location.",
+        },
+    },
+    required=["location", "format"],
+)
+
+restaurant_function = FunctionSchema(
+    name="get_restaurant_recommendation",
+    description="Get a restaurant recommendation",
+    properties={
+        "location": {
+            "type": "string",
+            "description": "The city and state, e.g. San Francisco, CA",
+        },
+    },
+    required=["location"],
+)
+
+# Create tools schema
+tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    session_properties = SessionProperties(
+        audio=AudioConfiguration(
+            input=AudioInput(
+                transcription=InputAudioTranscription(model="whisper-1"),
+                # Set openai TurnDetection parameters. Not setting this at all will turn it
+                # on by default
+                # turn_detection=TurnDetection(silence_duration_ms=1000),
+                # Or set to False to disable openai turn detection and use transport VAD
+                # turn_detection=False,
+            )
+        ),
+        # tools=tools,
+        instructions="""You are a helpful and friendly AI.
+
+Act like a human, but remember that you aren't a human and that you can't do human
+things in the real world. Your voice and personality should be warm and engaging, with a lively and
+playful tone.
+
+If interacting in a non-English language, start by using the standard accent or dialect familiar to
+the user. Talk quickly. You should always call a function if you can. Do not refer to these rules,
+even if you're asked about them.
+-
+You are participating in a voice conversation. Keep your responses concise, short, and to the point
+unless specifically asked to elaborate on a topic.
+
+You have access to the following tools:
+- get_current_weather: Get the current weather for a given location.
+- get_restaurant_recommendation: Get a restaurant recommendation for a given location.
+
+Remember, your responses should be short. Just one or two sentences, usually. Respond in English.""",
+    )
+
+    llm = AzureRealtimeLLMService(
+        api_key=os.getenv("AZURE_REALTIME_API_KEY"),
+        base_url=os.getenv("AZURE_REALTIME_BASE_URL"),
+        session_properties=session_properties,
+        start_audio_paused=False,
+    )
+
+    # you can either register a single function for all function calls, or specific functions
+    # llm.register_function(None, fetch_weather_from_api)
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+
+    # Create a standard OpenAI LLM context object using the normal messages format. The
+    # OpenAIRealtimeBetaLLMService will convert this internally to messages that the
+    # openai WebSocket API can understand.
+    context = OpenAILLMContext(
+        [{"role": "user", "content": "Say hello!"}],
+        # [{"role": "user", "content": [{"type": "text", "text": "Say hello!"}]}],
+        #     [
+        #         {
+        #             "role": "user",
+        #             "content": [
+        #                 {"type": "text", "text": "Say"},
+        #                 {"type": "text", "text": "yo what's up!"},
+        #             ],
+        #         }
+        #     ],
+        tools,
+    )
+
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            context_aggregator.user(),
+            llm,  # LLM
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/19b-openai-realtime-beta-text.py
+++ b/examples/foundational/19b-openai-realtime-beta-text.py
@@ -22,7 +22,7 @@ from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia import CartesiaTTSService
+from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.services.openai_realtime_beta import (
    InputAudioNoiseReduction,
--- a/examples/foundational/19b-openai-realtime-text.py
+++ b/examples/foundational/19b-openai-realtime-text.py
@@ -0,0 +1,234 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+from datetime import datetime
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.transcript_processor import TranscriptProcessor
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia import CartesiaTTSService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.services.openai_realtime import (
+    InputAudioNoiseReduction,
+    InputAudioTranscription,
+    OpenAIRealtimeLLMService,
+    SemanticTurnDetection,
+    SessionProperties,
+)
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
+    await params.result_callback(
+        {
+            "conditions": "nice",
+            "temperature": temperature,
+            "format": params.arguments["format"],
+            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
+        }
+    )
+
+
+async def fetch_restaurant_recommendation(params: FunctionCallParams):
+    await params.result_callback({"name": "The Golden Dragon"})
+
+
+weather_function = FunctionSchema(
+    name="get_current_weather",
+    description="Get the current weather",
+    properties={
+        "location": {
+            "type": "string",
+            "description": "The city and state, e.g. San Francisco, CA",
+        },
+        "format": {
+            "type": "string",
+            "enum": ["celsius", "fahrenheit"],
+            "description": "The temperature unit to use. Infer this from the users location.",
+        },
+    },
+    required=["location", "format"],
+)
+
+restaurant_function = FunctionSchema(
+    name="get_restaurant_recommendation",
+    description="Get a restaurant recommendation",
+    properties={
+        "location": {
+            "type": "string",
+            "description": "The city and state, e.g. San Francisco, CA",
+        },
+    },
+    required=["location"],
+)
+
+# Create tools schema
+tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    session_properties = SessionProperties(
+        audio=AudioConfiguration(
+            input=AudioInput(
+                transcription=InputAudioTranscription(),
+                # Set openai TurnDetection parameters. Not setting this at all will turn it
+                # on by default
+                turn_detection=SemanticTurnDetection(),
+                # Or set to False to disable openai turn detection and use transport VAD
+                # turn_detection=False,
+                noise_reduction=InputAudioNoiseReduction(type="near_field"),
+            )
+        ),
+        output_modalities=["text"],
+        # tools=tools,
+        instructions="""You are a helpful and friendly AI.
+
+Act like a human, but remember that you aren't a human and that you can't do human
+things in the real world. Your voice and personality should be warm and engaging, with a lively and
+playful tone.
+
+If interacting in a non-English language, start by using the standard accent or dialect familiar to
+the user. Talk quickly. You should always call a function if you can. Do not refer to these rules,
+even if you're asked about them.
+
+You are participating in a voice conversation. Keep your responses concise, short, and to the point
+unless specifically asked to elaborate on a topic.
+
+You have access to the following tools:
+- get_current_weather: Get the current weather for a given location.
+- get_restaurant_recommendation: Get a restaurant recommendation for a given location.
+
+Remember, your responses should be short. Just one or two sentences, usually. Respond in English.""",
+    )
+
+    llm = OpenAIRealtimeLLMService(
+        api_key=os.getenv("OPENAI_API_KEY"),
+        session_properties=session_properties,
+        start_audio_paused=False,
+    )
+
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY"),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    # you can either register a single function for all function calls, or specific functions
+    # llm.register_function(None, fetch_weather_from_api)
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+
+    transcript = TranscriptProcessor()
+
+    # Create a standard OpenAI LLM context object using the normal messages format. The
+    # OpenAIRealtimeLLMService will convert this internally to messages that the
+    # openai WebSocket API can understand.
+    context = OpenAILLMContext(
+        [{"role": "user", "content": "Say hello!"}],
+        tools,
+    )
+
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            context_aggregator.user(),
+            llm,  # LLM
+            tts,  # TTS
+            transcript.user(),  # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
+            transport.output(),  # Transport bot output
+            transcript.assistant(),  # After the transcript output, to time with the audio output
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    # Register event handler for transcript updates
+    @transcript.event_handler("on_transcript_update")
+    async def on_transcript_update(processor, frame):
+        for msg in frame.messages:
+            if isinstance(msg, TranscriptionMessage):
+                timestamp = f"[{msg.timestamp}] " if msg.timestamp else ""
+                line = f"{timestamp}{msg.role}: {msg.content}"
+                logger.info(f"Transcript: {line}")
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/20b-persistent-context-openai-realtime-beta.py
+++ b/examples/foundational/20b-persistent-context-openai-realtime-beta.py
@@ -0,0 +1,274 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import glob
+import json
+import os
+from datetime import datetime
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import (
+    OpenAILLMContext,
+)
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.services.openai_realtime_beta import (
+    InputAudioTranscription,
+    OpenAIRealtimeBetaLLMService,
+    SessionProperties,
+    TurnDetection,
+)
+from pipecat.services.openai_realtime_beta.events import AudioConfiguration, AudioInput
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+BASE_FILENAME = "/tmp/pipecat_conversation_"
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
+    await params.result_callback(
+        {
+            "conditions": "nice",
+            "temperature": temperature,
+            "format": params.arguments["format"],
+            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
+        }
+    )
+
+
+async def get_saved_conversation_filenames(params: FunctionCallParams):
+    # Construct the full pattern including the BASE_FILENAME
+    full_pattern = f"{BASE_FILENAME}*.json"
+
+    # Use glob to find all matching files
+    matching_files = glob.glob(full_pattern)
+    logger.debug(f"matching files: {matching_files}")
+
+    await params.result_callback({"filenames": matching_files})
+
+
+async def save_conversation(params: FunctionCallParams):
+    timestamp = datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
+    filename = f"{BASE_FILENAME}{timestamp}.json"
+    logger.debug(
+        f"writing conversation to {filename}\n{json.dumps(params.context.messages, indent=4)}"
+    )
+    try:
+        with open(filename, "w") as file:
+            messages = params.context.get_messages_for_persistent_storage()
+            # remove the last message, which is the instruction we just gave to save the conversation
+            messages.pop()
+            json.dump(messages, file, indent=2)
+        await params.result_callback({"success": True})
+    except Exception as e:
+        await params.result_callback({"success": False, "error": str(e)})
+
+
+async def load_conversation(params: FunctionCallParams):
+    async def _reset():
+        filename = params.arguments["filename"]
+        logger.debug(f"loading conversation from {filename}")
+        try:
+            with open(filename, "r") as file:
+                params.context.set_messages(json.load(file))
+                await params.llm.reset_conversation()
+                await params.llm._create_response()
+        except Exception as e:
+            await params.result_callback({"success": False, "error": str(e)})
+
+    asyncio.create_task(_reset())
+
+
+tools = [
+    {
+        "type": "function",
+        "name": "get_current_weather",
+        "description": "Get the current weather",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "location": {
+                    "type": "string",
+                    "description": "The city and state, e.g. San Francisco, CA",
+                },
+                "format": {
+                    "type": "string",
+                    "enum": ["celsius", "fahrenheit"],
+                    "description": "The temperature unit to use. Infer this from the users location.",
+                },
+            },
+            "required": ["location", "format"],
+        },
+    },
+    {
+        "type": "function",
+        "name": "save_conversation",
+        "description": "Save the current conversatione. Use this function to persist the current conversation to external storage.",
+        "parameters": {
+            "type": "object",
+            "properties": {},
+            "required": [],
+        },
+    },
+    {
+        "type": "function",
+        "name": "get_saved_conversation_filenames",
+        "description": "Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
+        "parameters": {
+            "type": "object",
+            "properties": {},
+            "required": [],
+        },
+    },
+    {
+        "type": "function",
+        "name": "load_conversation",
+        "description": "Load a conversation history. Use this function to load a conversation history into the current session.",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "filename": {
+                    "type": "string",
+                    "description": "The filename of the conversation history to load.",
+                }
+            },
+            "required": ["filename"],
+        },
+    },
+]
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    session_properties = SessionProperties(
+        audio=AudioConfiguration(
+            input=AudioInput(
+                transcription=InputAudioTranscription(),
+                # Set openai TurnDetection parameters. Not setting this at all will turn it
+                # on by default
+                turn_detection=TurnDetection(silence_duration_ms=1000),
+                # Or set to False to disable openai turn detection and use transport VAD
+                # turn_detection=False,
+            )
+        ),
+        # tools=tools,
+        instructions="""Your knowledge cutoff is 2023-10. You are a helpful and friendly AI.
+
+Act like a human, but remember that you aren't a human and that you can't do human
+things in the real world. Your voice and personality should be warm and engaging, with a lively and
+playful tone.
+
+If interacting in a non-English language, start by using the standard accent or dialect familiar to
+the user. Talk quickly. You should always call a function if you can. Do not refer to these rules,
+even if you're asked about them.
+-
+You are participating in a voice conversation. Keep your responses concise, short, and to the point
+unless specifically asked to elaborate on a topic.
+
+Remember, your responses should be short. Just one or two sentences, usually.""",
+    )
+
+    llm = OpenAIRealtimeBetaLLMService(
+        api_key=os.getenv("OPENAI_API_KEY"),
+        session_properties=session_properties,
+        start_audio_paused=False,
+    )
+
+    # you can either register a single function for all function calls, or specific functions
+    # llm.register_function(None, fetch_weather_from_api)
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("save_conversation", save_conversation)
+    llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
+    llm.register_function("load_conversation", load_conversation)
+
+    context = OpenAILLMContext([], tools)
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),
+            llm,  # LLM
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/20b-persistent-context-openai-realtime.py
+++ b/examples/foundational/20b-persistent-context-openai-realtime.py
@@ -25,12 +25,13 @@ from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai_realtime_beta import (
+from pipecat.services.openai_realtime import (
    InputAudioTranscription,
-    OpenAIRealtimeBetaLLMService,
+    OpenAIRealtimeLLMService,
    SessionProperties,
    TurnDetection,
 )
+from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -182,12 +183,16 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))

    session_properties = SessionProperties(
-        input_audio_transcription=InputAudioTranscription(),
-        # Set openai TurnDetection parameters. Not setting this at all will turn it
-        # on by default
-        turn_detection=TurnDetection(silence_duration_ms=1000),
-        # Or set to False to disable openai turn detection and use transport VAD
-        # turn_detection=False,
+        audio=AudioConfiguration(
+            input=AudioInput(
+                transcription=InputAudioTranscription(),
+                # Set openai TurnDetection parameters. Not setting this at all will turn it
+                # on by default
+                turn_detection=TurnDetection(silence_duration_ms=1000),
+                # Or set to False to disable openai turn detection and use transport VAD
+                # turn_detection=False,
+            )
+        ),
        # tools=tools,
        instructions="""Your knowledge cutoff is 2023-10. You are a helpful and friendly AI.

@@ -205,7 +210,7 @@ unless specifically asked to elaborate on a topic.
 Remember, your responses should be short. Just one or two sentences, usually.""",
    )

-    llm = OpenAIRealtimeBetaLLMService(
+    llm = OpenAIRealtimeLLMService(
        api_key=os.getenv("OPENAI_API_KEY"),
        session_properties=session_properties,
        start_audio_paused=False,
--- a/examples/foundational/22b-natural-conversation-proposal.py
+++ b/examples/foundational/22b-natural-conversation-proposal.py
@@ -18,9 +18,9 @@ from pipecat.frames.frames import (
    Frame,
    FunctionCallInProgressFrame,
    FunctionCallResultFrame,
+    InterruptionFrame,
    LLMRunFrame,
    StartFrame,
-    StartInterruptionFrame,
    SystemFrame,
    TextFrame,
    TranscriptionFrame,
@@ -144,7 +144,7 @@ class OutputGate(FrameProcessor):
                await self._start()
            if isinstance(frame, (EndFrame, CancelFrame)):
                await self._stop()
-            if isinstance(frame, StartInterruptionFrame):
+            if isinstance(frame, InterruptionFrame):
                self._frames_buffer = []
                self.close_gate()
            await self.push_frame(frame, direction)
@@ -232,7 +232,7 @@ class TurnDetectionLLM(Pipeline):
        async def pass_only_llm_trigger_frames(frame):
            return (
                isinstance(frame, OpenAILLMContextFrame)
-                or isinstance(frame, StartInterruptionFrame)
+                or isinstance(frame, InterruptionFrame)
                or isinstance(frame, FunctionCallInProgressFrame)
                or isinstance(frame, FunctionCallResultFrame)
            )
--- a/examples/foundational/22c-natural-conversation-mixed-llms.py
+++ b/examples/foundational/22c-natural-conversation-mixed-llms.py
@@ -18,9 +18,9 @@ from pipecat.frames.frames import (
    Frame,
    FunctionCallInProgressFrame,
    FunctionCallResultFrame,
+    InterruptionFrame,
    LLMRunFrame,
    StartFrame,
-    StartInterruptionFrame,
    SystemFrame,
    TextFrame,
    TranscriptionFrame,
@@ -347,7 +347,7 @@ class OutputGate(FrameProcessor):
                await self._start()
            if isinstance(frame, (EndFrame, CancelFrame)):
                await self._stop()
-            if isinstance(frame, StartInterruptionFrame):
+            if isinstance(frame, InterruptionFrame):
                self._frames_buffer = []
                self.close_gate()
            await self.push_frame(frame, direction)
@@ -426,7 +426,7 @@ class TurnDetectionLLM(Pipeline):
        async def pass_only_llm_trigger_frames(frame):
            return (
                isinstance(frame, OpenAILLMContextFrame)
-                or isinstance(frame, StartInterruptionFrame)
+                or isinstance(frame, InterruptionFrame)
                or isinstance(frame, FunctionCallInProgressFrame)
                or isinstance(frame, FunctionCallResultFrame)
            )
--- a/examples/foundational/22d-natural-conversation-gemini-audio.py
+++ b/examples/foundational/22d-natural-conversation-gemini-audio.py
@@ -20,10 +20,10 @@ from pipecat.frames.frames import (
    FunctionCallInProgressFrame,
    FunctionCallResultFrame,
    InputAudioRawFrame,
+    InterruptionFrame,
    LLMFullResponseStartFrame,
    LLMRunFrame,
    StartFrame,
-    StartInterruptionFrame,
    SystemFrame,
    TextFrame,
    TranscriptionFrame,
@@ -570,7 +570,7 @@ class OutputGate(FrameProcessor):
                await self._start()
            if isinstance(frame, (EndFrame, CancelFrame)):
                await self._stop()
-            if isinstance(frame, StartInterruptionFrame):
+            if isinstance(frame, InterruptionFrame):
                self._frames_buffer = []
                self.close_gate()
            await self.push_frame(frame, direction)
--- a/examples/foundational/30-observer.py
+++ b/examples/foundational/30-observer.py
@@ -15,8 +15,8 @@ from pipecat.frames.frames import (
    BotStartedSpeakingFrame,
    BotStoppedSpeakingFrame,
    EndFrame,
+    InterruptionFrame,
    LLMRunFrame,
-    StartInterruptionFrame,
    TTSTextFrame,
    UserStartedSpeakingFrame,
 )
@@ -48,7 +48,7 @@ class CustomObserver(BaseObserver):
    """Observer to log interruptions and bot speaking events to the console.

    Logs all frame instances of:
-    - StartInterruptionFrame
+    - InterruptionFrame
    - BotStartedSpeakingFrame
    - BotStoppedSpeakingFrame

@@ -69,7 +69,7 @@ class CustomObserver(BaseObserver):
        # Create direction arrow
        arrow = "→" if direction == FrameDirection.DOWNSTREAM else "←"

-        if isinstance(frame, StartInterruptionFrame) and isinstance(src, BaseOutputTransport):
+        if isinstance(frame, InterruptionFrame) and isinstance(src, BaseOutputTransport):
            logger.info(f"⚡ INTERRUPTION START: {src} {arrow} {dst} at {time_sec:.2f}s")
        elif isinstance(frame, BotStartedSpeakingFrame):
            logger.info(f"🤖 BOT START SPEAKING: {src} {arrow} {dst} at {time_sec:.2f}s")
--- a/examples/foundational/38b-smart-turn-local.py
+++ b/examples/foundational/38b-smart-turn-local.py
@@ -11,7 +11,7 @@ from dotenv import load_dotenv
 from loguru import logger

 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
-from pipecat.audio.turn.smart_turn.local_smart_turn_v2 import LocalSmartTurnAnalyzerV2
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import LLMRunFrame
@@ -31,20 +31,7 @@ from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
 load_dotenv(override=True)

 # To use this locally, set the environment variable LOCAL_SMART_TURN_MODEL_PATH
-# to the path where the smart-turn repo is cloned.
-#
-# Example setup:
-#
-#   # Git LFS (Large File Storage)
-#   brew install git-lfs
-#   # Hugging Face uses LFS to store large model files, including .mlpackage
-#   git lfs install
-#   # Clone the repo with the smart_turn_classifier.mlpackage
-#   git clone https://huggingface.co/pipecat-ai/smart-turn-v2
-#
-# Then set the env variable:
-#   export LOCAL_SMART_TURN_MODEL_PATH=./smart-turn
-# or add it to your .env file
+# to the Smart Turn v3 ONNX model file.
 smart_turn_model_path = os.getenv("LOCAL_SMART_TURN_MODEL_PATH")

 # We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -55,7 +42,7 @@ transport_params = {
        audio_in_enabled=True,
        audio_out_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV2(
+        turn_analyzer=LocalSmartTurnAnalyzerV3(
            smart_turn_model_path=smart_turn_model_path, params=SmartTurnParams()
        ),
    ),
@@ -63,7 +50,7 @@ transport_params = {
        audio_in_enabled=True,
        audio_out_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV2(
+        turn_analyzer=LocalSmartTurnAnalyzerV3(
            smart_turn_model_path=smart_turn_model_path, params=SmartTurnParams()
        ),
    ),
@@ -71,7 +58,7 @@ transport_params = {
        audio_in_enabled=True,
        audio_out_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
-        turn_analyzer=LocalSmartTurnAnalyzerV2(
+        turn_analyzer=LocalSmartTurnAnalyzerV3(
            smart_turn_model_path=smart_turn_model_path, params=SmartTurnParams()
        ),
    ),
--- a/examples/foundational/45-openai-agent-basic.py
+++ b/examples/foundational/45-openai-agent-basic.py
@@ -0,0 +1,205 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""
+Basic OpenAI Agent service example.
+
+This example demonstrates how to use the OpenAI Agents SDK within a Pipecat
+pipeline to create an interactive agent with tool calling capabilities.
+
+Requirements:
+- OpenAI API key
+- OpenAI Agents SDK: pip install openai-agents
+"""
+
+import os
+import random
+from typing import Any, List
+
+# Import agents SDK for tools and agent creation
+from agents import Agent, function_tool
+from dotenv import load_dotenv
+from loguru import logger
+from openai.types.chat import ChatCompletionMessageParam
+
+from pipecat.frames.frames import LLMRunFrame, TextFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.openai_agent.agent_service import OpenAIAgentService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+# Transport configuration
+transport_params = {
+    "daily": lambda: DailyParams(audio_out_enabled=True, audio_in_enabled=True),
+    "twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True, audio_in_enabled=True),
+    "webrtc": lambda: TransportParams(audio_out_enabled=True, audio_in_enabled=True),
+}
+
+
+@function_tool
+def get_weather(location: str) -> str:
+    """Get the current weather for a location.
+
+    Args:
+        location: The location to get weather for
+
+    Returns:
+        A weather description string
+    """
+    # Mock weather data - in real usage, integrate with weather API
+    weather_data = {
+        "San Francisco": "Foggy, 65°F",
+        "New York": "Sunny, 72°F",
+        "London": "Rainy, 59°F",
+        "Tokyo": "Partly cloudy, 68°F",
+    }
+    return weather_data.get(location, f"Weather data not available for {location}")
+
+
+@function_tool
+def get_random_fact() -> str:
+    """Get a random interesting fact.
+
+    Returns:
+        A random fact string
+    """
+    facts = [
+        "Honey never spoils. Archaeologists have found edible honey in ancient Egyptian tombs.",
+        "Octopuses have three hearts and blue blood.",
+        "The Great Wall of China isn't visible from space with the naked eye.",
+        "Bananas are berries, but strawberries aren't.",
+    ]
+    return random.choice(facts)
+
+
+def get_random_fact_tool():
+    """Example tool function for random facts."""
+
+    def get_random_fact() -> str:
+        """Get a random interesting fact.
+
+        Returns:
+            A random fact string.
+        """
+        facts = [
+            "Honey never spoils. Archaeologists have found edible honey in ancient Egyptian tombs.",
+            "A group of flamingos is called a 'flamboyance'.",
+            "Octopuses have three hearts and blue blood.",
+            "The Great Wall of China isn't visible from space with the naked eye.",
+            "Bananas are berries, but strawberries aren't.",
+        ]
+        return random.choice(facts)
+
+    return get_random_fact
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info("Starting OpenAI Agent bot")
+
+    # Set up STT for speech recognition
+    stt = DeepgramSTTService(
+        api_key=os.getenv("DEEPGRAM_API_KEY", ""),
+        model="nova-2",
+    )
+
+    # Set up TTS for voice output
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY", ""),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    # Create tools for the agent
+    tools: list[Any] = [
+        get_weather,
+        get_random_fact,
+    ]
+
+    # Create the agent with tools
+    agent = Agent(
+        name="Assistant",
+        instructions="""You are a helpful assistant with access to weather information and random facts. 
+        You can:
+        - Check weather for any location using the get_weather tool
+        - Share interesting facts using the get_random_fact tool
+        - Have natural conversations
+        
+        Be friendly, informative, and engaging in your responses.""",
+        tools=tools,
+    )
+
+    # Initialize the OpenAI Agent service with the pre-configured agent
+    agent_service = OpenAIAgentService(
+        agent=agent,
+        api_key=os.getenv("OPENAI_API_KEY"),
+        streaming=True,
+    )
+
+    # Set up conversation context with initial system message
+    messages: List[ChatCompletionMessageParam] = [
+        {
+            "role": "system",
+            "content": "You are a helpful assistant with access to weather information and random facts. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = OpenAILLMContext(messages)
+    context_aggregator = agent_service.create_context_aggregator(context)
+
+    # Create the processing pipeline with context aggregators
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # Speech to text
+            context_aggregator.user(),  # User responses
+            agent_service,  # OpenAI Agent processing
+            tts,  # Text to speech
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    # Send an initial greeting when client connects
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info("Client connected, sending greeting")
+        # Kick off the conversation by adding system message and running LLM
+        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info("Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/46-openai-agent-handoffs.py
+++ b/examples/foundational/46-openai-agent-handoffs.py
@@ -0,0 +1,276 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""
+Advanced OpenAI Agent service example with handoffs.
+
+This example demonstrates how to use multiple agents with handoffs in the
+OpenAI Agents SDK within a Pipecat pipeline, showcasing agent orchestration
+and specialization.
+
+Requirements:
+- OpenAI API key
+- OpenAI Agents SDK: pip install openai-agents
+"""
+
+import os
+import random
+from typing import Any, Dict, List
+
+from dotenv import load_dotenv
+from loguru import logger
+from openai.types.chat import ChatCompletionMessageParam
+
+from pipecat.frames.frames import LLMRunFrame, TextFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.openai_agent.agent_service import OpenAIAgentService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+# Transport configuration
+transport_params = {
+    "daily": lambda: DailyParams(audio_out_enabled=True, audio_in_enabled=True),
+    "twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True, audio_in_enabled=True),
+    "webrtc": lambda: TransportParams(audio_out_enabled=True, audio_in_enabled=True),
+}
+
+
+def create_weather_tools():
+    """Create weather-related tools."""
+
+    def get_weather(location: str) -> str:
+        """Get current weather for a location."""
+        conditions = ["sunny", "cloudy", "rainy", "snowy", "windy"]
+        temp = random.randint(-10, 35)
+        condition = random.choice(conditions)
+        return f"The weather in {location} is {condition} with a temperature of {temp}°C."
+
+    def get_forecast(location: str, days: int = 3) -> str:
+        """Get weather forecast for multiple days."""
+        forecast = []
+        for i in range(days):
+            conditions = ["sunny", "cloudy", "rainy", "snowy"]
+            temp = random.randint(-5, 30)
+            condition = random.choice(conditions)
+            day = "today" if i == 0 else f"in {i} day{'s' if i > 1 else ''}"
+            forecast.append(f"{day.capitalize()}: {condition}, {temp}°C")
+        return f"Weather forecast for {location}:\n" + "\n".join(forecast)
+
+    return [get_weather, get_forecast]
+
+
+def create_trivia_tools():
+    """Create trivia and fact tools."""
+
+    def get_random_fact() -> str:
+        """Get a random interesting fact."""
+        facts = [
+            "Honey never spoils. Archaeologists have found edible honey in ancient Egyptian tombs.",
+            "A group of flamingos is called a 'flamboyance'.",
+            "Octopuses have three hearts and blue blood.",
+            "The Great Wall of China isn't visible from space with the naked eye.",
+            "Bananas are berries, but strawberries aren't.",
+            "Wombat poop is cube-shaped.",
+            "A shrimp's heart is in its head.",
+            "It's impossible to hum while holding your nose.",
+        ]
+        return random.choice(facts)
+
+    def get_science_fact() -> str:
+        """Get a random science fact."""
+        facts = [
+            "The speed of light in a vacuum is approximately 299,792,458 meters per second.",
+            "DNA stands for Deoxyribonucleic Acid.",
+            "The human brain uses about 20% of the body's total energy.",
+            "There are more possible games of chess than atoms in the observable universe.",
+            "A single bolt of lightning contains enough energy to toast 100,000 slices of bread.",
+        ]
+        return random.choice(facts)
+
+    return [get_random_fact, get_science_fact]
+
+
+def create_math_tools():
+    """Create math calculation tools."""
+
+    def calculate(expression: str) -> str:
+        """Safely calculate a mathematical expression."""
+        try:
+            # Only allow basic math operations for safety
+            allowed_chars = set("0123456789+-*/.() ")
+            if not all(c in allowed_chars for c in expression):
+                return "Sorry, I can only calculate basic math expressions with +, -, *, /, and parentheses."
+
+            result = eval(expression)
+            return f"{expression} = {result}"
+        except Exception as e:
+            return f"Error calculating '{expression}': {str(e)}"
+
+    def generate_math_problem() -> str:
+        """Generate a random math problem."""
+        operations = ["+", "-", "*"]
+        a = random.randint(1, 20)
+        b = random.randint(1, 20)
+        op = random.choice(operations)
+
+        if op == "+":
+            answer = a + b
+        elif op == "-":
+            answer = a - b
+        else:  # multiplication
+            answer = a * b
+
+        return f"Here's a math problem for you: {a} {op} {b} = ?"
+
+    return [calculate, generate_math_problem]
+
+
+async def create_specialist_agents():
+    """Create specialized agents for different domains."""
+
+    # Weather specialist agent
+    weather_agent = OpenAIAgentService(
+        name="Weather Specialist",
+        instructions="""You are a weather specialist. You provide detailed weather information,
+        forecasts, and weather-related advice. Use your tools to get accurate weather data.
+        Be informative and helpful about weather conditions and what they might mean for
+        outdoor activities.""",
+        tools=create_weather_tools(),
+        api_key=os.getenv("OPENAI_API_KEY"),
+        streaming=True,
+    )
+
+    # Trivia specialist agent
+    trivia_agent = OpenAIAgentService(
+        name="Trivia Master",
+        instructions="""You are a trivia and facts specialist. You love sharing interesting
+        facts, trivia, and educational content. Use your tools to provide fascinating
+        information and engage users with fun facts. Make learning enjoyable!""",
+        tools=create_trivia_tools(),
+        api_key=os.getenv("OPENAI_API_KEY"),
+        streaming=True,
+    )
+
+    # Math specialist agent
+    math_agent = OpenAIAgentService(
+        name="Math Helper",
+        instructions="""You are a mathematics specialist. You help with calculations,
+        math problems, and mathematical concepts. Use your tools to solve problems
+        and generate practice questions. Make math accessible and fun!""",
+        tools=create_math_tools(),
+        api_key=os.getenv("OPENAI_API_KEY"),
+        streaming=True,
+    )
+
+    return weather_agent, trivia_agent, math_agent
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info("Starting OpenAI Agent bot with handoffs")
+
+    # Set up STT for speech recognition
+    stt = DeepgramSTTService(
+        api_key=os.getenv("DEEPGRAM_API_KEY", ""),
+        model="nova-2",
+    )
+
+    # Set up TTS for voice output
+    tts = CartesiaTTSService(
+        api_key=os.getenv("CARTESIA_API_KEY", ""),
+        voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+    )
+
+    # Create specialist agents
+    weather_agent, trivia_agent, math_agent = await create_specialist_agents()
+
+    # Create the main triage agent that can hand off to specialists
+    triage_agent = OpenAIAgentService(
+        name="Assistant Coordinator",
+        instructions="""You are a helpful assistant coordinator. Your role is to understand
+        what the user needs and direct them to the right specialist:
+        
+        - For weather questions, forecasts, or outdoor activity planning -> Weather Specialist
+        - For interesting facts, trivia, or educational content -> Trivia Master  
+        - For calculations, math problems, or mathematical help -> Math Helper
+        
+        If the request doesn't clearly fit a specialist, you can handle general conversation
+        yourself. Always be friendly and explain when you're connecting them to a specialist.""",
+        handoffs=[weather_agent.agent, trivia_agent.agent, math_agent.agent],  # type: ignore
+        api_key=os.getenv("OPENAI_API_KEY"),
+        streaming=True,
+    )
+
+    # Set up conversation context with initial system message
+    messages: List[ChatCompletionMessageParam] = [
+        {
+            "role": "system",
+            "content": "You are a helpful assistant coordinator with access to weather information, trivia, and math tools. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = OpenAILLMContext(messages)
+    context_aggregator = triage_agent.create_context_aggregator(context)
+
+    # Create the processing pipeline with context aggregators
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # Speech to text
+            context_aggregator.user(),  # User responses
+            triage_agent,  # OpenAI Agent processing
+            tts,  # Text to speech
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    # Send an initial greeting when client connects
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info("Client connected, sending greeting")
+        # Kick off the conversation by adding system message and running LLM
+        messages.append(
+            {
+                "role": "system",
+                "content": "Please introduce yourself to the user as an AI assistant coordinator who works with specialists for weather, trivia, and math topics.",
+            }
+        )
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info("Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/quickstart/uv.lock
+++ b/examples/quickstart/uv.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -34,7 +34,7 @@ dependencies = [
    "pyloudnorm~=0.1.1",
    "resampy~=0.4.3",
    "soxr~=0.5.0",
-    "openai>=1.74.0,<=1.99.1",
+    "openai>=1.74.0,<2.0.0",
    # Pinning numba to resolve package dependencies
    "numba==0.61.2",
    "wait_for2>=0.4.1; python_version<'3.12'",
@@ -74,7 +74,7 @@ langchain = [ "langchain~=0.3.20", "langchain-community~=0.3.20", "langchain-ope
 livekit = [ "livekit~=0.22.0", "livekit-api~=0.8.2", "tenacity>=8.2.3,<10.0.0" ]
 lmnt = [ "websockets>=13.1,<15.0" ]
 local = [ "pyaudio~=0.2.14" ]
-mcp = [ "mcp[cli]~=1.9.4" ]
+mcp = [ "mcp[cli]>=1.11.0,<2.0.0" ]
 mem0 = [ "mem0ai~=0.1.94" ]
 mistral = []
 mlx-whisper = [ "mlx-whisper~=0.4.2" ]
@@ -83,7 +83,8 @@ nim = []
 neuphonic = [ "websockets>=13.1,<15.0" ]
 noisereduce = [ "noisereduce~=3.0.3" ]
 openai = [ "websockets>=13.1,<15.0" ]
-openpipe = [ "openpipe~=4.50.0" ]
+openai-agent = [ "openai-agents~=0.3.0" ]
+# openpipe = [ "openpipe~=4.50.0" ]  # Temporarily disabled due to openai version conflict
 openrouter = []
 perplexity = []
 playht = [ "websockets>=13.1,<15.0" ]
@@ -95,8 +96,9 @@ sambanova = []
 sarvam = [ "websockets>=13.1,<15.0" ]
 sentry = [ "sentry-sdk~=2.23.1" ]
 local-smart-turn = [ "coremltools>=8.0", "transformers", "torch>=2.5.0,<3", "torchaudio>=2.5.0,<3" ]
+local-smart-turn-v3 = [ "transformers", "torch>=2.5.0,<3", "torchaudio>=2.5.0,<3", "onnxruntime>=1.20.1, <2" ]
 remote-smart-turn = []
-silero = [ "onnxruntime~=1.20.1" ]
+silero = [ "onnxruntime>=1.20.1, <2" ]
 simli = [ "simli-ai~=0.1.10"]
 soniox = [ "websockets>=13.1,<15.0" ]
 soundfile = [ "soundfile~=0.13.0" ]
@@ -154,6 +156,7 @@ where = ["src"]
    "src/pipecat/audio/dtmf/dtmf-star.wav",
 ]
 "pipecat.services.aws_nova_sonic" = ["src/pipecat/services/aws_nova_sonic/ready.wav"]
+"pipecat.audio.turn.smart_turn.data" = ["src/pipecat/audio/turn/smart_turn/data/smart-turn-v3.0.onnx"]

 [tool.pytest.ini_options]
 addopts = "--verbose"
--- a/scripts/evals/eval.py
+++ b/scripts/evals/eval.py
@@ -47,7 +47,7 @@ from pipecat.transports.daily.transport import DailyParams, DailyTransport
 SCRIPT_DIR = Path(__file__).resolve().parent

 PIPELINE_IDLE_TIMEOUT_SECS = 60
-EVAL_TIMEOUT_SECS = 90
+EVAL_TIMEOUT_SECS = 120

 EvalPrompt = str | Tuple[str, ImageFile]

@@ -266,8 +266,11 @@ async def run_eval_pipeline(
    elif isinstance(prompt, tuple):
        example_prompt, example_image = prompt

-    eval_prompt = f"The answer is correct if it's appropriate for the context and matches: {eval}."
-    common_system_prompt = f"Call the eval function with your assessment only if the user answers the question. {eval_prompt}"
+    eval_prompt = f"The answer is correct if it matches: {eval}."
+    common_system_prompt = (
+        "The user might say things other than the answer and that's allowed. "
+        f"You should only call the eval function with your assessment when the user actually answers the question. {eval_prompt}"
+    )
    if user_speaks_first:
        system_prompt = f"You are an LLM eval, be extremly brief. You will start the conversation by saying: '{example_prompt}'. {common_system_prompt}"
    else:
--- a/scripts/evals/run-release-evals.py
+++ b/scripts/evals/run-release-evals.py
@@ -135,6 +135,25 @@ TESTS_14 = [
    ("14r-function-calling-aws.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("14v-function-calling-openai.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("14w-function-calling-mistral.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("14x-function-calling-universal-context.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    (
+        "14y-function-calling-google-universal-context.py",
+        PROMPT_WEATHER,
+        EVAL_WEATHER,
+        BOT_SPEAKS_FIRST,
+    ),
+    (
+        "14z-function-calling-anthropic-universal-context.py",
+        PROMPT_WEATHER,
+        EVAL_WEATHER,
+        BOT_SPEAKS_FIRST,
+    ),
+    (
+        "14aa-function-calling-aws-universal-context.py",
+        PROMPT_WEATHER,
+        EVAL_WEATHER,
+        BOT_SPEAKS_FIRST,
+    ),
    # Currently not working.
    # ("14c-function-calling-together.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    # ("14l-function-calling-deepseek.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
@@ -148,6 +167,7 @@ TESTS_15 = [
 TESTS_19 = [
    ("19-openai-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("19a-azure-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    ("19b-openai-realtime-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("19b-openai-realtime-beta-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
 ]

--- a/src/pipecat/adapters/base_llm_adapter.py
+++ b/src/pipecat/adapters/base_llm_adapter.py
@@ -16,7 +16,12 @@ from typing import Any, Dict, Generic, List, TypeVar
 from loguru import logger

 from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.processors.aggregators.llm_context import LLMContext, NotGiven
+from pipecat.processors.aggregators.llm_context import (
+    LLMContext,
+    LLMContextMessage,
+    LLMSpecificMessage,
+    NotGiven,
+)

 # Should be a TypedDict
 TLLMInvocationParams = TypeVar("TLLMInvocationParams", bound=dict[str, Any])
@@ -38,6 +43,16 @@ class BaseLLMAdapter(ABC, Generic[TLLMInvocationParams]):
    Subclasses must implement provider-specific conversion logic.
    """

+    @property
+    @abstractmethod
+    def id_for_llm_specific_messages(self) -> str:
+        """Get the identifier used in LLMSpecificMessage instances for this LLM provider.
+
+        Returns:
+            The identifier string.
+        """
+        pass
+
    @abstractmethod
    def get_llm_invocation_params(self, context: LLMContext, **kwargs) -> TLLMInvocationParams:
        """Get provider-specific LLM invocation parameters from a universal LLM context.
@@ -76,6 +91,28 @@ class BaseLLMAdapter(ABC, Generic[TLLMInvocationParams]):
        """
        pass

+    def create_llm_specific_message(self, message: Any) -> LLMSpecificMessage:
+        """Create an LLM-specific message (as opposed to a standard message) for use in an LLMContext.
+
+        Args:
+            message: The message content.
+
+        Returns:
+            A LLMSpecificMessage instance.
+        """
+        return LLMSpecificMessage(llm=self.id_for_llm_specific_messages, message=message)
+
+    def get_messages(self, context: LLMContext) -> List[LLMContextMessage]:
+        """Get messages from the LLM context, including standard and LLM-specific messages.
+
+        Args:
+            context: The LLM context containing messages.
+
+        Returns:
+            List of messages including standard and LLM-specific messages.
+        """
+        return context.get_messages(self.id_for_llm_specific_messages)
+
    def from_standard_tools(self, tools: Any) -> List[Any] | NotGiven:
        """Convert tools from standard format to provider format.

--- a/src/pipecat/adapters/services/anthropic_adapter.py
+++ b/src/pipecat/adapters/services/anthropic_adapter.py
@@ -9,7 +9,7 @@
 import copy
 import json
 from dataclasses import dataclass
-from typing import Any, Dict, List, Optional, TypedDict
+from typing import Any, Dict, List, TypedDict

 from anthropic import NOT_GIVEN, NotGiven
 from anthropic.types.message_param import MessageParam
@@ -28,10 +28,7 @@ from pipecat.processors.aggregators.llm_context import (


 class AnthropicLLMInvocationParams(TypedDict):
-    """Context-based parameters for invoking Anthropic's LLM API.
-
-    This is a placeholder until support for universal LLMContext machinery is added for Anthropic.
-    """
+    """Context-based parameters for invoking Anthropic's LLM API."""

    system: str | NotGiven
    messages: List[MessageParam]
@@ -45,13 +42,16 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
    to the specific format required by Anthropic's Claude models for function calling.
    """

+    @property
+    def id_for_llm_specific_messages(self) -> str:
+        """Get the identifier used in LLMSpecificMessage instances for Anthropic."""
+        return "anthropic"
+
    def get_llm_invocation_params(
        self, context: LLMContext, enable_prompt_caching: bool
    ) -> AnthropicLLMInvocationParams:
        """Get Anthropic-specific LLM invocation parameters from a universal LLM context.

-        This is a placeholder until support for universal LLMContext machinery is added for Anthropic.
-
        Args:
            context: The LLM context containing messages, tools, etc.
            enable_prompt_caching: Whether prompt caching should be enabled.
@@ -59,7 +59,7 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
        Returns:
            Dictionary of parameters for invoking Anthropic's LLM API.
        """
-        messages = self._from_universal_context_messages(self._get_messages(context))
+        messages = self._from_universal_context_messages(self.get_messages(context))
        return {
            "system": messages.system,
            "messages": (
@@ -76,8 +76,6 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):

        Removes or truncates sensitive data like image content for safe logging.

-        This is a placeholder until support for universal LLMContext machinery is added for Anthropic.
-
        Args:
            context: The LLM context containing messages.

@@ -85,7 +83,7 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
            List of messages in a format ready for logging about Anthropic.
        """
        # Get messages in Anthropic's format
-        messages = self._from_universal_context_messages(self._get_messages(context)).messages
+        messages = self._from_universal_context_messages(self.get_messages(context)).messages

        # Sanitize messages for logging
        messages_for_logging = []
@@ -99,9 +97,6 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
            messages_for_logging.append(msg)
        return messages_for_logging

-    def _get_messages(self, context: LLMContext) -> List[LLMContextMessage]:
-        return context.get_messages("anthropic")
-
    @dataclass
    class ConvertedMessages:
        """Container for Anthropic-formatted messages converted from universal context."""
--- a/src/pipecat/adapters/services/aws_nova_sonic_adapter.py
+++ b/src/pipecat/adapters/services/aws_nova_sonic_adapter.py
@@ -31,6 +31,11 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
    specific function-calling format, enabling tool use with Nova Sonic models.
    """

+    @property
+    def id_for_llm_specific_messages(self) -> str:
+        """Get the identifier used in LLMSpecificMessage instances for AWS Nova Sonic."""
+        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")
+
    def get_llm_invocation_params(self, context: LLMContext) -> AWSNovaSonicLLMInvocationParams:
        """Get AWS Nova Sonic-specific LLM invocation parameters from a universal LLM context.

--- a/src/pipecat/adapters/services/bedrock_adapter.py
+++ b/src/pipecat/adapters/services/bedrock_adapter.py
@@ -6,21 +6,33 @@

 """AWS Bedrock LLM adapter for Pipecat."""

-from typing import Any, Dict, List, TypedDict
+import base64
+import copy
+import json
+from dataclasses import dataclass
+from typing import Any, Dict, List, Literal, Optional, TypedDict
+
+from loguru import logger

 from pipecat.adapters.base_llm_adapter import BaseLLMAdapter
 from pipecat.adapters.schemas.function_schema import FunctionSchema
 from pipecat.adapters.schemas.tools_schema import ToolsSchema
-from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_context import (
+    LLMContext,
+    LLMContextMessage,
+    LLMContextToolChoice,
+    LLMSpecificMessage,
+    LLMStandardMessage,
+)


 class AWSBedrockLLMInvocationParams(TypedDict):
-    """Context-based parameters for invoking AWS Bedrock's LLM API.
+    """Context-based parameters for invoking AWS Bedrock's LLM API."""

-    This is a placeholder until support for universal LLMContext machinery is added for Bedrock.
-    """
-
-    pass
+    system: Optional[List[dict[str, Any]]]  # [{"text": "system message"}]
+    messages: List[dict[str, Any]]
+    tools: List[dict[str, Any]]
+    tool_choice: LLMContextToolChoice


 class AWSBedrockLLMAdapter(BaseLLMAdapter[AWSBedrockLLMInvocationParams]):
@@ -30,33 +42,244 @@ class AWSBedrockLLMAdapter(BaseLLMAdapter[AWSBedrockLLMInvocationParams]):
    into AWS Bedrock's expected tool format for function calling capabilities.
    """

+    @property
+    def id_for_llm_specific_messages(self) -> str:
+        """Get the identifier used in LLMSpecificMessage instances for AWS Bedrock."""
+        return "aws"
+
    def get_llm_invocation_params(self, context: LLMContext) -> AWSBedrockLLMInvocationParams:
        """Get AWS Bedrock-specific LLM invocation parameters from a universal LLM context.

-        This is a placeholder until support for universal LLMContext machinery is added for Bedrock.
-
        Args:
            context: The LLM context containing messages, tools, etc.

        Returns:
            Dictionary of parameters for invoking AWS Bedrock's LLM API.
        """
-        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Bedrock.")
+        messages = self._from_universal_context_messages(self.get_messages(context))
+        return {
+            "system": messages.system,
+            "messages": messages.messages,
+            # NOTE: LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
+            "tools": self.from_standard_tools(context.tools) or [],
+            # To avoid refactoring in AWSBedrockLLMService, we just pass through tool_choice.
+            # Eventually (when we don't have to maintain the non-LLMContext code path) we should do
+            # the conversion to Bedrock's expected format here rather than in AWSBedrockLLMService.
+            "tool_choice": context.tool_choice,
+        }

    def get_messages_for_logging(self, context) -> List[Dict[str, Any]]:
        """Get messages from a universal LLM context in a format ready for logging about AWS Bedrock.

        Removes or truncates sensitive data like image content for safe logging.

-        This is a placeholder until support for universal LLMContext machinery is added for Bedrock.
-
        Args:
            context: The LLM context containing messages.

        Returns:
            List of messages in a format ready for logging about AWS Bedrock.
        """
-        raise NotImplementedError("Universal LLMContext is not yet supported for AWS Bedrock.")
+        # Get messages in Anthropic's format
+        messages = self._from_universal_context_messages(self.get_messages(context)).messages
+
+        # Sanitize messages for logging
+        messages_for_logging = []
+        for message in messages:
+            msg = copy.deepcopy(message)
+            if "content" in msg:
+                if isinstance(msg["content"], list):
+                    for item in msg["content"]:
+                        if item.get("image"):
+                            item["image"]["source"]["bytes"] = "..."
+            messages_for_logging.append(msg)
+        return messages_for_logging
+
+    @dataclass
+    class ConvertedMessages:
+        """Container for Anthropic-formatted messages converted from universal context."""
+
+        messages: List[dict[str, Any]]
+        system: Optional[str]
+
+    def _from_universal_context_messages(
+        self, universal_context_messages: List[LLMContextMessage]
+    ) -> ConvertedMessages:
+        system = None
+        messages = []
+
+        # first, map messages using self._from_universal_context_message(m)
+        try:
+            messages = [self._from_universal_context_message(m) for m in universal_context_messages]
+        except Exception as e:
+            logger.error(f"Error mapping messages: {e}")
+
+        # See if we should pull the system message out of our messages list
+        if messages and messages[0]["role"] == "system":
+            system = messages[0]["content"]
+            messages.pop(0)
+
+        # Convert any subsequent "system"-role messages to "user"-role
+        # messages, as AWS Bedrock doesn't support system input messages.
+        for message in messages:
+            if message["role"] == "system":
+                message["role"] = "user"
+
+        # Merge consecutive messages with the same role.
+        i = 0
+        while i < len(messages) - 1:
+            current_message = messages[i]
+            next_message = messages[i + 1]
+            if current_message["role"] == next_message["role"]:
+                # Convert content to list of dictionaries if it's a string
+                if isinstance(current_message["content"], str):
+                    current_message["content"] = [
+                        {"type": "text", "text": current_message["content"]}
+                    ]
+                if isinstance(next_message["content"], str):
+                    next_message["content"] = [{"type": "text", "text": next_message["content"]}]
+                # Concatenate the content
+                current_message["content"].extend(next_message["content"])
+                # Remove the next message from the list
+                messages.pop(i + 1)
+            else:
+                i += 1
+
+        # Avoid empty content in messages
+        for message in messages:
+            if isinstance(message["content"], str) and message["content"] == "":
+                message["content"] = "(empty)"
+            elif isinstance(message["content"], list) and len(message["content"]) == 0:
+                message["content"] = [{"type": "text", "text": "(empty)"}]
+
+        return self.ConvertedMessages(messages=messages, system=system)
+
+    def _from_universal_context_message(self, message: LLMContextMessage) -> dict[str, Any]:
+        if isinstance(message, LLMSpecificMessage):
+            return copy.deepcopy(message.message)
+        return self._from_standard_message(message)
+
+    def _from_standard_message(self, message: LLMStandardMessage) -> dict[str, Any]:
+        """Convert standard format message to AWS Bedrock format.
+
+        Handles conversion of text content, tool calls, and tool results.
+        Empty text content is converted to "(empty)".
+
+        Args:
+            message: Message in standard format.
+
+        Returns:
+            Message in AWS Bedrock format.
+
+        Examples:
+            Standard format input::
+
+                {
+                    "role": "assistant",
+                    "tool_calls": [
+                        {
+                            "id": "123",
+                            "function": {"name": "search", "arguments": '{"q": "test"}'}
+                        }
+                    ]
+                }
+
+            AWS Bedrock format output::
+
+                {
+                    "role": "assistant",
+                    "content": [
+                        {
+                            "toolUse": {
+                                "toolUseId": "123",
+                                "name": "search",
+                                "input": {"q": "test"}
+                            }
+                        }
+                    ]
+                }
+        """
+        message = copy.deepcopy(message)
+        if message["role"] == "tool":
+            # Try to parse the content as JSON if it looks like JSON
+            try:
+                if message["content"].strip().startswith("{") and message[
+                    "content"
+                ].strip().endswith("}"):
+                    content_json = json.loads(message["content"])
+                    tool_result_content = [{"json": content_json}]
+                else:
+                    tool_result_content = [{"text": message["content"]}]
+            except:
+                tool_result_content = [{"text": message["content"]}]
+
+            return {
+                "role": "user",
+                "content": [
+                    {
+                        "toolResult": {
+                            "toolUseId": message["tool_call_id"],
+                            "content": tool_result_content,
+                        },
+                    },
+                ],
+            }
+
+        if message.get("tool_calls"):
+            tc = message["tool_calls"]
+            ret = {"role": "assistant", "content": []}
+            for tool_call in tc:
+                function = tool_call["function"]
+                arguments = json.loads(function["arguments"])
+                new_tool_use = {
+                    "toolUse": {
+                        "toolUseId": tool_call["id"],
+                        "name": function["name"],
+                        "input": arguments,
+                    }
+                }
+                ret["content"].append(new_tool_use)
+            return ret
+
+        # Handle text content
+        content = message.get("content")
+        if isinstance(content, str):
+            if content == "":
+                return {"role": message["role"], "content": [{"text": "(empty)"}]}
+            else:
+                return {"role": message["role"], "content": [{"text": content}]}
+        elif isinstance(content, list):
+            new_content = []
+            for item in content:
+                # fix empty text
+                if item.get("type", "") == "text":
+                    text_content = item["text"] if item["text"] != "" else "(empty)"
+                    new_content.append({"text": text_content})
+                # handle image_url -> image conversion
+                if item["type"] == "image_url":
+                    new_item = {
+                        "image": {
+                            "format": "jpeg",
+                            "source": {
+                                "bytes": base64.b64decode(item["image_url"]["url"].split(",")[1])
+                            },
+                        }
+                    }
+                    new_content.append(new_item)
+            # In the case where there's a single image in the list (like what
+            # would result from a UserImageRawFrame), ensure that the image
+            # comes before text
+            image_indices = [i for i, item in enumerate(new_content) if "image" in item]
+            text_indices = [i for i, item in enumerate(new_content) if "text" in item]
+            if len(image_indices) == 1 and text_indices:
+                img_idx = image_indices[0]
+                first_txt_idx = text_indices[0]
+                if img_idx > first_txt_idx:
+                    # Move image before the first text
+                    image_item = new_content.pop(img_idx)
+                new_content.insert(first_txt_idx, image_item)
+            return {"role": message["role"], "content": new_content}
+
+        return message

    @staticmethod
    def _to_bedrock_function_format(function: FunctionSchema) -> Dict[str, Any]:
--- a/src/pipecat/adapters/services/gemini_adapter.py
+++ b/src/pipecat/adapters/services/gemini_adapter.py
@@ -54,6 +54,11 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
    - Extracting and sanitizing messages from the LLM context for logging with Gemini.
    """

+    @property
+    def id_for_llm_specific_messages(self) -> str:
+        """Get the identifier used in LLMSpecificMessage instances for Google."""
+        return "google"
+
    def get_llm_invocation_params(self, context: LLMContext) -> GeminiLLMInvocationParams:
        """Get Gemini-specific LLM invocation parameters from a universal LLM context.

@@ -63,7 +68,7 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
        Returns:
            Dictionary of parameters for Gemini's API.
        """
-        messages = self._from_universal_context_messages(self._get_messages(context))
+        messages = self._from_universal_context_messages(self.get_messages(context))
        return {
            "system_instruction": messages.system_instruction,
            "messages": messages.messages,
@@ -103,7 +108,7 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            List of messages in a format ready for logging about Gemini.
        """
        # Get messages in Gemini's format
-        messages = self._from_universal_context_messages(self._get_messages(context)).messages
+        messages = self._from_universal_context_messages(self.get_messages(context)).messages

        # Sanitize messages for logging
        messages_for_logging = []
@@ -119,9 +124,6 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            messages_for_logging.append(obj)
        return messages_for_logging

-    def _get_messages(self, context: LLMContext) -> List[LLMContextMessage]:
-        return context.get_messages("google")
-
    @dataclass
    class ConvertedMessages:
        """Container for Google-formatted messages converted from universal context."""
--- a/src/pipecat/adapters/services/open_ai_adapter.py
+++ b/src/pipecat/adapters/services/open_ai_adapter.py
@@ -24,6 +24,7 @@ from pipecat.processors.aggregators.llm_context import (
    LLMContext,
    LLMContextMessage,
    LLMContextToolChoice,
+    LLMSpecificMessage,
    NotGiven,
 )

@@ -47,6 +48,11 @@ class OpenAILLMAdapter(BaseLLMAdapter[OpenAILLMInvocationParams]):
    - Extracting and sanitizing messages from the LLM context for logging about OpenAI.
    """

+    @property
+    def id_for_llm_specific_messages(self) -> str:
+        """Get the identifier used in LLMSpecificMessage instances for OpenAI."""
+        return "openai"
+
    def get_llm_invocation_params(self, context: LLMContext) -> OpenAILLMInvocationParams:
        """Get OpenAI-specific LLM invocation parameters from a universal LLM context.

@@ -57,7 +63,7 @@ class OpenAILLMAdapter(BaseLLMAdapter[OpenAILLMInvocationParams]):
            Dictionary of parameters for OpenAI's ChatCompletion API.
        """
        return {
-            "messages": self._from_universal_context_messages(self._get_messages(context)),
+            "messages": self._from_universal_context_messages(self.get_messages(context)),
            # NOTE; LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
            "tools": self.from_standard_tools(context.tools),
            "tool_choice": context.tool_choice,
@@ -91,7 +97,7 @@ class OpenAILLMAdapter(BaseLLMAdapter[OpenAILLMInvocationParams]):
            List of messages in a format ready for logging about OpenAI.
        """
        msgs = []
-        for message in self._get_messages(context):
+        for message in self.get_messages(context):
            msg = copy.deepcopy(message)
            if "content" in msg:
                if isinstance(msg["content"], list):
@@ -104,14 +110,18 @@ class OpenAILLMAdapter(BaseLLMAdapter[OpenAILLMInvocationParams]):
            msgs.append(msg)
        return msgs

-    def _get_messages(self, context: LLMContext) -> List[LLMContextMessage]:
-        return context.get_messages("openai")
-
    def _from_universal_context_messages(
        self, messages: List[LLMContextMessage]
    ) -> List[ChatCompletionMessageParam]:
-        # Just a pass-through: messages are already the right type
-        return messages
+        result = []
+        for message in messages:
+            if isinstance(message, LLMSpecificMessage):
+                # Extract the actual message content from LLMSpecificMessage
+                result.append(message.message)
+            else:
+                # Standard message, pass through unchanged
+                result.append(message)
+        return result

    def _from_standard_tool_choice(
        self, tool_choice: LLMContextToolChoice | NotGiven
--- a/src/pipecat/adapters/services/open_ai_realtime_adapter.py
+++ b/src/pipecat/adapters/services/open_ai_realtime_adapter.py
@@ -30,6 +30,11 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
    OpenAI's Realtime API for function calling capabilities.
    """

+    @property
+    def id_for_llm_specific_messages(self) -> str:
+        """Get the identifier used in LLMSpecificMessage instances for OpenAI Realtime."""
+        raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")
+
    def get_llm_invocation_params(self, context: LLMContext) -> OpenAIRealtimeLLMInvocationParams:
        """Get OpenAI Realtime-specific LLM invocation parameters from a universal LLM context.

--- a/src/pipecat/audio/filters/noisereduce_filter.py
+++ b/src/pipecat/audio/filters/noisereduce_filter.py
@@ -33,6 +33,10 @@ class NoisereduceFilter(BaseAudioFilter):
    Applies spectral gating noise reduction algorithms to suppress background
    noise in audio streams. Uses the noisereduce library's default noise
    reduction parameters.
+
+    .. deprecated:: 0.0.85
+        `NoisereduceFilter` is deprecated and will be removed in a future version.
+        We recommend using other real-time audio filters like `KrispFilter` or `AICFilter`.
    """

    def __init__(self) -> None:
@@ -40,6 +44,17 @@ class NoisereduceFilter(BaseAudioFilter):
        self._filtering = True
        self._sample_rate = 0

+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "`NoisereduceFilter` is deprecated. "
+                "Use other real-time audio filters like `KrispFilter` or `AICFilter`.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
    async def start(self, sample_rate: int):
        """Initialize the filter with the transport's sample rate.

--- a/src/pipecat/audio/turn/smart_turn/data/init.py
+++ b/src/pipecat/audio/turn/smart_turn/data/init.py
--- a/src/pipecat/audio/turn/smart_turn/data/smart-turn-v3.0.onnx
+++ b/src/pipecat/audio/turn/smart_turn/data/smart-turn-v3.0.onnx
--- a/src/pipecat/audio/turn/smart_turn/local_smart_turn_v3.py
+++ b/src/pipecat/audio/turn/smart_turn/local_smart_turn_v3.py
@@ -0,0 +1,124 @@
+#
+# Copyright (c) 2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Local turn analyzer for on-device ML inference using the smart-turn-v3 model.
+
+This module provides a smart turn analyzer that uses an ONNX model for
+local end-of-turn detection without requiring network connectivity.
+"""
+
+from typing import Any, Dict, Optional
+
+import numpy as np
+from loguru import logger
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import BaseSmartTurn
+
+try:
+    import onnxruntime as ort
+    from transformers import WhisperFeatureExtractor
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error(
+        "In order to use LocalSmartTurnAnalyzerV3, you need to `pip install pipecat-ai[local-smart-turn-v3]`."
+    )
+    raise Exception(f"Missing module: {e}")
+
+
+class LocalSmartTurnAnalyzerV3(BaseSmartTurn):
+    """Local turn analyzer using the smart-turn-v3 ONNX model.
+
+    Provides end-of-turn detection using locally-stored ONNX model,
+    enabling offline operation without network dependencies.
+    """
+
+    def __init__(self, *, smart_turn_model_path: Optional[str] = None, **kwargs):
+        """Initialize the local ONNX smart-turn-v3 analyzer.
+
+        Args:
+            smart_turn_model_path: Path to the ONNX model file. If this is not
+                set, the bundled smart-turn-v3.0 model will be used.
+            **kwargs: Additional arguments passed to BaseSmartTurn.
+        """
+        super().__init__(**kwargs)
+
+        logger.debug("Loading Local Smart Turn v3 model...")
+
+        if not smart_turn_model_path:
+            # Load bundled model
+            model_name = "smart-turn-v3.0.onnx"
+            package_path = "pipecat.audio.turn.smart_turn.data"
+
+            try:
+                import importlib_resources as impresources
+
+                smart_turn_model_path = str(impresources.files(package_path).joinpath(model_name))
+            except BaseException:
+                from importlib import resources as impresources
+
+                try:
+                    with impresources.path(package_path, model_name) as f:
+                        smart_turn_model_path = f
+                except BaseException:
+                    smart_turn_model_path = str(
+                        impresources.files(package_path).joinpath(model_name)
+                    )
+
+        so = ort.SessionOptions()
+        so.execution_mode = ort.ExecutionMode.ORT_SEQUENTIAL
+        so.inter_op_num_threads = 1
+        so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
+
+        self._feature_extractor = WhisperFeatureExtractor(chunk_length=8)
+        self._session = ort.InferenceSession(smart_turn_model_path, sess_options=so)
+
+        logger.debug("Loaded Local Smart Turn v3")
+
+    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+        """Predict end-of-turn using local ONNX model."""
+
+        def truncate_audio_to_last_n_seconds(audio_array, n_seconds=8, sample_rate=16000):
+            """Truncate audio to last n seconds or pad with zeros to meet n seconds."""
+            max_samples = n_seconds * sample_rate
+            if len(audio_array) > max_samples:
+                return audio_array[-max_samples:]
+            elif len(audio_array) < max_samples:
+                # Pad with zeros at the beginning
+                padding = max_samples - len(audio_array)
+                return np.pad(audio_array, (padding, 0), mode="constant", constant_values=0)
+            return audio_array
+
+        # Truncate to 8 seconds (keeping the end) or pad to 8 seconds
+        audio_array = truncate_audio_to_last_n_seconds(audio_array, n_seconds=8)
+
+        # Process audio using Whisper's feature extractor
+        inputs = self._feature_extractor(
+            audio_array,
+            sampling_rate=16000,
+            return_tensors="pt",
+            padding="max_length",
+            max_length=8 * 16000,
+            truncation=True,
+            do_normalize=True,
+        )
+
+        # Convert to numpy and ensure correct shape for ONNX
+        input_features = inputs.input_features.squeeze(0).numpy().astype(np.float32)
+        input_features = np.expand_dims(input_features, axis=0)  # Add batch dimension
+
+        # Run ONNX inference
+        outputs = self._session.run(None, {"input_features": input_features})
+
+        # Extract probability (ONNX model returns sigmoid probabilities)
+        probability = outputs[0][0].item()
+
+        # Make prediction (1 for Complete, 0 for Incomplete)
+        prediction = 1 if probability > 0.5 else 0
+
+        return {
+            "prediction": prediction,
+            "probability": probability,
+        }
--- a/src/pipecat/extensions/voicemail/voicemail_detector.py
+++ b/src/pipecat/extensions/voicemail/voicemail_detector.py
@@ -21,7 +21,6 @@ from typing import List, Optional
 from loguru import logger

 from pipecat.frames.frames import (
-    BotInterruptionFrame,
    EndFrame,
    Frame,
    LLMFullResponseEndFrame,
@@ -360,7 +359,7 @@ class ClassificationProcessor(FrameProcessor):
            await self._voicemail_notifier.notify()  # Clear buffered TTS frames

            # Interrupt the current pipeline to stop any ongoing processing
-            await self.push_frame(BotInterruptionFrame(), FrameDirection.UPSTREAM)
+            await self.push_interruption_task_frame_and_wait()

            # Set the voicemail event to trigger the voicemail handler
            self._voicemail_event.clear()
--- a/src/pipecat/frames/frames.py
+++ b/src/pipecat/frames/frames.py
@@ -788,43 +788,6 @@ class FatalErrorFrame(ErrorFrame):
    fatal: bool = field(default=True, init=False)


-@dataclass
-class EndTaskFrame(SystemFrame):
-    """Frame to request graceful pipeline task closure.
-
-    This is used to notify the pipeline task that the pipeline should be
-    closed nicely (flushing all the queued frames) by pushing an EndFrame
-    downstream. This frame should be pushed upstream.
-    """
-
-    pass
-
-
-@dataclass
-class CancelTaskFrame(SystemFrame):
-    """Frame to request immediate pipeline task cancellation.
-
-    This is used to notify the pipeline task that the pipeline should be
-    stopped immediately by pushing a CancelFrame downstream. This frame
-    should be pushed upstream.
-    """
-
-    pass
-
-
-@dataclass
-class StopTaskFrame(SystemFrame):
-    """Frame to request pipeline task stop while keeping processors running.
-
-    This is used to notify the pipeline task that it should be stopped as
-    soon as possible (flushing all the queued frames) but that the pipeline
-    processors should be kept in a running state. This frame should be pushed
-    upstream.
-    """
-
-    pass
-
-
@dataclass
 class FrameProcessorPauseUrgentFrame(SystemFrame):
    """Frame to pause frame processing immediately.
@@ -857,7 +820,7 @@ class FrameProcessorResumeUrgentFrame(SystemFrame):


@dataclass
-class StartInterruptionFrame(SystemFrame):
+class InterruptionFrame(SystemFrame):
    """Frame indicating user started speaking (interruption detected).

    Emitted by the BaseInputTransport to indicate that a user has started
@@ -869,6 +832,34 @@ class StartInterruptionFrame(SystemFrame):
    pass


+@dataclass
+class StartInterruptionFrame(InterruptionFrame):
+    """Frame indicating user started speaking (interruption detected).
+
+    .. deprecated:: 0.0.85
+        This frame is deprecated and will be removed in a future version.
+        Instead, use `InterruptionFrame`.
+
+    Emitted by the BaseInputTransport to indicate that a user has started
+    speaking (i.e. is interrupting). This is similar to
+    UserStartedSpeakingFrame except that it should be pushed concurrently
+    with other frames (so the order is not guaranteed).
+    """
+
+    def __post_init__(self):
+        super().__post_init__()
+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "StartInterruptionFrame is deprecated and will be removed in a future version. "
+                "Instead, use InterruptionFrame.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
+
@dataclass
 class UserStartedSpeakingFrame(SystemFrame):
    """Frame indicating user has started speaking.
@@ -944,20 +935,6 @@ class VADUserStoppedSpeakingFrame(SystemFrame):
    pass


-@dataclass
-class BotInterruptionFrame(SystemFrame):
-    """Frame indicating the bot should be interrupted.
-
-    Emitted when the bot should be interrupted. This will mainly cause the
-    same actions as if the user interrupted except that the
-    UserStartedSpeakingFrame and UserStoppedSpeakingFrame won't be generated.
-    This frame should be pushed upstreams. It results in the BaseInputTransport
-    starting an interruption by pushing a StartInterruptionFrame downstream.
-    """
-
-    pass
-
-
@dataclass
 class BotStartedSpeakingFrame(SystemFrame):
    """Frame indicating the bot started speaking.
@@ -1253,23 +1230,6 @@ class UserImageRawFrame(InputImageRawFrame):
        return f"{self.name}(pts: {pts}, user: {self.user_id}, source: {self.transport_source}, size: {self.size}, format: {self.format}, request: {self.request})"


-@dataclass
-class VisionImageRawFrame(InputImageRawFrame):
-    """Image frame for vision/image analysis with associated text prompt.
-
-    An image with an associated text to ask for a description of it.
-
-    Parameters:
-        text: Optional text prompt describing what to analyze in the image.
-    """
-
-    text: Optional[str] = None
-
-    def __str__(self):
-        pts = format_pts(self.pts)
-        return f"{self.name}(pts: {pts}, text: [{self.text}], size: {self.size}, format: {self.format})"
-
-
@dataclass
 class InputDTMFFrame(DTMFFrame, SystemFrame):
    """DTMF keypress input frame from transport."""
@@ -1306,6 +1266,103 @@ class SpeechControlParamsFrame(SystemFrame):
    turn_params: Optional[SmartTurnParams] = None


+#
+# Task frames
+#
+
+
+@dataclass
+class TaskFrame(SystemFrame):
+    """Base frame for task frames.
+
+    This is a base class for frames that are meant to be sent and handled
+    upstream by the pipeline task. This might result in a corresponding frame
+    sent downstream (e.g. `InterruptionTaskFrame` / `InterruptionFrame` or
+    `EndTaskFrame` / `EndFrame`).
+
+    """
+
+    pass
+
+
+@dataclass
+class EndTaskFrame(TaskFrame):
+    """Frame to request graceful pipeline task closure.
+
+    This is used to notify the pipeline task that the pipeline should be
+    closed nicely (flushing all the queued frames) by pushing an EndFrame
+    downstream. This frame should be pushed upstream.
+    """
+
+    pass
+
+
+@dataclass
+class CancelTaskFrame(TaskFrame):
+    """Frame to request immediate pipeline task cancellation.
+
+    This is used to notify the pipeline task that the pipeline should be
+    stopped immediately by pushing a CancelFrame downstream. This frame
+    should be pushed upstream.
+    """
+
+    pass
+
+
+@dataclass
+class StopTaskFrame(TaskFrame):
+    """Frame to request pipeline task stop while keeping processors running.
+
+    This is used to notify the pipeline task that it should be stopped as
+    soon as possible (flushing all the queued frames) but that the pipeline
+    processors should be kept in a running state. This frame should be pushed
+    upstream.
+    """
+
+    pass
+
+
+@dataclass
+class InterruptionTaskFrame(TaskFrame):
+    """Frame indicating the bot should be interrupted.
+
+    Emitted when the bot should be interrupted. This will mainly cause the
+    same actions as if the user interrupted except that the
+    UserStartedSpeakingFrame and UserStoppedSpeakingFrame won't be generated.
+    This frame should be pushed upstream.
+    """
+
+    pass
+
+
+@dataclass
+class BotInterruptionFrame(InterruptionTaskFrame):
+    """Frame indicating the bot should be interrupted.
+
+    .. deprecated:: 0.0.85
+        This frame is deprecated and will be removed in a future version.
+        Instead, use `InterruptionTaskFrame`.
+
+    Emitted when the bot should be interrupted. This will mainly cause the
+    same actions as if the user interrupted except that the
+    UserStartedSpeakingFrame and UserStoppedSpeakingFrame won't be generated.
+    This frame should be pushed upstream.
+    """
+
+    def __post_init__(self):
+        super().__post_init__()
+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "BotInterruptionFrame is deprecated and will be removed in a future version. "
+                "Instead, use InterruptionTaskFrame.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
+
 #
 # Control frames
 #
--- a/src/pipecat/observers/loggers/debug_log_observer.py
+++ b/src/pipecat/observers/loggers/debug_log_observer.py
@@ -54,7 +54,7 @@ class DebugLogObserver(BaseObserver):

        Log frames with specific source/destination filters::

-            from pipecat.frames.frames import StartInterruptionFrame, UserStartedSpeakingFrame, LLMTextFrame
+            from pipecat.frames.frames import InterruptionFrame, UserStartedSpeakingFrame, LLMTextFrame
            from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint
            from pipecat.transports.base_output import BaseOutputTransport
            from pipecat.services.stt_service import STTService
@@ -62,8 +62,8 @@ class DebugLogObserver(BaseObserver):
            observers=[
                DebugLogObserver(
                    frame_types={
-                        # Only log StartInterruptionFrame when source is BaseOutputTransport
-                        StartInterruptionFrame: (BaseOutputTransport, FrameEndpoint.SOURCE),
+                        # Only log InterruptionFrame when source is BaseOutputTransport
+                        InterruptionFrame: (BaseOutputTransport, FrameEndpoint.SOURCE),
                        # Only log UserStartedSpeakingFrame when destination is STTService
                        UserStartedSpeakingFrame: (STTService, FrameEndpoint.DESTINATION),
                        # Log LLMTextFrame regardless of source or destination type
--- a/src/pipecat/pipeline/task.py
+++ b/src/pipecat/pipeline/task.py
@@ -32,6 +32,8 @@ from pipecat.frames.frames import (
    Frame,
    HeartbeatFrame,
    InputAudioRawFrame,
+    InterruptionFrame,
+    InterruptionTaskFrame,
    MetricsFrame,
    StartFrame,
    StopFrame,
@@ -113,9 +115,28 @@ class PipelineTask(BasePipelineTask):
    - on_frame_reached_downstream: Called when downstream frames reach the sink
    - on_idle_timeout: Called when pipeline is idle beyond timeout threshold
    - on_pipeline_started: Called when pipeline starts with StartFrame
-    - on_pipeline_stopped: Called when pipeline stops with StopFrame
-    - on_pipeline_ended: Called when pipeline ends with EndFrame
-    - on_pipeline_cancelled: Called when pipeline is cancelled
+    - on_pipeline_stopped: [deprecated] Called when pipeline stops with StopFrame
+
+            .. deprecated:: 0.0.86
+                Use `on_pipeline_finished` instead.
+
+    - on_pipeline_ended: [deprecated] Called when pipeline ends with EndFrame
+
+            .. deprecated:: 0.0.86
+                Use `on_pipeline_finished` instead.
+
+    - on_pipeline_cancelled: [deprecated] Called when pipeline is cancelled with CancelFrame
+
+            .. deprecated:: 0.0.86
+                Use `on_pipeline_finished` instead.
+
+    - on_pipeline_finished: Called after the pipeline has reached any terminal state.
+          This includes:
+              - StopFrame: pipeline was stopped (processors keep connections open)
+              - EndFrame: pipeline ended normally
+              - CancelFrame: pipeline was cancelled
+          Use this event for cleanup, logging, or post-processing tasks. Users can inspect
+          the frame if they need to handle specific cases.

    Example::

@@ -126,6 +147,10 @@ class PipelineTask(BasePipelineTask):
        @task.event_handler("on_idle_timeout")
        async def on_pipeline_idle_timeout(task):
            ...
+
+        @task.event_handler("on_pipeline_finished")
+        async def on_pipeline_finished(task, frame):
+            ...
    """

    def __init__(
@@ -262,6 +287,7 @@ class PipelineTask(BasePipelineTask):
        self._register_event_handler("on_pipeline_stopped")
        self._register_event_handler("on_pipeline_ended")
        self._register_event_handler("on_pipeline_cancelled")
+        self._register_event_handler("on_pipeline_finished")

    @property
    def params(self) -> PipelineParams:
@@ -290,6 +316,27 @@ class PipelineTask(BasePipelineTask):
        """
        return self._turn_trace_observer

+    def event_handler(self, event_name: str):
+        """Decorator for registering event handlers.
+
+        Args:
+            event_name: The name of the event to handle.
+
+        Returns:
+            The decorator function that registers the handler.
+        """
+        if event_name in ["on_pipeline_stopped", "on_pipeline_ended", "on_pipeline_cancelled"]:
+            import warnings
+
+            with warnings.catch_warnings():
+                warnings.simplefilter("always")
+                warnings.warn(
+                    f"Event '{event_name}' is deprecated, use 'on_pipeline_finished' instead.",
+                    DeprecationWarning,
+                )
+
+        return super().event_handler(event_name)
+
    def add_observer(self, observer: BaseObserver):
        """Add an observer to monitor pipeline execution.

@@ -532,6 +579,7 @@ class PipelineTask(BasePipelineTask):
                )
            finally:
                await self._call_event_handler("on_pipeline_cancelled", frame)
+                await self._call_event_handler("on_pipeline_finished", frame)

        logger.debug(f"{self}: Closing. Waiting for {frame} to reach the end of the pipeline...")

@@ -627,13 +675,23 @@ class PipelineTask(BasePipelineTask):

        if isinstance(frame, EndTaskFrame):
            # Tell the task we should end nicely.
+            logger.debug(f"{self}: received end task frame {frame}")
            await self.queue_frame(EndFrame())
        elif isinstance(frame, CancelTaskFrame):
            # Tell the task we should end right away.
+            logger.debug(f"{self}: received cancel task frame {frame}")
            await self.queue_frame(CancelFrame())
        elif isinstance(frame, StopTaskFrame):
            # Tell the task we should stop nicely.
+            logger.debug(f"{self}: received stop task frame {frame}")
            await self.queue_frame(StopFrame())
+        elif isinstance(frame, InterruptionTaskFrame):
+            # Tell the task we should interrupt the pipeline. Note that we are
+            # bypassing the push queue and directly queue into the
+            # pipeline. This is in case the push task is blocked waiting for a
+            # pipeline-ending frame to finish traversing the pipeline.
+            logger.debug(f"{self}: received interruption task frame {frame}")
+            await self._pipeline.queue_frame(InterruptionFrame())
        elif isinstance(frame, ErrorFrame):
            if frame.fatal:
                logger.error(f"A fatal error occurred: {frame}")
@@ -642,7 +700,7 @@ class PipelineTask(BasePipelineTask):
                # Tell the task we should stop.
                await self.queue_frame(StopTaskFrame())
            else:
-                logger.warning(f"Something went wrong: {frame}")
+                logger.warning(f"{self}: Something went wrong: {frame}")

    async def _sink_push_frame(self, frame: Frame, direction: FrameDirection):
        """Process frames coming downstream from the pipeline.
@@ -669,9 +727,11 @@ class PipelineTask(BasePipelineTask):
            self._pipeline_start_event.set()
        elif isinstance(frame, EndFrame):
            await self._call_event_handler("on_pipeline_ended", frame)
+            await self._call_event_handler("on_pipeline_finished", frame)
            self._pipeline_end_event.set()
        elif isinstance(frame, StopFrame):
            await self._call_event_handler("on_pipeline_stopped", frame)
+            await self._call_event_handler("on_pipeline_finished", frame)
            self._pipeline_end_event.set()
        elif isinstance(frame, CancelFrame):
            self._pipeline_end_event.set()
--- a/src/pipecat/processors/aggregators/dtmf_aggregator.py
+++ b/src/pipecat/processors/aggregators/dtmf_aggregator.py
@@ -16,7 +16,6 @@ from typing import Optional

 from pipecat.audio.dtmf.types import KeypadEntry
 from pipecat.frames.frames import (
-    BotInterruptionFrame,
    CancelFrame,
    EndFrame,
    Frame,
@@ -24,7 +23,7 @@ from pipecat.frames.frames import (
    StartFrame,
    TranscriptionFrame,
 )
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor, FrameProcessorSetup
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.utils.time import time_now_iso8601


@@ -105,7 +104,7 @@ class DTMFAggregator(FrameProcessor):

        # For first digit, schedule interruption.
        if is_first_digit:
-            await self.push_frame(BotInterruptionFrame(), FrameDirection.UPSTREAM)
+            await self.push_interruption_task_frame_and_wait()

        # Check for immediate flush conditions
        if frame.button == self._termination_digit:
--- a/src/pipecat/processors/aggregators/llm_response.py
+++ b/src/pipecat/processors/aggregators/llm_response.py
@@ -22,7 +22,6 @@ from pipecat.audio.interruptions.base_interruption_strategy import BaseInterrupt
 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import (
-    BotInterruptionFrame,
    BotStartedSpeakingFrame,
    BotStoppedSpeakingFrame,
    CancelFrame,
@@ -36,6 +35,7 @@ from pipecat.frames.frames import (
    FunctionCallsStartedFrame,
    InputAudioRawFrame,
    InterimTranscriptionFrame,
+    InterruptionFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
    LLMMessagesAppendFrame,
@@ -48,7 +48,6 @@ from pipecat.frames.frames import (
    OpenAILLMContextAssistantTimestampFrame,
    SpeechControlParamsFrame,
    StartFrame,
-    StartInterruptionFrame,
    TextFrame,
    TranscriptionFrame,
    UserImageRawFrame,
@@ -138,7 +137,7 @@ class LLMFullResponseAggregator(FrameProcessor):
        """
        await super().process_frame(frame, direction)

-        if isinstance(frame, StartInterruptionFrame):
+        if isinstance(frame, InterruptionFrame):
            await self._call_event_handler("on_completion", self._aggregation, False)
            self._aggregation = ""
            self._started = False
@@ -532,9 +531,9 @@ class LLMUserContextAggregator(LLMContextResponseAggregator):

                if should_interrupt:
                    logger.debug(
-                        "Interruption conditions met - pushing BotInterruptionFrame and aggregation"
+                        "Interruption conditions met - pushing interruption and aggregation"
                    )
-                    await self.push_frame(BotInterruptionFrame(), FrameDirection.UPSTREAM)
+                    await self.push_interruption_task_frame_and_wait()
                    await self._process_aggregation()
                else:
                    logger.debug("Interruption conditions not met - not pushing aggregation")
@@ -838,7 +837,7 @@ class LLMAssistantContextAggregator(LLMContextResponseAggregator):
        """
        await super().process_frame(frame, direction)

-        if isinstance(frame, StartInterruptionFrame):
+        if isinstance(frame, InterruptionFrame):
            await self._handle_interruptions(frame)
            await self.push_frame(frame, direction)
        elif isinstance(frame, LLMFullResponseStartFrame):
@@ -904,7 +903,7 @@ class LLMAssistantContextAggregator(LLMContextResponseAggregator):
        if frame.run_llm:
            await self.push_context_frame(FrameDirection.UPSTREAM)

-    async def _handle_interruptions(self, frame: StartInterruptionFrame):
+    async def _handle_interruptions(self, frame: InterruptionFrame):
        await self.push_aggregation()
        self._started = 0
        await self.reset()
--- a/src/pipecat/processors/aggregators/llm_response_universal.py
+++ b/src/pipecat/processors/aggregators/llm_response_universal.py
@@ -13,7 +13,6 @@ LLM processing, and text-to-speech components in conversational AI pipelines.

 import asyncio
 import json
-from dataclasses import dataclass
 from typing import Any, Dict, List, Literal, Optional, Set

 from loguru import logger
@@ -23,7 +22,6 @@ from pipecat.audio.interruptions.base_interruption_strategy import BaseInterrupt
 from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import (
-    BotInterruptionFrame,
    BotStartedSpeakingFrame,
    BotStoppedSpeakingFrame,
    CancelFrame,
@@ -37,6 +35,7 @@ from pipecat.frames.frames import (
    FunctionCallsStartedFrame,
    InputAudioRawFrame,
    InterimTranscriptionFrame,
+    InterruptionFrame,
    LLMContextAssistantTimestampFrame,
    LLMContextFrame,
    LLMFullResponseEndFrame,
@@ -48,7 +47,6 @@ from pipecat.frames.frames import (
    LLMSetToolsFrame,
    SpeechControlParamsFrame,
    StartFrame,
-    StartInterruptionFrame,
    TextFrame,
    TranscriptionFrame,
    UserImageRawFrame,
@@ -311,9 +309,9 @@ class LLMUserAggregator(LLMContextAggregator):

                if should_interrupt:
                    logger.debug(
-                        "Interruption conditions met - pushing BotInterruptionFrame and aggregation"
+                        "Interruption conditions met - pushing interruption and aggregation"
                    )
-                    await self.push_frame(BotInterruptionFrame(), FrameDirection.UPSTREAM)
+                    await self.push_interruption_task_frame_and_wait()
                    await self._process_aggregation()
                else:
                    logger.debug("Interruption conditions not met - not pushing aggregation")
@@ -579,7 +577,7 @@ class LLMAssistantAggregator(LLMContextAggregator):
        """
        await super().process_frame(frame, direction)

-        if isinstance(frame, StartInterruptionFrame):
+        if isinstance(frame, InterruptionFrame):
            await self._handle_interruptions(frame)
            await self.push_frame(frame, direction)
        elif isinstance(frame, LLMFullResponseStartFrame):
@@ -645,7 +643,7 @@ class LLMAssistantAggregator(LLMContextAggregator):
        if frame.run_llm:
            await self.push_context_frame(FrameDirection.UPSTREAM)

-    async def _handle_interruptions(self, frame: StartInterruptionFrame):
+    async def _handle_interruptions(self, frame: InterruptionFrame):
        await self._push_aggregation()
        self._started = 0
        await self.reset()
--- a/src/pipecat/processors/aggregators/vision_image_frame.py
+++ b/src/pipecat/processors/aggregators/vision_image_frame.py
@@ -10,13 +10,22 @@ This module provides frame aggregation functionality to combine text and image
 frames into vision frames for multimodal processing.
 """

-from pipecat.frames.frames import Frame, InputImageRawFrame, TextFrame, VisionImageRawFrame
+from pipecat.frames.frames import Frame, InputImageRawFrame, TextFrame
+from pipecat.processors.aggregators.openai_llm_context import (
+    OpenAILLMContext,
+    OpenAILLMContextFrame,
+)
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor


 class VisionImageFrameAggregator(FrameProcessor):
    """Aggregates consecutive text and image frames into vision frames.

+    .. deprecated:: 0.0.85
+        VisionImageRawFrame has been removed in favor of context frames
+        (LLMContextFrame or OpenAILLMContextFrame), so this aggregator is not
+        needed anymore. See the 12* examples for the new recommended pattern.
+
    This aggregator waits for a consecutive TextFrame and an InputImageRawFrame.
    After the InputImageRawFrame arrives it will output a VisionImageRawFrame
    combining both the text and image data for multimodal processing.
@@ -28,6 +37,17 @@ class VisionImageFrameAggregator(FrameProcessor):
        The aggregator starts with no cached text, waiting for the first
        TextFrame to arrive before it can create vision frames.
        """
+        import warnings
+
+        warnings.warn(
+            "VisionImageFrameAggregator is deprecated. "
+            "VisionImageRawFrame has been removed in favor of context frames "
+            "(LLMContextFrame or OpenAILLMContextFrame), so this aggregator is "
+            "not needed anymore. See the 12* examples for the new recommended "
+            "pattern.",
+            DeprecationWarning,
+            stacklevel=2,
+        )
        super().__init__()
        self._describe_text = None

@@ -47,12 +67,14 @@ class VisionImageFrameAggregator(FrameProcessor):
            self._describe_text = frame.text
        elif isinstance(frame, InputImageRawFrame):
            if self._describe_text:
-                frame = VisionImageRawFrame(
+                context = OpenAILLMContext()
+                context.add_image_frame_message(
                    text=self._describe_text,
                    image=frame.image,
                    size=frame.size,
                    format=frame.format,
                )
+                frame = OpenAILLMContextFrame(context)
                await self.push_frame(frame)
                self._describe_text = None
        else:
--- a/src/pipecat/processors/audio/audio_buffer_processor.py
+++ b/src/pipecat/processors/audio/audio_buffer_processor.py
@@ -137,12 +137,12 @@ class AudioBufferProcessor(FrameProcessor):
        return self._num_channels

    def has_audio(self) -> bool:
-        """Check if both user and bot audio buffers contain data.
+        """Check if either user or bot audio buffers contain data.

        Returns:
-            True if both buffers contain audio data.
+            True if either buffer contains audio data.
        """
-        return self._buffer_has_audio(self._user_audio_buffer) and self._buffer_has_audio(
+        return self._buffer_has_audio(self._user_audio_buffer) or self._buffer_has_audio(
            self._bot_audio_buffer
        )

--- a/src/pipecat/processors/filters/stt_mute_filter.py
+++ b/src/pipecat/processors/filters/stt_mute_filter.py
@@ -25,8 +25,8 @@ from pipecat.frames.frames import (
    FunctionCallResultFrame,
    InputAudioRawFrame,
    InterimTranscriptionFrame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    STTMuteFrame,
    TranscriptionFrame,
    UserStartedSpeakingFrame,
@@ -204,7 +204,7 @@ class STTMuteFilter(FrameProcessor):
        if isinstance(
            frame,
            (
-                StartInterruptionFrame,
+                InterruptionFrame,
                VADUserStartedSpeakingFrame,
                VADUserStoppedSpeakingFrame,
                UserStartedSpeakingFrame,
--- a/src/pipecat/processors/frame_processor.py
+++ b/src/pipecat/processors/frame_processor.py
@@ -28,8 +28,9 @@ from pipecat.frames.frames import (
    FrameProcessorPauseUrgentFrame,
    FrameProcessorResumeFrame,
    FrameProcessorResumeUrgentFrame,
+    InterruptionFrame,
+    InterruptionTaskFrame,
    StartFrame,
-    StartInterruptionFrame,
    SystemFrame,
 )
 from pipecat.metrics.metrics import LLMTokenUsage, MetricsData
@@ -219,6 +220,9 @@ class FrameProcessor(BaseObject):
        self.__process_event: Optional[asyncio.Event] = None
        self.__process_frame_task: Optional[asyncio.Task] = None

+        self._wait_for_interruption = False
+        self._wait_interruption_event = asyncio.Event()
+
    @property
    def id(self) -> int:
        """Get the unique identifier for this processor.
@@ -542,6 +546,14 @@ class FrameProcessor(BaseObject):
        if self._cancelling:
            return

+        # If we are waiting for an interruption we will bypass all queued system
+        # frames and we will process the frame right away. This is because a
+        # previous system frame might be waiting for the interruption frame and
+        # it's blocking the input task.
+        if self._wait_for_interruption and isinstance(frame, InterruptionFrame):
+            await self.__process_frame(frame, direction, callback)
+            return
+
        if self._enable_direct_mode:
            await self.__process_frame(frame, direction, callback)
        else:
@@ -588,7 +600,7 @@ class FrameProcessor(BaseObject):

        if isinstance(frame, StartFrame):
            await self.__start(frame)
-        elif isinstance(frame, StartInterruptionFrame):
+        elif isinstance(frame, InterruptionFrame):
            await self._start_interruption()
            await self.stop_all_metrics()
        elif isinstance(frame, CancelFrame):
@@ -620,6 +632,32 @@ class FrameProcessor(BaseObject):

        await self.__internal_push_frame(frame, direction)

+        if isinstance(frame, InterruptionFrame):
+            self._wait_interruption_event.set()
+
+    async def push_interruption_task_frame_and_wait(self):
+        """Push an interruption task frame upstream and wait for the interruption.
+
+        This function sends an `InterruptionTaskFrame` upstream to the pipeline
+        task and waits to receive the corresponding `InterruptionFrame`. When
+        the function finishes it is guaranteed that the `InterruptionFrame` has
+        been pushed downstream.
+        """
+        self._wait_for_interruption = True
+
+        await self.push_frame(InterruptionTaskFrame(), FrameDirection.UPSTREAM)
+
+        # Wait for an `InterruptionFrame` to come to this processor and be
+        # pushed. Take a look at `push_frame()` to see how we first push the
+        # `InterruptionFrame` and then we set the event in order to maintain
+        # frame ordering.
+        await self._wait_interruption_event.wait()
+
+        # Clean the event.
+        self._wait_interruption_event.clear()
+
+        self._wait_for_interruption = False
+
    async def __start(self, frame: StartFrame):
        """Handle the start frame to initialize processor state.

@@ -669,20 +707,22 @@ class FrameProcessor(BaseObject):
    async def _start_interruption(self):
        """Start handling an interruption by cancelling current tasks."""
        try:
-            # Cancel the process task. This will stop processing queued frames.
-            await self.__cancel_process_task()
+            if self._wait_for_interruption:
+                # If we get here we know the process task was just waiting for
+                # an interruption (push_interruption_task_frame_and_wait()), so
+                # we can't cancel the task because it might still need to do
+                # more things (e.g. pushing a frame after the
+                # interruption). Instead we just drain the queue because this is
+                # an interruption.
+                self.__reset_process_task()
+            else:
+                # Cancel and re-create the process task including the queue.
+                await self.__cancel_process_task()
+                self.__create_process_task()
        except Exception as e:
            logger.exception(f"Uncaught exception in {self} when handling _start_interruption: {e}")
            await self.push_error(ErrorFrame(str(e)))

-        # Create a new process queue and task.
-        self.__create_process_task()
-
-    async def _stop_interruption(self):
-        """Stop handling an interruption."""
-        # Nothing to do right now.
-        pass
-
    async def __internal_push_frame(self, frame: Frame, direction: FrameDirection):
        """Internal method to push frames to adjacent processors.

@@ -764,6 +804,17 @@ class FrameProcessor(BaseObject):
            self.__process_queue = asyncio.Queue()
            self.__process_frame_task = self.create_task(self.__process_frame_task_handler())

+    def __reset_process_task(self):
+        """Reset non-system frame processing task."""
+        if self._enable_direct_mode:
+            return
+
+        self.__should_block_frames = False
+        self.__process_event = asyncio.Event()
+        while not self.__process_queue.empty():
+            self.__process_queue.get_nowait()
+            self.__process_queue.task_done()
+
    async def __cancel_process_task(self):
        """Cancel the non-system frame processing task."""
        if self.__process_frame_task:
--- a/src/pipecat/processors/frameworks/rtvi.py
+++ b/src/pipecat/processors/frameworks/rtvi.py
@@ -30,7 +30,6 @@ from loguru import logger
 from pydantic import BaseModel, Field, PrivateAttr, ValidationError

 from pipecat.frames.frames import (
-    BotInterruptionFrame,
    BotStartedSpeakingFrame,
    BotStoppedSpeakingFrame,
    CancelFrame,
@@ -1206,7 +1205,7 @@ class RTVIProcessor(FrameProcessor):

    async def interrupt_bot(self):
        """Send a bot interruption frame upstream."""
-        await self.push_frame(BotInterruptionFrame(), FrameDirection.UPSTREAM)
+        await self.push_interruption_task_frame_and_wait()

    async def send_server_message(self, data: Any):
        """Send a server message to the client."""
--- a/src/pipecat/processors/transcript_processor.py
+++ b/src/pipecat/processors/transcript_processor.py
@@ -19,7 +19,7 @@ from pipecat.frames.frames import (
    CancelFrame,
    EndFrame,
    Frame,
-    StartInterruptionFrame,
+    InterruptionFrame,
    TranscriptionFrame,
    TranscriptionMessage,
    TranscriptionUpdateFrame,
@@ -86,7 +86,7 @@ class AssistantTranscriptProcessor(BaseTranscriptProcessor):
    transcript messages. Utterances are completed when:

    - The bot stops speaking (BotStoppedSpeakingFrame)
-    - The bot is interrupted (StartInterruptionFrame)
+    - The bot is interrupted (InterruptionFrame)
    - The pipeline ends (EndFrame)
    """

@@ -185,7 +185,7 @@ class AssistantTranscriptProcessor(BaseTranscriptProcessor):

        - TTSTextFrame: Aggregates text for current utterance
        - BotStoppedSpeakingFrame: Completes current utterance
-        - StartInterruptionFrame: Completes current utterance due to interruption
+        - InterruptionFrame: Completes current utterance due to interruption
        - EndFrame: Completes current utterance at pipeline end
        - CancelFrame: Completes current utterance due to cancellation

@@ -195,7 +195,7 @@ class AssistantTranscriptProcessor(BaseTranscriptProcessor):
        """
        await super().process_frame(frame, direction)

-        if isinstance(frame, (StartInterruptionFrame, CancelFrame)):
+        if isinstance(frame, (InterruptionFrame, CancelFrame)):
            # Push frame first otherwise our emitted transcription update frame
            # might get cleaned up.
            await self.push_frame(frame, direction)
--- a/src/pipecat/runner/types.py
+++ b/src/pipecat/runner/types.py
@@ -51,9 +51,11 @@ class WebSocketRunnerArguments(RunnerArguments):

    Parameters:
        websocket: WebSocket connection for audio streaming
+        body: Additional request data
    """

    websocket: WebSocket
+    body: Optional[Any] = field(default_factory=dict)


@dataclass
--- a/src/pipecat/runner/utils.py
+++ b/src/pipecat/runner/utils.py
@@ -99,16 +99,35 @@ async def parse_telephony_websocket(websocket: WebSocket):
        tuple: (transport_type: str, call_data: dict)

        call_data contains provider-specific fields:
-        - Twilio: {"stream_id": str, "call_id": str}
-        - Telnyx: {"stream_id": str, "call_control_id": str, "outbound_encoding": str}
-        - Plivo: {"stream_id": str, "call_id": str}
-        - Exotel: {"stream_id": str, "call_id": str, "account_sid": str}
+        - Twilio: {
+            "stream_id": str,
+            "call_id": str,
+            "body": dict
+        }
+        - Telnyx: {
+            "stream_id": str,
+            "call_control_id": str,
+            "outbound_encoding": str,
+            "from": str,
+            "to": str,
+        }
+        - Plivo: {
+            "stream_id": str,
+            "call_id": str,
+        }
+        - Exotel: {
+            "stream_id": str,
+            "call_id": str,
+            "account_sid": str,
+            "from": str,
+            "to": str,
+        }

    Example usage::

        transport_type, call_data = await parse_telephony_websocket(websocket)
-        if transport_type == "telnyx":
-            outbound_encoding = call_data["outbound_encoding"]
+        if transport_type == "twilio":
+            user_id = call_data["body"]["user_id"]
    """
    # Read first two messages
    start_data = websocket.iter_text()
@@ -151,9 +170,12 @@ async def parse_telephony_websocket(websocket: WebSocket):
        # Extract provider-specific data
        if transport_type == "twilio":
            start_data = call_data_raw.get("start", {})
+            body_data = start_data.get("customParameters", {})
            call_data = {
                "stream_id": start_data.get("streamSid"),
                "call_id": start_data.get("callSid"),
+                # All custom parameters
+                "body": body_data,
            }

        elif transport_type == "telnyx":
@@ -163,6 +185,8 @@ async def parse_telephony_websocket(websocket: WebSocket):
                "outbound_encoding": call_data_raw.get("start", {})
                .get("media_format", {})
                .get("encoding"),
+                "from": call_data_raw.get("start", {}).get("from", ""),
+                "to": call_data_raw.get("start", {}).get("to", ""),
            }

        elif transport_type == "plivo":
@@ -178,6 +202,8 @@ async def parse_telephony_websocket(websocket: WebSocket):
                "stream_id": start_data.get("stream_sid"),
                "call_id": start_data.get("call_sid"),
                "account_sid": start_data.get("account_sid"),
+                "from": start_data.get("from", ""),
+                "to": start_data.get("to", ""),
            }

        else:
--- a/src/pipecat/serializers/exotel.py
+++ b/src/pipecat/serializers/exotel.py
@@ -20,8 +20,8 @@ from pipecat.frames.frames import (
    Frame,
    InputAudioRawFrame,
    InputDTMFFrame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    TransportMessageFrame,
    TransportMessageUrgentFrame,
 )
@@ -98,7 +98,7 @@ class ExotelFrameSerializer(FrameSerializer):
        Returns:
            Serialized data as string or bytes, or None if the frame isn't handled.
        """
-        if isinstance(frame, StartInterruptionFrame):
+        if isinstance(frame, InterruptionFrame):
            answer = {"event": "clear", "streamSid": self._stream_sid}
            return json.dumps(answer)
        elif isinstance(frame, AudioRawFrame):
--- a/src/pipecat/serializers/plivo.py
+++ b/src/pipecat/serializers/plivo.py
@@ -22,8 +22,8 @@ from pipecat.frames.frames import (
    Frame,
    InputAudioRawFrame,
    InputDTMFFrame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    TransportMessageFrame,
    TransportMessageUrgentFrame,
 )
@@ -122,7 +122,7 @@ class PlivoFrameSerializer(FrameSerializer):
            self._hangup_attempted = True
            await self._hang_up_call()
            return None
-        elif isinstance(frame, StartInterruptionFrame):
+        elif isinstance(frame, InterruptionFrame):
            answer = {"event": "clearAudio", "streamId": self._stream_id}
            return json.dumps(answer)
        elif isinstance(frame, AudioRawFrame):
--- a/src/pipecat/serializers/telnyx.py
+++ b/src/pipecat/serializers/telnyx.py
@@ -29,8 +29,8 @@ from pipecat.frames.frames import (
    Frame,
    InputAudioRawFrame,
    InputDTMFFrame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
 )
 from pipecat.serializers.base_serializer import FrameSerializer, FrameSerializerType

@@ -137,7 +137,7 @@ class TelnyxFrameSerializer(FrameSerializer):
            self._hangup_attempted = True
            await self._hang_up_call()
            return None
-        elif isinstance(frame, StartInterruptionFrame):
+        elif isinstance(frame, InterruptionFrame):
            answer = {"event": "clear"}
            return json.dumps(answer)
        elif isinstance(frame, AudioRawFrame):
--- a/src/pipecat/serializers/twilio.py
+++ b/src/pipecat/serializers/twilio.py
@@ -22,8 +22,8 @@ from pipecat.frames.frames import (
    Frame,
    InputAudioRawFrame,
    InputDTMFFrame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    TransportMessageFrame,
    TransportMessageUrgentFrame,
 )
@@ -122,7 +122,7 @@ class TwilioFrameSerializer(FrameSerializer):
            self._hangup_attempted = True
            await self._hang_up_call()
            return None
-        elif isinstance(frame, StartInterruptionFrame):
+        elif isinstance(frame, InterruptionFrame):
            answer = {"event": "clear", "streamSid": self._stream_sid}
            return json.dumps(answer)
        elif isinstance(frame, AudioRawFrame):
--- a/src/pipecat/services/anthropic/llm.py
+++ b/src/pipecat/services/anthropic/llm.py
@@ -42,7 +42,6 @@ from pipecat.frames.frames import (
    LLMTextFrame,
    LLMUpdateSettingsFrame,
    UserImageRawFrame,
-    VisionImageRawFrame,
 )
 from pipecat.metrics.metrics import LLMTokenUsage
 from pipecat.processors.aggregators.llm_context import LLMContext
@@ -495,12 +494,6 @@ class AnthropicLLMService(LLMService):
            context = frame.context
        elif isinstance(frame, LLMMessagesFrame):
            context = AnthropicLLMContext.from_messages(frame.messages)
-        elif isinstance(frame, VisionImageRawFrame):
-            # This is only useful in very simple pipelines because it creates
-            # a new context. Generally we want a context manager to catch
-            # UserImageRawFrames coming through the pipeline and add them
-            # to the context.
-            context = AnthropicLLMContext.from_image_frame(frame)
        elif isinstance(frame, LLMUpdateSettingsFrame):
            await self._update_settings(frame.settings)
        elif isinstance(frame, LLMEnablePromptCachingFrame):
@@ -626,22 +619,6 @@ class AnthropicLLMContext(OpenAILLMContext):
        self._restructure_from_openai_messages()
        return self

-    @classmethod
-    def from_image_frame(cls, frame: VisionImageRawFrame) -> "AnthropicLLMContext":
-        """Create context from a vision image frame.
-
-        Args:
-            frame: The vision image frame to process.
-
-        Returns:
-            New Anthropic context with the image message.
-        """
-        context = cls()
-        context.add_image_frame_message(
-            format=frame.format, size=frame.size, image=frame.image, text=frame.text
-        )
-        return context
-
    def set_messages(self, messages: List):
        """Set the messages list and reset cache tracking.

--- a/src/pipecat/services/asyncai/tts.py
+++ b/src/pipecat/services/asyncai/tts.py
@@ -20,8 +20,8 @@ from pipecat.frames.frames import (
    EndFrame,
    ErrorFrame,
    Frame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
@@ -275,7 +275,7 @@ class AsyncAITTSService(InterruptibleTTSService):
            direction: The direction to push the frame.
        """
        await super().push_frame(frame, direction)
-        if isinstance(frame, (TTSStoppedFrame, StartInterruptionFrame)):
+        if isinstance(frame, (TTSStoppedFrame, InterruptionFrame)):
            self._started = False

    async def _receive_messages(self):
--- a/src/pipecat/services/aws/llm.py
+++ b/src/pipecat/services/aws/llm.py
@@ -25,7 +25,10 @@ from loguru import logger
 from PIL import Image
 from pydantic import BaseModel, Field

-from pipecat.adapters.services.bedrock_adapter import AWSBedrockLLMAdapter
+from pipecat.adapters.services.bedrock_adapter import (
+    AWSBedrockLLMAdapter,
+    AWSBedrockLLMInvocationParams,
+)
 from pipecat.frames.frames import (
    Frame,
    FunctionCallCancelFrame,
@@ -39,7 +42,6 @@ from pipecat.frames.frames import (
    LLMTextFrame,
    LLMUpdateSettingsFrame,
    UserImageRawFrame,
-    VisionImageRawFrame,
 )
 from pipecat.metrics.metrics import LLMTokenUsage
 from pipecat.processors.aggregators.llm_context import LLMContext
@@ -180,22 +182,6 @@ class AWSBedrockLLMContext(OpenAILLMContext):
        self._restructure_from_openai_messages()
        return self

-    @classmethod
-    def from_image_frame(cls, frame: VisionImageRawFrame) -> "AWSBedrockLLMContext":
-        """Create AWS Bedrock context from vision image frame.
-
-        Args:
-            frame: The vision image frame to convert.
-
-        Returns:
-            New AWS Bedrock LLM context instance.
-        """
-        context = cls()
-        context.add_image_frame_message(
-            format=frame.format, size=frame.size, image=frame.image, text=frame.text
-        )
-        return context
-
    def set_messages(self, messages: List):
        """Set the messages list and restructure for Bedrock format.

@@ -399,9 +385,33 @@ class AWSBedrockLLMContext(OpenAILLMContext):
        elif isinstance(content, list):
            new_content = []
            for item in content:
+                # fix empty text
                if item.get("type", "") == "text":
                    text_content = item["text"] if item["text"] != "" else "(empty)"
                    new_content.append({"text": text_content})
+                # handle image_url -> image conversion
+                if item["type"] == "image_url":
+                    new_item = {
+                        "image": {
+                            "format": "jpeg",
+                            "source": {
+                                "bytes": base64.b64decode(item["image_url"]["url"].split(",")[1])
+                            },
+                        }
+                    }
+                    new_content.append(new_item)
+            # In the case where there's a single image in the list (like what
+            # would result from a UserImageRawFrame), ensure that the image
+            # comes before text
+            image_indices = [i for i, item in enumerate(new_content) if "image" in item]
+            text_indices = [i for i, item in enumerate(new_content) if "text" in item]
+            if len(image_indices) == 1 and text_indices:
+                img_idx = image_indices[0]
+                first_txt_idx = text_indices[0]
+                if img_idx > first_txt_idx:
+                    # Move image before the first text
+                    image_item = new_content.pop(img_idx)
+                new_content.insert(first_txt_idx, image_item)
            return {"role": message["role"], "content": new_content}

        return message
@@ -569,7 +579,7 @@ class AWSBedrockLLMContext(OpenAILLMContext):
                if isinstance(msg["content"], list):
                    for item in msg["content"]:
                        if item.get("image"):
-                            item["source"]["bytes"] = "..."
+                            item["image"]["source"]["bytes"] = "..."
            msgs.append(msg)
        return msgs

@@ -801,64 +811,55 @@ class AWSBedrockLLMService(LLMService):
        Returns:
            The LLM's response as a string, or None if no response is generated.
        """
-        try:
-            messages = []
-            system = []
-            if isinstance(context, LLMContext):
-                # Future code will be something like this:
-                # adapter = self.get_llm_adapter()
-                # params: AWSBedrockLLMInvocationParams = adapter.get_llm_invocation_params(context)
-                # messages = params["messages"]
-                # system = params["system_instruction"] # [{"text": "system message"}]
-                raise NotImplementedError(
-                    "Universal LLMContext is not yet supported for AWS Bedrock."
-                )
-            else:
-                context = AWSBedrockLLMContext.upgrade_to_bedrock(context)
-                messages = context.messages
-                system = getattr(context, "system", None)  # [{"text": "system message"}]
+        messages = []
+        system = []
+        if isinstance(context, LLMContext):
+            adapter: AWSBedrockLLMAdapter = self.get_llm_adapter()
+            params: AWSBedrockLLMInvocationParams = adapter.get_llm_invocation_params(context)
+            messages = params["messages"]
+            system = params["system"]  # [{"text": "system message"}]
+        else:
+            context = AWSBedrockLLMContext.upgrade_to_bedrock(context)
+            messages = context.messages
+            system = getattr(context, "system", None)  # [{"text": "system message"}]

-            # Determine if we're using Claude or Nova based on model ID
-            model_id = self.model_name
+        # Determine if we're using Claude or Nova based on model ID
+        model_id = self.model_name

-            # Prepare request parameters
-            request_params = {
-                "modelId": model_id,
-                "messages": messages,
-                "inferenceConfig": {
-                    "maxTokens": 8192,
-                    "temperature": 0.7,
-                    "topP": 0.9,
-                },
-            }
+        # Prepare request parameters
+        request_params = {
+            "modelId": model_id,
+            "messages": messages,
+            "inferenceConfig": {
+                "maxTokens": 8192,
+                "temperature": 0.7,
+                "topP": 0.9,
+            },
+        }

-            if system:
-                request_params["system"] = system
+        if system:
+            request_params["system"] = system

-            async with self._aws_session.client(
-                service_name="bedrock-runtime", **self._aws_params
-            ) as client:
-                # Call Bedrock without streaming
-                response = await client.converse(**request_params)
+        async with self._aws_session.client(
+            service_name="bedrock-runtime", **self._aws_params
+        ) as client:
+            # Call Bedrock without streaming
+            response = await client.converse(**request_params)

-                # Extract the response text
-                if (
-                    "output" in response
-                    and "message" in response["output"]
-                    and "content" in response["output"]["message"]
-                ):
-                    content = response["output"]["message"]["content"]
-                    if isinstance(content, list):
-                        for item in content:
-                            if item.get("text"):
-                                return item["text"]
-                    elif isinstance(content, str):
-                        return content
+            # Extract the response text
+            if (
+                "output" in response
+                and "message" in response["output"]
+                and "content" in response["output"]["message"]
+            ):
+                content = response["output"]["message"]["content"]
+                if isinstance(content, list):
+                    for item in content:
+                        if item.get("text"):
+                            return item["text"]
+                elif isinstance(content, str):
+                    return content

-                return None
-
-        except Exception as e:
-            logger.error(f"Bedrock summary generation failed: {e}", exc_info=True)
            return None

    async def _create_converse_stream(self, client, request_params):
@@ -933,8 +934,25 @@ class AWSBedrockLLMService(LLMService):
            }
        }

+    def _get_llm_invocation_params(
+        self, context: OpenAILLMContext | LLMContext
+    ) -> AWSBedrockLLMInvocationParams:
+        # Universal LLMContext
+        if isinstance(context, LLMContext):
+            adapter: AWSBedrockLLMAdapter = self.get_llm_adapter()
+            params = adapter.get_llm_invocation_params(context)
+            return params
+
+        # AWS Bedrock-specific context
+        return AWSBedrockLLMInvocationParams(
+            system=getattr(context, "system", None),
+            messages=context.messages,
+            tools=context.tools or [],
+            tool_choice=context.tool_choice,
+        )
+
    @traced_llm
-    async def _process_context(self, context: AWSBedrockLLMContext):
+    async def _process_context(self, context: AWSBedrockLLMContext | LLMContext):
        # Usage tracking
        prompt_tokens = 0
        completion_tokens = 0
@@ -951,6 +969,12 @@ class AWSBedrockLLMService(LLMService):

            await self.start_ttfb_metrics()

+            params_from_context = self._get_llm_invocation_params(context)
+            messages = params_from_context["messages"]
+            system = params_from_context["system"]
+            tools = params_from_context["tools"]
+            tool_choice = params_from_context["tool_choice"]
+
            # Set up inference config
            inference_config = {
                "maxTokens": self._settings["max_tokens"],
@@ -961,17 +985,18 @@ class AWSBedrockLLMService(LLMService):
            # Prepare request parameters
            request_params = {
                "modelId": self.model_name,
-                "messages": context.messages,
+                "messages": messages,
                "inferenceConfig": inference_config,
                "additionalModelRequestFields": self._settings["additional_model_request_fields"],
            }

            # Add system message
-            request_params["system"] = context.system
+            if system:
+                request_params["system"] = system

            # Check if messages contain tool use or tool result content blocks
            has_tool_content = False
-            for message in context.messages:
+            for message in messages:
                if isinstance(message.get("content"), list):
                    for content_item in message["content"]:
                        if "toolUse" in content_item or "toolResult" in content_item:
@@ -981,7 +1006,6 @@ class AWSBedrockLLMService(LLMService):
                    break

            # Handle tools: use current tools, or no-op if tool content exists but no current tools
-            tools = context.tools or []
            if has_tool_content and not tools:
                tools = [self._create_no_op_tool()]
                using_noop_tool = True
@@ -990,17 +1014,15 @@ class AWSBedrockLLMService(LLMService):
                tool_config = {"tools": tools}

                # Only add tool_choice if we have real tools (not just no-op)
-                if not using_noop_tool and context.tool_choice:
-                    if context.tool_choice == "auto":
+                if not using_noop_tool and tool_choice:
+                    if tool_choice == "auto":
                        tool_config["toolChoice"] = {"auto": {}}
-                    elif context.tool_choice == "none":
+                    elif tool_choice == "none":
                        # Skip adding toolChoice for "none"
                        pass
-                    elif (
-                        isinstance(context.tool_choice, dict) and "function" in context.tool_choice
-                    ):
+                    elif isinstance(tool_choice, dict) and "function" in tool_choice:
                        tool_config["toolChoice"] = {
-                            "tool": {"name": context.tool_choice["function"]["name"]}
+                            "tool": {"name": tool_choice["function"]["name"]}
                        }

                request_params["toolConfig"] = tool_config
@@ -1009,7 +1031,17 @@ class AWSBedrockLLMService(LLMService):
            if self._settings["latency"] in ["standard", "optimized"]:
                request_params["performanceConfig"] = {"latency": self._settings["latency"]}

-            logger.debug(f"Calling AWS Bedrock model with: {request_params}")
+            # Log request params with messages redacted for logging
+            if isinstance(context, LLMContext):
+                adapter = self.get_llm_adapter()
+                context_type_for_logging = "universal"
+                messages_for_logging = adapter.get_messages_for_logging(context)
+            else:
+                context_type_for_logging = "LLM-specific"
+                messages_for_logging = context.get_messages_for_logging()
+            logger.debug(
+                f"{self}: Generating chat from {context_type_for_logging} context [{system}] | {messages_for_logging}"
+            )

            async with self._aws_session.client(
                service_name="bedrock-runtime", **self._aws_params
@@ -1117,15 +1149,9 @@ class AWSBedrockLLMService(LLMService):
        if isinstance(frame, OpenAILLMContextFrame):
            context = AWSBedrockLLMContext.upgrade_to_bedrock(frame.context)
        if isinstance(frame, LLMContextFrame):
-            raise NotImplementedError("Universal LLMContext is not yet supported for AWS Bedrock.")
+            context = frame.context
        elif isinstance(frame, LLMMessagesFrame):
            context = AWSBedrockLLMContext.from_messages(frame.messages)
-        elif isinstance(frame, VisionImageRawFrame):
-            # This is only useful in very simple pipelines because it creates
-            # a new context. Generally we want a context manager to catch
-            # UserImageRawFrames coming through the pipeline and add them
-            # to the context.
-            context = AWSBedrockLLMContext.from_image_frame(frame)
        elif isinstance(frame, LLMUpdateSettingsFrame):
            await self._update_settings(frame.settings)
        else:
--- a/src/pipecat/services/aws_nova_sonic/aws.py
+++ b/src/pipecat/services/aws_nova_sonic/aws.py
@@ -247,13 +247,14 @@ class AWSNovaSonicLLMService(LLMService):
        self._ready_to_send_context = False
        self._handling_bot_stopped_speaking = False
        self._triggering_assistant_response = False
-        self._assistant_response_trigger_audio: Optional[bytes] = (
-            None  # Not cleared on _disconnect()
-        )
        self._disconnecting = False
        self._connected_time: Optional[float] = None
        self._wants_connection = False

+        file_path = files("pipecat.services.aws_nova_sonic").joinpath("ready.wav")
+        with wave.open(file_path.open("rb"), "rb") as wav_file:
+            self._assistant_response_trigger_audio = wav_file.readframes(wav_file.getnframes())
+
    #
    # standard AIService frame handling
    #
@@ -1099,20 +1100,13 @@ class AWSNovaSonicLLMService(LLMService):

        self._triggering_assistant_response = True

-        # Read audio bytes, if we don't already have them cached
-        if not self._assistant_response_trigger_audio:
-            file_path = files("pipecat.services.aws_nova_sonic").joinpath("ready.wav")
-            with wave.open(file_path.open("rb"), "rb") as wav_file:
-                self._assistant_response_trigger_audio = wav_file.readframes(wav_file.getnframes())
-
        # Send the trigger audio, if we're fully connected and set up
-        if self._connected_time is not None:
+        if self._connected_time:
            await self._send_assistant_response_trigger()

    async def _send_assistant_response_trigger(self):
-        if (
-            not self._assistant_response_trigger_audio or self._connected_time is None
-        ):  # should never happen
+        if not self._connected_time:
+            # should never happen
            return

        try:
--- a/src/pipecat/services/aws_nova_sonic/context.py
+++ b/src/pipecat/services/aws_nova_sonic/context.py
@@ -21,13 +21,13 @@ from pipecat.frames.frames import (
    DataFrame,
    Frame,
    FunctionCallResultFrame,
+    InterruptionFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
    LLMMessagesAppendFrame,
    LLMMessagesUpdateFrame,
    LLMSetToolChoiceFrame,
    LLMSetToolsFrame,
-    StartInterruptionFrame,
    TextFrame,
    UserImageRawFrame,
 )
@@ -306,7 +306,7 @@ class AWSNovaSonicAssistantContextAggregator(OpenAIAssistantContextAggregator):
        if isinstance(
            frame,
            (
-                StartInterruptionFrame,
+                InterruptionFrame,
                LLMFullResponseStartFrame,
                LLMFullResponseEndFrame,
                TextFrame,
--- a/src/pipecat/services/azure/stt.py
+++ b/src/pipecat/services/azure/stt.py
@@ -19,6 +19,7 @@ from pipecat.frames.frames import (
    CancelFrame,
    EndFrame,
    Frame,
+    InterimTranscriptionFrame,
    StartFrame,
    TranscriptionFrame,
 )
@@ -140,6 +141,7 @@ class AzureSTTService(STTService):
        self._speech_recognizer = SpeechRecognizer(
            speech_config=self._speech_config, audio_config=audio_config
        )
+        self._speech_recognizer.recognizing.connect(self._on_handle_recognizing)
        self._speech_recognizer.recognized.connect(self._on_handle_recognized)
        self._speech_recognizer.start_continuous_recognition_async()

@@ -197,3 +199,15 @@ class AzureSTTService(STTService):
                self._handle_transcription(event.result.text, True, language), self.get_event_loop()
            )
            asyncio.run_coroutine_threadsafe(self.push_frame(frame), self.get_event_loop())
+
+    def _on_handle_recognizing(self, event):
+        if event.result.reason == ResultReason.RecognizingSpeech and len(event.result.text) > 0:
+            language = getattr(event.result, "language", None) or self._settings.get("language")
+            frame = InterimTranscriptionFrame(
+                event.result.text,
+                self._user_id,
+                time_now_iso8601(),
+                language,
+                result=event,
+            )
+            asyncio.run_coroutine_threadsafe(self.push_frame(frame), self.get_event_loop())
--- a/src/pipecat/services/cartesia/tts.py
+++ b/src/pipecat/services/cartesia/tts.py
@@ -20,8 +20,8 @@ from pipecat.frames.frames import (
    EndFrame,
    ErrorFrame,
    Frame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
@@ -371,7 +371,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
            return self._websocket
        raise Exception("Websocket not connected")

-    async def _handle_interruption(self, frame: StartInterruptionFrame, direction: FrameDirection):
+    async def _handle_interruption(self, frame: InterruptionFrame, direction: FrameDirection):
        await super()._handle_interruption(frame, direction)
        await self.stop_all_metrics()
        if self._context_id:
--- a/src/pipecat/services/elevenlabs/tts.py
+++ b/src/pipecat/services/elevenlabs/tts.py
@@ -25,9 +25,9 @@ from pipecat.frames.frames import (
    EndFrame,
    ErrorFrame,
    Frame,
+    InterruptionFrame,
    LLMFullResponseEndFrame,
    StartFrame,
-    StartInterruptionFrame,
    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
@@ -460,7 +460,7 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
            direction: The direction to push the frame.
        """
        await super().push_frame(frame, direction)
-        if isinstance(frame, (TTSStoppedFrame, StartInterruptionFrame)):
+        if isinstance(frame, (TTSStoppedFrame, InterruptionFrame)):
            self._started = False
            if isinstance(frame, TTSStoppedFrame):
                await self.add_word_timestamps([("Reset", 0)])
@@ -549,7 +549,7 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
            return self._websocket
        raise Exception("Websocket not connected")

-    async def _handle_interruption(self, frame: StartInterruptionFrame, direction: FrameDirection):
+    async def _handle_interruption(self, frame: InterruptionFrame, direction: FrameDirection):
        """Handle interruption by closing the current context."""
        await super()._handle_interruption(frame, direction)

@@ -558,7 +558,7 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
            logger.trace(f"Closing context {self._context_id} due to interruption")
            try:
                # ElevenLabs requires that Pipecat manages the contexts and closes them
-                # when they're not longer in use. Since a StartInterruptionFrame is pushed
+                # when they're not longer in use. Since an InterruptionFrame is pushed
                # every time the user speaks, we'll use this as a trigger to close the context
                # and reset the state.
                # Note: We do not need to call remove_audio_context here, as the context is
@@ -856,7 +856,7 @@ class ElevenLabsHttpTTSService(WordTTSService):
            direction: The direction to push the frame.
        """
        await super().push_frame(frame, direction)
-        if isinstance(frame, (StartInterruptionFrame, TTSStoppedFrame)):
+        if isinstance(frame, (InterruptionFrame, TTSStoppedFrame)):
            # Reset timing on interruption or stop
            self._reset_state()

--- a/src/pipecat/services/fish/tts.py
+++ b/src/pipecat/services/fish/tts.py
@@ -21,8 +21,8 @@ from pipecat.frames.frames import (
    EndFrame,
    ErrorFrame,
    Frame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
@@ -259,7 +259,7 @@ class FishAudioTTSService(InterruptibleTTSService):
            return self._websocket
        raise Exception("Websocket not connected")

-    async def _handle_interruption(self, frame: StartInterruptionFrame, direction: FrameDirection):
+    async def _handle_interruption(self, frame: InterruptionFrame, direction: FrameDirection):
        await super()._handle_interruption(frame, direction)
        await self.stop_all_metrics()
        self._request_id = None
--- a/src/pipecat/services/gemini_multimodal_live/gemini.py
+++ b/src/pipecat/services/gemini_multimodal_live/gemini.py
@@ -33,6 +33,7 @@ from pipecat.frames.frames import (
    InputAudioRawFrame,
    InputImageRawFrame,
    InputTextRawFrame,
+    InterruptionFrame,
    LLMContextFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
@@ -41,7 +42,6 @@ from pipecat.frames.frames import (
    LLMTextFrame,
    LLMUpdateSettingsFrame,
    StartFrame,
-    StartInterruptionFrame,
    TranscriptionFrame,
    TTSAudioRawFrame,
    TTSStartedFrame,
@@ -752,7 +752,7 @@ class GeminiMultimodalLiveLLMService(LLMService):
        elif isinstance(frame, InputImageRawFrame):
            await self._send_user_video(frame)
            await self.push_frame(frame, direction)
-        elif isinstance(frame, StartInterruptionFrame):
+        elif isinstance(frame, InterruptionFrame):
            await self._handle_interruption()
            await self.push_frame(frame, direction)
        elif isinstance(frame, UserStartedSpeakingFrame):
--- a/src/pipecat/services/google/llm.py
+++ b/src/pipecat/services/google/llm.py
@@ -36,7 +36,6 @@ from pipecat.frames.frames import (
    LLMTextFrame,
    LLMUpdateSettingsFrame,
    UserImageRawFrame,
-    VisionImageRawFrame,
 )
 from pipecat.metrics.metrics import LLMTokenUsage
 from pipecat.processors.aggregators.llm_context import LLMContext
@@ -1013,15 +1012,6 @@ class GoogleLLMService(LLMService):
            # NOTE: LLMMessagesFrame is deprecated, so we don't support the newer universal
            # LLMContext with it
            context = GoogleLLMContext(frame.messages)
-        elif isinstance(frame, VisionImageRawFrame):
-            # This is only useful in very simple pipelines because it creates
-            # a new context. Generally we want a context manager to catch
-            # UserImageRawFrames coming through the pipeline and add them
-            # to the context.
-            context = GoogleLLMContext()
-            context.add_image_frame_message(
-                format=frame.format, size=frame.size, image=frame.image, text=frame.text
-            )
        elif isinstance(frame, LLMUpdateSettingsFrame):
            await self._update_settings(frame.settings)
        else:
--- a/src/pipecat/services/google/tts.py
+++ b/src/pipecat/services/google/tts.py
@@ -500,9 +500,11 @@ class GoogleTTSService(TTSService):

        Parameters:
            language: Language for synthesis. Defaults to English.
+            speaking_rate: The speaking rate, in the range [0.25, 4.0].
        """

        language: Optional[Language] = Language.EN
+        speaking_rate: Optional[float] = None

    def __init__(
        self,
@@ -510,6 +512,7 @@ class GoogleTTSService(TTSService):
        credentials: Optional[str] = None,
        credentials_path: Optional[str] = None,
        voice_id: str = "en-US-Chirp3-HD-Charon",
+        voice_cloning_key: Optional[str] = None,
        sample_rate: Optional[int] = None,
        params: InputParams = InputParams(),
        **kwargs,
@@ -520,6 +523,7 @@ class GoogleTTSService(TTSService):
            credentials: JSON string containing Google Cloud service account credentials.
            credentials_path: Path to Google Cloud service account JSON file.
            voice_id: Google TTS voice identifier (e.g., "en-US-Chirp3-HD-Charon").
+            voice_cloning_key: The voice cloning key for Chirp 3 custom voices.
            sample_rate: Audio sample rate in Hz. If None, uses default.
            params: Language configuration parameters.
            **kwargs: Additional arguments passed to parent TTSService.
@@ -532,8 +536,10 @@ class GoogleTTSService(TTSService):
            "language": self.language_to_service_language(params.language)
            if params.language
            else "en-US",
+            "speaking_rate": params.speaking_rate,
        }
        self.set_voice(voice_id)
+        self._voice_cloning_key = voice_cloning_key
        self._client: texttospeech_v1.TextToSpeechAsyncClient = self._create_client(
            credentials, credentials_path
        )
@@ -600,15 +606,24 @@ class GoogleTTSService(TTSService):
        try:
            await self.start_ttfb_metrics()

-            voice = texttospeech_v1.VoiceSelectionParams(
-                language_code=self._settings["language"], name=self._voice_id
-            )
+            if self._voice_cloning_key:
+                voice_clone_params = texttospeech_v1.VoiceCloneParams(
+                    voice_cloning_key=self._voice_cloning_key
+                )
+                voice = texttospeech_v1.VoiceSelectionParams(
+                    language_code=self._settings["language"], voice_clone=voice_clone_params
+                )
+            else:
+                voice = texttospeech_v1.VoiceSelectionParams(
+                    language_code=self._settings["language"], name=self._voice_id
+                )

            streaming_config = texttospeech_v1.StreamingSynthesizeConfig(
                voice=voice,
                streaming_audio_config=texttospeech_v1.StreamingAudioConfig(
                    audio_encoding=texttospeech_v1.AudioEncoding.PCM,
                    sample_rate_hertz=self.sample_rate,
+                    speaking_rate=self._settings["speaking_rate"],
                ),
            )
            config_request = texttospeech_v1.StreamingSynthesizeRequest(
--- a/src/pipecat/services/heygen/video.py
+++ b/src/pipecat/services/heygen/video.py
@@ -240,6 +240,7 @@ class HeyGenVideoService(AIService):
            # As soon as we receive actual audio, the base output transport will create a
            # BotStartedSpeakingFrame, which we can use as a signal for the TTFB metrics.
            await self.stop_ttfb_metrics()
+            await self.push_frame(frame, direction)
        else:
            await self.push_frame(frame, direction)

--- a/src/pipecat/services/llm_service.py
+++ b/src/pipecat/services/llm_service.py
@@ -36,15 +36,15 @@ from pipecat.frames.frames import (
    FunctionCallResultFrame,
    FunctionCallResultProperties,
    FunctionCallsStartedFrame,
+    InterruptionFrame,
    LLMConfigureOutputFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
    LLMTextFrame,
    StartFrame,
-    StartInterruptionFrame,
    UserImageRequestFrame,
 )
-from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext, LLMSpecificMessage
 from pipecat.processors.aggregators.llm_response import (
    LLMAssistantAggregatorParams,
    LLMUserAggregatorParams,
@@ -195,6 +195,17 @@ class LLMService(AIService):
        """
        return self._adapter

+    def create_llm_specific_message(self, message: Any) -> LLMSpecificMessage:
+        """Create an LLM-specific message (as opposed to a standard message) for use in an LLMContext.
+
+        Args:
+            message: The message content.
+
+        Returns:
+            A LLMSpecificMessage instance.
+        """
+        return self.get_llm_adapter().create_llm_specific_message(message)
+
    async def run_inference(self, context: LLMContext | OpenAILLMContext) -> Optional[str]:
        """Run a one-shot, out-of-band (i.e. out-of-pipeline) inference with the given LLM context.

@@ -269,7 +280,7 @@ class LLMService(AIService):
        """
        await super().process_frame(frame, direction)

-        if isinstance(frame, StartInterruptionFrame):
+        if isinstance(frame, InterruptionFrame):
            await self._handle_interruptions(frame)
        elif isinstance(frame, LLMConfigureOutputFrame):
            self._skip_tts = frame.skip_tts
@@ -286,7 +297,7 @@ class LLMService(AIService):

        await super().push_frame(frame, direction)

-    async def _handle_interruptions(self, _: StartInterruptionFrame):
+    async def _handle_interruptions(self, _: InterruptionFrame):
        for function_name, entry in self._functions.items():
            if entry.cancel_on_interruption:
                await self._cancel_function_call(function_name)
--- a/src/pipecat/services/lmnt/tts.py
+++ b/src/pipecat/services/lmnt/tts.py
@@ -16,8 +16,8 @@ from pipecat.frames.frames import (
    EndFrame,
    ErrorFrame,
    Frame,
+    InterruptionFrame,
    StartFrame,
-    StartInterruptionFrame,
    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
@@ -180,7 +180,7 @@ class LmntTTSService(InterruptibleTTSService):
            direction: The direction to push the frame.
        """
        await super().push_frame(frame, direction)
-        if isinstance(frame, (TTSStoppedFrame, StartInterruptionFrame)):
+        if isinstance(frame, (TTSStoppedFrame, InterruptionFrame)):
            self._started = False

    async def _connect(self):
--- a/src/pipecat/services/mistral/llm.py
+++ b/src/pipecat/services/mistral/llm.py
@@ -57,16 +57,18 @@ class MistralLLMService(OpenAILLMService):
        logger.debug(f"Creating Mistral client with api {base_url}")
        return super().create_client(api_key, base_url, **kwargs)

-    def _apply_mistral_assistant_prefix(
+    def _apply_mistral_fixups(
        self, messages: List[ChatCompletionMessageParam]
    ) -> List[ChatCompletionMessageParam]:
-        """Apply Mistral's assistant message prefix requirement.
+        """Apply fixups to messages to meet Mistral-specific requirements.

-        Mistral requires assistant messages to have prefix=True when they
-        are the final message in a conversation. According to Mistral's API:
-        - Assistant messages with prefix=True MUST be the last message
-        - Only add prefix=True to the final assistant message when needed
-        - This allows assistant messages to be accepted as the last message
+        1. A "tool"-role message must be followed by an assistant message.
+
+        2. "system"-role messages must only appear at the start of a
+           conversation.
+
+        3. Assistant messages must have prefix=True when they are the final
+           message in a conversation (but at no other point).

        Args:
            messages: The original list of messages.
@@ -80,6 +82,25 @@ class MistralLLMService(OpenAILLMService):
        # Create a copy to avoid modifying the original
        fixed_messages = [dict(msg) for msg in messages]

+        # Ensure all tool responses are followed by an assistant message
+        assistant_insert_indices = []
+        for i, msg in enumerate(fixed_messages):
+            if msg.get("role") == "tool":
+                # If this is the last message or the next message is not assistant
+                if i == len(fixed_messages) - 1 or fixed_messages[i + 1].get("role") != "assistant":
+                    assistant_insert_indices.append(i + 1)
+        for idx in reversed(assistant_insert_indices):
+            fixed_messages.insert(idx, {"role": "assistant", "content": " "})
+
+        # Convert any "system" messages that aren't at the start (i.e., after the initial contiguous block) to "user"
+        first_non_system_idx = next(
+            (i for i, msg in enumerate(fixed_messages) if msg.get("role") != "system"),
+            len(fixed_messages),
+        )
+        for i, msg in enumerate(fixed_messages):
+            if msg.get("role") == "system" and i >= first_non_system_idx:
+                msg["role"] = "user"
+
        # Get the last message
        last_message = fixed_messages[-1]

@@ -158,7 +179,7 @@ class MistralLLMService(OpenAILLMService):
        - Core completion settings
        """
        # Apply Mistral's assistant prefix requirement for API compatibility
-        fixed_messages = self._apply_mistral_assistant_prefix(params_from_context["messages"])
+        fixed_messages = self._apply_mistral_fixups(params_from_context["messages"])

        params = {
            "model": self.model_name,
--- a/src/pipecat/services/moondream/vision.py
+++ b/src/pipecat/services/moondream/vision.py
@@ -11,17 +11,20 @@ for image analysis and description generation.
 """

 import asyncio
-from typing import AsyncGenerator
+import base64
+from io import BytesIO
+from typing import AsyncGenerator, Optional

 from loguru import logger
 from PIL import Image

-from pipecat.frames.frames import ErrorFrame, Frame, TextFrame, VisionImageRawFrame
+from pipecat.frames.frames import ErrorFrame, Frame, TextFrame
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.services.vision_service import VisionService

 try:
    import torch
-    from transformers import AutoModelForCausalLM, AutoTokenizer
+    from transformers import AutoModelForCausalLM
 except ModuleNotFoundError as e:
    logger.error(f"Exception: {e}")
    logger.error("In order to use Moondream, you need to `pip install pipecat-ai[moondream]`.")
@@ -94,11 +97,11 @@ class MoondreamService(VisionService):

        logger.debug("Loaded Moondream model")

-    async def run_vision(self, frame: VisionImageRawFrame) -> AsyncGenerator[Frame, None]:
+    async def run_vision(self, context: LLMContext) -> AsyncGenerator[Frame, None]:
        """Analyze an image and generate a description.

        Args:
-            frame: Vision frame containing the image data and optional question text.
+            context: The context to process, containing image data.

        Yields:
            Frame: TextFrame containing the generated image description, or ErrorFrame
@@ -109,22 +112,45 @@ class MoondreamService(VisionService):
            yield ErrorFrame("Moondream model not available")
            return

-        logger.debug(f"Analyzing image: {frame}")
+        image_bytes = None
+        text = None
+        try:
+            messages = context.get_messages()
+            last_message = messages[-1]
+            last_message_content = last_message.get("content")

-        def get_image_description(frame: VisionImageRawFrame):
-            """Generate description for the given image frame.
+            for item in last_message_content:
+                if isinstance(item, dict):
+                    if (
+                        "image_url" in item
+                        and isinstance(item["image_url"], dict)
+                        and item["image_url"].get("url")
+                    ):
+                        image_bytes = base64.b64decode(item["image_url"]["url"].split(",")[1])
+                    elif "text" in item and isinstance(item["text"], str):
+                        text = item["text"]

-            Args:
-                frame: Vision frame containing image data and question.
+        except Exception as e:
+            logger.error(f"Exception during image extraction: {e}")
+            yield ErrorFrame("Failed to extract image from context")
+            return

-            Returns:
-                str: Generated description of the image.
-            """
-            image = Image.frombytes(frame.format, frame.size, frame.image)
+        if not image_bytes:
+            logger.error("No image found in context")
+            yield ErrorFrame("No image found in context")
+            return
+
+        logger.debug(
+            f"Analyzing image (bytes length: {len(image_bytes) if image_bytes else 'None'})"
+        )
+
+        def get_image_description(bytes: bytes, text: Optional[str]) -> str:
+            image_buffer = BytesIO(bytes)
+            image = Image.open(image_buffer)
            image_embeds = self._model.encode_image(image)
-            description = self._model.query(image_embeds, frame.text)["answer"]
+            description = self._model.query(image_embeds, text)["answer"]
            return description

-        description = await asyncio.to_thread(get_image_description, frame)
+        description = await asyncio.to_thread(get_image_description, image_bytes, text)

        yield TextFrame(text=description)
--- a/src/pipecat/services/neuphonic/tts.py
+++ b/src/pipecat/services/neuphonic/tts.py
@@ -25,9 +25,9 @@ from pipecat.frames.frames import (
    EndFrame,
    ErrorFrame,
    Frame,
+    InterruptionFrame,
    LLMFullResponseEndFrame,
    StartFrame,
-    StartInterruptionFrame,
    TTSAudioRawFrame,
    TTSSpeakFrame,
    TTSStartedFrame,
@@ -224,7 +224,7 @@ class NeuphonicTTSService(InterruptibleTTSService):
            direction: The direction to push the frame.
        """
        await super().push_frame(frame, direction)
-        if isinstance(frame, (TTSStoppedFrame, StartInterruptionFrame)):
+        if isinstance(frame, (TTSStoppedFrame, InterruptionFrame)):
            self._started = False

    async def process_frame(self, frame: Frame, direction: FrameDirection):
--- a/src/pipecat/services/openai/base_llm.py
+++ b/src/pipecat/services/openai/base_llm.py
@@ -32,7 +32,6 @@ from pipecat.frames.frames import (
    LLMMessagesFrame,
    LLMTextFrame,
    LLMUpdateSettingsFrame,
-    VisionImageRawFrame,
 )
 from pipecat.metrics.metrics import LLMTokenUsage
 from pipecat.processors.aggregators.llm_context import LLMContext
@@ -418,8 +417,8 @@ class BaseOpenAILLMService(LLMService):
        """Process frames for LLM completion requests.

        Handles OpenAILLMContextFrame, LLMContextFrame, LLMMessagesFrame,
-        VisionImageRawFrame, and LLMUpdateSettingsFrame to trigger LLM
-        completions and manage settings.
+        and LLMUpdateSettingsFrame to trigger LLM completions and manage
+        settings.

        Args:
            frame: The frame to process.
@@ -438,16 +437,6 @@ class BaseOpenAILLMService(LLMService):
            # NOTE: LLMMessagesFrame is deprecated, so we don't support the newer universal
            # LLMContext with it
            context = OpenAILLMContext.from_messages(frame.messages)
-        elif isinstance(frame, VisionImageRawFrame):
-            # This is only useful in very simple pipelines because it creates
-            # a new context. Generally we want a context manager to catch
-            # UserImageRawFrames coming through the pipeline and add them
-            # to the context.
-            # TODO: support the newer universal LLMContext with a VisionImageRawFrame equivalent?
-            context = OpenAILLMContext()
-            context.add_image_frame_message(
-                format=frame.format, size=frame.size, image=frame.image, text=frame.text
-            )
        elif isinstance(frame, LLMUpdateSettingsFrame):
            await self._update_settings(frame.settings)
        else:
--- a/src/pipecat/services/openai/tts.py
+++ b/src/pipecat/services/openai/tts.py
@@ -64,6 +64,7 @@ class OpenAITTSService(TTSService):
        model: str = "gpt-4o-mini-tts",
        sample_rate: Optional[int] = None,
        instructions: Optional[str] = None,
+        speed: Optional[float] = None,
        **kwargs,
    ):
        """Initialize OpenAI TTS service.
@@ -75,6 +76,7 @@ class OpenAITTSService(TTSService):
            model: TTS model to use. Defaults to "gpt-4o-mini-tts".
            sample_rate: Output audio sample rate in Hz. If None, uses OpenAI's default 24kHz.
            instructions: Optional instructions to guide voice synthesis behavior.
+            speed: Voice speed control (0.25 to 4.0, default 1.0).
            **kwargs: Additional keyword arguments passed to TTSService.
        """
        if sample_rate and sample_rate != self.OPENAI_SAMPLE_RATE:
@@ -84,6 +86,7 @@ class OpenAITTSService(TTSService):
            )
        super().__init__(sample_rate=sample_rate, **kwargs)

+        self._speed = speed
        self.set_model_name(model)
        self.set_voice(voice)
        self._instructions = instructions
@@ -133,17 +136,22 @@ class OpenAITTSService(TTSService):
        try:
            await self.start_ttfb_metrics()

-            # Setup extra body parameters
-            extra_body = {}
+            # Setup API parameters
+            create_params = {
+                "input": text,
+                "model": self.model_name,
+                "voice": VALID_VOICES[self._voice_id],
+                "response_format": "pcm",
+            }
+
            if self._instructions:
-                extra_body["instructions"] = self._instructions
+                create_params["instructions"] = self._instructions
+
+            if self._speed:
+                create_params["speed"] = self._speed

            async with self._client.audio.speech.with_streaming_response.create(
-                input=text,
-                model=self.model_name,
-                voice=VALID_VOICES[self._voice_id],
-                response_format="pcm",
-                extra_body=extra_body,
+                **create_params
            ) as r:
                if r.status_code != 200:
                    error = await r.text()
--- a/src/pipecat/services/openai_agent/README.md
+++ b/src/pipecat/services/openai_agent/README.md
@@ -0,0 +1,209 @@
+# OpenAI Agents SDK Integration
+
+This service integrates the [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/) with Pipecat, enabling powerful agentic workflows with features like:
+
+- **Agent loops** with tool calling and response streaming
+- **Handoffs** between specialized agents  
+- **Guardrails** for input/output validation
+- **Sessions** with automatic conversation history
+- **Built-in tracing** and monitoring
+
+## Installation
+
+Install the OpenAI Agents SDK dependency:
+
+```bash
+pip install "pipecat-ai[openai-agent]"
+# or
+uv add "pipecat-ai[openai-agent]"
+```
+
+## Basic Usage
+
+```python
+from pipecat.services.openai_agent import OpenAIAgentService
+
+# Create a simple agent
+agent_service = OpenAIAgentService(
+    name="Assistant",
+    instructions="You are a helpful assistant.",
+    api_key=os.getenv("OPENAI_API_KEY"),
+    streaming=True,
+)
+
+# Use in a pipeline
+pipeline = Pipeline([
+    transport.input(),
+    stt,
+    agent_service,
+    tts,
+    transport.output(),
+])
+```
+
+## Features
+
+### Tool Integration
+
+```python
+def get_weather(location: str) -> str:
+    """Get weather for a location."""
+    return f"Weather in {location}: sunny, 22°C"
+
+agent_service = OpenAIAgentService(
+    name="Weather Assistant",
+    instructions="Help users with weather information.",
+    tools=[get_weather],
+    api_key=os.getenv("OPENAI_API_KEY"),
+)
+```
+
+### Agent Handoffs
+
+```python
+# Create specialized agents
+weather_agent = OpenAIAgentService(
+    name="Weather Specialist",
+    instructions="Provide weather information and forecasts.",
+    tools=[get_weather, get_forecast],
+)
+
+trivia_agent = OpenAIAgentService(
+    name="Trivia Master", 
+    instructions="Share interesting facts and trivia.",
+    tools=[get_random_fact],
+)
+
+# Create coordinator that can hand off to specialists
+coordinator = OpenAIAgentService(
+    name="Coordinator",
+    instructions="Route users to the right specialist.",
+    handoffs=[weather_agent.agent, trivia_agent.agent],
+)
+```
+
+### Guardrails
+
+```python
+from agents import InputGuardrail, GuardrailFunctionOutput
+
+async def content_filter(ctx, agent, input_data):
+    # Check input for appropriate content
+    if is_inappropriate(input_data):
+        return GuardrailFunctionOutput(
+            tripwire_triggered=True,
+            output_info="Content not allowed"
+        )
+    return GuardrailFunctionOutput(tripwire_triggered=False)
+
+agent_service = OpenAIAgentService(
+    name="Safe Assistant",
+    instructions="You are a helpful and safe assistant.",
+    input_guardrails=[InputGuardrail(guardrail_function=content_filter)],
+)
+```
+
+### Session Management
+
+```python
+agent_service = OpenAIAgentService(
+    name="Personal Assistant",
+    instructions="Remember user preferences and context.",
+    session_config={
+        "user_id": "user_123",
+        "memory_enabled": True,
+    }
+)
+
+# Update session context dynamically
+agent_service.update_session_context({
+    "user_preferences": {"language": "en", "style": "formal"}
+})
+```
+
+## Configuration Options
+
+### Basic Parameters
+
+- `name`: Agent identifier for handoffs and tracing
+- `instructions`: System prompt defining agent behavior  
+- `api_key`: OpenAI API key (or use `OPENAI_API_KEY` env var)
+- `streaming`: Enable real-time token streaming (default: True)
+
+### Advanced Configuration
+
+- `tools`: List of callable functions for the agent to use
+- `handoffs`: List of other agents this agent can transfer to
+- `input_guardrails`: Input validation and filtering
+- `output_guardrails`: Output validation and filtering  
+- `model_config`: Model settings (model, temperature, etc.)
+- `session_config`: Session and memory configuration
+
+### Model Configuration
+
+```python
+agent_service = OpenAIAgentService(
+    name="Precise Assistant",
+    instructions="Provide accurate, concise responses.",
+    model_config={
+        "model": "gpt-4o",
+        "temperature": 0.1,
+        "max_tokens": 150,
+    }
+)
+```
+
+## Examples
+
+See the foundational examples:
+
+- [`45-openai-agent-basic.py`](../examples/foundational/45-openai-agent-basic.py) - Basic agent with tools
+- [`46-openai-agent-handoffs.py`](../examples/foundational/46-openai-agent-handoffs.py) - Multi-agent system with handoffs
+
+## Methods
+
+### Core Methods
+
+- `update_agent_config()` - Update instructions and model settings
+- `add_tool()` - Add new tools dynamically
+- `add_handoff_agent()` - Add handoff destinations
+- `get_session_context()` - Get current session state
+- `update_session_context()` - Update session variables
+
+### Lifecycle Methods
+
+Inherited from `AIService`:
+- `start()` - Initialize the agent
+- `stop()` - Clean up resources
+- `cancel()` - Cancel ongoing operations
+
+## Integration with Pipecat
+
+The service processes `TextFrame` inputs and generates:
+- `LLMFullResponseStartFrame` - Response beginning
+- `LLMTextFrame` - Streaming text tokens (if streaming enabled)
+- `LLMFullResponseEndFrame` - Response completion
+
+This integrates seamlessly with Pipecat's conversation pipeline and context aggregators.
+
+## Error Handling
+
+The service includes robust error handling for:
+- Missing API keys or SDK installation
+- Agent processing failures  
+- Network connectivity issues
+- Malformed tool responses
+
+Errors are emitted as `ErrorFrame` objects in the pipeline.
+
+## Requirements
+
+- OpenAI API key
+- `openai-agents` package
+- Python 3.10+
+
+## Limitations
+
+- Currently supports OpenAI models only (via Agents SDK)
+- Handoffs work within individual requests (no cross-request state)
+- Real-time voice features require additional setup
--- a/src/pipecat/services/openai_agent/init.py
+++ b/src/pipecat/services/openai_agent/init.py
@@ -0,0 +1,11 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""OpenAI Agents SDK service for Pipecat integration."""
+
+from .agent_service import OpenAIAgentService
+
+__all__ = ["OpenAIAgentService"]
--- a/src/pipecat/services/openai_agent/agent_service.py
+++ b/src/pipecat/services/openai_agent/agent_service.py
@@ -0,0 +1,567 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""OpenAI Agents SDK integration service.
+
+Provides integration with the OpenAI Agents SDK for building AI applications
+within Pipecat pipelines. This service allows leveraging agent loops, handoffs,
+guardrails, sessions, and tools from the OpenAI Agents SDK.
+"""
+
+import asyncio
+import os
+from dataclasses import dataclass
+from typing import (
+    Any,
+    Awaitable,
+    Callable,
+    Dict,
+    List,
+    Optional,
+    Protocol,
+    Sequence,
+    Union,
+    override,
+    runtime_checkable,
+)
+
+from loguru import logger
+
+try:
+    from agents import Agent, InputGuardrail, OutputGuardrail, Runner, Tool
+    from agents.result import RunResult, RunResultStreaming
+    from agents.stream_events import StreamEvent
+except ImportError as e:
+    logger.error(f"Exception: {e}")
+    logger.error(
+        "In order to use OpenAI Agents SDK, you need to `pip install openai-agents`. "
+        "Also, set `OPENAI_API_KEY` environment variable."
+    )
+    raise Exception(f"Missing module: {e}")
+
+from pipecat.frames.frames import (
+    CancelFrame,
+    EndFrame,
+    ErrorFrame,
+    Frame,
+    LLMFullResponseEndFrame,
+    LLMFullResponseStartFrame,
+    LLMTextFrame,
+    StartFrame,
+    TextFrame,
+    UserImageRawFrame,
+)
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantAggregatorParams,
+    LLMAssistantContextAggregator,
+    LLMUserAggregatorParams,
+    LLMUserContextAggregator,
+)
+from pipecat.processors.aggregators.openai_llm_context import (
+    OpenAILLMContext,
+    OpenAILLMContextFrame,
+)
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.services.ai_service import AIService
+
+
+@runtime_checkable
+class ToolLike(Protocol):
+    """Protocol for tool-like objects."""
+
+    def __call__(self, *args: Any, **kwargs: Any) -> Any:
+        """Tool call interface."""
+        ...
+
+
+@runtime_checkable
+class AgentLike(Protocol):
+    """Protocol for agent-like objects."""
+
+    name: str
+
+    def __call__(self, *args: Any, **kwargs: Any) -> Any:
+        """Agent call interface."""
+        ...
+
+
+@dataclass
+class OpenAIAgentContextAggregatorPair:
+    """Pair of OpenAI Agent context aggregators for user and assistant messages.
+
+    Parameters:
+        _user: User context aggregator for processing user messages.
+        _assistant: Assistant context aggregator for processing assistant messages.
+    """
+
+    _user: "OpenAIAgentUserContextAggregator"
+    _assistant: "OpenAIAgentAssistantContextAggregator"
+
+    def user(self) -> "OpenAIAgentUserContextAggregator":
+        """Get the user context aggregator.
+
+        Returns:
+            The user context aggregator instance.
+        """
+        return self._user
+
+    def assistant(self) -> "OpenAIAgentAssistantContextAggregator":
+        """Get the assistant context aggregator.
+
+        Returns:
+            The assistant context aggregator instance.
+        """
+        return self._assistant
+
+
+class OpenAIAgentService(AIService):
+    """OpenAI Agents SDK service for Pipecat.
+
+    Integrates the OpenAI Agents SDK with Pipecat's pipeline architecture,
+    enabling advanced agentic workflows with features like handoffs, guardrails,
+    sessions, and tools within real-time conversational AI applications.
+
+    The service processes text input frames and generates streaming responses
+    using the agent's configured capabilities.
+    """
+
+    def __init__(
+        self,
+        *,
+        agent: Optional[Agent] = None,
+        name: str = "Assistant",
+        instructions: Union[str, Sequence[str]] = "You are a helpful assistant.",
+        handoffs: Optional[Sequence[AgentLike]] = None,
+        tools: Optional[Sequence[ToolLike]] = None,
+        input_guardrails: Optional[Sequence[InputGuardrail]] = None,
+        output_guardrails: Optional[Sequence[OutputGuardrail]] = None,
+        model_config: Optional[Dict[str, Any]] = None,
+        session_config: Optional[Dict[str, Any]] = None,
+        api_key: Optional[str] = None,
+        streaming: bool = True,
+        **kwargs,
+    ):
+        """Initialize the OpenAI Agent service.
+
+        Args:
+            agent: Pre-configured Agent instance. If provided, other agent configuration
+                parameters will be ignored.
+            name: Name of the agent for identification and handoffs.
+            instructions: System instructions that define the agent's behavior.
+            handoffs: List of other agents this agent can hand off to.
+            tools: List of callable functions the agent can use as tools.
+            input_guardrails: List of input validation guardrails.
+            output_guardrails: List of output validation guardrails.
+            model_config: Configuration for the underlying language model.
+            session_config: Configuration for session management.
+            api_key: OpenAI API key. If not provided, will use OPENAI_API_KEY env var.
+            streaming: Whether to use streaming responses for real-time output.
+            **kwargs: Additional arguments passed to the parent AIService.
+        """
+        super().__init__(**kwargs)
+
+        # Set up API key
+        if api_key:
+            os.environ["OPENAI_API_KEY"] = api_key
+        elif not os.getenv("OPENAI_API_KEY"):
+            logger.warning("No OpenAI API key provided. Set OPENAI_API_KEY environment variable.")
+
+        # Create or use existing agent
+        if agent:
+            self._agent = agent
+        else:
+            # Convert sequences to lists and handle string instructions
+            agent_handoffs: List[Any] = list(handoffs) if handoffs else []
+            agent_tools: List[Any] = list(tools) if tools else []
+            agent_input_guardrails: List[Any] = list(input_guardrails) if input_guardrails else []
+            agent_output_guardrails: List[Any] = (
+                list(output_guardrails) if output_guardrails else []
+            )
+
+            # Handle instructions - convert sequence to string if needed
+            if isinstance(instructions, str):
+                agent_instructions = instructions
+            else:
+                agent_instructions = " ".join(str(instr) for instr in instructions)
+
+            self._agent = Agent(
+                name=name,
+                instructions=agent_instructions,
+                handoffs=agent_handoffs,
+                tools=agent_tools,
+                input_guardrails=agent_input_guardrails,
+                output_guardrails=agent_output_guardrails,
+                model=model_config.get("model", "gpt-4o") if model_config else "gpt-4o",
+            )
+
+        self._streaming = streaming
+        self._session_config = session_config or {}
+        self._current_session = None
+        self._accumulated_text = ""
+
+        # Set model name for metrics
+        if model_config and "model" in model_config:
+            self.set_model_name(model_config["model"])
+        else:
+            self.set_model_name("gpt-4o")  # Default model
+
+        logger.info(f"Initialized OpenAI Agent service: {self._agent.name}")
+
+    @property
+    def agent(self) -> Agent:
+        """Get the underlying OpenAI Agent.
+
+        Returns:
+            The configured Agent instance.
+        """
+        return self._agent
+
+    def create_context_aggregator(
+        self,
+        context: OpenAILLMContext,
+        *,
+        user_params: LLMUserAggregatorParams = LLMUserAggregatorParams(),
+        assistant_params: LLMAssistantAggregatorParams = LLMAssistantAggregatorParams(),
+    ) -> OpenAIAgentContextAggregatorPair:
+        """Create OpenAI-specific context aggregators for agent interactions.
+
+        Creates a pair of context aggregators optimized for OpenAI Agent interactions,
+        including support for function calls, tool usage, and conversation management.
+
+        Args:
+            context: The LLM context to create aggregators for.
+            user_params: Parameters for user message aggregation.
+            assistant_params: Parameters for assistant message aggregation.
+
+        Returns:
+            OpenAIAgentContextAggregatorPair: A pair of context aggregators, one for
+            the user and one for the assistant, encapsulated in an
+            OpenAIAgentContextAggregatorPair.
+        """
+        user = OpenAIAgentUserContextAggregator(context, params=user_params)
+        assistant = OpenAIAgentAssistantContextAggregator(context, params=assistant_params)
+        return OpenAIAgentContextAggregatorPair(_user=user, _assistant=assistant)
+
+    def update_agent_config(
+        self,
+        *,
+        instructions: Optional[str] = None,
+        model_config: Optional[Dict[str, Any]] = None,
+        **kwargs,
+    ) -> None:
+        """Update agent configuration dynamically.
+
+        Args:
+            instructions: New system instructions for the agent.
+            model_config: Updated model configuration.
+            **kwargs: Additional agent configuration parameters.
+        """
+        if instructions:
+            self._agent.instructions = instructions
+            logger.info(f"Updated agent instructions for {self._agent.name}")
+
+        if model_config:
+            # Note: OpenAI Agents SDK handles model configuration during agent creation
+            # We can't update model_config after agent is created, but we can update our model name
+            if "model" in model_config:
+                self.set_model_name(model_config["model"])
+            logger.info(f"Updated model config for {self._agent.name}")
+
+    async def start(self, frame: StartFrame):
+        """Start the OpenAI Agent service.
+
+        Initializes the agent session and prepares for processing.
+
+        Args:
+            frame: The start frame containing initialization parameters.
+        """
+        logger.info(f"Starting OpenAI Agent service: {self._agent.name}")
+        await super().start(frame)
+
+    async def stop(self, frame: EndFrame):
+        """Stop the OpenAI Agent service.
+
+        Cleans up resources and ends the current session.
+
+        Args:
+            frame: The end frame.
+        """
+        logger.info(f"Stopping OpenAI Agent service: {self._agent.name}")
+        await super().stop(frame)
+
+    async def cancel(self, frame: CancelFrame):
+        """Cancel the OpenAI Agent service.
+
+        Cancels any ongoing operations.
+
+        Args:
+            frame: The cancel frame.
+        """
+        logger.info(f"Cancelling OpenAI Agent service: {self._agent.name}")
+        await super().cancel(frame)
+
+    @override
+    async def process_frame(self, frame: Frame, direction: FrameDirection) -> None:
+        """Process frames and handle agent interactions.
+
+        Processes OpenAILLMContextFrame and TextFrame by running them through the OpenAI Agent
+        and streams the results back as LLM frames.
+
+        Args:
+            frame: The frame to process.
+            direction: The direction of frame processing.
+        """
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, OpenAILLMContextFrame):
+            # Process context frame through the agent
+            try:
+                await self.push_frame(LLMFullResponseStartFrame())
+                # Extract the latest user message from the context
+                messages = frame.context.get_messages()
+                if messages:
+                    # Get the last user message
+                    for message in reversed(messages):
+                        if message.get("role") == "user":
+                            content = message.get("content", "")
+                            if isinstance(content, list):
+                                # Extract text from content array
+                                text_parts = []
+                                for part in content:
+                                    if isinstance(part, dict) and part.get("type") == "text":
+                                        text_parts.append(part.get("text", ""))
+                                user_input = " ".join(text_parts)
+                            else:
+                                user_input = str(content)
+
+                            if user_input.strip():
+                                await self._process_agent_request(user_input)
+                                break
+                await self.push_frame(LLMFullResponseEndFrame())
+            except Exception as e:
+                logger.error(f"Error processing agent context: {e}")
+                await self.push_error(ErrorFrame(f"Agent processing error: {e}"))
+        elif isinstance(frame, TextFrame):
+            # Process text input through the agent directly (for backwards compatibility)
+            try:
+                await self.push_frame(LLMFullResponseStartFrame())
+                await self._process_agent_request(frame.text)
+                await self.push_frame(LLMFullResponseEndFrame())
+            except Exception as e:
+                logger.error(f"Error processing agent request: {e}")
+                await self.push_error(ErrorFrame(f"Agent processing error: {e}"))
+        else:
+            # For frames we don't handle, pass them through with direction
+            await self.push_frame(frame, direction)
+
+    async def _process_agent_request(self, input_text: str):
+        """Process an agent request and stream the results.
+
+        Args:
+            input_text: The user input text to process.
+        """
+        logger.debug(f"Processing agent request: {input_text}")
+
+        if self._streaming:
+            await self._process_streaming_response(input_text)
+        else:
+            await self._process_non_streaming_response(input_text)
+
+    async def _process_streaming_response(self, input_text: str):
+        """Process a streaming agent response.
+
+        Args:
+            input_text: The user input text to process.
+        """
+        try:
+            # Run the agent with streaming
+            result: RunResultStreaming = Runner.run_streamed(
+                self._agent, input_text, context=self._session_config
+            )
+
+            has_streaming_deltas = False
+
+            # Process the stream events
+            async for event in result.stream_events():
+                if event.type == "raw_response_event":
+                    # Handle token-by-token streaming
+                    # Only check for delta on events that are known to have it
+                    if hasattr(event.data, "delta") and getattr(event.data, "delta", None):
+                        delta_text = getattr(event.data, "delta", "")
+                        if delta_text:
+                            has_streaming_deltas = True
+                            self._accumulated_text += delta_text
+                            await self.push_frame(LLMTextFrame(text=delta_text))
+
+                elif event.type == "run_item_stream_event":
+                    # Handle completed items
+                    if event.item.type == "message_output_item":
+                        # Only process complete message if we didn't get streaming deltas
+                        if not has_streaming_deltas:
+                            message_text = self._extract_message_text(event.item)
+                            logger.debug(
+                                f"Processing complete message (no deltas): {message_text[:50]}..."
+                                if len(message_text) > 50
+                                else f"Processing complete message: {message_text}"
+                            )
+                            if message_text:
+                                await self.push_frame(LLMTextFrame(text=message_text))
+
+                    elif event.item.type == "tool_call_item":
+                        # Use getattr for safe attribute access
+                        tool_name = getattr(event.item, "tool_name", "unknown")
+                        logger.debug(f"Tool called: {tool_name}")
+
+                    elif event.item.type == "tool_call_output_item":
+                        output = getattr(event.item, "output", "no output")
+                        logger.debug(f"Tool output: {output}")
+
+                elif event.type == "agent_updated_stream_event":
+                    logger.debug(f"Agent updated: {event.new_agent.name}")
+
+            # Reset accumulated text for next request
+            self._accumulated_text = ""
+
+        except Exception as e:
+            logger.error(f"Error in streaming response: {e}")
+            raise
+
+    async def _process_non_streaming_response(self, input_text: str):
+        """Process a non-streaming agent response.
+
+        Args:
+            input_text: The user input text to process.
+        """
+        try:
+            # Run the agent without streaming
+            result: RunResult = await Runner.run(
+                self._agent, input_text, context=self._session_config
+            )
+
+            # Send the final output
+            if result.final_output:
+                await self.push_frame(LLMTextFrame(text=result.final_output))
+
+        except Exception as e:
+            logger.error(f"Error in non-streaming response: {e}")
+            raise
+
+    def _extract_message_text(self, item) -> str:
+        """Extract text from a message output item.
+
+        Args:
+            item: The message output item from the agent.
+
+        Returns:
+            The extracted text content.
+        """
+        try:
+            # Handle OpenAI Agents SDK MessageOutputItem format
+            if hasattr(item, "raw_item") and hasattr(item.raw_item, "content"):
+                content = item.raw_item.content
+                if isinstance(content, list):
+                    text_parts = []
+                    for content_part in content:
+                        if hasattr(content_part, "text"):
+                            text_parts.append(content_part.text)
+                        elif (
+                            isinstance(content_part, dict)
+                            and content_part.get("type") == "output_text"
+                        ):
+                            text_parts.append(content_part.get("text", ""))
+                        elif isinstance(content_part, dict) and content_part.get("type") == "text":
+                            text_parts.append(content_part.get("text", ""))
+                    return "".join(text_parts)
+                elif isinstance(content, str):
+                    return content
+
+            # Handle direct content attribute
+            elif hasattr(item, "content"):
+                if isinstance(item.content, str):
+                    return item.content
+                elif isinstance(item.content, list):
+                    # Extract text from content array
+                    text_parts = []
+                    for content_part in item.content:
+                        if isinstance(content_part, dict) and content_part.get("type") == "text":
+                            text_parts.append(content_part.get("text", ""))
+                        elif isinstance(content_part, str):
+                            text_parts.append(content_part)
+                    return "".join(text_parts)
+
+            # If no text content found, return empty string instead of str(item)
+            logger.debug(f"No extractable text content found in item: {type(item)}")
+            return ""
+
+        except Exception as e:
+            logger.warning(f"Could not extract text from message item: {e}")
+            return ""
+
+    async def add_tool(self, tool_function: ToolLike):
+        """Add a tool function to the agent.
+
+        Args:
+            tool_function: A callable function or Tool object to add as a tool.
+        """
+        if hasattr(self._agent, "tools"):
+            # Cast to Any to handle the type variance issue
+            tools_list: List[Any] = self._agent.tools
+            tools_list.append(tool_function)
+            tool_name = getattr(
+                tool_function, "__name__", getattr(tool_function, "name", "unknown")
+            )
+            logger.info(f"Added tool {tool_name} to agent {self._agent.name}")
+
+    async def add_handoff_agent(self, agent: AgentLike):
+        """Add a handoff agent.
+
+        Args:
+            agent: Another Agent instance or handoff object that this agent can hand off to.
+        """
+        if hasattr(self._agent, "handoffs"):
+            # Cast to Any to handle the type variance issue
+            handoffs_list: List[Any] = self._agent.handoffs
+            handoffs_list.append(agent)
+            agent_name = getattr(agent, "name", "unknown")
+            logger.info(f"Added handoff agent {agent_name} to agent {self._agent.name}")
+
+    def get_session_context(self) -> Dict[str, Any]:
+        """Get the current session context.
+
+        Returns:
+            Dictionary containing the current session context.
+        """
+        return self._session_config.copy()
+
+    def update_session_context(self, context: Dict[str, Any]):
+        """Update the session context.
+
+        Args:
+            context: Dictionary of context updates to apply.
+        """
+        self._session_config.update(context)
+        logger.debug(f"Updated session context for agent {self._agent.name}")
+
+
+class OpenAIAgentUserContextAggregator(LLMUserContextAggregator):
+    """OpenAI Agent-specific user context aggregator.
+
+    Handles aggregation of user messages for OpenAI Agent services.
+    Inherits all functionality from the base LLMUserContextAggregator.
+    """
+
+    pass
+
+
+class OpenAIAgentAssistantContextAggregator(LLMAssistantContextAggregator):
+    """OpenAI Agent-specific assistant context aggregator.
+
+    Handles aggregation of assistant messages for OpenAI Agent services,
+    with specialized support for OpenAI's function calling format,
+    tool usage tracking, and agent interaction management.
+    """
+
+    pass
--- a/src/pipecat/services/openai_realtime/init.py
+++ b/src/pipecat/services/openai_realtime/init.py
@@ -0,0 +1,9 @@
+from .azure import AzureRealtimeLLMService
+from .events import (
+    InputAudioNoiseReduction,
+    InputAudioTranscription,
+    SemanticTurnDetection,
+    SessionProperties,
+    TurnDetection,
+)
+from .openai import OpenAIRealtimeLLMService
--- a/src/pipecat/services/openai_realtime/azure.py
+++ b/src/pipecat/services/openai_realtime/azure.py
@@ -0,0 +1,67 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Azure OpenAI Realtime LLM service implementation."""
+
+from loguru import logger
+
+from .openai import OpenAIRealtimeLLMService
+
+try:
+    from websockets.asyncio.client import connect as websocket_connect
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error(
+        "In order to use OpenAI, you need to `pip install pipecat-ai[openai]`. Also, set `OPENAI_API_KEY` environment variable."
+    )
+    raise Exception(f"Missing module: {e}")
+
+
+class AzureRealtimeLLMService(OpenAIRealtimeLLMService):
+    """Azure OpenAI Realtime LLM service with Azure-specific authentication.
+
+    Extends the OpenAI Realtime service to work with Azure OpenAI endpoints,
+    using Azure's authentication headers and endpoint format. Provides the same
+    real-time audio and text communication capabilities as the base OpenAI service.
+    """
+
+    def __init__(
+        self,
+        *,
+        api_key: str,
+        base_url: str,
+        **kwargs,
+    ):
+        """Initialize Azure Realtime LLM service.
+
+        Args:
+            api_key: The API key for the Azure OpenAI service.
+            base_url: The full Azure WebSocket endpoint URL including api-version and deployment.
+                Example: "wss://my-project.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=my-realtime-deployment"
+            **kwargs: Additional arguments passed to parent OpenAIRealtimeLLMService.
+        """
+        super().__init__(base_url=base_url, api_key=api_key, **kwargs)
+        self.api_key = api_key
+        self.base_url = base_url
+
+    async def _connect(self):
+        try:
+            if self._websocket:
+                # Here we assume that if we have a websocket, we are connected. We
+                # handle disconnections in the send/recv code paths.
+                return
+
+            logger.info(f"Connecting to {self.base_url}, api key: {self.api_key}")
+            self._websocket = await websocket_connect(
+                uri=self.base_url,
+                additional_headers={
+                    "api-key": self.api_key,
+                },
+            )
+            self._receive_task = self.create_task(self._receive_task_handler())
+        except Exception as e:
+            logger.error(f"{self} initialization error: {e}")
+            self._websocket = None
--- a/src/pipecat/services/openai_realtime/context.py
+++ b/src/pipecat/services/openai_realtime/context.py
@@ -0,0 +1,272 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""OpenAI Realtime LLM context and aggregator implementations."""
+
+import copy
+import json
+
+from loguru import logger
+
+from pipecat.frames.frames import (
+    Frame,
+    FunctionCallResultFrame,
+    InterimTranscriptionFrame,
+    LLMMessagesUpdateFrame,
+    LLMSetToolsFrame,
+    LLMTextFrame,
+    TranscriptionFrame,
+)
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.services.openai.llm import (
+    OpenAIAssistantContextAggregator,
+    OpenAIUserContextAggregator,
+)
+
+from . import events
+from .frames import RealtimeFunctionCallResultFrame, RealtimeMessagesUpdateFrame
+
+
+class OpenAIRealtimeLLMContext(OpenAILLMContext):
+    """OpenAI Realtime LLM context with session management and message conversion.
+
+    Extends the standard OpenAI LLM context to support real-time session properties,
+    instruction management, and conversion between standard message formats and
+    realtime conversation items.
+    """
+
+    def __init__(self, messages=None, tools=None, **kwargs):
+        """Initialize the OpenAIRealtimeLLMContext.
+
+        Args:
+            messages: Initial conversation messages. Defaults to None.
+            tools: Available function tools. Defaults to None.
+            **kwargs: Additional arguments passed to parent OpenAILLMContext.
+        """
+        super().__init__(messages=messages, tools=tools, **kwargs)
+        self.__setup_local()
+
+    def __setup_local(self):
+        self.llm_needs_settings_update = True
+        self.llm_needs_initial_messages = True
+        self._session_instructions = ""
+
+        return
+
+    @staticmethod
+    def upgrade_to_realtime(obj: OpenAILLMContext) -> "OpenAIRealtimeLLMContext":
+        """Upgrade a standard OpenAI LLM context to a realtime context.
+
+        Args:
+            obj: The OpenAILLMContext instance to upgrade.
+
+        Returns:
+            The upgraded OpenAIRealtimeLLMContext instance.
+        """
+        if isinstance(obj, OpenAILLMContext) and not isinstance(obj, OpenAIRealtimeLLMContext):
+            obj.__class__ = OpenAIRealtimeLLMContext
+            obj.__setup_local()
+        return obj
+
+    # todo
+    #   - finish implementing all frames
+
+    def from_standard_message(self, message):
+        """Convert a standard message format to a realtime conversation item.
+
+        Args:
+            message: The standard message dictionary to convert.
+
+        Returns:
+            A ConversationItem instance for the realtime API.
+        """
+        if message.get("role") == "user":
+            content = message.get("content")
+            if isinstance(message.get("content"), list):
+                content = ""
+                for c in message.get("content"):
+                    if c.get("type") == "text":
+                        content += " " + c.get("text")
+                    else:
+                        logger.error(
+                            f"Unhandled content type in context message: {c.get('type')} - {message}"
+                        )
+            return events.ConversationItem(
+                role="user",
+                type="message",
+                content=[events.ItemContent(type="input_text", text=content)],
+            )
+        if message.get("role") == "assistant" and message.get("tool_calls"):
+            tc = message.get("tool_calls")[0]
+            return events.ConversationItem(
+                type="function_call",
+                call_id=tc["id"],
+                name=tc["function"]["name"],
+                arguments=tc["function"]["arguments"],
+            )
+        logger.error(f"Unhandled message type in from_standard_message: {message}")
+
+    def get_messages_for_initializing_history(self):
+        """Get conversation items for initializing the realtime session history.
+
+        Converts the context's messages to a format suitable for the realtime API,
+        handling system instructions and conversation history packaging.
+
+        Returns:
+            List of conversation items for session initialization.
+        """
+        # We can't load a long conversation history into the openai realtime api yet. (The API/model
+        # forgets that it can do audio, if you do a series of `conversation.item.create` calls.) So
+        # our general strategy until this is fixed is just to put everything into a first "user"
+        # message as a single input.
+        if not self.messages:
+            return []
+
+        messages = copy.deepcopy(self.messages)
+
+        # If we have a "system" message as our first message, let's pull that out into session
+        # "instructions"
+        if messages[0].get("role") == "system":
+            self.llm_needs_settings_update = True
+            system = messages.pop(0)
+            content = system.get("content")
+            if isinstance(content, str):
+                self._session_instructions = content
+            elif isinstance(content, list):
+                self._session_instructions = content[0].get("text")
+            if not messages:
+                return []
+
+        # If we have just a single "user" item, we can just send it normally
+        if len(messages) == 1 and messages[0].get("role") == "user":
+            return [self.from_standard_message(messages[0])]
+
+        # Otherwise, let's pack everything into a single "user" message with a bit of
+        # explanation for the LLM
+        intro_text = """
+        This is a previously saved conversation. Please treat this conversation history as a
+        starting point for the current conversation."""
+
+        trailing_text = """
+        This is the end of the previously saved conversation. Please continue the conversation
+        from here. If the last message is a user instruction or question, act on that instruction
+        or answer the question. If the last message is an assistant response, simple say that you
+        are ready to continue the conversation."""
+
+        return [
+            {
+                "role": "user",
+                "type": "message",
+                "content": [
+                    {
+                        "type": "input_text",
+                        "text": "\n\n".join(
+                            [intro_text, json.dumps(messages, indent=2), trailing_text]
+                        ),
+                    }
+                ],
+            }
+        ]
+
+    def add_user_content_item_as_message(self, item):
+        """Add a user content item as a standard message to the context.
+
+        Args:
+            item: The conversation item to add as a user message.
+        """
+        message = {
+            "role": "user",
+            "content": [{"type": "text", "text": item.content[0].transcript}],
+        }
+        self.add_message(message)
+
+
+class OpenAIRealtimeUserContextAggregator(OpenAIUserContextAggregator):
+    """User context aggregator for OpenAI Realtime API.
+
+    Handles user input frames and generates appropriate context updates
+    for the realtime conversation, including message updates and tool settings.
+
+    Args:
+        context: The OpenAI realtime LLM context.
+        **kwargs: Additional arguments passed to parent aggregator.
+    """
+
+    async def process_frame(
+        self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM
+    ):
+        """Process incoming frames and handle realtime-specific frame types.
+
+        Args:
+            frame: The frame to process.
+            direction: The direction of frame flow in the pipeline.
+        """
+        await super().process_frame(frame, direction)
+        # Parent does not push LLMMessagesUpdateFrame. This ensures that in a typical pipeline,
+        # messages are only processed by the user context aggregator, which is generally what we want. But
+        # we also need to send new messages over the websocket, so the openai realtime API has them
+        # in its context.
+        if isinstance(frame, LLMMessagesUpdateFrame):
+            await self.push_frame(RealtimeMessagesUpdateFrame(context=self._context))
+
+        # Parent also doesn't push the LLMSetToolsFrame.
+        if isinstance(frame, LLMSetToolsFrame):
+            await self.push_frame(frame, direction)
+
+    async def push_aggregation(self):
+        """Push user input aggregation.
+
+        Currently ignores all user input coming into the pipeline as realtime
+        audio input is handled directly by the service.
+        """
+        # for the moment, ignore all user input coming into the pipeline.
+        # todo: think about whether/how to fix this to allow for text input from
+        #       upstream (transport/transcription, or other sources)
+        pass
+
+
+class OpenAIRealtimeAssistantContextAggregator(OpenAIAssistantContextAggregator):
+    """Assistant context aggregator for OpenAI Realtime API.
+
+    Handles assistant output frames from the realtime service, filtering
+    out duplicate text frames and managing function call results.
+
+    Args:
+        context: The OpenAI realtime LLM context.
+        **kwargs: Additional arguments passed to parent aggregator.
+    """
+
+    # The LLMAssistantContextAggregator uses TextFrames to aggregate the LLM output,
+    # but the OpenAIRealtimeLLMService pushes LLMTextFrames and TTSTextFrames. We
+    # need to override this proces_frame for LLMTextFrame, so that only the TTSTextFrames
+    # are process. This ensures that the context gets only one set of messages.
+    # OpenAIRealtimeLLMService also pushes TranscriptionFrames and InterimTranscriptionFrames,
+    # so we need to ignore pushing those as well, as they're also TextFrames.
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process assistant frames, filtering out duplicate text content.
+
+        Args:
+            frame: The frame to process.
+            direction: The direction of frame flow in the pipeline.
+        """
+        if not isinstance(frame, (LLMTextFrame, TranscriptionFrame, InterimTranscriptionFrame)):
+            await super().process_frame(frame, direction)
+
+    async def handle_function_call_result(self, frame: FunctionCallResultFrame):
+        """Handle function call result and notify the realtime service.
+
+        Args:
+            frame: The function call result frame to handle.
+        """
+        await super().handle_function_call_result(frame)
+
+        # The standard function callback code path pushes the FunctionCallResultFrame from the llm itself,
+        # so we didn't have a chance to add the result to the openai realtime api context. Let's push a
+        # special frame to do that.
+        await self.push_frame(
+            RealtimeFunctionCallResultFrame(result_frame=frame), FrameDirection.UPSTREAM
+        )
--- a/src/pipecat/services/openai_realtime/events.py
+++ b/src/pipecat/services/openai_realtime/events.py
--- a/src/pipecat/services/openai_realtime/frames.py
+++ b/src/pipecat/services/openai_realtime/frames.py
@@ -0,0 +1,37 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Custom frame types for OpenAI Realtime API integration."""
+
+from dataclasses import dataclass
+from typing import TYPE_CHECKING
+
+from pipecat.frames.frames import DataFrame, FunctionCallResultFrame
+
+if TYPE_CHECKING:
+    from pipecat.services.openai_realtime_beta.context import OpenAIRealtimeLLMContext
+
+
+@dataclass
+class RealtimeMessagesUpdateFrame(DataFrame):
+    """Frame indicating that the realtime context messages have been updated.
+
+    Parameters:
+        context: The updated OpenAI realtime LLM context.
+    """
+
+    context: "OpenAIRealtimeLLMContext"
+
+
+@dataclass
+class RealtimeFunctionCallResultFrame(DataFrame):
+    """Frame containing function call results for the realtime service.
+
+    Parameters:
+        result_frame: The function call result frame to send to the realtime API.
+    """
+
+    result_frame: FunctionCallResultFrame
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
James Hush	29d4a56663	Working on the 46 example	2025-09-17 11:59:16 +08:00
James Hush	373a09ecd6	Working on the 46 example	2025-09-17 11:59:10 +08:00
James Hush	07f54c48f3	This is working	2025-09-17 11:53:07 +08:00
James Hush	c8a3d65aa4	Save progress	2025-09-17 11:39:21 +08:00
James Hush	50a2a0dc86	ok its kinda working	2025-09-17 11:29:11 +08:00
James Hush	0421d97954	Save changes	2025-09-17 11:09:03 +08:00
James Hush	54c8f336c3	Save progress	2025-09-16 16:43:38 +08:00
James Hush	b086fbafe6	feat: Add OpenAI Agents SDK integration service - Create new OpenAIAgentService that integrates OpenAI Agents SDK with Pipecat - Support for agent loops, handoffs, guardrails, and session management - Add streaming and non-streaming response modes - Include comprehensive tool integration and error handling - Add optional dependency for openai-agents package - Create foundational examples showing basic usage and agent handoffs - Add comprehensive tests with mocked dependencies - Include detailed documentation and README Key features: - Real-time streaming responses compatible with Pipecat pipelines - Agent handoffs for specialized task delegation - Tool calling with automatic schema generation - Input/output guardrails for safety and validation - Session context management for conversation continuity - Built-in tracing and monitoring integration Examples: - 45-openai-agent-basic.py: Basic agent with weather and trivia tools - 46-openai-agent-handoffs.py: Multi-agent system with specialist handoffs	2025-09-16 16:20:30 +08:00
Mark Backman	cca90791c4	Merge pull request #2652 from pipecat-ai/mb/fix-audio-buffer-processor-has-audio fix: AudioBufferProcessor has_audio returns based on user or bot audi…	2025-09-15 18:43:59 -07:00
Mark Backman	f2a5d408de	fix: AudioBufferProcessor has_audio returns based on user or bot audio existing	2025-09-15 21:35:35 -04:00
Aleix Conchillo Flaqué	044c6eba46	Merge pull request #2655 from pipecat-ai/aleix/add-on-pipeline-finalized PipelineTask: add on_pipeline_finished event	2025-09-15 15:32:04 -07:00
Aleix Conchillo Flaqué	db71089f5e	PipelineTask: add on_pipeline_finished event This deprecates `on_pipeline_stopped`, `on_pipeline_ended` and `on_pipeline_cancelled`.	2025-09-15 15:28:33 -07:00
Aleix Conchillo Flaqué	f861f5066f	Merge pull request #2654 from pipecat-ai/aleix/unify-on-client-disconnected transports: on_client_disconnected only if remote client disconnects	2025-09-15 15:18:57 -07:00
kompfner	81cede2c60	Merge pull request #2653 from pipecat-ai/pk/llm-context-adapting-tests `LLMContext`-adapting unit tests	2025-09-15 16:38:46 -04:00
kompfner	7603203230	Merge pull request #2644 from pipecat-ai/pk/run-inference-unit-tests `run_inference` unit tests	2025-09-15 16:26:10 -04:00
Aleix Conchillo Flaqué	8569b61598	transports: on_client_disconnected only if remote client disconnects	2025-09-15 11:35:40 -07:00
Paul Kompfner	fe42187dc1	Implement `LLMService.create_llm_specific_message()` so that users don't need to just know what value of `llm` to provide to the `LLMSpecificMessage` constructor	2025-09-15 14:15:22 -04:00
Paul Kompfner	999e88c942	Add unit tests for `AWSBedrockLLMAdapter.get_llm_invocation_params()`, focusing on messages specifically	2025-09-15 12:08:21 -04:00
Paul Kompfner	c04df2f28b	Add unit tests for `AnthropicLLMAdapter.get_llm_invocation_params()`, focusing on messages specifically	2025-09-15 11:55:48 -04:00
Paul Kompfner	100ef0ab5c	Add unit tests for `GeminiLLMAdapter.get_llm_invocation_params()`, focusing on messages specifically	2025-09-15 11:38:23 -04:00
Paul Kompfner	42886d7105	Add unit tests for `OpenAILLMAdapter.get_llm_invocation_params()`, focusing on messages specifically. Also, fix a bug in `OpenAILLMAdapter` (found thanks to the unit tests) where it wasn't "unwrapping" `LLMSpecificMessage`s.	2025-09-15 11:17:11 -04:00
Mark Backman	22cbba002a	Merge pull request #2651 from pipecat-ai/mb/heygen-bot-speaking-frame fix: push BotStartedSpeakingFrame in HeyGenVideoService	2025-09-15 08:02:25 -07:00
Mark Backman	c873798ce5	fix: push BotStartedSpeakingFrame in HeyGenVideoService	2025-09-14 08:12:44 -04:00
Aleix Conchillo Flaqué	d8cd28bb8b	Merge pull request #2640 from pipecat-ai/aleix/pipecat-0.0.85 update CHANGELOG for 0.0.85	2025-09-12 11:06:41 -07:00
Aleix Conchillo Flaqué	c2df6c8aee	update CHANGELOG for 0.0.85	2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué	82478be861	scripts(evals): add 19b-openai-realtime-text	2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué	0f2b7bc01b	examples(foundational): fix 19b-openai-realtime-beta-text	2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué	1b2a5df017	Merge pull request #2622 from pipecat-ai/mb/call-data-runner Add to, from phone info and custom data to the development runner	2025-09-12 10:28:17 -07:00
Mark Backman	2f496ac74f	Add optional body parameter to WebsocketRunnerArguments	2025-09-12 11:28:12 -04:00
Mark Backman	22633a63b0	Update changelog	2025-09-12 11:15:03 -04:00
Mark Backman	e5ed0424e4	Remove to/from data from Plivo, as it will rely on body information	2025-09-12 11:10:03 -04:00
Paul Kompfner	786387722a	Fix an issue in `AWSBedrockLLMService.run_inference`—exceptions should propagate, just like with other LLM services	2025-09-12 11:09:32 -04:00
Paul Kompfner	9f82c6b4a4	Add unit tests for `run_inference`	2025-09-12 11:07:11 -04:00
Mark Backman	99cfcb1d4e	Parsed custom data from Plivo extraHeaders	2025-09-12 08:11:30 -04:00
Mark Backman	d595676436	Add custom data handling for Twilio	2025-09-12 08:11:30 -04:00
Aleix Conchillo Flaqué	0190812ee8	Merge pull request #2639 from pipecat-ai/aleix/min-words-interruption-unit-test MinWordsInterruptionStrategy unit test	2025-09-11 18:52:39 -07:00
Aleix Conchillo Flaqué	2a24061bbb	examples(07ad): remove deprecated user_continuous_stream	2025-09-11 18:50:00 -07:00
Aleix Conchillo Flaqué	89f7e7d199	update CHANGELOG with BaseOutputTransport fix	2025-09-11 16:58:44 -07:00
Aleix Conchillo Flaqué	384814e640	Merge pull request #2456 from a6kme/patch-1 Only set last_frame_time when handling OutputAudioRawFrame	2025-09-11 16:56:25 -07:00
Aleix Conchillo Flaqué	ab4364b833	update CHANGELOG and fix formatting	2025-09-11 15:34:47 -07:00
Aleix Conchillo Flaqué	fafdadad3c	Merge pull request #2473 from TheNotary/adds-interim-transcription-frame-support adds support to Azure STT for creating InterimTranscriptFrames	2025-09-11 15:33:38 -07:00
Aleix Conchillo Flaqué	05dc2fa916	updated CHANGELOG.md with GoogleTTSService updates	2025-09-11 14:36:21 -07:00
Aleix Conchillo Flaqué	0c30cc6ea6	Merge pull request #2547 from manishkjs/feat/google-tts-voice-cloning feat: add voice cloning and speaking rate to GoogleTTSService	2025-09-11 14:32:21 -07:00
Aleix Conchillo Flaqué	c26d336e34	Merge pull request #2545 from pipecat-ai/aleix/aws-nova-sonic-pre-load-cue AWSNovaSonicLLMService: pre-load audio cue in the constructor	2025-09-11 14:31:26 -07:00
Mark Backman	37b6198787	Merge pull request #2635 from pipecat-ai/mb/openai-tts-speed	2025-09-11 14:22:51 -07:00
kompfner	3c271da94c	Merge pull request #2633 from pipecat-ai/pk/uv-readme-updates Updating the README to reflect that:	2025-09-11 17:07:41 -04:00
kompfner	be28d3f93b	Merge pull request #2637 from pipecat-ai/pk/llm-context-evals-and-bug-fix `LLMContext` evals and bug fix	2025-09-11 17:00:07 -04:00
marcus-daily	d2f210e960	Bundle Smart Turn v3 with Pipecat	2025-09-11 21:37:16 +01:00
Aleix Conchillo Flaqué	57add41971	tests: add unit test for MinWordsInterruptionStrategy	2025-09-11 13:07:30 -07:00
Aleix Conchillo Flaqué	74b38b59d6	tests(utils): allow passing PipelineParams to run_test()	2025-09-11 13:02:21 -07:00
kompfner	dac58deffc	Merge pull request #2636 from pipecat-ai/pk/uv-lock-update-for-smart-turn-v3 uv.lock update for Smart Turn v3	2025-09-11 14:35:36 -04:00
Paul Kompfner	aff11f5121	Fix missing import in llm_response_universal.py	2025-09-11 14:33:17 -04:00
Paul Kompfner	a4023d3915	Update evals to include examples that exercise the universal `LLMContext`	2025-09-11 14:32:56 -04:00
Paul Kompfner	d6543d244d	uv.lock update for Smart Turn v3	2025-09-11 14:07:17 -04:00
Mark Backman	fafcd79870	OpenAITTSService: add speed arg	2025-09-11 13:53:52 -04:00
Paul Kompfner	6a717fbbd1	Updating the README to reflect that: - various dependencies that previously didn't work with Python 3.13 now seem to - ultravox isn't fully supported on macOS	2025-09-11 12:27:43 -04:00
Aleix Conchillo Flaqué	9b3f6927c2	Merge pull request #2621 from pipecat-ai/aleix/interruption-task-frame interruption task frame	2025-09-11 09:22:35 -07:00
Aleix Conchillo Flaqué	0b21f8a6bd	FrameProcessor: add push_interruption_task_frame_and_wait()	2025-09-11 09:19:44 -07:00
Aleix Conchillo Flaqué	8249b014f0	frames: BotInterruptionFrame is deprecated, use InterruptionTaskFrame	2025-09-11 09:01:54 -07:00
Aleix Conchillo Flaqué	9d9f10ae0e	frames: StartInterruptionFrame is deprecated, use InterruptionFrame	2025-09-11 09:01:54 -07:00
Aleix Conchillo Flaqué	e27b23694d	frames: add new TaskFrame TaskFrame is a base class for other frames that are meant to be sent to the pipeline task.	2025-09-11 09:01:52 -07:00
marcus-daily	66ce5fe6bd	Ruff fixes	2025-09-11 16:04:56 +01:00
marcus-daily	a9b53dc800	Update inference session options	2025-09-11 16:04:56 +01:00
marcus-daily	818352a300	Formatting	2025-09-11 16:04:56 +01:00
marcus-daily	3e9fc7be19	Update onnxruntime version	2025-09-11 16:04:56 +01:00
marcus-daily	a2e76bcad8	Smart Turn V3 support	2025-09-11 16:04:56 +01:00
Mark Backman	8e8e42717b	Add to and from phone information to the development runner	2025-09-11 10:44:21 -04:00
kompfner	b31322e38e	Merge pull request #2619 from pipecat-ai/pk/aws-universal-context Expand universal `LLMContext` support to AWS Bedrock	2025-09-11 09:33:08 -04:00
Aleix Conchillo Flaqué	908325484d	Merge pull request #2614 from pipecat-ai/aleix/readme-client-sdks-table README: update clients' table	2025-09-10 10:21:18 -07:00
Mark Backman	dd6ff789c7	Merge pull request #2628 from pipecat-ai/mb/fix-13-push-frame fix: 13 foundational examples now push frames from TranscriptionLogger	2025-09-10 09:13:04 -07:00
Mark Backman	f4938e0fad	fix: 13 foundational examples now push frames from TranscriptionLogger	2025-09-10 10:40:10 -04:00
James Hush	e8f60c7c6f	Handle missing rawResponse in transcription messages (#2623 ) * Handle missing rawResponse in transcription messages - Use message.get('rawResponse', {}) to safely access rawResponse field - Default is_final to False when rawResponse is missing - Add proper type annotations for better code clarity - Minor import formatting cleanup This prevents KeyError crashes when transcription messages from Daily's API don't include the rawResponse field in edge cases. * docs: add changelog line	2025-09-10 15:03:23 +08:00
Paul Kompfner	fedb8a201f	Update 12d example to use `LLMContext`, now that AWS Bedrock supports it	2025-09-09 16:24:13 -04:00
Paul Kompfner	8ccd220a60	Add universal `LLMContext` support to `AWSBedrockLLMService.run_inference()`	2025-09-09 16:00:32 -04:00
Paul Kompfner	fe79de8f27	When converting universal `LLMContext` messages to AWS Bedrock expected format, automatically update non-initial "system"-role messages to "user"-role messages, as we do in other non-OpenAI LLM services	2025-09-09 15:50:03 -04:00
Paul Kompfner	176573c342	Add to CHANGELOG AWS Bedrock's support for universal `LLMContext`	2025-09-09 15:31:56 -04:00
Paul Kompfner	75f9914f49	Add support for universal LLMContext to AWS Bedrock LLM service	2025-09-09 15:25:04 -04:00
Paul Kompfner	f4d6715e32	Add foundational example using AWS Bedrock with universal LLMContext	2025-09-09 10:49:51 -04:00
kompfner	38f6e33f97	Merge pull request #2598 from pipecat-ai/pk/deprecate-vision-image-raw-frame Remove `VisionImageRawFrame`, which was previously being handled dire…	2025-09-08 17:13:28 -04:00
Paul Kompfner	1c3e4e34e5	Minor fix to AWS Bedrock console logging to handle image messages in the context	2025-09-08 17:10:11 -04:00
Paul Kompfner	623c660027	Remove debugging comment	2025-09-08 17:01:51 -04:00
Paul Kompfner	a3e65ab3b5	The `VisionImageRawFrame` removal and corresponding `VisionImageFrameAggregator` deprecation will now happen in version 0.0.85	2025-09-08 17:01:47 -04:00
Paul Kompfner	f3a4b416df	Remove `VisionImageRawFrame`, which was previously being handled directly by the LLM services, and deprecate the associated `VisionImageFrameAggregator`. Removing `VisionImageRawFrame` lets us simplify LLM services' logic, getting us closer to the idealized architecture where all they care about is handling context frames. This change is in service of getting us closer to ready to deprecate usage of `OpenAILLMContext` and subclasses in favor of the universal `LLMContext`, at least for the traditional text-to-text LLMs. Why remove `VisionImageRawFrame` rather than deprecate? It's "internal"—only created by `VisionImageFrameAggregator`—and never intended to be used directly by users (it would be difficult to use directly anyway). Move the logic that was once in `VisionImageFrameAggregator` directly into the examples. Reasoning: - If `UserImageRequester` is defined in the examples, it makes sense for `UserImageProcessor` to be too, as it’s the flip side of the same coin, so to speak - The logic is now pretty trivial - This kind of one-shot, history-less image-describing pipeline shouldn't be common at all; it's ok for it to live in examples rather than as a dedicated class - In the short term, this enables us to create `LLMContext`s for services that support it and `OpenAILLMContext`s for services that don't yet (AWS) This commit also adds missing translation from OpenAI-format image context messages to AWS format. Note that this isn't a wasted effort in the face of the upcoming migration to universal `LLMContext`—this work will be reused as it has to be implemented there too.	2025-09-08 17:00:08 -04:00
Aleix Conchillo Flaqué	aa471a4ef5	update CHANGELOG with LiveKitTransport updates	2025-09-08 13:53:21 -07:00
Aleix Conchillo Flaqué	d55133a44f	Merge pull request #2604 from alexyzhou/feature/livekit_video_and_bug_fix Feature: Add support for livekit video stream and minor bug fixes	2025-09-08 13:51:14 -07:00
Aleix Conchillo Flaqué	0f1cf81691	README: update clients' table	2025-09-08 12:08:32 -07:00
kompfner	ac4d335799	Merge pull request #2613 from pipecat-ai/pk/mistral-message-fixups Apply additional fixups to context messages to meet Mistral-specific …	2025-09-08 13:59:54 -04:00
Paul Kompfner	e65385c151	Tweak the Mistral-specific context messages fixup logic to handle the (mostly academic) possibility of a "tool" message appearing at the end	2025-09-08 13:55:09 -04:00
Paul Kompfner	0bb7df7a6b	Remove stray debugging message	2025-09-08 13:38:26 -04:00
Paul Kompfner	daee1ddf3b	Apply additional fixups to context messages to meet Mistral-specific requirements	2025-09-08 11:26:58 -04:00
Aleix Conchillo Flaqué	1cccb97ccf	Merge pull request #2608 from pipecat-ai/aleix/deprecate-noisereducefilter audio(filters): deprecate NoisereduceFilter	2025-09-07 20:54:09 -07:00
Aleix Conchillo Flaqué	d7794abf21	audio(filters): deprecate NoisereduceFilter	2025-09-07 20:52:17 -07:00
Aleix Conchillo Flaqué	6a6a63a532	Merge pull request #2607 from pipecat-ai/aleix/scripts-evals-improve-eval-prompt scripts(evals): allow user to talk and only eval when needed	2025-09-07 20:49:43 -07:00
Mark Backman	6edb6fed41	Merge pull request #2606 from pipecat-ai/mb/quickstart-lockfile Remove uv.lock from quickstart	2025-09-07 06:10:14 -07:00
Mark Backman	a537382816	Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596 ) * Add OpenAI Realtime module * Add foundational examples for OpenAI Realtime * Add deprecation warning to OpenAIRealtimeBetaLLMService * Add deprecation warning to AzureRealtimeBetaLLMService * Update Changelog	2025-09-07 09:09:57 -04:00
Aleix Conchillo Flaqué	46deaada70	scripts(evals): allow user to talk and only eval when needed	2025-09-06 19:19:08 -07:00
TheNotary	7366b1aee0	adds missing InterimTranscriptionFrame import	2025-09-06 14:40:19 -05:00
Mark Backman	dbc52bc6b0	Remove uv.lock from quickstart	2025-09-06 11:13:50 -04:00
Alex Zhou	d6432589f6	fix: fix format and lint by ruff	2025-09-06 10:50:47 +08:00
Alex Zhou	13b73d4406	feat: Add support for pipecat video stream; fix the bug of duplicate participants when connecting; fix the bug of RTVI messages sent via livekit messages;	2025-09-06 10:41:33 +08:00
Manish Kumar	4699ee8d86	docs: add docstring for voice_cloning_key and update CHANGELOG	2025-09-04 22:45:51 +05:30
Aleix Conchillo Flaqué	e3597801d4	AWSNovaSonicLLMService: pre-load audio cue in the constructor	2025-09-04 09:31:39 -07:00
Manish Kumar	2ee481d541	feat: add voice cloning and speaking rate to GoogleTTSService	2025-08-30 23:04:59 +05:30
TheNotary	48b3ad8f8f	adds support for creating InterimTranscriptFrames for Azure speech services	2025-08-19 17:00:42 -05:00
Abhishek	8bbdc7c8d1	Only set last_frame_time when handling OutputAudioRawFrame We don't want to set `last_frame_time` on other frames like `HeartBeatFrame`, `LLMGeneratedTextFrame`, `InterruptionFrames` so that we can calculate `diff_time` and compare it against `vad_stop_secs` properly	2025-08-16 16:25:14 +05:30