Remove extra code

Pre recorded message example
Save progress
2025-10-01 14:48:12 +08:00 · 2025-10-01 14:43:36 +08:00 · 2025-10-01 13:58:32 +08:00 · 2025-10-01 13:52:58 +08:00 · 2025-09-30 16:27:34 -07:00 · 2025-09-30 16:26:48 -07:00
53 changed files with 1290 additions and 362 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,40 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Added

+- Added RTVI messages for user/bot audio levels and system logs.
+
+- Include OpenAI-based LLM services cached tokens to `MetricsFrame`.
+
+### Changed
+
+- Updated the default model for `AnthropicLLMService` to
+  `claude-sonnet-4-5-20250929`.
+
+### Deprecated
+
+- `DailyUpdateRemoteParticipantsFrame` is deprecated and will be removed in a
+  future version. Instead, create your own custom frame and handle it in the
+  `@transport.output().event_handler("on_after_push_frame")` event handler or a
+  custom processor.
+
+## Fixed
+
+- Fixed a `PipelineTask` issue that could prevent the application to exit if
+  `task.cancel()` was called when the task was already finished.
+
+- Fixed an issue where local SmartTurn was not being ran in a separate thread.
+
+## [0.0.86] - 2025-09-24
+
+### Added
+
+- Added `HeyGenTransport`. This is an integration for HeyGen Interactive
+  Avatar. A video service that handles audio streaming and requests HeyGen to
+  generate avatar video responses. (see https://www.heygen.com/). When used, the
+  Pipecat bot joins the same virtual room as the HeyGen Avatar and the user.
+
+- Added support to `TwilioFrameSerializer` for `region` and `edge` settings.
+
 - Added support for using universal `LLMContext` with:

  - `LLMLogObserver`
@@ -68,6 +102,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Changed

+- Updated `aiortc` to 1.13.0.
+
+- Updated `sentry` to 2.38.0.
+
 - `BaseOutputTransport` methods `write_audio_frame` and `write_video_frame` now
  return a boolean to indicate if the transport implementation was able to write
  the given frame or not.
@@ -102,6 +140,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Fixed

+- Fixed an issue where the pipeline could freeze if a task cancellation never
+  completed because a third-party library swallowed asyncio.CancelledError. We
+  now apply a timeout to task cancellations to prevent these freezes. If the
+  timeout is reached, the system logs warnings and leaves dangling tasks behind,
+  which can help diagnose where cancellation is being blocked.
+
+- Fixed an `AudioBufferProcessor` issues that was causing user audio to be
+  missing in stereo recordings causing bot and user overlaps.
+
 - Fixed a `BaseOutputTransport` issue that could produce large saved
  `AudioBufferProcessor` files when using an audio mixer.

--- a/env.example
+++ b/env.example
@@ -66,8 +66,8 @@ LMNT_VOICE_ID=...
 PERPLEXITY_API_KEY=...

 # PlayHT
-PLAY_HT_USER_ID=...
-PLAY_HT_API_KEY=...
+PLAYHT_USER_ID=...
+PLAYHT_API_KEY=...

 # OpenAI
 OPENAI_API_KEY=...
--- a/examples/foundational/07-interruptible.py
+++ b/examples/foundational/07-interruptible.py
@@ -5,6 +5,7 @@
 #

 import os
+import wave

 from dotenv import load_dotenv
 from loguru import logger
@@ -13,7 +14,14 @@ from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
 from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
-from pipecat.frames.frames import LLMRunFrame
+from pipecat.frames.frames import (
+    LLMFullResponseEndFrame,
+    LLMFullResponseStartFrame,
+    LLMRunFrame,
+    LLMTextFrame,
+    OutputAudioRawFrame,
+    TextFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
@@ -103,7 +111,27 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        logger.info(f"Client connected")
        # Kick off the conversation.
        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
-        await task.queue_frames([LLMRunFrame()])
+
+        audio_file_path = os.path.join(os.path.dirname(__file__), "assets", "pre-recorded.wav")
+
+        with wave.open(audio_file_path, "rb") as wav_file:
+            llm_text_frame = TextFrame(text="This is a pre-recorded message.")
+            llm_text_frame.skip_tts = True
+
+            audio_data = wav_file.readframes(wav_file.getnframes())
+            output_audio_raw_frame = OutputAudioRawFrame(
+                audio=audio_data, sample_rate=44100, num_channels=1
+            )
+
+            await task.queue_frames(
+                [
+                    LLMRunFrame(),
+                    LLMFullResponseStartFrame(),
+                    llm_text_frame,
+                    output_audio_raw_frame,
+                    LLMFullResponseEndFrame(),
+                ]
+            )

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/07b-interruptible-langchain.py
+++ b/examples/foundational/07b-interruptible-langchain.py
@@ -129,7 +129,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
        # Kick off the conversation.
-        # An `OpenAILLMContextFrame` will be picked up by the LangchainProcessor using
+        # An `LLMContextFrame` will be picked up by the LangchainProcessor using
        # only the content of the last message to inject it in the prompt defined
        # above. So no role is required here.
        messages = [({"content": "Please briefly introduce yourself to the user."})]
--- a/examples/foundational/07m-interruptible-aws-strands.py
+++ b/examples/foundational/07m-interruptible-aws-strands.py
@@ -9,14 +9,12 @@ from dotenv import load_dotenv
 from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import LLMMessagesAppendFrame, LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_response import (
-    LLMAssistantContextAggregator,
-    LLMUserContextAggregator,
-)
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frameworks.strands_agents import StrandsAgentsProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
@@ -115,19 +113,18 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        )

    # Setup context aggregators for message handling
-    context = OpenAILLMContext()
-    tma_in = LLMUserContextAggregator(context=context)
-    tma_out = LLMAssistantContextAggregator(context=context)
+    context = LLMContext()
+    context_aggregator = LLMContextAggregatorPair(context)

    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
            stt,  # Speech-to-text
-            tma_in,  # User context aggregator
+            context_aggregator.user(),  # User responses
            llm,  # Strands Agents processor
            tts,  # Text-to-speech
            transport.output(),  # Transport bot output
-            tma_out,  # Assistant context aggregator
+            context_aggregator.assistant(),  # Assistant spoken responses
        ]
    )

@@ -143,6 +140,20 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames(
+            [
+                LLMMessagesAppendFrame(
+                    messages=[
+                        {
+                            "role": "user",
+                            "content": f"Greet the user and introduce yourself.",
+                        }
+                    ],
+                    run_llm=True,
+                )
+            ]
+        )

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
--- a/examples/foundational/22b-natural-conversation-proposal.py
+++ b/examples/foundational/22b-natural-conversation-proposal.py
@@ -9,8 +9,9 @@ import os

 from dotenv import load_dotenv
 from loguru import logger
-from openai.types.chat import ChatCompletionToolParam

+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import (
    CancelFrame,
@@ -19,6 +20,7 @@ from pipecat.frames.frames import (
    FunctionCallInProgressFrame,
    FunctionCallResultFrame,
    InterruptionFrame,
+    LLMContextFrame,
    LLMRunFrame,
    StartFrame,
    SystemFrame,
@@ -32,10 +34,8 @@ from pipecat.pipeline.parallel_pipeline import ParallelPipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-    OpenAILLMContextFrame,
-)
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.filters.function_filter import FunctionFilter
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.processors.user_idle_processor import UserIdleProcessor
@@ -66,13 +66,13 @@ class StatementJudgeContextFilter(FrameProcessor):
            await self.push_frame(frame, direction)
            return

-        # We only want to handle OpenAILLMContextFrames, and only want to push through a simplified
+        # We only want to handle LLMContextFrames, and only want to push through a simplified
        # context frame that contains a system prompt and the most recent user messages,
        # concatenated.
-        if isinstance(frame, OpenAILLMContextFrame):
+        if isinstance(frame, LLMContextFrame):
            logger.debug(f"Context Frame: {frame}")
            # Take text content from the most recent user messages.
-            messages = frame.context.messages
+            messages = frame.context.get_messages()
            user_text_messages = []
            last_assistant_message = None
            for message in reversed(messages):
@@ -100,7 +100,7 @@ class StatementJudgeContextFilter(FrameProcessor):
                if last_assistant_message:
                    messages.append(last_assistant_message)
                messages.append({"role": "user", "content": user_message})
-                await self.push_frame(OpenAILLMContextFrame(OpenAILLMContext(messages)))
+                await self.push_frame(LLMContextFrame(LLMContext(messages)))


 class CompletenessCheck(FrameProcessor):
@@ -231,22 +231,26 @@ class TurnDetectionLLM(Pipeline):

        async def pass_only_llm_trigger_frames(frame):
            return (
-                isinstance(frame, OpenAILLMContextFrame)
+                isinstance(frame, LLMContextFrame)
                or isinstance(frame, InterruptionFrame)
                or isinstance(frame, FunctionCallInProgressFrame)
                or isinstance(frame, FunctionCallResultFrame)
            )

+        async def filter_all(frame):
+            return False
+
        super().__init__(
            [
                ParallelPipeline(
                    [
-                        # Ignore everything except an OpenAILLMContextFrame. Pass a specially constructed
+                        # Ignore everything except an LLMContextFrame. Pass a specially constructed
                        # simplified context frame to the statement classifier LLM. The only frame this
                        # sub-pipeline will output is a UserStoppedSpeakingFrame.
                        statement_judge_context_filter,
                        statement_llm,
                        completeness_check,
+                        FunctionFilter(filter=filter_all, direction=FrameDirection.UPSTREAM),
                    ],
                    [
                        # Block everything except frames that trigger LLM inference.
@@ -302,30 +306,23 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    async def on_function_calls_started(service, function_calls):
        await tts.queue_frame(TTSSpeakFrame("Let me check on that."))

-    tools = [
-        ChatCompletionToolParam(
-            type="function",
-            function={
-                "name": "get_current_weather",
-                "description": "Get the current weather",
-                "parameters": {
-                    "type": "object",
-                    "properties": {
-                        "location": {
-                            "type": "string",
-                            "description": "The city and state, e.g. San Francisco, CA",
-                        },
-                        "format": {
-                            "type": "string",
-                            "enum": ["celsius", "fahrenheit"],
-                            "description": "The temperature unit to use. Infer this from the users location.",
-                        },
-                    },
-                    "required": ["location", "format"],
-                },
+    weather_function = FunctionSchema(
+        name="get_current_weather",
+        description="Get the current weather",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
            },
-        )
-    ]
+            "format": {
+                "type": "string",
+                "enum": ["celsius", "fahrenheit"],
+                "description": "The temperature unit to use. Infer this from the users location.",
+            },
+        },
+        required=["location", "format"],
+    )
+    tools = ToolsSchema(standard_tools=[weather_function])

    messages = [
        {
@@ -334,8 +331,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        },
    ]

-    context = OpenAILLMContext(messages, tools)
-    context_aggregator = llm_main.create_context_aggregator(context)
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)

    # LLM + turn detection (with an extra LLM as a judge)
    llm = TurnDetectionLLM(llm_main)
@@ -369,7 +366,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_app_message")
-    async def on_app_message(transport, message):
+    async def on_app_message(transport, message, sender):
        logger.debug(f"Received app message: {message}")
        if "message" not in message:
            return
--- a/examples/foundational/22c-natural-conversation-mixed-llms.py
+++ b/examples/foundational/22c-natural-conversation-mixed-llms.py
@@ -9,8 +9,9 @@ import os

 from dotenv import load_dotenv
 from loguru import logger
-from openai.types.chat import ChatCompletionToolParam

+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import (
    CancelFrame,
@@ -19,6 +20,7 @@ from pipecat.frames.frames import (
    FunctionCallInProgressFrame,
    FunctionCallResultFrame,
    InterruptionFrame,
+    LLMContextFrame,
    LLMRunFrame,
    StartFrame,
    SystemFrame,
@@ -32,10 +34,8 @@ from pipecat.pipeline.parallel_pipeline import ParallelPipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-    OpenAILLMContextFrame,
-)
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.filters.function_filter import FunctionFilter
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.processors.user_idle_processor import UserIdleProcessor
@@ -272,11 +272,11 @@ class StatementJudgeContextFilter(FrameProcessor):
            await self.push_frame(frame, direction)
            return

-        # We only want to handle OpenAILLMContextFrames, and only want to push through a simplified
+        # We only want to handle LLMContextFrames, and only want to push through a simplified
        # context frame that contains a system prompt and the most recent user messages,
-        if isinstance(frame, OpenAILLMContextFrame):
+        if isinstance(frame, LLMContextFrame):
            # Take text content from the most recent user messages.
-            messages = frame.context.messages
+            messages = frame.context.get_messages()
            user_text_messages = []
            last_assistant_message = None
            for message in reversed(messages):
@@ -303,7 +303,7 @@ class StatementJudgeContextFilter(FrameProcessor):
                if last_assistant_message:
                    messages.append(last_assistant_message)
                messages.append({"role": "user", "content": user_message})
-                await self.push_frame(OpenAILLMContextFrame(OpenAILLMContext(messages)))
+                await self.push_frame(LLMContextFrame(LLMContext(messages)))


 class CompletenessCheck(FrameProcessor):
@@ -425,12 +425,15 @@ class TurnDetectionLLM(Pipeline):

        async def pass_only_llm_trigger_frames(frame):
            return (
-                isinstance(frame, OpenAILLMContextFrame)
+                isinstance(frame, LLMContextFrame)
                or isinstance(frame, InterruptionFrame)
                or isinstance(frame, FunctionCallInProgressFrame)
                or isinstance(frame, FunctionCallResultFrame)
            )

+        async def filter_all(frame):
+            return False
+
        super().__init__(
            [
                ParallelPipeline(
@@ -440,12 +443,13 @@ class TurnDetectionLLM(Pipeline):
                        FunctionFilter(filter=block_user_stopped_speaking),
                    ],
                    [
-                        # Ignore everything except an OpenAILLMContextFrame. Pass a specially constructed
+                        # Ignore everything except an LLMContextFrame. Pass a specially constructed
                        # simplified context frame to the statement classifier LLM. The only frame this
                        # sub-pipeline will output is a UserStoppedSpeakingFrame.
                        statement_judge_context_filter,
                        statement_llm,
                        completeness_check,
+                        FunctionFilter(filter=filter_all, direction=FrameDirection.UPSTREAM),
                    ],
                    [
                        # Block everything except frames that trigger LLM inference.
@@ -505,30 +509,23 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    async def on_function_calls_started(service, function_calls):
        await tts.queue_frame(TTSSpeakFrame("Let me check on that."))

-    tools = [
-        ChatCompletionToolParam(
-            type="function",
-            function={
-                "name": "get_current_weather",
-                "description": "Get the current weather",
-                "parameters": {
-                    "type": "object",
-                    "properties": {
-                        "location": {
-                            "type": "string",
-                            "description": "The city and state, e.g. San Francisco, CA",
-                        },
-                        "format": {
-                            "type": "string",
-                            "enum": ["celsius", "fahrenheit"],
-                            "description": "The temperature unit to use. Infer this from the users location.",
-                        },
-                    },
-                    "required": ["location", "format"],
-                },
+    weather_function = FunctionSchema(
+        name="get_current_weather",
+        description="Get the current weather",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
            },
-        )
-    ]
+            "format": {
+                "type": "string",
+                "enum": ["celsius", "fahrenheit"],
+                "description": "The temperature unit to use. Infer this from the users location.",
+            },
+        },
+        required=["location", "format"],
+    )
+    tools = ToolsSchema(standard_tools=[weather_function])

    messages = [
        {
@@ -537,8 +534,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        },
    ]

-    context = OpenAILLMContext(messages, tools)
-    context_aggregator = llm_main.create_context_aggregator(context)
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)

    # LLM + turn detection (with an extra LLM as a judge)
    llm = TurnDetectionLLM(llm_main)
@@ -577,7 +574,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_app_message")
-    async def on_app_message(transport, message):
+    async def on_app_message(transport, message, sender):
        logger.debug(f"Received app message: {message}")
        if "message" not in message:
            return
--- a/examples/foundational/22d-natural-conversation-gemini-audio.py
+++ b/examples/foundational/22d-natural-conversation-gemini-audio.py
@@ -9,7 +9,6 @@ import os
 import time

 from dotenv import load_dotenv
-from google.genai.types import Content, Part
 from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
@@ -21,6 +20,7 @@ from pipecat.frames.frames import (
    FunctionCallResultFrame,
    InputAudioRawFrame,
    InterruptionFrame,
+    LLMContextFrame,
    LLMFullResponseStartFrame,
    LLMRunFrame,
    StartFrame,
@@ -34,20 +34,18 @@ from pipecat.pipeline.parallel_pipeline import ParallelPipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response import (
    LLMAssistantAggregatorParams,
    LLMAssistantResponseAggregator,
 )
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-    OpenAILLMContextFrame,
-)
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.filters.function_filter import FunctionFilter
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.google.llm import GoogleLLMContext, GoogleLLMService
+from pipecat.services.google.llm import GoogleLLMService
 from pipecat.services.llm_service import LLMService
 from pipecat.sync.base_notifier import BaseNotifier
 from pipecat.sync.event_notifier import EventNotifier
@@ -375,7 +373,7 @@ class AudioAccumulator(FrameProcessor):
        await super().process_frame(frame, direction)

        # ignore context frame
-        if isinstance(frame, OpenAILLMContextFrame):
+        if isinstance(frame, LLMContextFrame):
            return

        if isinstance(frame, TranscriptionFrame):
@@ -392,9 +390,9 @@ class AudioAccumulator(FrameProcessor):
                f"Processing audio buffer seconds: ({len(self._audio_frames)}) ({len(data)}) {len(data) / 2 / 16000}"
            )
            self._user_speaking = False
-            context = GoogleLLMContext()
+            context = LLMContext()
            context.add_audio_frames_message(audio_frames=self._audio_frames)
-            await self.push_frame(OpenAILLMContextFrame(context=context))
+            await self.push_frame(LLMContextFrame(context=context))
        elif isinstance(frame, InputAudioRawFrame):
            # Append the audio frame to our buffer. Treat the buffer as a ring buffer, dropping the oldest
            # frames as necessary.
@@ -513,7 +511,7 @@ class LLMAggregatorBuffer(LLMAssistantResponseAggregator):
 class ConversationAudioContextAssembler(FrameProcessor):
    """Takes the single-message context generated by the AudioAccumulator and adds it to the conversation LLM's context."""

-    def __init__(self, context: OpenAILLMContext, **kwargs):
+    def __init__(self, context: LLMContext, **kwargs):
        super().__init__(**kwargs)
        self._context = context

@@ -525,11 +523,10 @@ class ConversationAudioContextAssembler(FrameProcessor):
            await self.push_frame(frame, direction)
            return

-        if isinstance(frame, OpenAILLMContextFrame):
-            GoogleLLMContext.upgrade_to_google(self._context)
-            last_message = frame.context.messages[-1]
+        if isinstance(frame, LLMContextFrame):
+            last_message = frame.context.get_messages()[-1]
            self._context._messages.append(last_message)
-            await self.push_frame(OpenAILLMContextFrame(context=self._context))
+            await self.push_frame(LLMContextFrame(context=self._context))


 class OutputGate(FrameProcessor):
@@ -543,7 +540,7 @@ class OutputGate(FrameProcessor):
    def __init__(
        self,
        notifier: BaseNotifier,
-        context: OpenAILLMContext,
+        context: LLMContext,
        llm_transcription_buffer: LLMAggregatorBuffer,
        **kwargs,
    ):
@@ -610,19 +607,23 @@ class OutputGate(FrameProcessor):
            self._gate_task = None

    async def _gate_task_handler(self):
-        await self._notifier.wait()
+        while True:
+            try:
+                await self._notifier.wait()

-        transcription = await self._transcription_buffer.wait_for_transcription() or "-"
-        self._context.add_message(Content(role="user", parts=[Part(text=transcription)]))
+                transcription = await self._transcription_buffer.wait_for_transcription() or "-"
+                self._context.add_message({"role": "user", "content": transcription})

-        self.open_gate()
-        for frame, direction in self._frames_buffer:
-            await self.push_frame(frame, direction)
-        self._frames_buffer = []
+                self.open_gate()
+                for frame, direction in self._frames_buffer:
+                    await self.push_frame(frame, direction)
+                self._frames_buffer = []
+            except asyncio.CancelledError:
+                break


 class TurnDetectionLLM(Pipeline):
-    def __init__(self, llm: LLMService, context: OpenAILLMContext):
+    def __init__(self, llm: LLMService, context: LLMContext):
        # This is the LLM that will transcribe user speech.
        tx_llm = GoogleLLMService(
            name="Transcriber",
@@ -648,10 +649,10 @@ class TurnDetectionLLM(Pipeline):
        # as complete or incomplete.
        # statement_judge_context_filter = StatementJudgeAudioContextAccumulator(notifier=notifier)

-        audio_accumulater = AudioAccumulator()
+        audio_accumulator = AudioAccumulator()
        # This sends a UserStoppedSpeakingFrame and triggers the notifier event
        completeness_check = CompletenessCheck(
-            notifier=notifier, audio_accumulator=audio_accumulater
+            notifier=notifier, audio_accumulator=audio_accumulator
        )

        async def block_user_stopped_speaking(frame):
@@ -667,7 +668,7 @@ class TurnDetectionLLM(Pipeline):

        super().__init__(
            [
-                audio_accumulater,
+                audio_accumulator,
                ParallelPipeline(
                    [
                        # Pass everything except UserStoppedSpeaking to the elements after
@@ -734,8 +735,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        system_instruction=conversation_system_instruction,
    )

-    context = OpenAILLMContext()
-    context_aggregator = conversation_llm.create_context_aggregator(context)
+    context = LLMContext()
+    context_aggregator = LLMContextAggregatorPair(context)

    llm = TurnDetectionLLM(conversation_llm, context)

@@ -761,12 +762,10 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
-        # Kick off the conversation.
-        await task.queue_frames([LLMRunFrame()])

    @transport.event_handler("on_app_message")
-    async def on_app_message(transport, message):
-        logger.debug(f"Received app message: {message}")
+    async def on_app_message(transport, message, sender):
+        logger.debug(f"Received app message: {message}, sender: {sender}")  # TODO: revert
        if "message" not in message:
            return

--- a/examples/foundational/25-google-audio-in.py
+++ b/examples/foundational/25-google-audio-in.py
@@ -8,13 +8,13 @@ import os
 from dataclasses import dataclass

 from dotenv import load_dotenv
-from google.genai.types import Content, Part
 from loguru import logger

 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.frames.frames import (
    Frame,
    InputAudioRawFrame,
+    LLMContextFrame,
    LLMFullResponseEndFrame,
    LLMRunFrame,
    SystemFrame,
@@ -27,15 +27,13 @@ from pipecat.pipeline.parallel_pipeline import ParallelPipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-    OpenAILLMContextFrame,
-)
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frame_processor import FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.google.llm import GoogleLLMContext, GoogleLLMService
+from pipecat.services.google.llm import GoogleLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -101,9 +99,7 @@ class UserAudioCollector(FrameProcessor):
        elif isinstance(frame, UserStoppedSpeakingFrame):
            self._user_speaking = False
            self._context.add_audio_frames_message(audio_frames=self._audio_frames)
-            await self._user_context_aggregator.push_frame(
-                self._user_context_aggregator.get_context_frame()
-            )
+            await self._user_context_aggregator.push_frame(LLMContextFrame(context=self._context))
        elif isinstance(frame, InputAudioRawFrame):
            if self._user_speaking:
                self._audio_frames.append(frame)
@@ -121,10 +117,10 @@ class UserAudioCollector(FrameProcessor):


 class InputTranscriptionContextFilter(FrameProcessor):
-    """This FrameProcessor blocks all frames except the OpenAILLMContextFrame that triggers
+    """This FrameProcessor blocks all frames except the LLMContextFrame that triggers
    LLM inference. (And system frames, which are needed for the pipeline element lifecycle.)

-    We take the context object out of the OpenAILLMContextFrame and use it to create a new
+    We take the context object out of the LLMContextFrame and use it to create a new
    context object that we will send to the transcriber LLM.
    """

@@ -136,52 +132,54 @@ class InputTranscriptionContextFilter(FrameProcessor):
            await self.push_frame(frame, direction)
            return

-        if not isinstance(frame, OpenAILLMContextFrame):
+        if not isinstance(frame, LLMContextFrame):
            return

        try:
-            # Make sure we're working with a GoogleLLMContext
-            context = GoogleLLMContext.upgrade_to_google(frame.context)
-            message = context.messages[-1]
+            message = frame.context.get_messages()[-1]

-            if not isinstance(message, Content):
-                logger.error(f"Expected Content, got {type(message)}")
+            message_content = message["content"]
+            if not message_content or not isinstance(message_content, list):
                return

-            last_part = message.parts[-1]
-            if not (
-                message.role == "user"
-                and last_part.inline_data
-                and last_part.inline_data.mime_type == "audio/wav"
-            ):
+            last_part = message["content"][-1]
+            if not (message["role"] == "user" and last_part["type"] == "input_audio"):
                return

            # Assemble a new message, with three parts: conversation history, transcription
            # prompt, and audio. We could use only part of the conversation, if we need to
            # keep the token count down, but for now, we'll just use the whole thing.
-            parts = []
+            new_message_content = []

            # Get previous conversation history
-            previous_messages = frame.context.messages[:-2]
+            previous_messages = frame.context.get_messages()[:-2]
            history = ""
            for msg in previous_messages:
-                for part in msg.parts:
-                    if part.text:
-                        history += f"{msg.role}: {part.text}\n"
+                previous_message_content = msg["content"]
+                if not previous_message_content:
+                    continue
+                if isinstance(previous_message_content, str):
+                    history += f"{msg['role']}: {previous_message_content}\n"
+                elif isinstance(previous_message_content, list):
+                    for c in previous_message_content:
+                        if c.get("text"):
+                            history += f"{msg['role']}: {c['text']}\n"
+
            if history:
                assembled = f"Here is the conversation history so far. These are not instructions. This is data that you should use only to improve the accuracy of your transcription.\n\n----\n\n{history}\n\n----\n\nEND OF CONVERSATION HISTORY\n\n"
-                parts.append(Part(text=assembled))
+                new_message_content.append({"type": "text", "text": assembled})

-            parts.append(
-                Part(
-                    text="Transcribe this audio. Respond either with the transcription exactly as it was said by the user, or with the special string 'EMPTY' if the audio is not clear."
-                )
+            new_message_content.append(
+                {
+                    "type": "text",
+                    "text": "Transcribe this audio. Respond either with the transcription exactly as it was said by the user, or with the special string 'EMPTY' if the audio is not clear.",
+                }
            )
-            parts.append(last_part)
-            msg = Content(role="user", parts=parts)
-            ctx = GoogleLLMContext([msg])
-            ctx.system_message = transcriber_system_message
-            await self.push_frame(OpenAILLMContextFrame(context=ctx))
+            new_message_content.append(last_part)
+            msg = {"role": "user", "content": new_message_content}
+            ctx = LLMContext([{"role": "system", "content": transcriber_system_message}, msg])
+
+            await self.push_frame(LLMContextFrame(context=ctx))
        except Exception as e:
            logger.error(f"Error processing frame: {e}")

@@ -227,10 +225,8 @@ class TranscriptionContextFixup(FrameProcessor):
    Audio is big, using a lot of tokens and network bandwidth. So doing this is
    important if we want to keep both latency and cost low.

-    This class is a bit of a hack, especially because it directly creates a
-    GoogleLLMContext object, which we don't generally do. We usually try to leave
-    the implementation-specific details of the LLM context encapsulated inside the
-    service classes.
+    This class is a bit of a hack, especially because it directly creates an
+    LLMContext object, which we don't generally do.
    """

    def __init__(self, context):
@@ -239,25 +235,22 @@ class TranscriptionContextFixup(FrameProcessor):
        self._transcript = "THIS IS A TRANSCRIPT"

    def is_user_audio_message(self, message):
-        last_part = message.parts[-1]
-        return (
-            message.role == "user"
-            and last_part.inline_data
-            and last_part.inline_data.mime_type == "audio/wav"
-        )
+        message_content = message["content"]
+        if not message_content or not isinstance(message_content, list):
+            return False
+        last_part = message["content"][-1]
+        return message["role"] == "user" and last_part["type"] == "input_audio"

    def swap_user_audio(self):
        if not self._transcript:
            return
-        message = self._context.messages[-2]
+        message = self._context.get_messages()[-2]
        if not self.is_user_audio_message(message):
-            message = self._context.messages[-1]
+            message = self._context.get_messages()[-1]
            if not self.is_user_audio_message(message):
                return

-        audio_part = message.parts[-1]
-        audio_part.inline_data = None
-        audio_part.text = self._transcript
+        message["content"] = self._transcript

    async def process_frame(self, frame, direction):
        await super().process_frame(frame, direction)
@@ -327,8 +320,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        },
    ]

-    context = OpenAILLMContext(messages)
-    context_aggregator = conversation_llm.create_context_aggregator(context)
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
    audio_collector = UserAudioCollector(context, context_aggregator.user())
    input_transcription_context_filter = InputTranscriptionContextFilter()
    transcription_frames_emitter = InputTranscriptionFrameEmitter()
--- a/examples/foundational/39-mcp-stdio.py
+++ b/examples/foundational/39-mcp-stdio.py
@@ -206,6 +206,14 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
+    if not os.getenv("NASA_API_KEY"):
+        logger.error(
+            f"Please set NASA_API_KEY environment variable for this example. See https://api.nasa.gov"
+        )
+        import sys
+
+        sys.exit(1)
+
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/39a-mcp-run-sse.py
+++ b/examples/foundational/39a-mcp-run-sse.py
@@ -141,6 +141,14 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
+    if not os.getenv("MCP_RUN_SSE_URL"):
+        logger.error(
+            f"Please set MCP_RUN_SSE_URL environment variable for this example. See https://mcp.run"
+        )
+        import sys
+
+        sys.exit(1)
+
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/39b-multiple-mcp.py
+++ b/examples/foundational/39b-multiple-mcp.py
@@ -219,6 +219,14 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
+    if not os.getenv("NASA_API_KEY") or not os.getenv("MCP_RUN_SSE_URL"):
+        logger.error(
+            f"Please set NASA_API_KEY and MCP_RUN_SSE_URL environment variables. See https://api.nasa.gov and https://mcp.run"
+        )
+        import sys
+
+        sys.exit(1)
+
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/39c-mcp-run-http.py
+++ b/examples/foundational/39c-mcp-run-http.py
@@ -145,6 +145,14 @@ async def bot(runner_args: RunnerArguments):


 if __name__ == "__main__":
+    if not os.getenv("GITHUB_PERSONAL_ACCESS_TOKEN"):
+        logger.error(
+            f"Please set GITHUB_PERSONAL_ACCESS_TOKEN environment variable for this example."
+        )
+        import sys
+
+        sys.exit(1)
+
    from pipecat.runner.run import main

    main()
--- a/examples/foundational/43-heygen-transport.py
+++ b/examples/foundational/43-heygen-transport.py
@@ -0,0 +1,113 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import os
+import sys
+
+import aiohttp
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response_universal import (
+    LLMContext,
+    LLMContextAggregatorPair,
+)
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.google.llm import GoogleLLMService
+from pipecat.transports.heygen.transport import HeyGenParams, HeyGenTransport
+
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main():
+    async with aiohttp.ClientSession() as session:
+        transport = HeyGenTransport(
+            api_key=os.getenv("HEYGEN_API_KEY"),
+            session=session,
+            params=HeyGenParams(
+                audio_in_enabled=True,
+                audio_out_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+                turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+            ),
+        )
+
+        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+        tts = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_id="00967b2f-88a6-4a31-8153-110a92134b9f",
+        )
+
+        llm = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful assistant. Your output will be converted to audio so don't include special characters in your answers. Be succinct and respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        context = LLMContext(messages)
+        context_aggregator = LLMContextAggregatorPair(context)
+
+        pipeline = Pipeline(
+            [
+                transport.input(),  # Transport user input
+                stt,  # STT
+                context_aggregator.user(),  # User responses
+                llm,  # LLM
+                tts,  # TTS
+                transport.output(),  # Transport bot output
+                context_aggregator.assistant(),  # Assistant spoken responses
+            ]
+        )
+
+        task = PipelineTask(
+            pipeline,
+            params=PipelineParams(
+                enable_metrics=True,
+                enable_usage_metrics=True,
+            ),
+        )
+
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            logger.info(f"Client connected")
+            # Kick off the conversation.
+            messages.append(
+                {
+                    "role": "system",
+                    "content": "Start by saying 'Hello' and then a short greeting.",
+                }
+            )
+            await task.queue_frames([LLMRunFrame()])
+
+        @transport.event_handler("on_client_disconnected")
+        async def on_client_disconnected(transport, client):
+            logger.info(f"Client disconnected")
+            await task.cancel()
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/foundational/44-voicemail-detection.py
+++ b/examples/foundational/44-voicemail-detection.py
@@ -14,13 +14,12 @@ from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnal
 from pipecat.audio.vad.silero import SileroVADAnalyzer
 from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.extensions.voicemail.voicemail_detector import VoicemailDetector
-from pipecat.frames.frames import EndTaskFrame, TTSSpeakFrame
+from pipecat.frames.frames import TTSSpeakFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
-from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
--- a/examples/foundational/assets/pre-recorded.wav
+++ b/examples/foundational/assets/pre-recorded.wav
--- a/examples/quickstart/pyproject.toml
+++ b/examples/quickstart/pyproject.toml
@@ -4,7 +4,7 @@ version = "0.1.0"
 description = "Quickstart example for building voice AI bots with Pipecat"
 requires-python = ">=3.10"
 dependencies = [
-    "pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]>=0.0.85",
+    "pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]>=0.0.86",
    "pipecatcloud>=0.2.4"
 ]

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -93,7 +93,7 @@ riva = [ "nvidia-riva-client~=2.21.1" ]
 runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<0.117.0", "pipecat-ai-small-webrtc-prebuilt>=1.0.0"]
 sambanova = []
 sarvam = [ "pipecat-ai[websockets-base]" ]
-sentry = [ "sentry-sdk~=2.23.1" ]
+sentry = [ "sentry-sdk>=2.28.0,<3" ]
 local-smart-turn = [ "coremltools>=8.0", "transformers", "torch>=2.5.0,<3", "torchaudio>=2.5.0,<3" ]
 local-smart-turn-v3 = [ "transformers", "onnxruntime>=1.20.1,<2" ]
 remote-smart-turn = []
@@ -107,7 +107,7 @@ tavus=[]
 together = []
 tracing = [ "opentelemetry-sdk>=1.33.0", "opentelemetry-api>=1.33.0", "opentelemetry-instrumentation>=0.54b0" ]
 ultravox = [ "transformers>=4.48.0", "vllm>=0.9.0" ]
-webrtc = [ "aiortc~=1.13.0", "opencv-python~=4.11.0.86" ]
+webrtc = [ "aiortc>=1.13.0,<2", "opencv-python>=4.11.0.86,<5" ]
 websocket = [ "pipecat-ai[websockets-base]", "fastapi>=0.115.6,<0.117.0" ]
 websockets-base = [ "websockets>=13.1,<16.0" ]
 whisper = [ "faster-whisper~=1.1.1" ]
--- a/scripts/evals/eval.py
+++ b/scripts/evals/eval.py
@@ -34,7 +34,8 @@ from pipecat.frames.frames import EndTaskFrame, LLMRunFrame, OutputImageRawFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
 from pipecat.processors.frame_processor import FrameDirection
 from pipecat.runner.types import RunnerArguments
@@ -283,8 +284,8 @@ async def run_eval_pipeline(
        },
    ]

-    context = OpenAILLMContext(messages, tools)
-    context_aggregator = llm.create_context_aggregator(context)
+    context = LLMContext(messages, tools)
+    context_aggregator = LLMContextAggregatorPair(context)

    audio_buffer = AudioBufferProcessor()

--- a/scripts/evals/run-release-evals.py
+++ b/scripts/evals/run-release-evals.py
@@ -83,6 +83,7 @@ TESTS_07 = [
    ("07k-interruptible-lmnt.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    ("07l-interruptible-groq.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    ("07m-interruptible-aws.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("07m-interruptible-aws-strands.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("07n-interruptible-gemini.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    ("07n-interruptible-google.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    ("07o-interruptible-assemblyai.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
--- a/src/pipecat/adapters/services/open_ai_adapter.py
+++ b/src/pipecat/adapters/services/open_ai_adapter.py
@@ -105,6 +105,8 @@ class OpenAILLMAdapter(BaseLLMAdapter[OpenAILLMInvocationParams]):
                        if item["type"] == "image_url":
                            if item["image_url"]["url"].startswith("data:image/"):
                                item["image_url"]["url"] = "data:image/..."
+                        if item["type"] == "input_audio":
+                            item["input_audio"]["data"] = "..."
            if "mime_type" in msg and msg["mime_type"].startswith("image/"):
                msg["data"] = "..."
            msgs.append(msg)
--- a/src/pipecat/audio/turn/base_turn_analyzer.py
+++ b/src/pipecat/audio/turn/base_turn_analyzer.py
@@ -14,6 +14,8 @@ from abc import ABC, abstractmethod
 from enum import Enum
 from typing import Optional, Tuple

+from pydantic import BaseModel
+
 from pipecat.metrics.metrics import MetricsData


@@ -29,6 +31,12 @@ class EndOfTurnState(Enum):
    INCOMPLETE = 2


+class BaseTurnParams(BaseModel):
+    """Base class for turn analyzer parameters."""
+
+    pass
+
+
 class BaseTurnAnalyzer(ABC):
    """Abstract base class for analyzing user end of turn.

@@ -78,7 +86,7 @@ class BaseTurnAnalyzer(ABC):

    @property
    @abstractmethod
-    def params(self):
+    def params(self) -> BaseTurnParams:
        """Get the current turn analyzer parameters.

        Returns:
--- a/src/pipecat/audio/turn/smart_turn/base_smart_turn.py
+++ b/src/pipecat/audio/turn/smart_turn/base_smart_turn.py
@@ -11,15 +11,17 @@ machine learning models to determine when a user has finished speaking, going
 beyond simple silence-based detection.
 """

+import asyncio
 import time
 from abc import abstractmethod
+from concurrent.futures import ThreadPoolExecutor
 from typing import Any, Dict, Optional, Tuple

 import numpy as np
 from loguru import logger
 from pydantic import BaseModel

-from pipecat.audio.turn.base_turn_analyzer import BaseTurnAnalyzer, EndOfTurnState
+from pipecat.audio.turn.base_turn_analyzer import BaseTurnAnalyzer, BaseTurnParams, EndOfTurnState
 from pipecat.metrics.metrics import MetricsData, SmartTurnMetricsData

 # Default timing parameters
@@ -29,7 +31,7 @@ MAX_DURATION_SECONDS = 8  # Max allowed segment duration
 USE_ONLY_LAST_VAD_SEGMENT = True


-class SmartTurnParams(BaseModel):
+class SmartTurnParams(BaseTurnParams):
    """Configuration parameters for smart turn analysis.

    Parameters:
@@ -77,6 +79,9 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        self._speech_triggered = False
        self._silence_ms = 0
        self._speech_start_time = 0
+        # Thread executor that will run the model. We only need one thread per
+        # analyzer because one analyzer just handles one audio stream.
+        self._executor = ThreadPoolExecutor(max_workers=1)

    @property
    def speech_triggered(self) -> bool:
@@ -151,7 +156,10 @@ class BaseSmartTurn(BaseTurnAnalyzer):
            Tuple containing the end-of-turn state and optional metrics data
            from the ML model analysis.
        """
-        state, result = await self._process_speech_segment(self._audio_buffer)
+        loop = asyncio.get_running_loop()
+        state, result = await loop.run_in_executor(
+            self._executor, self._process_speech_segment, self._audio_buffer
+        )
        if state == EndOfTurnState.COMPLETE or USE_ONLY_LAST_VAD_SEGMENT:
            self._clear(state)
        logger.debug(f"End of Turn result: {state}")
@@ -169,9 +177,7 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        self._speech_start_time = 0
        self._silence_ms = 0

-    async def _process_speech_segment(
-        self, audio_buffer
-    ) -> Tuple[EndOfTurnState, Optional[MetricsData]]:
+    def _process_speech_segment(self, audio_buffer) -> Tuple[EndOfTurnState, Optional[MetricsData]]:
        """Process accumulated audio segment using ML model."""
        state = EndOfTurnState.INCOMPLETE

@@ -203,7 +209,7 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        if len(segment_audio) > 0:
            start_time = time.perf_counter()
            try:
-                result = await self._predict_endpoint(segment_audio)
+                result = self._predict_endpoint(segment_audio)
                state = (
                    EndOfTurnState.COMPLETE
                    if result["prediction"] == 1
@@ -249,6 +255,6 @@ class BaseSmartTurn(BaseTurnAnalyzer):
        return state, result_data

    @abstractmethod
-    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using ML model from audio data."""
        pass
--- a/src/pipecat/audio/turn/smart_turn/http_smart_turn.py
+++ b/src/pipecat/audio/turn/smart_turn/http_smart_turn.py
@@ -104,11 +104,15 @@ class HttpSmartTurnAnalyzer(BaseSmartTurn):
            logger.error(f"Failed to send raw request to Daily Smart Turn: {e}")
            raise Exception("Failed to send raw request to Daily Smart Turn.")

-    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using remote HTTP ML service."""
        try:
            serialized_array = self._serialize_array(audio_array)
-            return await self._send_raw_request(serialized_array)
+            loop = asyncio.get_running_loop()
+            future = asyncio.run_coroutine_threadsafe(
+                self._send_raw_request(serialized_array), loop
+            )
+            return future.result()
        except Exception as e:
            logger.error(f"Smart turn prediction failed: {str(e)}")
            # Return an incomplete prediction when a failure occurs
--- a/src/pipecat/audio/turn/smart_turn/local_smart_turn.py
+++ b/src/pipecat/audio/turn/smart_turn/local_smart_turn.py
@@ -64,7 +64,7 @@ class LocalSmartTurnAnalyzer(BaseSmartTurn):
        self._turn_model.eval()
        logger.debug("Loaded Local Smart Turn")

-    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using local PyTorch model."""
        inputs = self._turn_processor(
            audio_array,
--- a/src/pipecat/audio/turn/smart_turn/local_smart_turn_v2.py
+++ b/src/pipecat/audio/turn/smart_turn/local_smart_turn_v2.py
@@ -73,7 +73,7 @@ class LocalSmartTurnAnalyzerV2(BaseSmartTurn):
        self._turn_model.eval()
        logger.debug("Loaded Local Smart Turn v2")

-    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using local PyTorch model."""
        inputs = self._turn_processor(
            audio_array,
--- a/src/pipecat/audio/turn/smart_turn/local_smart_turn_v3.py
+++ b/src/pipecat/audio/turn/smart_turn/local_smart_turn_v3.py
@@ -77,7 +77,7 @@ class LocalSmartTurnAnalyzerV3(BaseSmartTurn):

        logger.debug("Loaded Local Smart Turn v3")

-    async def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
+    def _predict_endpoint(self, audio_array: np.ndarray) -> Dict[str, Any]:
        """Predict end-of-turn using local ONNX model."""

        def truncate_audio_to_last_n_seconds(audio_array, n_seconds=8, sample_rate=16000):
--- a/src/pipecat/audio/vad/vad_analyzer.py
+++ b/src/pipecat/audio/vad/vad_analyzer.py
@@ -11,7 +11,9 @@ data structures for voice activity detection in audio streams. Includes state
 management, parameter configuration, and audio analysis framework.
 """

+import asyncio
 from abc import ABC, abstractmethod
+from concurrent.futures import ThreadPoolExecutor
 from enum import Enum
 from typing import Optional

@@ -84,6 +86,10 @@ class VADAnalyzer(ABC):
        self._smoothing_factor = 0.2
        self._prev_volume = 0

+        # Thread executor that will run the model. We only need one thread per
+        # analyzer because one analyzer just handles one audio stream.
+        self._executor = ThreadPoolExecutor(max_workers=1)
+
    @property
    def sample_rate(self) -> int:
        """Get the current sample rate.
@@ -165,7 +171,7 @@ class VADAnalyzer(ABC):
        volume = calculate_audio_volume(audio, self.sample_rate)
        return exp_smoothing(volume, self._prev_volume, self._smoothing_factor)

-    def analyze_audio(self, buffer) -> VADState:
+    async def analyze_audio(self, buffer: bytes) -> VADState:
        """Analyze audio buffer and return current VAD state.

        Processes incoming audio data, maintains internal state, and determines
@@ -177,6 +183,12 @@ class VADAnalyzer(ABC):
        Returns:
            Current VAD state after processing the buffer.
        """
+        loop = asyncio.get_running_loop()
+        state = await loop.run_in_executor(self._executor, self._run_analyzer, buffer)
+        return state
+
+    def _run_analyzer(self, buffer: bytes) -> VADState:
+        """Analyze audio buffer and return current VAD state."""
        self._vad_buffer += buffer

        num_required_bytes = self._vad_frames_num_bytes
--- a/src/pipecat/extensions/voicemail/voicemail_detector.py
+++ b/src/pipecat/extensions/voicemail/voicemail_detector.py
@@ -36,7 +36,8 @@ from pipecat.frames.frames import (
    UserStoppedSpeakingFrame,
 )
 from pipecat.pipeline.parallel_pipeline import ParallelPipeline
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor, FrameProcessorSetup
 from pipecat.services.llm_service import LLMService
 from pipecat.sync.base_notifier import BaseNotifier
@@ -614,8 +615,8 @@ VOICEMAIL SYSTEM (respond "VOICEMAIL"):
        ]

        # Create the LLM context and aggregators for conversation management
-        self._context = OpenAILLMContext(self._messages)
-        self._context_aggregator = llm.create_context_aggregator(self._context)
+        self._context = LLMContext(self._messages)
+        self._context_aggregator = LLMContextAggregatorPair(self._context)

        # Create notification system for coordinating between components
        self._gate_notifier = EventNotifier()  # Signals classification completion
--- a/src/pipecat/pipeline/task.py
+++ b/src/pipecat/pipeline/task.py
@@ -13,8 +13,7 @@ including heartbeats, idle detection, and observer integration.

 import asyncio
 import time
-from collections import deque
-from typing import Any, AsyncIterable, Deque, Dict, Iterable, List, Optional, Tuple, Type
+from typing import Any, AsyncIterable, Dict, Iterable, List, Optional, Tuple, Type

 from loguru import logger
 from pydantic import BaseModel, ConfigDict, Field
@@ -31,7 +30,6 @@ from pipecat.frames.frames import (
    ErrorFrame,
    Frame,
    HeartbeatFrame,
-    InputAudioRawFrame,
    InterruptionFrame,
    InterruptionTaskFrame,
    MetricsFrame,
@@ -395,7 +393,8 @@ class PipelineTask(BasePipelineTask):
        Cancels all running tasks and stops frame processing without
        waiting for completion.
        """
-        await self._cancel()
+        if not self._finished:
+            await self._cancel()

    async def run(self, params: PipelineTaskParams):
        """Start and manage the pipeline execution until completion or cancellation.
--- a/src/pipecat/processors/aggregators/llm_response_universal.py
+++ b/src/pipecat/processors/aggregators/llm_response_universal.py
@@ -13,6 +13,7 @@ LLM processing, and text-to-speech components in conversational AI pipelines.

 import asyncio
 import json
+from abc import abstractmethod
 from typing import Any, Dict, List, Literal, Optional, Set

 from loguru import logger
@@ -169,6 +170,11 @@ class LLMContextAggregator(FrameProcessor):
        """Reset the aggregation state."""
        self._aggregation = ""

+    @abstractmethod
+    async def push_aggregation(self):
+        """Push the current aggregation downstream."""
+        pass
+

 class LLMUserAggregator(LLMContextAggregator):
    """User LLM aggregator that processes speech-to-text transcriptions.
@@ -301,7 +307,7 @@ class LLMUserAggregator(LLMContextAggregator):
        frame = LLMContextFrame(self._context)
        await self.push_frame(frame)

-    async def _push_aggregation(self):
+    async def push_aggregation(self):
        """Push the current aggregation based on interruption strategies and conditions."""
        if len(self._aggregation) > 0:
            if self.interruption_strategies and self._bot_speaking:
@@ -392,7 +398,7 @@ class LLMUserAggregator(LLMContextAggregator):
        # pushing the aggregation as we will probably get a final transcription.
        if len(self._aggregation) > 0:
            if not self._seen_interim_results:
-                await self._push_aggregation()
+                await self.push_aggregation()
        # Handles the case where both the user and the bot are not speaking,
        # and the bot was previously speaking before the user interruption.
        # So in this case we are resetting the aggregation timer
@@ -471,7 +477,7 @@ class LLMUserAggregator(LLMContextAggregator):
                await self._maybe_emulate_user_speaking()
            except asyncio.TimeoutError:
                if not self._user_speaking:
-                    await self._push_aggregation()
+                    await self.push_aggregation()

                # If we are emulating VAD we still need to send the user stopped
                # speaking frame.
@@ -607,12 +613,12 @@ class LLMAssistantAggregator(LLMContextAggregator):
        elif isinstance(frame, UserImageRawFrame) and frame.request and frame.request.tool_call_id:
            await self._handle_user_image_frame(frame)
        elif isinstance(frame, BotStoppedSpeakingFrame):
-            await self._push_aggregation()
+            await self.push_aggregation()
            await self.push_frame(frame, direction)
        else:
            await self.push_frame(frame, direction)

-    async def _push_aggregation(self):
+    async def push_aggregation(self):
        """Push the current assistant aggregation with timestamp."""
        if not self._aggregation:
            return
@@ -644,7 +650,7 @@ class LLMAssistantAggregator(LLMContextAggregator):
            await self.push_context_frame(FrameDirection.UPSTREAM)

    async def _handle_interruptions(self, frame: InterruptionFrame):
-        await self._push_aggregation()
+        await self.push_aggregation()
        self._started = 0
        await self.reset()

@@ -778,7 +784,7 @@ class LLMAssistantAggregator(LLMContextAggregator):
            text=frame.request.context,
        )

-        await self._push_aggregation()
+        await self.push_aggregation()
        await self.push_context_frame(FrameDirection.UPSTREAM)

    async def _handle_llm_start(self, _: LLMFullResponseStartFrame):
@@ -786,7 +792,7 @@ class LLMAssistantAggregator(LLMContextAggregator):

    async def _handle_llm_end(self, _: LLMFullResponseEndFrame):
        self._started -= 1
-        await self._push_aggregation()
+        await self.push_aggregation()

    async def _handle_text(self, frame: TextFrame):
        if not self._started:
--- a/src/pipecat/processors/aggregators/user_response.py
+++ b/src/pipecat/processors/aggregators/user_response.py
@@ -12,14 +12,14 @@ in conversational pipelines.
 """

 from pipecat.frames.frames import TextFrame
-from pipecat.processors.aggregators.llm_response import LLMUserContextAggregator
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMUserAggregator


-class UserResponseAggregator(LLMUserContextAggregator):
+class UserResponseAggregator(LLMUserAggregator):
    """Aggregates user responses into TextFrame objects.

-    This aggregator extends LLMUserContextAggregator to specifically handle
+    This aggregator extends LLMUserAggregator to specifically handle
    user input by collecting text responses and outputting them as TextFrame
    objects when the aggregation is complete.
    """
@@ -28,9 +28,9 @@ class UserResponseAggregator(LLMUserContextAggregator):
        """Initialize the user response aggregator.

        Args:
-            **kwargs: Additional arguments passed to parent LLMUserContextAggregator.
+            **kwargs: Additional arguments passed to parent LLMUserAggregator.
        """
-        super().__init__(context=OpenAILLMContext(), **kwargs)
+        super().__init__(context=LLMContext(), **kwargs)

    async def push_aggregation(self):
        """Push the aggregated user response as a TextFrame.
--- a/src/pipecat/processors/audio/audio_buffer_processor.py
+++ b/src/pipecat/processors/audio/audio_buffer_processor.py
@@ -229,9 +229,12 @@ class AudioBufferProcessor(FrameProcessor):
            # Save time of frame so we can compute silence.
            self._last_bot_frame_at = time.time()

-        if self._buffer_size > 0 and len(self._user_audio_buffer) > self._buffer_size:
+        if self._buffer_size > 0 and (
+            len(self._user_audio_buffer) >= self._buffer_size
+            or len(self._bot_audio_buffer) >= self._buffer_size
+        ):
            await self._call_on_audio_data_handler()
-            self._reset_recording()
+            self._reset_primary_audio_buffers()

        # Process turn recording with preprocessed data.
        if self._enable_turn_audio:
@@ -272,9 +275,15 @@ class AudioBufferProcessor(FrameProcessor):

    async def _call_on_audio_data_handler(self):
        """Call the audio data event handlers with buffered audio."""
-        if not self.has_audio() or not self._recording:
+        if not self._recording:
            return

+        if len(self._user_audio_buffer) == 0 and len(self._bot_audio_buffer) == 0:
+            return
+
+        self._align_track_buffers()
+        flush_time = time.time()
+
        # Call original handler with merged audio
        merged_audio = self.merge_audio_buffers()
        await self._call_event_handler(
@@ -290,23 +299,49 @@ class AudioBufferProcessor(FrameProcessor):
            self._num_channels,
        )

+        self._last_user_frame_at = flush_time
+        self._last_bot_frame_at = flush_time
+
    def _buffer_has_audio(self, buffer: bytearray) -> bool:
        """Check if a buffer contains audio data."""
        return buffer is not None and len(buffer) > 0

    def _reset_recording(self):
        """Reset recording state and buffers."""
-        self._reset_audio_buffers()
+        self._reset_all_audio_buffers()
        self._last_user_frame_at = time.time()
        self._last_bot_frame_at = time.time()

-    def _reset_audio_buffers(self):
+    def _reset_all_audio_buffers(self):
        """Reset all audio buffers to empty state."""
+        self._reset_primary_audio_buffers()
+        self._reset_turn_audio_buffers()
+
+    def _reset_primary_audio_buffers(self):
+        """Clear user and bot buffers while preserving turn buffers and timestamps."""
        self._user_audio_buffer = bytearray()
        self._bot_audio_buffer = bytearray()
+
+    def _reset_turn_audio_buffers(self):
+        """Clear user and bot turn buffers while preserving primary buffers and timestamps."""
        self._user_turn_audio_buffer = bytearray()
        self._bot_turn_audio_buffer = bytearray()

+    def _align_track_buffers(self):
+        """Pad the shorter track with silence so both tracks stay in sync."""
+        user_len = len(self._user_audio_buffer)
+        bot_len = len(self._bot_audio_buffer)
+        if user_len == bot_len:
+            return
+
+        target_len = max(user_len, bot_len)
+        if user_len < target_len:
+            self._user_audio_buffer.extend(b"\x00" * (target_len - user_len))
+            self._last_user_frame_at = max(self._last_user_frame_at, self._last_bot_frame_at)
+        if bot_len < target_len:
+            self._bot_audio_buffer.extend(b"\x00" * (target_len - bot_len))
+            self._last_bot_frame_at = max(self._last_bot_frame_at, self._last_user_frame_at)
+
    async def _resample_input_audio(self, frame: InputAudioRawFrame) -> bytes:
        """Resample audio frame to the target sample rate."""
        return await self._input_resampler.resample(
--- a/src/pipecat/processors/frame_processor.py
+++ b/src/pipecat/processors/frame_processor.py
@@ -455,9 +455,13 @@ class FrameProcessor(BaseObject):
            name = f"{self}::{coroutine.cr_code.co_name}"
        return self.task_manager.create_task(coroutine, name)

-    async def cancel_task(self, task: asyncio.Task, timeout: Optional[float] = None):
+    async def cancel_task(self, task: asyncio.Task, timeout: Optional[float] = 1.0):
        """Cancel a task managed by this processor.

+        A default timeout if 1 second is used in order to avoid potential
+        freezes caused by certain libraries that swallow
+        `asyncio.CancelledError`.
+
        Args:
            task: The task to cancel.
            timeout: Optional timeout for task cancellation.
--- a/src/pipecat/processors/frameworks/rtvi.py
+++ b/src/pipecat/processors/frameworks/rtvi.py
@@ -13,6 +13,7 @@ and frame observation for the RTVI protocol.

 import asyncio
 import base64
+import time
 from dataclasses import dataclass
 from typing import (
    Any,
@@ -29,6 +30,7 @@ from typing import (
 from loguru import logger
 from pydantic import BaseModel, Field, PrivateAttr, ValidationError

+from pipecat.audio.utils import calculate_audio_volume
 from pipecat.frames.frames import (
    BotStartedSpeakingFrame,
    BotStoppedSpeakingFrame,
@@ -52,6 +54,7 @@ from pipecat.frames.frames import (
    SystemFrame,
    TranscriptionFrame,
    TransportMessageUrgentFrame,
+    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
    TTSTextFrame,
@@ -613,9 +616,9 @@ class RTVIAppendToContextData(BaseModel):

    Contains the role, content, and whether to run the message immediately.

-        .. deprecated:: 0.0.85
-            The RTVI message, append-to-context, has been deprecated. Use send-text
-            or custom client and server messages instead.
+    .. deprecated:: 0.0.85
+        The RTVI message, append-to-context, has been deprecated. Use send-text
+        or custom client and server messages instead.
    """

    role: Literal["user", "assistant"] | str
@@ -839,6 +842,36 @@ class RTVIServerMessage(BaseModel):
    data: Any


+class RTVIAudioLevelMessageData(BaseModel):
+    """Data format for sending audio levels."""
+
+    value: float
+
+
+class RTVIUserAudioLevelMessage(BaseModel):
+    """Message indicating user audio level."""
+
+    label: RTVIMessageLiteral = RTVI_MESSAGE_LABEL
+    type: Literal["user-audio-level"] = "user-audio-level"
+    data: RTVIAudioLevelMessageData
+
+
+class RTVIBotAudioLevelMessage(BaseModel):
+    """Message indicating bot audio level."""
+
+    label: RTVIMessageLiteral = RTVI_MESSAGE_LABEL
+    type: Literal["bot-audio-level"] = "bot-audio-level"
+    data: RTVIAudioLevelMessageData
+
+
+class RTVISystemLogMessage(BaseModel):
+    """Message including a system log."""
+
+    label: RTVIMessageLiteral = RTVI_MESSAGE_LABEL
+    type: Literal["system-log"] = "system-log"
+    data: RTVITextMessageData
+
+
@dataclass
 class RTVIServerMessageFrame(SystemFrame):
    """A frame for sending server messages to the client.
@@ -858,25 +891,36 @@ class RTVIServerMessageFrame(SystemFrame):
 class RTVIObserverParams:
    """Parameters for configuring RTVI Observer behavior.

+    .. deprecated:: 0.0.87
+        Parameter `errors_enabled` is deprecated. Error messages are always enabled.
+
    Parameters:
        bot_llm_enabled: Indicates if the bot's LLM messages should be sent.
        bot_tts_enabled: Indicates if the bot's TTS messages should be sent.
        bot_speaking_enabled: Indicates if the bot's started/stopped speaking messages should be sent.
+        bot_audio_level_enabled: Indicates if bot's audio level messages should be sent.
        user_llm_enabled: Indicates if the user's LLM input messages should be sent.
        user_speaking_enabled: Indicates if the user's started/stopped speaking messages should be sent.
        user_transcription_enabled: Indicates if user's transcription messages should be sent.
+        user_audio_level_enabled: Indicates if user's audio level messages should be sent.
        metrics_enabled: Indicates if metrics messages should be sent.
-        errors_enabled: Indicates if errors messages should be sent.
+        system_logs_enabled: Indicates if system logs should be sent.
+        errors_enabled: [Deprecated] Indicates if errors messages should be sent.
+        audio_level_period_secs: How often audio levels should be sent if enabled.
    """

    bot_llm_enabled: bool = True
    bot_tts_enabled: bool = True
    bot_speaking_enabled: bool = True
+    bot_audio_level_enabled: bool = False
    user_llm_enabled: bool = True
    user_speaking_enabled: bool = True
    user_transcription_enabled: bool = True
+    user_audio_level_enabled: bool = False
    metrics_enabled: bool = True
+    system_logs_enabled: bool = False
    errors_enabled: bool = True
+    audio_level_period_secs: float = 0.15


 class RTVIObserver(BaseObserver):
@@ -892,7 +936,11 @@ class RTVIObserver(BaseObserver):
    """

    def __init__(
-        self, rtvi: "RTVIProcessor", *, params: Optional[RTVIObserverParams] = None, **kwargs
+        self,
+        rtvi: Optional["RTVIProcessor"] = None,
+        *,
+        params: Optional[RTVIObserverParams] = None,
+        **kwargs,
    ):
        """Initialize the RTVI observer.

@@ -904,9 +952,50 @@ class RTVIObserver(BaseObserver):
        super().__init__(**kwargs)
        self._rtvi = rtvi
        self._params = params or RTVIObserverParams()
-        self._bot_transcription = ""
+
        self._frames_seen = set()
-        rtvi.set_errors_enabled(self._params.errors_enabled)
+
+        self._bot_transcription = ""
+        self._last_user_audio_level = 0
+        self._last_bot_audio_level = 0
+
+        if self._params.system_logs_enabled:
+            self._system_logger_id = logger.add(self._logger_sink)
+
+        if self._params.errors_enabled:
+            import warnings
+
+            with warnings.catch_warnings():
+                warnings.simplefilter("always")
+                warnings.warn(
+                    "Parameter `errors_enabled` is deprecated. Error messages are always enabled.",
+                    DeprecationWarning,
+                )
+
+    async def _logger_sink(self, message):
+        """Logger sink so we cna send system logs to RTVI clients."""
+        message = RTVISystemLogMessage(data=RTVITextMessageData(text=message))
+        await self.send_rtvi_message(message)
+
+    async def cleanup(self):
+        """Cleanup RTVI observer resources."""
+        await super().cleanup()
+        if self._params.system_logs_enabled:
+            logger.remove(self._system_logger_id)
+
+    async def send_rtvi_message(self, model: BaseModel, exclude_none: bool = True):
+        """Send an RTVI message.
+
+        By default, we push a transport frame. But this function can be
+        overriden by subclass to send RTVI messages in different ways.
+
+        Args:
+            model: The message to send.
+            exclude_none: Whether to exclude None values from the model dump.
+
+        """
+        if self._rtvi:
+            await self._rtvi.push_transport_message(model, exclude_none)

    async def on_push_frame(self, data: FramePushed):
        """Process a frame being pushed through the pipeline.
@@ -948,52 +1037,58 @@ class RTVIObserver(BaseObserver):
        ):
            await self._handle_context(frame)
        elif isinstance(frame, LLMFullResponseStartFrame) and self._params.bot_llm_enabled:
-            await self.push_transport_message_urgent(RTVIBotLLMStartedMessage())
+            await self.send_rtvi_message(RTVIBotLLMStartedMessage())
        elif isinstance(frame, LLMFullResponseEndFrame) and self._params.bot_llm_enabled:
-            await self.push_transport_message_urgent(RTVIBotLLMStoppedMessage())
+            await self.send_rtvi_message(RTVIBotLLMStoppedMessage())
        elif isinstance(frame, LLMTextFrame) and self._params.bot_llm_enabled:
            await self._handle_llm_text_frame(frame)
        elif isinstance(frame, TTSStartedFrame) and self._params.bot_tts_enabled:
-            await self.push_transport_message_urgent(RTVIBotTTSStartedMessage())
+            await self.send_rtvi_message(RTVIBotTTSStartedMessage())
        elif isinstance(frame, TTSStoppedFrame) and self._params.bot_tts_enabled:
-            await self.push_transport_message_urgent(RTVIBotTTSStoppedMessage())
+            await self.send_rtvi_message(RTVIBotTTSStoppedMessage())
        elif isinstance(frame, TTSTextFrame) and self._params.bot_tts_enabled:
            if isinstance(src, BaseOutputTransport):
                message = RTVIBotTTSTextMessage(data=RTVITextMessageData(text=frame.text))
-                await self.push_transport_message_urgent(message)
+                await self.send_rtvi_message(message)
            else:
                mark_as_seen = False
        elif isinstance(frame, MetricsFrame) and self._params.metrics_enabled:
            await self._handle_metrics(frame)
        elif isinstance(frame, RTVIServerMessageFrame):
            message = RTVIServerMessage(data=frame.data)
-            await self.push_transport_message_urgent(message)
+            await self.send_rtvi_message(message)
        elif isinstance(frame, RTVIServerResponseFrame):
            if frame.error is not None:
                await self._send_error_response(frame)
            else:
                await self._send_server_response(frame)
+        elif isinstance(frame, InputAudioRawFrame) and self._params.user_audio_level_enabled:
+            curr_time = time.time()
+            diff_time = curr_time - self._last_user_audio_level
+            if diff_time > self._params.audio_level_period_secs:
+                level = calculate_audio_volume(frame.audio, frame.sample_rate)
+                message = RTVIUserAudioLevelMessage(data=RTVIAudioLevelMessageData(value=level))
+                await self.send_rtvi_message(message)
+                self._last_user_audio_level = curr_time
+        elif isinstance(frame, TTSAudioRawFrame) and self._params.bot_audio_level_enabled:
+            curr_time = time.time()
+            diff_time = curr_time - self._last_bot_audio_level
+            if diff_time > self._params.audio_level_period_secs:
+                level = calculate_audio_volume(frame.audio, frame.sample_rate)
+                message = RTVIBotAudioLevelMessage(data=RTVIAudioLevelMessageData(value=level))
+                await self.send_rtvi_message(message)
+                self._last_bot_audio_level = curr_time

        if mark_as_seen:
            self._frames_seen.add(frame.id)

-    async def push_transport_message_urgent(self, model: BaseModel, exclude_none: bool = True):
-        """Push an urgent transport message to the RTVI processor.
-
-        Args:
-            model: The message model to send.
-            exclude_none: Whether to exclude None values from the model dump.
-        """
-        frame = TransportMessageUrgentFrame(message=model.model_dump(exclude_none=exclude_none))
-        await self._rtvi.push_frame(frame)
-
    async def _push_bot_transcription(self):
        """Push accumulated bot transcription as a message."""
        if len(self._bot_transcription) > 0:
            message = RTVIBotTranscriptionMessage(
                data=RTVITextMessageData(text=self._bot_transcription)
            )
-            await self.push_transport_message_urgent(message)
+            await self.send_rtvi_message(message)
            self._bot_transcription = ""

    async def _handle_interruptions(self, frame: Frame):
@@ -1005,7 +1100,7 @@ class RTVIObserver(BaseObserver):
            message = RTVIUserStoppedSpeakingMessage()

        if message:
-            await self.push_transport_message_urgent(message)
+            await self.send_rtvi_message(message)

    async def _handle_bot_speaking(self, frame: Frame):
        """Handle bot speaking event frames."""
@@ -1016,12 +1111,12 @@ class RTVIObserver(BaseObserver):
            message = RTVIBotStoppedSpeakingMessage()

        if message:
-            await self.push_transport_message_urgent(message)
+            await self.send_rtvi_message(message)

    async def _handle_llm_text_frame(self, frame: LLMTextFrame):
        """Handle LLM text output frames."""
        message = RTVIBotLLMTextMessage(data=RTVITextMessageData(text=frame.text))
-        await self.push_transport_message_urgent(message)
+        await self.send_rtvi_message(message)

        self._bot_transcription += frame.text
        if match_endofsentence(self._bot_transcription):
@@ -1044,7 +1139,7 @@ class RTVIObserver(BaseObserver):
            )

        if message:
-            await self.push_transport_message_urgent(message)
+            await self.send_rtvi_message(message)

    async def _handle_context(self, frame: OpenAILLMContextFrame | LLMContextFrame):
        """Process LLM context frames to extract user messages for the RTVI client."""
@@ -1064,7 +1159,7 @@ class RTVIObserver(BaseObserver):
                text = "".join(part.text for part in message.parts if hasattr(part, "text"))
                if text:
                    rtvi_message = RTVIUserLLMTextMessage(data=RTVITextMessageData(text=text))
-                    await self.push_transport_message_urgent(rtvi_message)
+                    await self.send_rtvi_message(rtvi_message)

            # Handle OpenAI format (original implementation)
            elif isinstance(message, dict):
@@ -1075,7 +1170,7 @@ class RTVIObserver(BaseObserver):
                    else:
                        text = content
                    rtvi_message = RTVIUserLLMTextMessage(data=RTVITextMessageData(text=text))
-                    await self.push_transport_message_urgent(rtvi_message)
+                    await self.send_rtvi_message(rtvi_message)

        except Exception as e:
            logger.warning(f"Caught an error while trying to handle context: {e}")
@@ -1102,7 +1197,7 @@ class RTVIObserver(BaseObserver):
                metrics["characters"].append(d.model_dump(exclude_none=True))

        message = RTVIMetricsMessage(data=metrics)
-        await self.push_transport_message_urgent(message)
+        await self.send_rtvi_message(message)

    async def _send_server_response(self, frame: RTVIServerResponseFrame):
        """Send a response to the client for a specific request."""
@@ -1110,7 +1205,7 @@ class RTVIObserver(BaseObserver):
            id=str(frame.client_msg.msg_id),
            data=RTVIRawServerResponseData(t=frame.client_msg.type, d=frame.data),
        )
-        await self.push_transport_message_urgent(message)
+        await self.send_rtvi_message(message)

    async def _send_error_response(self, frame: RTVIServerResponseFrame):
        """Send a response to the client for a specific request."""
@@ -1118,7 +1213,7 @@ class RTVIObserver(BaseObserver):
            message = RTVIErrorResponse(
                id=str(frame.client_msg.msg_id), data=RTVIErrorResponseData(error=frame.error)
            )
-            await self.push_transport_message_urgent(message)
+            await self.send_rtvi_message(message)


 class RTVIProcessor(FrameProcessor):
@@ -1152,7 +1247,6 @@ class RTVIProcessor(FrameProcessor):
        # Default to 0.3.0 which is the last version before actually having a
        # "client-version".
        self._client_version = [0, 3, 0]
-        self._errors_enabled = True
        self._skip_tts: bool = False  # Keep in sync with llm_service.py

        self._registered_actions: Dict[str, RTVIAction] = {}
@@ -1222,14 +1316,6 @@ class RTVIProcessor(FrameProcessor):
        await self._update_config(self._config, False)
        await self._send_bot_ready()

-    def set_errors_enabled(self, enabled: bool):
-        """Enable or disable error message sending.
-
-        Args:
-            enabled: Whether to send error messages.
-        """
-        self._errors_enabled = enabled
-
    async def interrupt_bot(self):
        """Send a bot interruption frame upstream."""
        await self.push_interruption_task_frame_and_wait()
@@ -1258,6 +1344,11 @@ class RTVIProcessor(FrameProcessor):
        """
        await self._send_error_frame(ErrorFrame(error=error))

+    async def push_transport_message(self, model: BaseModel, exclude_none: bool = True):
+        """Push a transport message frame."""
+        frame = TransportMessageUrgentFrame(message=model.model_dump(exclude_none=exclude_none))
+        await self.push_frame(frame)
+
    async def handle_message(self, message: RTVIMessage):
        """Handle an incoming RTVI message.

@@ -1278,7 +1369,7 @@ class RTVIProcessor(FrameProcessor):
            args=params.arguments,
        )
        message = RTVILLMFunctionCallMessage(data=fn)
-        await self._push_transport_message(message, exclude_none=False)
+        await self.push_transport_message(message, exclude_none=False)

    async def handle_function_call_start(
        self, function_name: str, llm: FrameProcessor, context: OpenAILLMContext
@@ -1305,7 +1396,7 @@ class RTVIProcessor(FrameProcessor):

        fn = RTVILLMFunctionCallStartMessageData(function_name=function_name)
        message = RTVILLMFunctionCallStartMessage(data=fn)
-        await self._push_transport_message(message, exclude_none=False)
+        await self.push_transport_message(message, exclude_none=False)

    async def process_frame(self, frame: Frame, direction: FrameDirection):
        """Process incoming frames through the RTVI processor.
@@ -1377,11 +1468,6 @@ class RTVIProcessor(FrameProcessor):
            await self.cancel_task(self._message_task)
            self._message_task = None

-    async def _push_transport_message(self, model: BaseModel, exclude_none: bool = True):
-        """Push a transport message frame."""
-        frame = TransportMessageUrgentFrame(message=model.model_dump(exclude_none=exclude_none))
-        await self.push_frame(frame)
-
    async def _action_task_handler(self):
        """Handle incoming action frames."""
        while True:
@@ -1518,7 +1604,7 @@ class RTVIProcessor(FrameProcessor):

        services = list(self._registered_services.values())
        message = RTVIDescribeConfig(id=request_id, data=RTVIDescribeConfigData(config=services))
-        await self._push_transport_message(message)
+        await self.push_transport_message(message)

    async def _handle_describe_actions(self, request_id: str):
        """Handle a describe-actions request."""
@@ -1533,7 +1619,7 @@ class RTVIProcessor(FrameProcessor):

        actions = list(self._registered_actions.values())
        message = RTVIDescribeActions(id=request_id, data=RTVIDescribeActionsData(actions=actions))
-        await self._push_transport_message(message)
+        await self.push_transport_message(message)

    async def _handle_get_config(self, request_id: str):
        """Handle a get-config request."""
@@ -1547,7 +1633,7 @@ class RTVIProcessor(FrameProcessor):
            )

        message = RTVIConfigResponse(id=request_id, data=self._config)
-        await self._push_transport_message(message)
+        await self.push_transport_message(message)

    def _update_config_option(self, service: str, config: RTVIServiceOptionConfig):
        """Update a specific configuration option."""
@@ -1672,7 +1758,7 @@ class RTVIProcessor(FrameProcessor):
        # action responses (such as webhooks) don't set a request_id
        if request_id:
            message = RTVIActionResponse(id=request_id, data=RTVIActionResponseData(result=result))
-            await self._push_transport_message(message)
+            await self.push_transport_message(message)

    async def _send_bot_ready(self):
        """Send the bot-ready message to the client."""
@@ -1683,23 +1769,21 @@ class RTVIProcessor(FrameProcessor):
            id=self._client_ready_id,
            data=RTVIBotReadyData(version=RTVI_PROTOCOL_VERSION, config=config),
        )
-        await self._push_transport_message(message)
+        await self.push_transport_message(message)

    async def _send_server_message(self, message: RTVIServerMessage | RTVIServerResponse):
        """Send a message or response to the client."""
-        await self._push_transport_message(message)
+        await self.push_transport_message(message)

    async def _send_error_frame(self, frame: ErrorFrame):
        """Send an error frame as an RTVI error message."""
-        if self._errors_enabled:
-            message = RTVIError(data=RTVIErrorData(error=frame.error, fatal=frame.fatal))
-            await self._push_transport_message(message)
+        message = RTVIError(data=RTVIErrorData(error=frame.error, fatal=frame.fatal))
+        await self.push_transport_message(message)

    async def _send_error_response(self, id: str, error: str):
        """Send an error response message."""
-        if self._errors_enabled:
-            message = RTVIErrorResponse(id=id, data=RTVIErrorResponseData(error=error))
-            await self._push_transport_message(message)
+        message = RTVIErrorResponse(id=id, data=RTVIErrorResponseData(error=error))
+        await self.push_transport_message(message)

    def _action_id(self, service: str, action: str) -> str:
        """Generate an action ID from service and action names."""
--- a/src/pipecat/processors/frameworks/strands_agents.py
+++ b/src/pipecat/processors/frameworks/strands_agents.py
@@ -10,6 +10,7 @@ from loguru import logger

 from pipecat.frames.frames import (
    Frame,
+    LLMContextFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
    LLMTextFrame,
@@ -71,9 +72,11 @@ class StrandsAgentsProcessor(FrameProcessor):
            direction: The direction of frame flow in the pipeline.
        """
        await super().process_frame(frame, direction)
-        if isinstance(frame, OpenAILLMContextFrame):
-            text = frame.context.messages[-1]["content"]
-            await self._ainvoke(str(text).strip())
+        if isinstance(frame, (LLMContextFrame, OpenAILLMContextFrame)):
+            messages = frame.context.get_messages()
+            if messages:
+                last_message = messages[-1]
+                await self._ainvoke(str(last_message["content"]).strip())
        else:
            await self.push_frame(frame, direction)

--- a/src/pipecat/serializers/twilio.py
+++ b/src/pipecat/serializers/twilio.py
@@ -61,6 +61,8 @@ class TwilioFrameSerializer(FrameSerializer):
        call_sid: Optional[str] = None,
        account_sid: Optional[str] = None,
        auth_token: Optional[str] = None,
+        region: Optional[str] = None,
+        edge: Optional[str] = None,
        params: Optional[InputParams] = None,
    ):
        """Initialize the TwilioFrameSerializer.
@@ -70,13 +72,42 @@ class TwilioFrameSerializer(FrameSerializer):
            call_sid: The associated Twilio Call SID (optional, but required for auto hang-up).
            account_sid: Twilio account SID (required for auto hang-up).
            auth_token: Twilio auth token (required for auto hang-up).
+            region: Twilio region (e.g., "au1", "ie1"). Must be specified with edge.
+            edge: Twilio edge location (e.g., "sydney", "dublin"). Must be specified with region.
            params: Configuration parameters.
        """
+        self._params = params or TwilioFrameSerializer.InputParams()
+
+        # Validate hangup-related parameters if auto_hang_up is enabled
+        if self._params.auto_hang_up:
+            # Validate required credentials
+            missing_credentials = []
+            if not call_sid:
+                missing_credentials.append("call_sid")
+            if not account_sid:
+                missing_credentials.append("account_sid")
+            if not auth_token:
+                missing_credentials.append("auth_token")
+
+            if missing_credentials:
+                raise ValueError(
+                    f"auto_hang_up is enabled but missing required parameters: {', '.join(missing_credentials)}"
+                )
+
+            # Validate region and edge are both provided if either is specified
+            if (region and not edge) or (edge and not region):
+                raise ValueError(
+                    "Both edge and region parameters are required if one is set. "
+                    f"Twilio's FQDN format requires both: api.{{edge}}.{{region}}.twilio.com. "
+                    f"Got: region='{region}', edge='{edge}'"
+                )
+
        self._stream_sid = stream_sid
        self._call_sid = call_sid
        self._account_sid = account_sid
        self._auth_token = auth_token
-        self._params = params or TwilioFrameSerializer.InputParams()
+        self._region = region
+        self._edge = edge

        self._twilio_sample_rate = self._params.twilio_sample_rate
        self._sample_rate = 0  # Pipeline input rate
@@ -158,25 +189,14 @@ class TwilioFrameSerializer(FrameSerializer):
            account_sid = self._account_sid
            auth_token = self._auth_token
            call_sid = self._call_sid
+            region = self._region
+            edge = self._edge

-            if not call_sid or not account_sid or not auth_token:
-                missing = []
-                if not call_sid:
-                    missing.append("call_sid")
-                if not account_sid:
-                    missing.append("account_sid")
-                if not auth_token:
-                    missing.append("auth_token")
-
-                logger.warning(
-                    f"Cannot hang up Twilio call: missing required parameters: {', '.join(missing)}"
-                )
-                return
+            region_prefix = f"{region}." if region else ""
+            edge_prefix = f"{edge}." if edge else ""

            # Twilio API endpoint for updating calls
-            endpoint = (
-                f"https://api.twilio.com/2010-04-01/Accounts/{account_sid}/Calls/{call_sid}.json"
-            )
+            endpoint = f"https://api.{edge_prefix}{region_prefix}twilio.com/2010-04-01/Accounts/{account_sid}/Calls/{call_sid}.json"

            # Create basic auth from account_sid and auth_token
            auth = aiohttp.BasicAuth(account_sid, auth_token)
--- a/src/pipecat/services/anthropic/llm.py
+++ b/src/pipecat/services/anthropic/llm.py
@@ -151,7 +151,7 @@ class AnthropicLLMService(LLMService):
        self,
        *,
        api_key: str,
-        model: str = "claude-sonnet-4-20250514",
+        model: str = "claude-sonnet-4-5-20250929",
        params: Optional[InputParams] = None,
        client=None,
        retry_timeout_secs: Optional[float] = 5.0,
@@ -162,7 +162,7 @@ class AnthropicLLMService(LLMService):

        Args:
            api_key: Anthropic API key for authentication.
-            model: Model name to use. Defaults to "claude-sonnet-4-20250514".
+            model: Model name to use. Defaults to "claude-sonnet-4-5-20250929".
            params: Optional model parameters for inference.
            client: Optional custom Anthropic client instance.
            retry_timeout_secs: Request timeout in seconds for retry logic.
--- a/src/pipecat/services/aws_nova_sonic/aws.py
+++ b/src/pipecat/services/aws_nova_sonic/aws.py
@@ -429,7 +429,7 @@ class AWSNovaSonicLLMService(LLMService):
            await self._finish_connecting_if_context_available()
        except Exception as e:
            logger.error(f"{self} initialization error: {e}")
-            self._disconnect()
+            await self._disconnect()

    async def _finish_connecting_if_context_available(self):
        # We can only finish connecting once we've gotten our initial context and we're ready to
--- a/src/pipecat/services/heygen/api.py
+++ b/src/pipecat/services/heygen/api.py
@@ -108,12 +108,14 @@ class HeyGenSession(BaseModel):
    Parameters:
        session_id (str): Unique identifier for the streaming session.
        access_token (str): Token for accessing the session securely.
+        livekit_agent_token (str): Token for HeyGen’s audio agents(Pipecat).
        realtime_endpoint (str): Real-time communication endpoint URL.
        url (str): Direct URL for the session.
    """

    session_id: str
    access_token: str
+    livekit_agent_token: str
    realtime_endpoint: str
    url: str

--- a/src/pipecat/services/heygen/client.py
+++ b/src/pipecat/services/heygen/client.py
@@ -393,7 +393,9 @@ class HeyGenClient:
            participant_id: Identifier of the participant to capture audio from
            callback: Async function to handle received audio frames
        """
-        logger.debug(f"capture_participant_audio: {participant_id}")
+        logger.debug(
+            f"capture_participant_audio: {participant_id}, sample_rate: {self._in_sample_rate}"
+        )
        self._audio_frame_callback = callback
        if self._audio_task is not None:
            logger.warning(
@@ -407,7 +409,9 @@ class HeyGenClient:
            for track_pub in participant.track_publications.values():
                if track_pub.kind == rtc.TrackKind.KIND_AUDIO and track_pub.track is not None:
                    logger.debug(f"Starting audio capture for existing track: {track_pub.sid}")
-                    audio_stream = rtc.AudioStream(track_pub.track)
+                    audio_stream = rtc.AudioStream(
+                        track=track_pub.track, sample_rate=self._in_sample_rate
+                    )
                    self._audio_task = self._task_manager.create_task(
                        self._process_audio_frames(audio_stream), name="HeyGenClient_Receive_Audio"
                    )
@@ -536,7 +540,7 @@ class HeyGenClient:
                    and self._audio_task is None
                ):
                    logger.debug(f"Creating audio stream processor for track: {publication.sid}")
-                    audio_stream = rtc.AudioStream(track)
+                    audio_stream = rtc.AudioStream(track=track, sample_rate=self._in_sample_rate)
                    self._audio_task = self._task_manager.create_task(
                        self._process_audio_frames(audio_stream), name="HeyGenClient_Receive_Audio"
                    )
@@ -559,7 +563,7 @@ class HeyGenClient:
                )

            await self._livekit_room.connect(
-                self._heyGen_session.url, self._heyGen_session.access_token
+                self._heyGen_session.url, self._heyGen_session.livekit_agent_token
            )
            logger.debug(f"Successfully connected to LiveKit room: {self._livekit_room.name}")
            logger.debug(f"Local participant SID: {self._livekit_room.local_participant.sid}")
--- a/src/pipecat/services/heygen/video.py
+++ b/src/pipecat/services/heygen/video.py
@@ -110,6 +110,7 @@ class HeyGenVideoService(AIService):
            api_key=self._api_key,
            session=self._session,
            params=TransportParams(
+                audio_in_sample_rate=48000,
                audio_in_enabled=True,
                video_in_enabled=True,
                audio_out_enabled=True,
--- a/src/pipecat/services/openai/base_llm.py
+++ b/src/pipecat/services/openai/base_llm.py
@@ -337,10 +337,16 @@ class BaseOpenAILLMService(LLMService):

        async for chunk in chunk_stream:
            if chunk.usage:
+                cached_tokens = (
+                    chunk.usage.prompt_tokens_details.cached_tokens
+                    if chunk.usage.prompt_tokens_details
+                    else None
+                )
                tokens = LLMTokenUsage(
                    prompt_tokens=chunk.usage.prompt_tokens,
                    completion_tokens=chunk.usage.completion_tokens,
                    total_tokens=chunk.usage.total_tokens,
+                    cache_read_input_tokens=cached_tokens,
                )
                await self.start_llm_usage_metrics(tokens)

--- a/src/pipecat/transports/base_input.py
+++ b/src/pipecat/transports/base_input.py
@@ -11,7 +11,6 @@ input processing, including VAD, turn analysis, and interruption management.
 """

 import asyncio
-from concurrent.futures import ThreadPoolExecutor
 from typing import Optional

 from loguru import logger
@@ -79,10 +78,6 @@ class BaseInputTransport(FrameProcessor):
        # Track user speaking state for interruption logic
        self._user_speaking = False

-        # We read audio from a single queue one at a time and we then run VAD in
-        # a thread. Therefore, only one thread should be necessary.
-        self._executor = ThreadPoolExecutor(max_workers=1)
-
        # Task to process incoming audio (VAD) and push audio frames downstream
        # if passthrough is enabled.
        self._audio_task = None
@@ -398,9 +393,7 @@ class BaseInputTransport(FrameProcessor):
        """Analyze audio frame for voice activity."""
        state = VADState.QUIET
        if self.vad_analyzer:
-            state = await self.get_event_loop().run_in_executor(
-                self._executor, self.vad_analyzer.analyze_audio, audio_frame.audio
-            )
+            state = await self.vad_analyzer.analyze_audio(audio_frame.audio)
        return state

    async def _handle_vad(self, audio_frame: InputAudioRawFrame, vad_state: VADState) -> VADState:
--- a/src/pipecat/transports/daily/transport.py
+++ b/src/pipecat/transports/daily/transport.py
@@ -110,12 +110,32 @@ class DailyInputTransportMessageUrgentFrame(InputTransportMessageUrgentFrame):
 class DailyUpdateRemoteParticipantsFrame(ControlFrame):
    """Frame to update remote participants in Daily calls.

+    .. deprecated:: 0.0.87
+        `DailyUpdateRemoteParticipantsFrame` is deprecated and will be removed in a future version.
+        Create your own custom frame and use a custom processor to handle it or use, for example,
+        `on_after_push_frame` event instead in the output transport.
+
    Parameters:
        remote_participants: See https://reference-python.daily.co/api_reference.html#daily.CallClient.update_remote_participants.
    """

    remote_participants: Mapping[str, Any] = None

+    def __post_init__(self):
+        super().__post_init__()
+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "DailyUpdateRemoteParticipantsFrame is deprecated and will be removed in a future version."
+                "Instead, create your own custom frame and handle it in the "
+                '`@transport.output().event_handler("on_after_push_frame")` event handler or a '
+                "custom processor.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+

 class WebRTCVADAnalyzer(VADAnalyzer):
    """Voice Activity Detection analyzer using WebRTC.
--- a/src/pipecat/transports/heygen/init.py
+++ b/src/pipecat/transports/heygen/init.py
--- a/src/pipecat/transports/heygen/transport.py
+++ b/src/pipecat/transports/heygen/transport.py
@@ -0,0 +1,381 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""HeyGen implementation for Pipecat.
+
+This module provides integration with the HeyGen platform for creating conversational
+AI applications with avatars. It manages conversation sessions and provides real-time
+audio/video streaming capabilities through the HeyGen API.
+
+The module consists of three main components:
+- HeyGenInputTransport: Handles incoming audio and events from HeyGen conversations
+- HeyGenOutputTransport: Manages outgoing audio and events to HeyGen conversations
+- HeyGenTransport: Main transport implementation that coordinates input/output transports
+"""
+
+from typing import Any, Optional
+
+import aiohttp
+from loguru import logger
+
+from pipecat.frames.frames import (
+    AudioRawFrame,
+    BotStartedSpeakingFrame,
+    BotStoppedSpeakingFrame,
+    CancelFrame,
+    EndFrame,
+    Frame,
+    InputAudioRawFrame,
+    InterruptionFrame,
+    OutputAudioRawFrame,
+    StartFrame,
+    UserStartedSpeakingFrame,
+    UserStoppedSpeakingFrame,
+)
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor, FrameProcessorSetup
+from pipecat.services.heygen.api import NewSessionRequest
+from pipecat.services.heygen.client import HeyGenCallbacks, HeyGenClient
+from pipecat.transports.base_input import BaseInputTransport
+from pipecat.transports.base_output import BaseOutputTransport
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+
+
+class HeyGenInputTransport(BaseInputTransport):
+    """Input transport for receiving audio and events from HeyGen conversations.
+
+    Handles incoming audio streams from participants and manages audio capture
+    from the Daily room connected to the HeyGen conversation.
+    """
+
+    def __init__(
+        self,
+        client: HeyGenClient,
+        params: TransportParams,
+        **kwargs,
+    ):
+        """Initialize the HeyGen input transport.
+
+        Args:
+            client: The HeyGen transport client instance.
+            params: Transport configuration parameters.
+            **kwargs: Additional arguments passed to parent class.
+        """
+        super().__init__(params, **kwargs)
+        self._client = client
+        self._params = params
+        # Whether we have seen a StartFrame already.
+        self._initialized = False
+
+    async def setup(self, setup: FrameProcessorSetup):
+        """Setup the input transport.
+
+        Args:
+            setup: The frame processor setup configuration.
+        """
+        await super().setup(setup)
+        await self._client.setup(setup)
+
+    async def cleanup(self):
+        """Cleanup input transport resources."""
+        await super().cleanup()
+        await self._client.cleanup()
+
+    async def start(self, frame: StartFrame):
+        """Start the input transport.
+
+        Args:
+            frame: The start frame containing initialization parameters.
+        """
+        await super().start(frame)
+
+        if self._initialized:
+            return
+
+        self._initialized = True
+
+        await self.set_transport_ready(frame)
+
+    async def stop(self, frame: EndFrame):
+        """Stop the input transport.
+
+        Args:
+            frame: The end frame signaling transport shutdown.
+        """
+        await super().stop(frame)
+        await self._client.stop()
+
+    async def cancel(self, frame: CancelFrame):
+        """Cancel the input transport.
+
+        Args:
+            frame: The cancel frame signaling immediate cancellation.
+        """
+        await super().cancel(frame)
+        await self._client.stop()
+
+    async def start_capturing_audio(self, participant_id: str):
+        """Start capturing audio from a participant.
+
+        Args:
+            participant_id: The participant to capture audio from.
+        """
+        if self._params.audio_in_enabled:
+            logger.info(f"HeyGenTransport start capturing audio for participant {participant_id}")
+            await self._client.capture_participant_audio(
+                participant_id, self._on_participant_audio_data
+            )
+
+    async def _on_participant_audio_data(self, audio_frame: AudioRawFrame):
+        """Handle received participant audio data."""
+        frame = InputAudioRawFrame(
+            audio=audio_frame.audio,
+            sample_rate=audio_frame.sample_rate,
+            num_channels=audio_frame.num_channels,
+        )
+        await self.push_audio_frame(frame)
+
+
+class HeyGenOutputTransport(BaseOutputTransport):
+    """Output transport for sending audio and events to HeyGen conversations.
+
+    Handles outgoing audio streams to participants and manages the custom
+    audio track expected by the HeyGen platform.
+    """
+
+    def __init__(
+        self,
+        client: HeyGenClient,
+        params: TransportParams,
+        **kwargs,
+    ):
+        """Initialize the HeyGen output transport.
+
+        Args:
+            client: The HeyGen transport client instance.
+            params: Transport configuration parameters.
+            **kwargs: Additional arguments passed to parent class.
+        """
+        super().__init__(params, **kwargs)
+        self._client = client
+        self._params = params
+
+        # Whether we have seen a StartFrame already.
+        self._initialized = False
+        self._event_id = None
+
+    async def setup(self, setup: FrameProcessorSetup):
+        """Setup the output transport.
+
+        Args:
+            setup: The frame processor setup configuration.
+        """
+        await super().setup(setup)
+        await self._client.setup(setup)
+
+    async def cleanup(self):
+        """Cleanup output transport resources."""
+        await super().cleanup()
+        await self._client.cleanup()
+
+    async def start(self, frame: StartFrame):
+        """Start the output transport.
+
+        Args:
+            frame: The start frame containing initialization parameters.
+        """
+        await super().start(frame)
+
+        if self._initialized:
+            return
+
+        self._initialized = True
+        await self._client.start(frame, self.audio_chunk_size)
+        await self.set_transport_ready(frame)
+        self._client.transport_ready()
+
+    async def stop(self, frame: EndFrame):
+        """Stop the output transport.
+
+        Args:
+            frame: The end frame signaling transport shutdown.
+        """
+        await super().stop(frame)
+        await self._client.stop()
+
+    async def cancel(self, frame: CancelFrame):
+        """Cancel the output transport.
+
+        Args:
+            frame: The cancel frame signaling immediate cancellation.
+        """
+        await super().cancel(frame)
+        await self._client.stop()
+
+    async def push_frame(self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM):
+        """Push a frame to the next processor in the pipeline.
+
+        Args:
+            frame: The frame to push.
+            direction: The direction to push the frame.
+        """
+        # The BotStartedSpeakingFrame and BotStoppedSpeakingFrame are created inside BaseOutputTransport
+        # This is a workaround, so we can more reliably be aware when the bot has started or stopped speaking
+        if direction == FrameDirection.DOWNSTREAM:
+            if isinstance(frame, BotStartedSpeakingFrame):
+                if self._event_id is not None:
+                    logger.warning("self._event_id is already defined!")
+                self._event_id = str(frame.id)
+            elif isinstance(frame, BotStoppedSpeakingFrame):
+                await self._client.agent_speak_end(self._event_id)
+                self._event_id = None
+        await super().push_frame(frame, direction)
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process frames and handle interruptions.
+
+        Handles various types of frames including interruption events and user speaking states.
+        Updates the HeyGen client state based on the received frames.
+
+        Args:
+            frame: The frame to process
+            direction: The direction of frame flow in the pipeline
+
+        Note:
+            Special handling is implemented for:
+            - InterruptionFrame: Triggers interruption of current speech
+            - UserStartedSpeakingFrame: Initiates agent listening mode
+            - UserStoppedSpeakingFrame: Stops agent listening mode
+        """
+        await super().process_frame(frame, direction)
+        if isinstance(frame, InterruptionFrame):
+            await self._client.interrupt(self._event_id)
+            await self.push_frame(frame, direction)
+        if isinstance(frame, UserStartedSpeakingFrame):
+            await self._client.start_agent_listening()
+            await self.push_frame(frame, direction)
+        elif isinstance(frame, UserStoppedSpeakingFrame):
+            await self._client.stop_agent_listening()
+            await self.push_frame(frame, direction)
+
+    async def write_audio_frame(self, frame: OutputAudioRawFrame) -> bool:
+        """Write an audio frame to the HeyGen transport.
+
+        Args:
+            frame: The audio frame to write.
+        """
+        await self._client.agent_speak(bytes(frame.audio), self._event_id)
+        return True
+
+
+class HeyGenParams(TransportParams):
+    """Configuration parameters for the HeyGen transport.
+
+    Parameters:
+        audio_in_enabled: Whether to enable audio input from participants.
+        audio_out_enabled: Whether to enable audio output to participants.
+    """
+
+    audio_in_enabled: bool = True
+    audio_out_enabled: bool = True
+
+
+class HeyGenTransport(BaseTransport):
+    """Transport implementation for HeyGen video calls.
+
+    When used, the Pipecat bot joins the same virtual room as the HeyGen Avatar and the user.
+    This is achieved by using `HeyGenTransport`, which initiates the conversation via
+    `HeyGenApi` and obtains a room URL that all participants connect to.
+    """
+
+    def __init__(
+        self,
+        session: aiohttp.ClientSession,
+        api_key: str,
+        params: HeyGenParams = HeyGenParams(),
+        input_name: Optional[str] = None,
+        output_name: Optional[str] = None,
+        session_request: NewSessionRequest = NewSessionRequest(
+            avatar_id="Shawn_Therapist_public",
+            version="v2",
+        ),
+    ):
+        """Initialize the HeyGen transport.
+
+        Sets up a new HeyGen transport instance with the specified configuration for
+        handling video calls between the Pipecat bot and HeyGen Avatar.
+
+        Args:
+            session: aiohttp session for making async HTTP requests
+            api_key: HeyGen API key for authentication
+            params: HeyGen-specific configuration parameters (default: HeyGenParams())
+            input_name: Optional custom name for the input transport
+            output_name: Optional custom name for the output transport
+            session_request: Configuration for the HeyGen session (default: uses Shawn_Therapist_public avatar)
+
+        Note:
+            The transport will automatically join the same virtual room as the HeyGen Avatar
+            and user through the HeyGenClient, which handles session initialization via HeyGenApi.
+        """
+        super().__init__(input_name=input_name, output_name=output_name)
+        self._params = params
+        self._client = HeyGenClient(
+            api_key=api_key,
+            session=session,
+            params=params,
+            session_request=session_request,
+            callbacks=HeyGenCallbacks(
+                on_participant_connected=self._on_participant_connected,
+                on_participant_disconnected=self._on_participant_disconnected,
+            ),
+        )
+        self._input: Optional[HeyGenInputTransport] = None
+        self._output: Optional[HeyGenOutputTransport] = None
+        self._HeyGen_participant_id = None
+
+        # Register supported handlers. The user will only be able to register
+        # these handlers.
+        self._register_event_handler("on_client_connected")
+        self._register_event_handler("on_client_disconnected")
+
+    async def _on_participant_disconnected(self, participant_id: str):
+        logger.debug(f"HeyGen participant {participant_id} disconnected")
+        if participant_id != "heygen":
+            await self._on_client_disconnected(participant_id)
+
+    async def _on_participant_connected(self, participant_id: str):
+        logger.debug(f"HeyGen participant {participant_id} connected")
+        if participant_id != "heygen":
+            await self._on_client_connected(participant_id)
+            if self._input:
+                await self._input.start_capturing_audio(participant_id)
+
+    def input(self) -> FrameProcessor:
+        """Get the input transport for receiving media and events.
+
+        Returns:
+            The HeyGen input transport instance.
+        """
+        if not self._input:
+            self._input = HeyGenInputTransport(client=self._client, params=self._params)
+        return self._input
+
+    def output(self) -> FrameProcessor:
+        """Get the output transport for sending media and events.
+
+        Returns:
+            The HeyGen output transport instance.
+        """
+        if not self._output:
+            self._output = HeyGenOutputTransport(client=self._client, params=self._params)
+        return self._output
+
+    async def _on_client_connected(self, participant: Any):
+        """Handle client connected events."""
+        await self._call_event_handler("on_client_connected", participant)
+
+    async def _on_client_disconnected(self, participant: Any):
+        """Handle client disconnected events."""
+        await self._call_event_handler("on_client_disconnected", participant)
--- a/src/pipecat/transports/smallwebrtc/transport.py
+++ b/src/pipecat/transports/smallwebrtc/transport.py
@@ -66,7 +66,7 @@ class SmallWebRTCCallbacks(BaseModel):
        on_client_disconnected: Called when a client disconnects.
    """

-    on_app_message: Callable[[Any], Awaitable[None]]
+    on_app_message: Callable[[Any, str], Awaitable[None]]
    on_client_connected: Callable[[SmallWebRTCConnection], Awaitable[None]]
    on_client_disconnected: Callable[[SmallWebRTCConnection], Awaitable[None]]

@@ -254,7 +254,7 @@ class SmallWebRTCClient:

        @self._webrtc_connection.event_handler("app-message")
        async def on_app_message(connection: SmallWebRTCConnection, message: Any):
-            await self._handle_app_message(message)
+            await self._handle_app_message(message, connection.pc_id)

    def _convert_frame(self, frame_array: np.ndarray, format_name: str) -> np.ndarray:
        """Convert a video frame to RGB format based on the input format.
@@ -512,9 +512,9 @@ class SmallWebRTCClient:
        if not self._closing:
            await self._callbacks.on_client_disconnected(self._webrtc_connection)

-    async def _handle_app_message(self, message: Any):
+    async def _handle_app_message(self, message: Any, sender: str):
        """Handle incoming application messages."""
-        await self._callbacks.on_app_message(message)
+        await self._callbacks.on_app_message(message, sender)

    def _can_send(self):
        """Check if the connection is ready for sending data."""
@@ -935,11 +935,11 @@ class SmallWebRTCTransport(BaseTransport):
        if self._output:
            await self._output.queue_frame(frame, FrameDirection.DOWNSTREAM)

-    async def _on_app_message(self, message: Any):
+    async def _on_app_message(self, message: Any, sender: str):
        """Handle incoming application messages."""
        if self._input:
            await self._input.push_app_message(message)
-        await self._call_event_handler("on_app_message", message)
+        await self._call_event_handler("on_app_message", message, sender)

    async def _on_client_connected(self, webrtc_connection):
        """Handle client connection events."""
--- a/src/pipecat/transports/tavus/transport.py
+++ b/src/pipecat/transports/tavus/transport.py
@@ -221,6 +221,7 @@ class TavusTransportClient:
                ),
                on_joined=self._on_joined,
                on_left=self._on_left,
+                on_before_leave=partial(self._on_handle_callback, "on_before_leave"),
                on_error=partial(self._on_handle_callback, "on_error"),
                on_app_message=partial(self._on_handle_callback, "on_app_message"),
                on_call_state_updated=partial(self._on_handle_callback, "on_call_state_updated"),
--- a/tests/integration/test_integration_unified_function_calling.py
+++ b/tests/integration/test_integration_unified_function_calling.py
@@ -12,14 +12,12 @@ from dotenv import load_dotenv

 from pipecat.adapters.schemas.function_schema import FunctionSchema
 from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.frames.frames import LLMContextFrame
 from pipecat.pipeline.pipeline import Pipeline
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-    OpenAILLMContextFrame,
-)
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.services.anthropic.llm import AnthropicLLMService
 from pipecat.services.google.llm import GoogleLLMService
-from pipecat.services.llm_service import LLMService
+from pipecat.services.llm_service import FunctionCallParams, LLMService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.tests.utils import run_test

@@ -48,8 +46,13 @@ def standard_tools() -> ToolsSchema:


 async def _test_llm_function_calling(llm: LLMService):
-    # Create an AsyncMock for the function
-    mock_fetch_weather = AsyncMock()
+    # Create a mock weather function
+    call_count = 0
+
+    async def mock_fetch_weather(params: FunctionCallParams):
+        nonlocal call_count
+        call_count += 1
+        pass

    llm.register_function(None, mock_fetch_weather)

@@ -60,21 +63,19 @@ async def _test_llm_function_calling(llm: LLMService):
        },
        {"role": "user", "content": " How is the weather today in San Francisco, California?"},
    ]
-    context = OpenAILLMContext(messages, standard_tools())
-    # This is done by default inside the create_context_aggregator
-    context.set_llm_adapter(llm.get_llm_adapter())
+    context = LLMContext(messages, standard_tools())

    pipeline = Pipeline([llm])

-    frames_to_send = [OpenAILLMContextFrame(context)]
+    frames_to_send = [LLMContextFrame(context)]
    await run_test(
        pipeline,
        frames_to_send=frames_to_send,
        expected_down_frames=None,
    )

-    # Assert that the mock function was called
-    mock_fetch_weather.assert_called_once()
+    # Assert that the weather function was called once
+    assert call_count == 1


@pytest.mark.skipif(os.getenv("OPENAI_API_KEY") is None, reason="OPENAI_API_KEY is not set")
--- a/tests/test_audio_buffer_processor.py
+++ b/tests/test_audio_buffer_processor.py
@@ -0,0 +1,117 @@
+#
+# Copyright (c) 2024-2025 Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import struct
+import unittest
+
+from pipecat.frames.frames import InputAudioRawFrame, OutputAudioRawFrame, StartFrame
+from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
+
+
+class _PassthroughResampler:
+    async def resample(
+        self, audio: bytes, in_rate: int, out_rate: int
+    ) -> bytes:  # pragma: no cover - trivial
+        return audio
+
+
+class TestAudioBufferProcessor(unittest.IsolatedAsyncioTestCase):
+    async def asyncSetUp(self):
+        self.processor = AudioBufferProcessor(sample_rate=16000, num_channels=2, buffer_size=4)
+        self.processor._input_resampler = _PassthroughResampler()
+        self.processor._output_resampler = _PassthroughResampler()
+        self.processor._update_sample_rate(StartFrame(audio_out_sample_rate=16000))
+        await self.processor.start_recording()
+
+    async def asyncTearDown(self):
+        if getattr(self.processor, "_recording", False):
+            await self.processor.stop_recording()
+        await self.processor.cleanup()
+
+    async def test_flush_user_audio_pads_bot_track(self):
+        user_audio = struct.pack("<hh", 1000, -1000)
+        audio_event = asyncio.Event()
+        track_event = asyncio.Event()
+        captured = {}
+
+        async def on_audio_data(_, audio: bytes, sample_rate: int, num_channels: int):
+            captured["merged"] = (audio, sample_rate, num_channels)
+            audio_event.set()
+
+        async def on_track_audio_data(
+            _, user: bytes, bot: bytes, sample_rate: int, num_channels: int
+        ):
+            captured["tracks"] = (user, bot, sample_rate, num_channels)
+            track_event.set()
+
+        self.processor.add_event_handler("on_audio_data", on_audio_data)
+        self.processor.add_event_handler("on_track_audio_data", on_track_audio_data)
+
+        frame = InputAudioRawFrame(audio=user_audio, sample_rate=16000, num_channels=1)
+        await self.processor._process_recording(frame)
+
+        await asyncio.wait_for(audio_event.wait(), timeout=1)
+        await asyncio.wait_for(track_event.wait(), timeout=1)
+
+        merged_audio, merged_sr, merged_channels = captured["merged"]
+        user_track, bot_track, track_sr, track_channels = captured["tracks"]
+
+        self.assertEqual(merged_sr, 16000)
+        self.assertEqual(merged_channels, 2)
+        self.assertEqual(track_sr, 16000)
+        self.assertEqual(track_channels, 2)
+        self.assertEqual(user_track, user_audio)
+        self.assertEqual(bot_track, b"\x00" * len(user_audio))
+        self.assertEqual(len(merged_audio), len(user_audio) * 2)
+        self.assertEqual(merged_audio[0:2], user_audio[0:2])
+        self.assertEqual(merged_audio[2:4], b"\x00\x00")
+        self.assertEqual(merged_audio[4:6], user_audio[2:4])
+        self.assertEqual(merged_audio[6:8], b"\x00\x00")
+        self.assertEqual(len(self.processor._user_audio_buffer), 0)
+        self.assertEqual(len(self.processor._bot_audio_buffer), 0)
+
+    async def test_flush_bot_audio_pads_user_track(self):
+        bot_audio = struct.pack("<hh", -800, 400)
+        audio_event = asyncio.Event()
+        track_event = asyncio.Event()
+        captured = {}
+
+        async def on_audio_data(_, audio: bytes, sample_rate: int, num_channels: int):
+            captured["merged"] = (audio, sample_rate, num_channels)
+            audio_event.set()
+
+        async def on_track_audio_data(
+            _, user: bytes, bot: bytes, sample_rate: int, num_channels: int
+        ):
+            captured["tracks"] = (user, bot, sample_rate, num_channels)
+            track_event.set()
+
+        self.processor.add_event_handler("on_audio_data", on_audio_data)
+        self.processor.add_event_handler("on_track_audio_data", on_track_audio_data)
+
+        frame = OutputAudioRawFrame(audio=bot_audio, sample_rate=16000, num_channels=1)
+        await self.processor._process_recording(frame)
+
+        await asyncio.wait_for(audio_event.wait(), timeout=1)
+        await asyncio.wait_for(track_event.wait(), timeout=1)
+
+        merged_audio, merged_sr, merged_channels = captured["merged"]
+        user_track, bot_track, track_sr, track_channels = captured["tracks"]
+
+        self.assertEqual(merged_sr, 16000)
+        self.assertEqual(merged_channels, 2)
+        self.assertEqual(track_sr, 16000)
+        self.assertEqual(track_channels, 2)
+        self.assertEqual(user_track, b"\x00" * len(bot_audio))
+        self.assertEqual(bot_track, bot_audio)
+        self.assertEqual(len(merged_audio), len(bot_audio) * 2)
+        self.assertEqual(merged_audio[0:2], b"\x00\x00")
+        self.assertEqual(merged_audio[2:4], bot_audio[0:2])
+        self.assertEqual(merged_audio[4:6], b"\x00\x00")
+        self.assertEqual(merged_audio[6:8], bot_audio[2:4])
+        self.assertEqual(len(self.processor._user_audio_buffer), 0)
+        self.assertEqual(len(self.processor._bot_audio_buffer), 0)
--- a/tests/test_langchain.py
+++ b/tests/test_langchain.py
@@ -10,24 +10,21 @@ from langchain.prompts import ChatPromptTemplate
 from langchain_core.language_models import FakeStreamingListLLM

 from pipecat.frames.frames import (
+    LLMContextAssistantTimestampFrame,
+    LLMContextFrame,
    LLMFullResponseEndFrame,
    LLMFullResponseStartFrame,
-    OpenAILLMContextAssistantTimestampFrame,
    TextFrame,
    TranscriptionFrame,
    UserStartedSpeakingFrame,
    UserStoppedSpeakingFrame,
 )
 from pipecat.pipeline.pipeline import Pipeline
+from pipecat.processors.aggregators.llm_context import LLMContext
 from pipecat.processors.aggregators.llm_response import (
    LLMAssistantAggregatorParams,
-    LLMAssistantContextAggregator,
-    LLMUserContextAggregator,
-)
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-    OpenAILLMContextFrame,
 )
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
 from pipecat.processors.frame_processor import FrameProcessor
 from pipecat.processors.frameworks.langchain import LangchainProcessor
 from pipecat.tests.utils import SleepFrame, run_test
@@ -67,13 +64,14 @@ class TestLangchain(unittest.IsolatedAsyncioTestCase):
        proc = LangchainProcessor(chain=chain)
        self.mock_proc = self.MockProcessor("token_collector")

-        context = OpenAILLMContext()
-        tma_in = LLMUserContextAggregator(context)
-        tma_out = LLMAssistantContextAggregator(
-            context, params=LLMAssistantAggregatorParams(expect_stripped_words=False)
+        context = LLMContext()
+        context_aggregator = LLMContextAggregatorPair(
+            context, assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False)
        )

-        pipeline = Pipeline([tma_in, proc, self.mock_proc, tma_out])
+        pipeline = Pipeline(
+            [context_aggregator.user(), proc, self.mock_proc, context_aggregator.assistant()]
+        )

        frames_to_send = [
            UserStartedSpeakingFrame(),
@@ -84,8 +82,8 @@ class TestLangchain(unittest.IsolatedAsyncioTestCase):
        expected_down_frames = [
            UserStartedSpeakingFrame,
            UserStoppedSpeakingFrame,
-            OpenAILLMContextFrame,
-            OpenAILLMContextAssistantTimestampFrame,
+            LLMContextFrame,
+            LLMContextAssistantTimestampFrame,
        ]
        await run_test(
            pipeline,
@@ -94,4 +92,6 @@ class TestLangchain(unittest.IsolatedAsyncioTestCase):
        )

        self.assertEqual("".join(self.mock_proc.token), self.expected_response)
-        self.assertEqual(tma_out.messages[-1]["content"], self.expected_response)
+        self.assertEqual(
+            context_aggregator.assistant().messages[-1]["content"], self.expected_response
+        )
--- a/uv.lock
+++ b/uv.lock
@@ -4484,7 +4484,7 @@ requires-dist = [
    { name = "aioboto3", marker = "extra == 'aws'", specifier = "~=15.0.0" },
    { name = "aiofiles", specifier = ">=24.1.0,<25" },
    { name = "aiohttp", specifier = ">=3.11.12,<4" },
-    { name = "aiortc", marker = "extra == 'webrtc'", specifier = "~=1.13.0" },
+    { name = "aiortc", marker = "extra == 'webrtc'", specifier = ">=1.13.0,<2" },
    { name = "anthropic", marker = "extra == 'anthropic'", specifier = "~=0.49.0" },
    { name = "audioop-lts", marker = "python_full_version >= '3.13'", specifier = "~=0.2.1" },
    { name = "aws-sdk-bedrock-runtime", marker = "python_full_version >= '3.12' and extra == 'aws-nova-sonic'", specifier = "~=0.0.2" },
@@ -4522,7 +4522,7 @@ requires-dist = [
    { name = "onnxruntime", marker = "extra == 'local-smart-turn-v3'", specifier = ">=1.20.1,<2" },
    { name = "onnxruntime", marker = "extra == 'silero'", specifier = ">=1.20.1,<2" },
    { name = "openai", specifier = ">=1.74.0,<=1.99.1" },
-    { name = "opencv-python", marker = "extra == 'webrtc'", specifier = "~=4.11.0.86" },
+    { name = "opencv-python", marker = "extra == 'webrtc'", specifier = ">=4.11.0.86,<5" },
    { name = "openpipe", marker = "extra == 'openpipe'", specifier = "~=4.50.0" },
    { name = "opentelemetry-api", marker = "extra == 'tracing'", specifier = ">=1.33.0" },
    { name = "opentelemetry-instrumentation", marker = "extra == 'tracing'", specifier = ">=0.54b0" },
@@ -4557,7 +4557,7 @@ requires-dist = [
    { name = "python-dotenv", marker = "extra == 'runner'", specifier = ">=1.0.0,<2.0.0" },
    { name = "pyvips", extras = ["binary"], marker = "extra == 'moondream'", specifier = "~=3.0.0" },
    { name = "resampy", specifier = "~=0.4.3" },
-    { name = "sentry-sdk", marker = "extra == 'sentry'", specifier = "~=2.23.1" },
+    { name = "sentry-sdk", marker = "extra == 'sentry'", specifier = ">=2.28.0,<3" },
    { name = "simli-ai", marker = "extra == 'simli'", specifier = "~=0.1.10" },
    { name = "soundfile", marker = "extra == 'soundfile'", specifier = "~=0.13.0" },
    { name = "soxr", specifier = "~=0.5.0" },
@@ -6389,15 +6389,15 @@ wheels = [

 [[package]]
 name = "sentry-sdk"
-version = "2.23.1"
+version = "2.38.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "certifi" },
    { name = "urllib3" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/60/fd/2c5f7161dbea1fa03381f139c443b4524f3a15d58e50c96a65d19f454ba2/sentry_sdk-2.23.1.tar.gz", hash = "sha256:2288320465065f3f056630ce55936426204f96f63f1208edb79e033ed03774db", size = 316248, upload-time = "2025-03-17T12:52:34.14Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/b2/22/60fd703b34d94d216b2387e048ac82de3e86b63bc28869fb076f8bb0204a/sentry_sdk-2.38.0.tar.gz", hash = "sha256:792d2af45e167e2f8a3347143f525b9b6bac6f058fb2014720b40b84ccbeb985", size = 348116, upload-time = "2025-09-15T15:00:37.846Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/4e/00/9a9a2ab9020ee824d787f7e82a539305bf926393fe139baedbcf34356770/sentry_sdk-2.23.1-py2.py3-none-any.whl", hash = "sha256:42ef3a6cc1db3d22cb2ab24163d75b23f291ad9892b1a8c44075ce809a32b191", size = 336327, upload-time = "2025-03-17T12:52:32.176Z" },
+    { url = "https://files.pythonhosted.org/packages/7a/84/bde4c4bbb269b71bc09316af8eb00da91f67814d40337cc12ef9c8742541/sentry_sdk-2.38.0-py2.py3-none-any.whl", hash = "sha256:2324aea8573a3fa1576df7fb4d65c4eb8d9929c8fa5939647397a07179eef8d0", size = 370346, upload-time = "2025-09-15T15:00:35.821Z" },
 ]

 [[package]]
Author	SHA1	Message	Date
James Hush	a39f8b4882	Remove extra code	2025-10-01 14:48:12 +08:00
James Hush	76fc36f621	Pre recorded message example	2025-10-01 14:43:36 +08:00
James Hush	c0878c5e09	Save progress	2025-10-01 13:58:32 +08:00
James Hush	c6a1013051	Pre-recorded message example	2025-10-01 13:52:58 +08:00
Aleix Conchillo Flaqué	feae3b6d2d	Merge pull request #2742 from pipecat-ai/aleix/deprecate-daily-update-remote-participants-frame DailyTransport: deprecated DailyUpdateRemoteParticipantsFrame	2025-09-30 16:27:34 -07:00
Aleix Conchillo Flaqué	92d3be8975	DailyTransport: deprecated DailyUpdateRemoteParticipantsFrame	2025-09-30 16:26:48 -07:00
Aleix Conchillo Flaqué	0f53e1db2c	Merge pull request #2759 from pipecat-ai/aleix/dont-cancel-if-finished PipelineTask: avoid cancellation if application is finished	2025-09-30 16:21:16 -07:00
Aleix Conchillo Flaqué	d398e8cc10	Merge pull request #2761 from pipecat-ai/aleix/rtvi-tail-updates RTVI updates: audio levels and system logs	2025-09-30 13:55:17 -07:00
Aleix Conchillo Flaqué	e5f263d380	update CHANGELOG	2025-09-30 13:51:35 -07:00
Aleix Conchillo Flaqué	3a4c303c54	RTVIParams: add errors_enabled deprecation warnings	2025-09-30 13:49:51 -07:00
Mark Backman	54a1ef47d0	Merge pull request #2758 from pipecat-ai/mb/claude-sonnet-4.5 Update AnthropicLLMService to use claude-sonnet-4-5-20250929	2025-09-30 16:42:47 -04:00
Aleix Conchillo Flaqué	149ffa4f3c	RTVIObserver: add support system logs	2025-09-30 13:42:40 -07:00
Aleix Conchillo Flaqué	e5465034d9	RTVIObserver: add support for user/bot audio levels	2025-09-30 13:41:26 -07:00
Aleix Conchillo Flaqué	568c7c782d	rtvi: allow None RTVIProcessor and rename to send_rtvi_message()	2025-09-30 13:35:27 -07:00
Aleix Conchillo Flaqué	9851334221	rtvi: deprecate errors_enabled and always send errors	2025-09-30 13:31:30 -07:00
Aleix Conchillo Flaqué	e79c4fc99d	PipelineTask: avoid cancellation if application is finished	2025-09-30 13:18:25 -07:00
Aleix Conchillo Flaqué	55c321f4ff	Merge pull request #2751 from pipecat-ai/aleix/nova-sonic-disconnect-fix AWSNovaSonicLLMService: add missing await	2025-09-30 13:12:22 -07:00
kompfner	a14a53a005	Merge pull request #2735 from pipecat-ai/pk/remove-openaillmcontext-usage Remove remaining usage of `OpenAILLMContext` throughout the codebase …	2025-09-30 10:09:25 -04:00
Mark Backman	a71f937e8f	Update AnthropicLLMService to use claude-sonnet-4-5-20250929	2025-09-30 08:49:30 -04:00
Mark Backman	d0178edad0	Merge pull request #2753 from pipecat-ai/mb/quickstart-0.0.86 Quickstart: Update to 0.0.86, removing pytorch requirements	2025-09-29 09:43:33 -04:00
Mark Backman	795c5e55d9	Quickstart: Update to 0.0.86, removing pytorch requirements	2025-09-27 08:30:37 -04:00
Aleix Conchillo Flaqué	8f8d8ae0d8	AWSNovaSonicLLMService: add missing await	2025-09-26 15:58:05 -07:00
Vanessa Pyne	741f192d04	Merge pull request #2096 from pipecat-ai/vp-mcp-ex-nit mcp examples: check for env vars needed for examples	2025-09-26 10:21:22 -05:00
Aleix Conchillo Flaqué	ee00ee5c57	Merge pull request #2744 from pipecat-ai/aleix/vad-analyzer-thread-executor BaseInputTransport: create VAD thread in VADAnalyzer	2025-09-25 13:43:34 -07:00
Aleix Conchillo Flaqué	f53fd880dc	BaseInputTransport: create VAD thread in VADAnalyzer We move the thread creation to the VADAnalyzer instead of the input transport. This can potentially be useful if we need to analyze multiple audio streams.	2025-09-25 13:41:20 -07:00
Aleix Conchillo Flaqué	de3461e4cc	Merge pull request #2743 from pipecat-ai/aleix/turn-analyzer-fixes turn analyzer fixes	2025-09-25 13:40:43 -07:00
Aleix Conchillo Flaqué	7bafc3a1bb	BaseSmartTurn: process speech in a separate thread	2025-09-25 13:37:28 -07:00
Aleix Conchillo Flaqué	22ef61fe8d	BaseTurnAnalyzer: add BaseTurnParams base class for parameters	2025-09-25 13:37:09 -07:00
Aleix Conchillo Flaqué	7078fb53bd	Merge pull request #2738 from pipecat-ai/aleix/openai-cached-tokens-metrics BaseOpenAILLMService: include cached tokens to metrics frame	2025-09-25 13:36:03 -07:00
Aleix Conchillo Flaqué	33447ad6f2	BaseOpenAILLMService: include cached tokens to metrics frame	2025-09-24 19:32:16 -07:00
Paul Kompfner	6faa50ae5b	Remove remaining usage of `OpenAILLMContext` throughout the codebase in favor of `LLMContext`, except for: - Usage in classes that are already deprecated - Usage related to realtime LLMs, which don't yet support `LLMContext` - Usage in (soon-to-be-deprecated) code paths related to `OpenAILLMContext` itself and associated machinery	2025-09-24 16:35:03 -04:00
Aleix Conchillo Flaqué	3797f41c8c	Merge pull request #2734 from pipecat-ai/aleix/pipecat-0.0.86 update CHANGELOG for 0.0.86	2025-09-24 12:19:16 -07:00
Aleix Conchillo Flaqué	ff919b8c15	update CHANGELOG for 0.0.86	2025-09-24 11:28:14 -07:00
Aleix Conchillo Flaqué	cb048d6c7e	Merge pull request #2733 from pipecat-ai/aleix/tavus-missing-daily-callback TavusTransport: add missing on_before_leave callback	2025-09-24 11:10:32 -07:00
Aleix Conchillo Flaqué	6c2c43ade0	Merge pull request #2724 from pipecat-ai/pk/update-natural-conversation-examples-with-universal-context Update natural conversation examples with universal context	2025-09-24 11:07:50 -07:00
Aleix Conchillo Flaqué	f899c15b03	Merge pull request #2731 from pipecat-ai/pk/update-example-25-to-use-universal-context Update example 25 to use universal `LLMContext`	2025-09-24 11:05:36 -07:00
Aleix Conchillo Flaqué	d10ef08775	Merge pull request #2727 from pipecat-ai/pk/strands-agents-needs-to-support-openaillmcontextframe-for-now `StrandsAgentsProcessor` should still support `OpenAILLMContextFrame`…	2025-09-24 11:05:07 -07:00
Aleix Conchillo Flaqué	27a5af6fa1	Merge pull request #2728 from pipecat-ai/pk/fix-playht-in-env-example Fix PlayHT env variable names in env.example	2025-09-24 11:04:55 -07:00
Aleix Conchillo Flaqué	4bff0a7c49	Merge pull request #2732 from pipecat-ai/pk/update-voicemail-detector-to-use-llm-context Update `VoicemailDetector` to use universal `LLMContext`	2025-09-24 11:04:42 -07:00
Aleix Conchillo Flaqué	508f7d203d	Merge pull request #2729 from pipecat-ai/aleix/frame-processor-cancel-default-timeout FrameProcessor: timeout when cancelling tasks	2025-09-24 10:55:52 -07:00
Filipi da Silva Fuchter	0f87d5342c	Merge pull request #2266 from pipecat-ai/filipi/hey_gen_transport HeyGen implementation for Pipecat - HeyGenTransport	2025-09-24 14:35:50 -03:00
Filipi Fuchter	f6164e3bde	HeyGen implementation for Pipecat - HeyGenTransport.	2025-09-24 14:33:56 -03:00
Aleix Conchillo Flaqué	1a0fb55d0f	TavusTransport: add missing on_before_leave callback	2025-09-24 10:18:56 -07:00
Paul Kompfner	6d0beef944	Update `VoicemailDetector` to use universal `LLMContext`	2025-09-24 12:58:42 -04:00
Paul Kompfner	b9fd6b873b	Update comment in example 07b to reference `LLMContext` rather than `OpenAILLMContext`	2025-09-24 12:49:34 -04:00
Paul Kompfner	dea0f1791f	Update `OpenAILLMAdapter.get_messages_for_logging()` to truncate "input_audio" message data	2025-09-24 12:41:41 -04:00
Paul Kompfner	da66c38795	Update example 25 to use universal `LLMContext`	2025-09-24 12:37:29 -04:00
Paul Kompfner	912f8b96f0	Fix PlayHT env variable names in env.example	2025-09-24 11:24:35 -04:00
Aleix Conchillo Flaqué	f9eb447d82	FrameProcessor: timeout when cancelling tasks	2025-09-24 08:24:28 -07:00
Paul Kompfner	65f5fe8588	`StrandsAgentsProcessor` should still support `OpenAILLMContextFrame` until that frame has been deprecated	2025-09-24 11:05:27 -04:00
Paul Kompfner	817c77f3fe	Update `SmallWebRTCTransport` to pass a sender ID in the "on_app_message" event	2025-09-24 10:24:59 -04:00
Paul Kompfner	8896179b00	Update another "natural conversation" example to use universal `LLMContext`. Note that this one had to also be fixed in various ways, as it wasn't working.	2025-09-24 09:55:11 -04:00
Filipi da Silva Fuchter	463752360b	Merge pull request #2726 from pipecat-ai/mb/twilio-serializer-cleanup Fixup for TwilioFrameSerializer	2025-09-24 10:45:38 -03:00
Paul Kompfner	66b7977a62	Make `SmallWebRTCTransport` adhere to the expected "on_app_message" event signature	2025-09-24 09:34:28 -04:00
Mark Backman	468de68aec	Fixup for TwilioFrameSerializer	2025-09-24 09:32:46 -04:00
Mark Backman	c4762c1a92	Merge pull request #2627 from jessieweiyi/main Support TwilioFrameSerializer region/edge settings. Close #2625.	2025-09-24 09:13:10 -04:00
Aleix Conchillo Flaqué	7f4d3a2f02	pyproject: updated sentry to 2.38.0	2025-09-23 19:12:03 -07:00
Jessie Wei	88614b312f	Merge branch 'pipecat-ai:main' into main	2025-09-24 10:23:52 +10:00
Jessie Wei	5b4655f45a	chore: Update per review comments	2025-09-24 00:22:56 +00:00
Aleix Conchillo Flaqué	d7c8f8df53	update CHANGELOG with AudioBufferProcessor fixes	2025-09-23 15:42:01 -07:00
Aleix Conchillo Flaqué	2571cb2e69	tests: fix formatting	2025-09-23 15:28:07 -07:00
Aleix Conchillo Flaqué	15782be27c	Merge pull request #2676 from golbin/main Fix audio buffer flush and silence handling	2025-09-23 15:27:31 -07:00
Aleix Conchillo Flaqué	997e4b66c6	Merge pull request #2722 from pipecat-ai/aleix/strands-agents-update-and-evals examples: update Strands Agents with universal context and add evals	2025-09-23 15:21:41 -07:00
Paul Kompfner	6ccbfd9b57	Update "natural conversation" examples to use universal `LLMContext`	2025-09-23 16:20:16 -04:00
Paul Kompfner	677f69971c	Add filters in 22b and 22c examples to prevent function call results triggering the "statement judge" LLM from running unnecessarily, and with the wrong system prompt, which would result in garbled output statements comprised of both LLMs outputs combined	2025-09-23 16:14:58 -04:00
Paul Kompfner	678dd22b8e	Add missing `sender` argument to a few "on_app_message" handlers in examples	2025-09-23 15:29:20 -04:00
Aleix Conchillo Flaqué	620b1f785c	examples: update Strands Agents with universal context and add evals	2025-09-23 11:37:57 -07:00
Jessie Wei	392293d55f	Merge branch 'pipecat-ai:main' into main	2025-09-23 07:48:45 +10:00
vipyne	889dc19a27	mcp examples: check for env vars needed for examples	2025-09-19 12:09:50 -05:00
Jin Kim	58f70e7e0d	Add tests for audio buffer processor flush alignment	2025-09-18 22:19:32 +09:00
Jin Kim	d0b573e44f	Fix audio buffer flush and silence handling	2025-09-18 19:40:45 +09:00
Jessie Wei	305108be9a	[TwilioFrameSerializer]: Add parameter validation	2025-09-10 23:00:15 +00:00
Jessie Wei	2e1f397d17	Support TwilioFrameSerializer region/edge settings to that the call is terminated successfully for regions other than default us1 region	2025-09-10 14:30:10 +00:00