docs: add a demo showing how to track usage

Merge pull request #2844 from pipecat-ai/aleix/runner-files-path
runner: allow subdirectories in --folder
2025-10-16 13:45:42 +08:00 · 2025-10-14 08:38:38 -07:00 · 2025-10-13 18:29:19 -07:00 · 2025-10-11 09:31:21 -07:00 · 2025-10-11 09:20:17 -07:00 · 2025-10-10 14:20:44 -07:00
73 changed files with 6550 additions and 5198 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,84 @@ All notable changes to **Pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

+## [Unreleased]
+
+### Added
+
+- The runner `--folder` argument now supports downloading files from
+  subdirectories.
+
+### Fixed
+
+- Fixed an issue where `RimeHttpTTSService` and `PiperTTSService` could generate
+  incorrectly 16-bit aligned audio frames, potentially leading to internal
+  errors or static audio.
+
+## [0.0.90] - 2025-10-10
+
+### Added
+
+- Added audio filter `KrispVivaFilter` using the Krisp VIVA SDK.
+
+- Added `--folder` argument to the runner, allowing files saved in that folder
+  to be downloaded from `http://HOST:PORT/file/FILE`.
+
+- Added `GeminiLiveVertexLLMService`, for accessing Gemini Live via Google
+  Vertex AI.
+
+- Added some new configuration options to `GeminiLiveLLMService`:
+
+  - `thinking`
+  - `enable_affective_dialog`
+  - `proactivity`
+
+  Note that these new configuration options require using a newer model than
+  the default, like "gemini-2.5-flash-native-audio-preview-09-2025". The last
+  two require specifying `http_options=HttpOptions(api_version="v1alpha")`.
+
+- Added `on_pipeline_error` event to `PipelineTask`. This event will get fired
+  when an `ErrorFrame` is pushed (use `FrameProcessor.push_error()`).
+
+  ```python
+  @task.event_handler("on_pipeline_error")
+  async def on_pipeline_error(task: PipelineTask, frame: ErrorFrame):
+      ...
+  ```
+
+- Added a `service_tier` `InputParam` to the `BaseOpenAILLMService`. This
+  parameter can influence the latency of the response. For example `"priority"`
+  will result in faster completions, but in exchange for a higher price.
+
+### Changed
+
+- Updated `GeminiLiveLLMService` to use the `google-genai` library rather than
+  use WebSockets directly.
+
+### Deprecated
+
+- `LivekitFrameSerializer` is now deprecated. Use `LiveKitTransport` instead.
+
+- `pipecat.service.openai_realtime` is now deprecated, use
+  `pipecat.services.openai.realtime` instead or
+  `pipecat.services.azure.realtime` for Azure Realtime.
+
+- `pipecat.service.aws_nova_sonic` is now deprecated, use
+  `pipecat.services.aws.nova_sonic` instead.
+
+- `GeminiMultimodalLiveLLMService` is now deprecated, use
+  `GeminiLiveLLMService`.
+
+### Fixed
+
+- Fixed a `GoogleVertexLLMService` issue that would generate an error if no
+  token information was returned.
+
+- `GeminiLiveLLMService` will now end gracefully (i.e. after the bot has
+  finished) upon receiving an `EndFrame`.
+
+- `GeminiLiveLLMService` will try to seamlessly reconnect when it loses its
+  connection.
+
 ## [0.0.89] - 2025-10-07

 ### Fixed
@@ -23,8 +101,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Added `HumeTTSService` for text-to-speech synthesis using Hume AI's expressive
  voice models. Provides high-quality, emotionally expressive speech synthesis
  with support for various voice models. Includes example in
-  `examples/foundational/07ad-interruptible-hume.py`. Use with `uv pip install
-  pipecat-ai[hume]`.
+  `examples/foundational/07ad-interruptible-hume.py`. Use with:
+  `uv pip install pipecat-ai[hume]`.

 ### Changed

@@ -1422,7 +1500,7 @@ quality and critical bugs impacting `ParallelPipelines` functionality.**
 - Added `session_token` parameter to `AWSNovaSonicLLMService`.

 - Added Gemini Multimodal Live File API for uploading, fetching, listing, and
-  deleting files. See `26f-gemini-multimodal-live-files-api.py` for example usage.
+  deleting files. See `26f-gemini-live-files-api.py` for example usage.

 ### Changed

@@ -3428,7 +3506,7 @@ stt = DeepgramSTTService(..., live_options=LiveOptions(model="nova-2-general"))
 - Added the new modalities option and helper function to set Gemini output
  modalities.

- Added `examples/foundational/26d-gemini-multimodal-live-text.py` which is
+- Added `examples/foundational/26d-gemini-live-text.py` which is
  using Gemini as TEXT modality and using another TTS provider for TTS process.

 ### Changed
@@ -3615,9 +3693,9 @@ stt = DeepgramSTTService(..., live_options=LiveOptions(model="nova-2-general"))
 - Added new foundational examples for `GeminiMultimodalLiveLLMService`:

  - `26-gemini-multimodal-live.py`
-  - `26a-gemini-multimodal-live-transcription.py`
-  - `26b-gemini-multimodal-live-video.py`
-  - `26c-gemini-multimodal-live-video.py`
+  - `26a-gemini-live-transcription.py`
+  - `26b-gemini-live-video.py`
+  - `26c-gemini-live-video.py`

 - Added `SimliVideoService`. This is an integration for Simli AI avatars.
  (see https://www.simli.com)
--- a/README.md
+++ b/README.md
@@ -3,6 +3,7 @@
 </div></h1>

 [![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![codecov](https://codecov.io/gh/pipecat-ai/pipecat/graph/badge.svg?token=LNVUIVO4Y9)](https://codecov.io/gh/pipecat-ai/pipecat) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/pipecat-ai/pipecat)
+[![](https://getmanta.ai/api/badges?text=Manta%20Graph&link=manta)](https://getmanta.ai/pipecat)

 # 🎙️ Pipecat: Real-Time Voice & Multimodal AI Agents

@@ -51,6 +52,10 @@ Looking for help debugging your pipeline and processors? Check out [Whisker](htt

 Love terminal applications? Check out [Tail](https://github.com/pipecat-ai/tail), a terminal dashboard for Pipecat.

+### 📺️ Pipecat TV Channel
+
+Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.youtube.com/playlist?list=PLzU2zoMTQIHjqC3v4q2XVSR3hGSzwKFwH) channel.
+
 ## 🎬 See it in action

 <p float="left">
--- a/docs/api/conf.py
+++ b/docs/api/conf.py
@@ -50,6 +50,7 @@ autodoc_mock_imports = [
    # Krisp - has build issues on some platforms
    "pipecat_ai_krisp",
    "krisp",
+    "krisp_audio",
    # System-specific GUI libraries
    "_tkinter",
    "tkinter",
--- a/env.example
+++ b/env.example
@@ -90,6 +90,9 @@ SIMLI_FACE_ID=...
 # Krisp
 KRISP_MODEL_PATH=...

+# Krisp Viva
+KRISP_VIVA_MODEL_PATH=...
+
 # DeepSeek
 DEEPSEEK_API_KEY=...

--- a/examples/foundational/07p-interruptible-krisp-viva.py
+++ b/examples/foundational/07p-interruptible-krisp-viva.py
@@ -0,0 +1,129 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.filters.krisp_viva_filter import KrispVivaFilter
+from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
+from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.deepgram.tts import DeepgramTTSService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+        audio_in_filter=KrispVivaFilter(),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+        audio_in_filter=KrispVivaFilter(),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+        turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
+        audio_in_filter=KrispVivaFilter(),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+    tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-helios-en")
+
+    llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
+
+    messages = [
+        {
+            "role": "system",
+            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+        },
+    ]
+
+    context = LLMContext(messages)
+    context_aggregator = LLMContextAggregatorPair(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),  # Transport user input
+            stt,  # STT
+            context_aggregator.user(),  # User responses
+            llm,  # LLM
+            tts,  # TTS
+            transport.output(),  # Transport bot output
+            context_aggregator.assistant(),  # Assistant spoken responses
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/14p-function-calling-gemini-vertex-ai.py
+++ b/examples/foundational/14p-function-calling-gemini-vertex-ai.py
@@ -76,9 +76,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    llm = GoogleVertexLLMService(
        credentials=os.getenv("GOOGLE_VERTEX_TEST_CREDENTIALS"),
-        params=GoogleVertexLLMService.InputParams(
-            project_id=os.getenv("GOOGLE_CLOUD_PROJECT_ID"),
-        ),
+        project_id=os.getenv("GOOGLE_CLOUD_PROJECT_ID"),
+        location=os.getenv("GOOGLE_CLOUD_LOCATION"),
    )
    # You can aslo register a function_name of None to get all functions
    # sent to the same callback with an additional function_name parameter.
--- a/examples/foundational/18-openai-realtime-usage.py
+++ b/examples/foundational/18-openai-realtime-usage.py
@@ -0,0 +1,156 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Example: Print OpenAI Realtime API Token Usage Statistics
+
+This example demonstrates how to access and print token usage statistics
+from the OpenAI Realtime API, including detailed breakdowns of input/output
+tokens, cached tokens, and audio/text token usage.
+"""
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+# We store functions so objects don't get instantiated until the desired
+# transport gets selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    """Main function demonstrating usage statistics tracking."""
+    logger.info(f"Starting bot")
+
+    # Initialize the OpenAI Realtime service
+    llm = OpenAIRealtimeLLMService(
+        api_key=os.getenv("OPENAI_API_KEY") or "",
+        model="gpt-4o-realtime-preview-2024-12-17",
+    )
+
+    # To access usage statistics, we wrap the internal response handler
+    # This is the cleanest way to intercept usage data from the realtime API
+    original_handler = llm._handle_evt_response_done
+
+    async def custom_response_done_handler(evt):
+        """Custom handler that prints usage stats before calling original handler."""
+        # Print usage statistics if available
+        if evt.response.usage:
+            usage = evt.response.usage
+
+            logger.info("\n" + "=" * 50)
+            logger.info("📊 TOKEN USAGE STATISTICS")
+            logger.info("=" * 50)
+            logger.info(f"Total tokens: {usage.total_tokens}")
+            logger.info(f"Input tokens: {usage.input_tokens}")
+            logger.info(f"Output tokens: {usage.output_tokens}")
+
+            # Input token details
+            if usage.input_token_details:
+                logger.info(f"\n📥 Input token breakdown:")
+                logger.info(f"  • Cached tokens: {usage.input_token_details.cached_tokens}")
+                logger.info(f"  • Text tokens: {usage.input_token_details.text_tokens}")
+                logger.info(f"  • Audio tokens: {usage.input_token_details.audio_tokens}")
+
+                # Cached token details if available
+                if usage.input_token_details.cached_tokens_details:
+                    logger.info(
+                        f"  • Cached text tokens: {usage.input_token_details.cached_tokens_details.text_tokens}"
+                    )
+                    logger.info(
+                        f"  • Cached audio tokens: {usage.input_token_details.cached_tokens_details.audio_tokens}"
+                    )
+
+            # Output token details
+            if usage.output_token_details:
+                logger.info(f"\n📤 Output token breakdown:")
+                logger.info(f"  • Text tokens: {usage.output_token_details.text_tokens}")
+                logger.info(f"  • Audio tokens: {usage.output_token_details.audio_tokens}")
+
+            logger.info("=" * 50 + "\n")
+
+        # Call the original handler to maintain normal functionality
+        await original_handler(evt)
+
+    # Replace the handler with our custom one
+    llm._handle_evt_response_done = custom_response_done_handler
+
+    # Create pipeline
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            llm,
+            transport.output(),
+        ]
+    )
+
+    # Create task
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info("Client connected")
+        logger.info("🎤 Speak into your microphone to interact with the assistant")
+        logger.info("📊 Usage statistics will be printed after each response")
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info("Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/19-openai-realtime.py
+++ b/examples/foundational/19-openai-realtime.py
@@ -24,14 +24,15 @@ from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai_realtime import (
+from pipecat.services.openai.realtime.events import (
+    AudioConfiguration,
+    AudioInput,
    InputAudioNoiseReduction,
    InputAudioTranscription,
-    OpenAIRealtimeLLMService,
    SemanticTurnDetection,
    SessionProperties,
 )
-from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
+from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
--- a/examples/foundational/19a-azure-realtime.py
+++ b/examples/foundational/19a-azure-realtime.py
@@ -21,13 +21,14 @@ from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
+from pipecat.services.azure.realtime.llm import AzureRealtimeLLMService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai_realtime import (
-    AzureRealtimeLLMService,
+from pipecat.services.openai.realtime.events import (
+    AudioConfiguration,
+    AudioInput,
    InputAudioTranscription,
    SessionProperties,
 )
-from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
--- a/examples/foundational/19b-openai-realtime-text.py
+++ b/examples/foundational/19b-openai-realtime-text.py
@@ -22,16 +22,17 @@ from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia import CartesiaTTSService
+from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai_realtime import (
+from pipecat.services.openai.realtime.events import (
+    AudioConfiguration,
+    AudioInput,
    InputAudioNoiseReduction,
    InputAudioTranscription,
-    OpenAIRealtimeLLMService,
    SemanticTurnDetection,
    SessionProperties,
 )
-from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
+from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
--- a/examples/foundational/20b-persistent-context-openai-realtime.py
+++ b/examples/foundational/20b-persistent-context-openai-realtime.py
@@ -25,13 +25,14 @@ from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.deepgram.stt import DeepgramSTTService
 from pipecat.services.llm_service import FunctionCallParams
-from pipecat.services.openai_realtime import (
+from pipecat.services.openai.realtime.events import (
+    AudioConfiguration,
+    AudioInput,
    InputAudioTranscription,
-    OpenAIRealtimeLLMService,
    SessionProperties,
    TurnDetection,
 )
-from pipecat.services.openai_realtime.events import AudioConfiguration, AudioInput
+from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
--- a/examples/foundational/20e-persistent-context-aws-nova-sonic.py
+++ b/examples/foundational/20e-persistent-context-aws-nova-sonic.py
@@ -23,7 +23,7 @@ from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.aws_nova_sonic.aws import AWSNovaSonicLLMService
+from pipecat.services.aws.nova_sonic.llm import AWSNovaSonicLLMService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
--- a/examples/foundational/26-gemini-multimodal-live.py
+++ b/examples/foundational/26-gemini-multimodal-live.py
@@ -17,7 +17,7 @@ from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -65,7 +65,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    Respond to what the user said in a creative and helpful way.
    """

-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
--- a/examples/foundational/26a-gemini-multimodal-live-transcription.py
+++ b/examples/foundational/26a-gemini-multimodal-live-transcription.py
@@ -20,7 +20,7 @@ from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.transcript_processor import TranscriptProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -65,7 +65,7 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Aoede",  # Puck, Charon, Kore, Fenrir, Aoede
        # system_instruction="Talk like a pirate."
--- a/examples/foundational/26b-gemini-multimodal-live-function-calling.py
+++ b/examples/foundational/26b-gemini-multimodal-live-function-calling.py
@@ -22,7 +22,7 @@ from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -122,12 +122,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        required=["location"],
    )
    search_tool = {"google_search": {}}
+    # KNOWN ISSUE: If using GeminiVertexLiveLLMService, it appears
+    # you cannot use the "google_search" tool alongside other tools.
+    # See https://github.com/googleapis/python-genai/issues/941.
    tools = ToolsSchema(
        standard_tools=[weather_function, restaurant_function],
        custom_tools={AdapterType.GEMINI: [search_tool]},
    )

-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
        tools=tools,
--- a/examples/foundational/26c-gemini-multimodal-live-video.py
+++ b/examples/foundational/26c-gemini-multimodal-live-video.py
@@ -24,7 +24,7 @@ from pipecat.runner.utils import (
    maybe_capture_participant_camera,
    maybe_capture_participant_screen,
 )
-from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams

@@ -58,7 +58,7 @@ transport_params = {


 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Aoede",  # Puck, Charon, Kore, Fenrir, Aoede
        # system_instruction="Talk like a pirate."
--- a/examples/foundational/26d-gemini-multimodal-live-text.py
+++ b/examples/foundational/26d-gemini-multimodal-live-text.py
@@ -20,9 +20,9 @@ from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
 from pipecat.services.cartesia.tts import CartesiaTTSService
-from pipecat.services.gemini_multimodal_live.gemini import (
-    GeminiMultimodalLiveLLMService,
-    GeminiMultimodalModalities,
+from pipecat.services.google.gemini_live.llm import (
+    GeminiLiveLLMService,
+    GeminiModalities,
    InputParams,
 )
 from pipecat.transports.base_transport import BaseTransport, TransportParams
@@ -80,11 +80,15 @@ transport_params = {
 async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

-    llm = GeminiMultimodalLiveLLMService(
+    # KNOWN ISSUE: If using GeminiLiveVertexLLMService, you cannot specify a
+    # modality other than AUDIO (at least not if using the service's default
+    # model, which is a native audio model:
+    # https://cloud.google.com/vertex-ai/generative-ai/docs/live-api/tools#native-audio).
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=SYSTEM_INSTRUCTION,
        tools=[{"google_search": {}}, {"code_execution": {}}],
-        params=InputParams(modalities=GeminiMultimodalModalities.TEXT),
+        params=InputParams(modalities=GeminiModalities.TEXT),
    )

    # Optionally, you can set the response modalities via a function
--- a/examples/foundational/26e-gemini-multimodal-google-search.py
+++ b/examples/foundational/26e-gemini-multimodal-google-search.py
@@ -19,7 +19,7 @@ from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -83,7 +83,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info(f"Starting bot")

    # Initialize the Gemini Multimodal Live model
-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
        system_instruction=system_instruction,
--- a/examples/foundational/26f-gemini-multimodal-live-files-api.py
+++ b/examples/foundational/26f-gemini-multimodal-live-files-api.py
@@ -19,9 +19,7 @@ from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.gemini_multimodal_live.gemini import (
-    GeminiMultimodalLiveLLMService,
-)
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -110,7 +108,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    """

    # Initialize Gemini service with File API support
-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=system_instruction,
        voice_id="Charon",  # Aoede, Charon, Fenrir, Kore, Puck
--- a/examples/foundational/26g-gemini-multimodal-live-groundingMetadata.py
+++ b/examples/foundational/26g-gemini-multimodal-live-groundingMetadata.py
@@ -9,13 +9,13 @@ from pipecat.audio.vad.vad_analyzer import VADParams
 from pipecat.frames.frames import Frame, LLMRunFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.pipeline.task import PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
 from pipecat.services.google.frames import LLMSearchResponseFrame
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
 from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -105,7 +105,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        custom_tools={AdapterType.GEMINI: [{"google_search": {}}, {"code_execution": {}}]},
    )

-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        system_instruction=SYSTEM_INSTRUCTION,
        voice_id="Charon",  # Aoede, Charon, Fenrir, Kore, Puck
--- a/examples/foundational/26h-gemini-live-vertex-function-calling.py
+++ b/examples/foundational/26h-gemini-live-vertex-function-calling.py
@@ -0,0 +1,191 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+from datetime import datetime
+
+from dotenv import load_dotenv
+from google.genai.types import HttpOptions
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.google.gemini_live.llm_vertex import GeminiLiveVertexLLMService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
+    await params.result_callback(
+        {
+            "conditions": "nice",
+            "temperature": temperature,
+            "format": params.arguments["format"],
+            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
+        }
+    )
+
+
+async def fetch_restaurant_recommendation(params: FunctionCallParams):
+    await params.result_callback({"name": "The Golden Dragon"})
+
+
+system_instruction = """
+You are a helpful assistant who can answer questions and use tools.
+
+You have three tools available to you:
+1. get_current_weather: Use this tool to get the current weather in a specific location.
+2. get_restaurant_recommendation: Use this tool to get a restaurant recommendation in a specific location.
+"""
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        # set stop_secs to something roughly similar to the internal setting
+        # of the Multimodal Live api, just to align events. This doesn't really
+        # matter because we can only use the Multimodal Live API's phrase
+        # endpointing, for now.
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        # set stop_secs to something roughly similar to the internal setting
+        # of the Multimodal Live api, just to align events. This doesn't really
+        # matter because we can only use the Multimodal Live API's phrase
+        # endpointing, for now.
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        # set stop_secs to something roughly similar to the internal setting
+        # of the Multimodal Live api, just to align events. This doesn't really
+        # matter because we can only use the Multimodal Live API's phrase
+        # endpointing, for now.
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    weather_function = FunctionSchema(
+        name="get_current_weather",
+        description="Get the current weather",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+            "format": {
+                "type": "string",
+                "enum": ["celsius", "fahrenheit"],
+                "description": "The temperature unit to use. Infer this from the user's location.",
+            },
+        },
+        required=["location", "format"],
+    )
+    restaurant_function = FunctionSchema(
+        name="get_restaurant_recommendation",
+        description="Get a restaurant recommendation",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+        },
+        required=["location"],
+    )
+    # KNOWN ISSUE: If using GeminiVertexLiveLLMService, it appears
+    # you cannot use the "google_search" tool alongside other tools.
+    # See https://github.com/googleapis/python-genai/issues/941.
+    tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
+
+    llm = GeminiLiveVertexLLMService(
+        credentials=os.getenv("GOOGLE_VERTEX_TEST_CREDENTIALS"),
+        project_id=os.getenv("GOOGLE_CLOUD_PROJECT_ID"),
+        location=os.getenv("GOOGLE_CLOUD_LOCATION"),
+        system_instruction=system_instruction,
+        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
+        tools=tools,
+    )
+
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+
+    context = OpenAILLMContext(
+        [{"role": "user", "content": "Say hello."}],
+    )
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            context_aggregator.user(),
+            llm,
+            transport.output(),
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/26i-gemini-live-graceful-end.py
+++ b/examples/foundational/26i-gemini-live-graceful-end.py
@@ -0,0 +1,204 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import os
+from datetime import datetime
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.frames.frames import EndTaskFrame, LLMRunFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
+    await params.result_callback(
+        {
+            "conditions": "nice",
+            "temperature": temperature,
+            "format": params.arguments["format"],
+            "timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
+        }
+    )
+
+
+async def fetch_restaurant_recommendation(params: FunctionCallParams):
+    await params.result_callback({"name": "The Golden Dragon"})
+
+
+async def end_conversation(params: FunctionCallParams):
+    await params.result_callback({"success": True})
+    await params.llm.push_frame(EndTaskFrame(), FrameDirection.UPSTREAM)
+
+
+system_instruction = """
+You are a helpful assistant who can answer questions and use tools.
+
+You have three tools available to you:
+1. get_current_weather: Use this tool to get the current weather in a specific location.
+2. get_restaurant_recommendation: Use this tool to get a restaurant recommendation in a specific location.
+3. end_conversation: Use this tool to gracefully end the conversation.
+
+After you've responded to the user three times, do two things, in order:
+1. Politely let them know that that's all the time you have today and say goodbye.
+2. Call the end_conversation tool to gracefully end the conversation.
+"""
+
+
+# We store functions so objects (e.g. SileroVADAnalyzer) don't get
+# instantiated. The function will be called when the desired transport gets
+# selected.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        # set stop_secs to something roughly similar to the internal setting
+        # of the Multimodal Live api, just to align events. This doesn't really
+        # matter because we can only use the Multimodal Live API's phrase
+        # endpointing, for now.
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        # set stop_secs to something roughly similar to the internal setting
+        # of the Multimodal Live api, just to align events. This doesn't really
+        # matter because we can only use the Multimodal Live API's phrase
+        # endpointing, for now.
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+        # set stop_secs to something roughly similar to the internal setting
+        # of the Multimodal Live api, just to align events. This doesn't really
+        # matter because we can only use the Multimodal Live API's phrase
+        # endpointing, for now.
+        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5)),
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    weather_function = FunctionSchema(
+        name="get_current_weather",
+        description="Get the current weather",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+            "format": {
+                "type": "string",
+                "enum": ["celsius", "fahrenheit"],
+                "description": "The temperature unit to use. Infer this from the user's location.",
+            },
+        },
+        required=["location", "format"],
+    )
+    restaurant_function = FunctionSchema(
+        name="get_restaurant_recommendation",
+        description="Get a restaurant recommendation",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+        },
+        required=["location"],
+    )
+    end_conversation_function = FunctionSchema(
+        name="end_conversation",
+        description="Gracefully end the conversation",
+        properties={},
+        required=[],
+    )
+    search_tool = {"google_search": {}}
+    tools = ToolsSchema(
+        standard_tools=[weather_function, restaurant_function, end_conversation_function],
+        custom_tools={AdapterType.GEMINI: [search_tool]},
+    )
+
+    llm = GeminiLiveLLMService(
+        api_key=os.getenv("GOOGLE_API_KEY"),
+        system_instruction=system_instruction,
+        tools=tools,
+    )
+
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+    llm.register_function("end_conversation", end_conversation)
+
+    context = OpenAILLMContext(
+        [{"role": "user", "content": "Say hello."}],
+    )
+    context_aggregator = llm.create_context_aggregator(context)
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            context_aggregator.user(),
+            llm,
+            transport.output(),
+            context_aggregator.assistant(),
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/foundational/40-aws-nova-sonic.py
+++ b/examples/foundational/40-aws-nova-sonic.py
@@ -21,7 +21,7 @@ from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.aws_nova_sonic import AWSNovaSonicLLMService
+from pipecat.services.aws.nova_sonic.llm import AWSNovaSonicLLMService
 from pipecat.services.llm_service import FunctionCallParams
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
--- a/examples/foundational/46-video-processing.py
+++ b/examples/foundational/46-video-processing.py
@@ -20,7 +20,7 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.processors.frameworks.rtvi import RTVIObserver, RTVIProcessor
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.gemini_multimodal_live import GeminiMultimodalLiveLLMService
+from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
 from pipecat.transports.base_transport import TransportParams
 from pipecat.transports.daily.transport import DailyParams, DailyTransport

@@ -94,7 +94,7 @@ Respond to what the user said in a creative and helpful way. Keep your responses


 async def run_bot(pipecat_transport):
-    llm = GeminiMultimodalLiveLLMService(
+    llm = GeminiLiveLLMService(
        api_key=os.getenv("GOOGLE_API_KEY"),
        voice_id="Puck",  # Aoede, Charon, Fenrir, Kore, Puck
        transcribe_user_audio=True,
--- a/examples/foundational/README.md
+++ b/examples/foundational/README.md
@@ -105,7 +105,7 @@ uv run 07-interruptible.py -t twilio -x NGROK_HOST_NAME
 ### Vision & Multimodal

 - **[12a-describe-video-gemini-flash.py](./12a-describe-video-gemini-flash.py)**: Bot describes user's video (Video input, Multimodal LLMs)
- **[26c-gemini-multimodal-live-video.py](./26c-gemini-multimodal-live-video.py)**: Gemini with video input (Streaming video, Function calls)
+- **[26c-gemini-live-video.py](./26c-gemini-live-video.py)**: Gemini with video input (Streaming video, Function calls)

 ### Voice & Language

--- a/scripts/evals/run-release-evals.py
+++ b/scripts/evals/run-release-evals.py
@@ -147,7 +147,10 @@ TESTS_15 = [
 ]

 TESTS_19 = [
+    ("19-openai-realtime.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("19-openai-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
+    # OpenAI Realtime not released on Azure yet
+    # ("19a-azure-realtime.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("19a-azure-realtime-beta.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("19b-openai-realtime-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
    ("19b-openai-realtime-beta-text.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
@@ -160,18 +163,18 @@ TESTS_21 = [
 TESTS_26 = [
    ("26-gemini-multimodal-live.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    (
-        "26a-gemini-multimodal-live-transcription.py",
+        "26a-gemini-live-transcription.py",
        PROMPT_SIMPLE_MATH,
        EVAL_SIMPLE_MATH,
        BOT_SPEAKS_FIRST,
    ),
    (
-        "26b-gemini-multimodal-live-function-calling.py",
+        "26b-gemini-live-function-calling.py",
        PROMPT_WEATHER,
        EVAL_WEATHER,
        BOT_SPEAKS_FIRST,
    ),
-    ("26c-gemini-multimodal-live-video.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    ("26c-gemini-live-video.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
    (
        "26e-gemini-multimodal-google-search.py",
        PROMPT_ONLINE_SEARCH,
@@ -179,7 +182,13 @@ TESTS_26 = [
        BOT_SPEAKS_FIRST,
    ),
    # Currently not working.
-    # ("26d-gemini-multimodal-live-text.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    # ("26d-gemini-live-text.py", PROMPT_SIMPLE_MATH, EVAL_SIMPLE_MATH, BOT_SPEAKS_FIRST),
+    (
+        "26h-gemini-live-vertex-function-calling.py",
+        PROMPT_WEATHER,
+        EVAL_WEATHER,
+        BOT_SPEAKS_FIRST,
+    ),
 ]

 TESTS_27 = [
--- a/src/pipecat/adapters/services/gemini_adapter.py
+++ b/src/pipecat/adapters/services/gemini_adapter.py
@@ -87,9 +87,11 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
            Includes both converted standard tools and any custom Gemini-specific tools.
        """
        functions_schema = tools_schema.standard_tools
-        formatted_standard_tools = [
-            {"function_declarations": [func.to_default_dict() for func in functions_schema]}
-        ]
+        formatted_standard_tools = (
+            [{"function_declarations": [func.to_default_dict() for func in functions_schema]}]
+            if functions_schema
+            else []
+        )
        custom_gemini_tools = []
        if tools_schema.custom_tools:
            custom_gemini_tools = tools_schema.custom_tools.get(AdapterType.GEMINI, [])
--- a/src/pipecat/audio/filters/krisp_viva_filter.py
+++ b/src/pipecat/audio/filters/krisp_viva_filter.py
@@ -0,0 +1,193 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Krisp noise reduction audio filter for Pipecat.
+
+This module provides an audio filter implementation using Krisp VIVA SDK.
+"""
+
+import os
+
+import numpy as np
+from loguru import logger
+
+from pipecat.audio.filters.base_audio_filter import BaseAudioFilter
+from pipecat.frames.frames import FilterControlFrame, FilterEnableFrame
+
+try:
+    import krisp_audio
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error("In order to use the Krisp filter, you need to install krisp_audio.")
+    raise Exception(f"Missing module: {e}")
+
+
+def _log_callback(log_message, log_level):
+    logger.info(f"[{log_level}] {log_message}")
+
+
+class KrispVivaFilter(BaseAudioFilter):
+    """Audio filter using the Krisp VIVA SDK.
+
+    Provides real-time noise reduction for audio streams using Krisp's
+    proprietary noise suppression algorithms. This filter requires a
+    valid Krisp model file to operate.
+
+    Supported sample rates:
+        - 8000 Hz
+        - 16000 Hz
+        - 24000 Hz
+        - 32000 Hz
+        - 44100 Hz
+        - 48000 Hz
+    """
+
+    # Initialize Krisp Audio SDK globally
+    krisp_audio.globalInit("", _log_callback, krisp_audio.LogLevel.Off)
+    SDK_VERSION = krisp_audio.getVersion()
+    logger.debug(
+        f"Krisp Audio Python SDK Version: {SDK_VERSION.major}."
+        f"{SDK_VERSION.minor}.{SDK_VERSION.patch}"
+    )
+
+    SAMPLE_RATES = {
+        8000: krisp_audio.SamplingRate.Sr8000Hz,
+        16000: krisp_audio.SamplingRate.Sr16000Hz,
+        24000: krisp_audio.SamplingRate.Sr24000Hz,
+        32000: krisp_audio.SamplingRate.Sr32000Hz,
+        44100: krisp_audio.SamplingRate.Sr44100Hz,
+        48000: krisp_audio.SamplingRate.Sr48000Hz,
+    }
+
+    FRAME_SIZE_MS = 10  # Krisp requires audio frames of 10ms duration for processing.
+
+    def __init__(self, model_path: str = None, noise_suppression_level: int = 100) -> None:
+        """Initialize the Krisp noise reduction filter.
+
+        Args:
+            model_path: Path to the Krisp model file (.kef extension).
+                If None, uses KRISP_VIVA_MODEL_PATH environment variable.
+            noise_suppression_level: Noise suppression level.
+
+        Raises:
+            ValueError: If model_path is not provided and KRISP_VIVA_MODEL_PATH is not set.
+            Exception: If model file doesn't have .kef extension.
+            FileNotFoundError: If model file doesn't exist.
+        """
+        super().__init__()
+
+        # Set model path, checking environment if not specified
+        self._model_path = model_path or os.getenv("KRISP_VIVA_MODEL_PATH")
+        if not self._model_path:
+            logger.error("Model path is not provided and KRISP_VIVA_MODEL_PATH is not set.")
+            raise ValueError("Model path for KrispAudioProcessor must be provided.")
+
+        if not self._model_path.endswith(".kef"):
+            raise Exception("Model is expected with .kef extension")
+
+        if not os.path.isfile(self._model_path):
+            raise FileNotFoundError(f"Model file not found: {self._model_path}")
+
+        self._filtering = True
+        self._session = None
+        self._samples_per_frame = None
+        self._noise_suppression_level = noise_suppression_level
+
+        # Audio buffer to accumulate samples for complete frames
+        self._audio_buffer = bytearray()
+
+    def _int_to_sample_rate(self, sample_rate):
+        """Convert integer sample rate to krisp_audio SamplingRate enum.
+
+        Args:
+            sample_rate: Sample rate as integer
+
+        Returns:
+            krisp_audio.SamplingRate enum value
+
+        Raises:
+            ValueError: If sample rate is not supported
+        """
+        if sample_rate not in self.SAMPLE_RATES:
+            raise ValueError("Unsupported sample rate")
+        return self.SAMPLE_RATES[sample_rate]
+
+    async def start(self, sample_rate: int):
+        """Initialize the Krisp processor with the transport's sample rate.
+
+        Args:
+            sample_rate: The sample rate of the input transport in Hz.
+        """
+        model_info = krisp_audio.ModelInfo()
+        model_info.path = self._model_path
+
+        nc_cfg = krisp_audio.NcSessionConfig()
+        nc_cfg.inputSampleRate = self._int_to_sample_rate(sample_rate)
+        nc_cfg.inputFrameDuration = krisp_audio.FrameDuration.Fd10ms
+        nc_cfg.outputSampleRate = nc_cfg.inputSampleRate
+        nc_cfg.modelInfo = model_info
+
+        self._samples_per_frame = int((sample_rate * self.FRAME_SIZE_MS) / 1000)
+        self._session = krisp_audio.NcInt16.create(nc_cfg)
+
+    async def stop(self):
+        """Clean up the Krisp processor when stopping."""
+        self._session = None
+
+    async def process_frame(self, frame: FilterControlFrame):
+        """Process control frames to enable/disable filtering.
+
+        Args:
+            frame: The control frame containing filter commands.
+        """
+        if isinstance(frame, FilterEnableFrame):
+            self._filtering = frame.enable
+
+    async def filter(self, audio: bytes) -> bytes:
+        """Apply Krisp noise reduction to audio data.
+
+        Args:
+            audio: Raw audio data as bytes to be filtered.
+
+        Returns:
+            Noise-reduced audio data as bytes.
+        """
+        if not self._filtering:
+            return audio
+
+        # Add incoming audio to our buffer
+        self._audio_buffer.extend(audio)
+
+        # Calculate how many complete frames we can process
+        total_samples = len(self._audio_buffer) // 2  # 2 bytes per int16 sample
+        num_complete_frames = total_samples // self._samples_per_frame
+
+        if num_complete_frames == 0:
+            # Not enough samples for a complete frame yet, return empty
+            return b""
+
+        # Calculate how many bytes we need for complete frames
+        complete_samples_count = num_complete_frames * self._samples_per_frame
+        bytes_to_process = complete_samples_count * 2  # 2 bytes per sample
+
+        # Extract the bytes we can process
+        audio_to_process = bytes(self._audio_buffer[:bytes_to_process])
+
+        # Remove processed bytes from buffer, keep the remainder
+        self._audio_buffer = self._audio_buffer[bytes_to_process:]
+
+        # Process the complete frames
+        samples = np.frombuffer(audio_to_process, dtype=np.int16)
+        frames = samples.reshape(-1, self._samples_per_frame)
+        processed_samples = np.empty_like(samples)
+
+        for i, frame in enumerate(frames):
+            cleaned_frame = self._session.process(frame, self._noise_suppression_level)
+            processed_samples[i * self._samples_per_frame : (i + 1) * self._samples_per_frame] = (
+                cleaned_frame
+            )
+
+        return processed_samples.tobytes()
--- a/src/pipecat/pipeline/task.py
+++ b/src/pipecat/pipeline/task.py
@@ -138,6 +138,8 @@ class PipelineTask(BasePipelineTask):
          Use this event for cleanup, logging, or post-processing tasks. Users can inspect
          the frame if they need to handle specific cases.

+    - on_pipeline_error: Called when an error occurs with ErrorFrame
+
    Example::

        @task.event_handler("on_frame_reached_upstream")
@@ -148,9 +150,17 @@ class PipelineTask(BasePipelineTask):
        async def on_pipeline_idle_timeout(task):
            ...

+        @task.event_handler("on_pipeline_started")
+        async def on_pipeline_started(task, frame):
+            ...
+
        @task.event_handler("on_pipeline_finished")
        async def on_pipeline_finished(task, frame):
            ...
+
+        @task.event_handler("on_pipeline_error")
+        async def on_pipeline_error(task, frame):
+            ...
    """

    def __init__(
@@ -288,6 +298,7 @@ class PipelineTask(BasePipelineTask):
        self._register_event_handler("on_pipeline_ended")
        self._register_event_handler("on_pipeline_cancelled")
        self._register_event_handler("on_pipeline_finished")
+        self._register_event_handler("on_pipeline_error")

    @property
    def params(self) -> PipelineParams:
@@ -694,12 +705,11 @@ class PipelineTask(BasePipelineTask):
            logger.debug(f"{self}: received interruption task frame {frame}")
            await self._pipeline.queue_frame(InterruptionFrame())
        elif isinstance(frame, ErrorFrame):
+            await self._call_event_handler("on_pipeline_error", frame)
            if frame.fatal:
                logger.error(f"A fatal error occurred: {frame}")
                # Cancel all tasks downstream.
                await self.queue_frame(CancelFrame())
-                # Tell the task we should stop.
-                await self.queue_frame(StopTaskFrame())
            else:
                logger.warning(f"{self}: Something went wrong: {frame}")

--- a/src/pipecat/processors/frame_processor.py
+++ b/src/pipecat/processors/frame_processor.py
@@ -877,6 +877,8 @@ class FrameProcessor(BaseObject):

        """
        while True:
+            (frame, direction, callback) = await self.__input_queue.get()
+
            if self.__should_block_system_frames and self.__input_event:
                logger.trace(f"{self}: system frame processing paused")
                await self.__input_event.wait()
@@ -884,8 +886,6 @@ class FrameProcessor(BaseObject):
                self.__should_block_system_frames = False
                logger.trace(f"{self}: system frame processing resumed")

-            (frame, direction, callback) = await self.__input_queue.get()
-
            if isinstance(frame, SystemFrame):
                await self.__process_frame(frame, direction, callback)
            elif self.__process_queue:
@@ -900,6 +900,8 @@ class FrameProcessor(BaseObject):
    async def __process_frame_task_handler(self):
        """Handle non-system frames from the process queue."""
        while True:
+            (frame, direction, callback) = await self.__process_queue.get()
+
            if self.__should_block_frames and self.__process_event:
                logger.trace(f"{self}: frame processing paused")
                await self.__process_event.wait()
@@ -907,8 +909,6 @@ class FrameProcessor(BaseObject):
                self.__should_block_frames = False
                logger.trace(f"{self}: frame processing resumed")

-            (frame, direction, callback) = await self.__process_queue.get()
-
            await self.__process_frame(frame, direction, callback)

            self.__process_queue.task_done()
--- a/src/pipecat/runner/run.py
+++ b/src/pipecat/runner/run.py
@@ -67,12 +67,15 @@ To run locally:

 import argparse
 import asyncio
+import mimetypes
 import os
 import sys
 from contextlib import asynccontextmanager
+from pathlib import Path
 from typing import Optional

 import aiohttp
+from fastapi.responses import FileResponse
 from loguru import logger

 from pipecat.runner.types import (
@@ -98,6 +101,12 @@ except ImportError as e:
 load_dotenv(override=True)
 os.environ["ENV"] = "local"

+TELEPHONY_TRANSPORTS = ["twilio", "telnyx", "plivo", "exotel"]
+
+RUNNER_DOWNLOADS_FOLDER: Optional[str] = None
+RUNNER_HOST: str = "localhost"
+RUNNER_PORT: int = 7860
+

 def _get_bot_module():
    """Get the bot module from the calling script."""
@@ -152,7 +161,12 @@ async def _run_telephony_bot(websocket: WebSocket):


 def _create_server_app(
-    transport_type: str, host: str = "localhost", proxy: str = None, esp32_mode: bool = False
+    *,
+    transport_type: str,
+    host: str = "localhost",
+    proxy: str,
+    esp32_mode: bool = False,
+    folder: Optional[str] = None,
 ):
    """Create FastAPI app with transport-specific routes."""
    app = FastAPI()
@@ -167,19 +181,21 @@ def _create_server_app(

    # Set up transport-specific routes
    if transport_type == "webrtc":
-        _setup_webrtc_routes(app, esp32_mode=esp32_mode, host=host)
+        _setup_webrtc_routes(app, esp32_mode=esp32_mode, host=host, folder=folder)
        _setup_whatsapp_routes(app)
    elif transport_type == "daily":
        _setup_daily_routes(app)
-    elif transport_type in ["twilio", "telnyx", "plivo", "exotel"]:
-        _setup_telephony_routes(app, transport_type, proxy)
+    elif transport_type in TELEPHONY_TRANSPORTS:
+        _setup_telephony_routes(app, transport_type=transport_type, proxy=proxy)
    else:
        logger.warning(f"Unknown transport type: {transport_type}")

    return app


-def _setup_webrtc_routes(app: FastAPI, esp32_mode: bool = False, host: str = "localhost"):
+def _setup_webrtc_routes(
+    app: FastAPI, *, esp32_mode: bool = False, host: str = "localhost", folder: Optional[str] = None
+):
    """Set up WebRTC-specific routes."""
    try:
        from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI
@@ -201,6 +217,21 @@ def _setup_webrtc_routes(app: FastAPI, esp32_mode: bool = False, host: str = "lo
        """Redirect root requests to client interface."""
        return RedirectResponse(url="/client/")

+    @app.get("/files/{filename:path}")
+    async def download_file(filename: str):
+        """Handle file downloads."""
+        if not folder:
+            logger.warning(f"Attempting to dowload {filename}, but downloads folder not setup.")
+            return
+
+        file_path = Path(folder) / filename
+        if not os.path.exists(file_path):
+            raise HTTPException(404)
+
+        media_type, _ = mimetypes.guess_type(file_path)
+
+        return FileResponse(path=file_path, media_type=media_type, filename=filename)
+
    # Initialize the SmallWebRTC request handler
    small_webrtc_handler: SmallWebRTCRequestHandler = SmallWebRTCRequestHandler(
        esp32_mode=esp32_mode, host=host
@@ -489,7 +520,7 @@ def _setup_daily_routes(app: FastAPI):
        return await _handle_rtvi_request(request)


-def _setup_telephony_routes(app: FastAPI, transport_type: str, proxy: str):
+def _setup_telephony_routes(app: FastAPI, *, transport_type: str, proxy: str):
    """Set up telephony-specific routes."""
    # XML response templates (Exotel doesn't use XML webhooks)
    XML_TEMPLATES = {
@@ -592,6 +623,21 @@ def _validate_and_clean_proxy(proxy: str) -> str:
    return proxy


+def runner_downloads_folder() -> Optional[str]:
+    """Returns the folder where files are stored for later download."""
+    return RUNNER_DOWNLOADS_FOLDER
+
+
+def runner_host() -> str:
+    """Returns the host name of this runner."""
+    return RUNNER_HOST
+
+
+def runner_port() -> int:
+    """Returns the port of this runner."""
+    return RUNNER_PORT
+
+
 def main():
    """Start the Pipecat development runner.

@@ -612,14 +658,16 @@ def main():

    The bot file must contain a `bot(runner_args)` function as the entry point.
    """
+    global RUNNER_DOWNLOADS_FOLDER, RUNNER_HOST, RUNNER_PORT
+
    parser = argparse.ArgumentParser(description="Pipecat Development Runner")
-    parser.add_argument("--host", type=str, default="localhost", help="Host address")
-    parser.add_argument("--port", type=int, default=7860, help="Port number")
+    parser.add_argument("--host", type=str, default=RUNNER_HOST, help="Host address")
+    parser.add_argument("--port", type=int, default=RUNNER_PORT, help="Port number")
    parser.add_argument(
        "-t",
        "--transport",
        type=str,
-        choices=["daily", "webrtc", "twilio", "telnyx", "plivo", "exotel"],
+        choices=["daily", "webrtc", *TELEPHONY_TRANSPORTS],
        default="webrtc",
        help="Transport type",
    )
@@ -637,6 +685,7 @@ def main():
        default=False,
        help="Connect directly to Daily room (automatically sets transport to daily)",
    )
+    parser.add_argument("-f", "--folder", type=str, help="Path to downloads folder")
    parser.add_argument(
        "--verbose", "-v", action="count", default=0, help="Increase logging verbosity"
    )
@@ -659,6 +708,10 @@ def main():
        logger.error("For ESP32, you need to specify `--host IP` so we can do SDP munging.")
        return

+    if args.transport in TELEPHONY_TRANSPORTS and not args.proxy:
+        logger.error(f"For telephony transports, you need to specify `--proxy PROXY`.")
+        return
+
    # Log level
    logger.remove()
    logger.add(sys.stderr, level="TRACE" if args.verbose else "DEBUG")
@@ -689,8 +742,18 @@ def main():
        print(f"   → Open http://{args.host}:{args.port} in your browser to start a session")
        print()

+    RUNNER_DOWNLOADS_FOLDER = args.folder
+    RUNNER_HOST = args.host
+    RUNNER_PORT = args.port
+
    # Create the app with transport-specific setup
-    app = _create_server_app(args.transport, args.host, args.proxy, args.esp32)
+    app = _create_server_app(
+        transport_type=args.transport,
+        host=args.host,
+        proxy=args.proxy,
+        esp32_mode=args.esp32,
+        folder=args.folder,
+    )

    # Run the server
    uvicorn.run(app, host=args.host, port=args.port)
--- a/src/pipecat/serializers/livekit.py
+++ b/src/pipecat/serializers/livekit.py
@@ -25,11 +25,31 @@ except ModuleNotFoundError as e:
 class LivekitFrameSerializer(FrameSerializer):
    """Serializer for converting between Pipecat frames and LiveKit audio frames.

+    .. deprecated:: 0.0.90
+
+        This class is deprecated and will be removed in a future version.
+        Please use LiveKitTransport instead, which handles audio streaming
+        and frame conversion natively.
+
    This serializer handles the conversion of Pipecat's OutputAudioRawFrame objects
    to LiveKit AudioFrame objects for transmission, and the reverse conversion
    for received audio data.
    """

+    def __init__(self):
+        """Initialize the LiveKit frame serializer."""
+        super().__init__()
+        import warnings
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("always")
+            warnings.warn(
+                "LivekitFrameSerializer is deprecated and will be removed in a future version. "
+                "Please use LiveKitTransport instead, which handles audio streaming natively.",
+                DeprecationWarning,
+                stacklevel=2,
+            )
+
    @property
    def type(self) -> FrameSerializerType:
        """Get the serializer type.
--- a/src/pipecat/services/ai_service.py
+++ b/src/pipecat/services/ai_service.py
@@ -97,9 +97,7 @@ class AIService(FrameProcessor):
        pass

    async def _update_settings(self, settings: Mapping[str, Any]):
-        from pipecat.services.openai_realtime_beta.events import (
-            SessionProperties,
-        )
+        from pipecat.services.openai.realtime.events import SessionProperties

        for key, value in settings.items():
            logger.debug("Update request for:", key, value)
@@ -111,9 +109,7 @@ class AIService(FrameProcessor):
                logger.debug("Attempting to update", key, value)

                try:
-                    from pipecat.services.openai_realtime_beta.events import (
-                        TurnDetection,
-                    )
+                    from pipecat.services.openai.realtime.events import TurnDetection

                    if isinstance(self._session_properties, SessionProperties):
                        current_properties = self._session_properties
--- a/src/pipecat/services/aws/init.py
+++ b/src/pipecat/services/aws/init.py
@@ -9,6 +9,7 @@ import sys
 from pipecat.services import DeprecatedModuleProxy

 from .llm import *
+from .nova_sonic import *
 from .stt import *
 from .tts import *

--- a/src/pipecat/services/aws/nova_sonic/init.py
+++ b/src/pipecat/services/aws/nova_sonic/init.py
--- a/src/pipecat/services/aws/nova_sonic/context.py
+++ b/src/pipecat/services/aws/nova_sonic/context.py
@@ -0,0 +1,367 @@
+#
+# Copyright (c) 2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Context management for AWS Nova Sonic LLM service.
+
+This module provides specialized context aggregators and message handling for AWS Nova Sonic,
+including conversation history management and role-specific message processing.
+"""
+
+import copy
+from dataclasses import dataclass, field
+from enum import Enum
+
+from loguru import logger
+
+from pipecat.frames.frames import (
+    BotStoppedSpeakingFrame,
+    DataFrame,
+    Frame,
+    FunctionCallResultFrame,
+    InterruptionFrame,
+    LLMFullResponseEndFrame,
+    LLMFullResponseStartFrame,
+    LLMMessagesAppendFrame,
+    LLMMessagesUpdateFrame,
+    LLMSetToolChoiceFrame,
+    LLMSetToolsFrame,
+    TextFrame,
+    UserImageRawFrame,
+)
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.services.aws.nova_sonic.frames import AWSNovaSonicFunctionCallResultFrame
+from pipecat.services.openai.llm import (
+    OpenAIAssistantContextAggregator,
+    OpenAIUserContextAggregator,
+)
+
+
+class Role(Enum):
+    """Roles supported in AWS Nova Sonic conversations.
+
+    Parameters:
+        SYSTEM: System-level messages (not used in conversation history).
+        USER: Messages sent by the user.
+        ASSISTANT: Messages sent by the assistant.
+        TOOL: Messages sent by tools (not used in conversation history).
+    """
+
+    SYSTEM = "SYSTEM"
+    USER = "USER"
+    ASSISTANT = "ASSISTANT"
+    TOOL = "TOOL"
+
+
+@dataclass
+class AWSNovaSonicConversationHistoryMessage:
+    """A single message in AWS Nova Sonic conversation history.
+
+    Parameters:
+        role: The role of the message sender (USER or ASSISTANT only).
+        text: The text content of the message.
+    """
+
+    role: Role  # only USER and ASSISTANT
+    text: str
+
+
+@dataclass
+class AWSNovaSonicConversationHistory:
+    """Complete conversation history for AWS Nova Sonic initialization.
+
+    Parameters:
+        system_instruction: System-level instruction for the conversation.
+        messages: List of conversation messages between user and assistant.
+    """
+
+    system_instruction: str = None
+    messages: list[AWSNovaSonicConversationHistoryMessage] = field(default_factory=list)
+
+
+class AWSNovaSonicLLMContext(OpenAILLMContext):
+    """Specialized LLM context for AWS Nova Sonic service.
+
+    Extends OpenAI context with Nova Sonic-specific message handling,
+    conversation history management, and text buffering capabilities.
+    """
+
+    def __init__(self, messages=None, tools=None, **kwargs):
+        """Initialize AWS Nova Sonic LLM context.
+
+        Args:
+            messages: Initial messages for the context.
+            tools: Available tools for the context.
+            **kwargs: Additional arguments passed to parent class.
+        """
+        super().__init__(messages=messages, tools=tools, **kwargs)
+        self.__setup_local()
+
+    def __setup_local(self, system_instruction: str = ""):
+        self._assistant_text = ""
+        self._user_text = ""
+        self._system_instruction = system_instruction
+
+    @staticmethod
+    def upgrade_to_nova_sonic(
+        obj: OpenAILLMContext, system_instruction: str
+    ) -> "AWSNovaSonicLLMContext":
+        """Upgrade an OpenAI context to AWS Nova Sonic context.
+
+        Args:
+            obj: The OpenAI context to upgrade.
+            system_instruction: System instruction for the context.
+
+        Returns:
+            The upgraded AWS Nova Sonic context.
+        """
+        if isinstance(obj, OpenAILLMContext) and not isinstance(obj, AWSNovaSonicLLMContext):
+            obj.__class__ = AWSNovaSonicLLMContext
+            obj.__setup_local(system_instruction)
+        return obj
+
+    # NOTE: this method has the side-effect of updating _system_instruction from messages
+    def get_messages_for_initializing_history(self) -> AWSNovaSonicConversationHistory:
+        """Get conversation history for initializing AWS Nova Sonic session.
+
+        Processes stored messages and extracts system instruction and conversation
+        history in the format expected by AWS Nova Sonic.
+
+        Returns:
+            Formatted conversation history with system instruction and messages.
+        """
+        history = AWSNovaSonicConversationHistory(system_instruction=self._system_instruction)
+
+        # Bail if there are no messages
+        if not self.messages:
+            return history
+
+        messages = copy.deepcopy(self.messages)
+
+        # If we have a "system" message as our first message, let's pull that out into "instruction"
+        if messages[0].get("role") == "system":
+            system = messages.pop(0)
+            content = system.get("content")
+            if isinstance(content, str):
+                history.system_instruction = content
+            elif isinstance(content, list):
+                history.system_instruction = content[0].get("text")
+            if history.system_instruction:
+                self._system_instruction = history.system_instruction
+
+        # Process remaining messages to fill out conversation history.
+        # Nova Sonic supports "user" and "assistant" messages in history.
+        for message in messages:
+            history_message = self.from_standard_message(message)
+            if history_message:
+                history.messages.append(history_message)
+
+        return history
+
+    def get_messages_for_persistent_storage(self):
+        """Get messages formatted for persistent storage.
+
+        Returns:
+            List of messages including system instruction if present.
+        """
+        messages = super().get_messages_for_persistent_storage()
+        # If we have a system instruction and messages doesn't already contain it, add it
+        if self._system_instruction and not (messages and messages[0].get("role") == "system"):
+            messages.insert(0, {"role": "system", "content": self._system_instruction})
+        return messages
+
+    def from_standard_message(self, message) -> AWSNovaSonicConversationHistoryMessage:
+        """Convert standard message format to Nova Sonic format.
+
+        Args:
+            message: Standard message dictionary to convert.
+
+        Returns:
+            Nova Sonic conversation history message, or None if not convertible.
+        """
+        role = message.get("role")
+        if message.get("role") == "user" or message.get("role") == "assistant":
+            content = message.get("content")
+            if isinstance(message.get("content"), list):
+                content = ""
+                for c in message.get("content"):
+                    if c.get("type") == "text":
+                        content += " " + c.get("text")
+                    else:
+                        logger.error(
+                            f"Unhandled content type in context message: {c.get('type')} - {message}"
+                        )
+            # There won't be content if this is an assistant tool call entry.
+            # We're ignoring those since they can't be loaded into AWS Nova Sonic conversation
+            # history
+            if content:
+                return AWSNovaSonicConversationHistoryMessage(role=Role[role.upper()], text=content)
+        # NOTE: we're ignoring messages with role "tool" since they can't be loaded into AWS Nova
+        # Sonic conversation history
+
+    def buffer_user_text(self, text):
+        """Buffer user text for later flushing to context.
+
+        Args:
+            text: User text to buffer.
+        """
+        self._user_text += f" {text}" if self._user_text else text
+        # logger.debug(f"User text buffered: {self._user_text}")
+
+    def flush_aggregated_user_text(self) -> str:
+        """Flush buffered user text to context as a complete message.
+
+        Returns:
+            The flushed user text, or empty string if no text was buffered.
+        """
+        if not self._user_text:
+            return ""
+        user_text = self._user_text
+        message = {
+            "role": "user",
+            "content": [{"type": "text", "text": user_text}],
+        }
+        self._user_text = ""
+        self.add_message(message)
+        # logger.debug(f"Context updated (user): {self.get_messages_for_logging()}")
+        return user_text
+
+    def buffer_assistant_text(self, text):
+        """Buffer assistant text for later flushing to context.
+
+        Args:
+            text: Assistant text to buffer.
+        """
+        self._assistant_text += text
+        # logger.debug(f"Assistant text buffered: {self._assistant_text}")
+
+    def flush_aggregated_assistant_text(self):
+        """Flush buffered assistant text to context as a complete message."""
+        if not self._assistant_text:
+            return
+        message = {
+            "role": "assistant",
+            "content": [{"type": "text", "text": self._assistant_text}],
+        }
+        self._assistant_text = ""
+        self.add_message(message)
+        # logger.debug(f"Context updated (assistant): {self.get_messages_for_logging()}")
+
+
+@dataclass
+class AWSNovaSonicMessagesUpdateFrame(DataFrame):
+    """Frame containing updated AWS Nova Sonic context.
+
+    Parameters:
+        context: The updated AWS Nova Sonic LLM context.
+    """
+
+    context: AWSNovaSonicLLMContext
+
+
+class AWSNovaSonicUserContextAggregator(OpenAIUserContextAggregator):
+    """Context aggregator for user messages in AWS Nova Sonic conversations.
+
+    Extends the OpenAI user context aggregator to emit Nova Sonic-specific
+    context update frames.
+    """
+
+    async def process_frame(
+        self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM
+    ):
+        """Process frames and emit Nova Sonic-specific context updates.
+
+        Args:
+            frame: The frame to process.
+            direction: The direction the frame is traveling.
+        """
+        await super().process_frame(frame, direction)
+
+        # Parent does not push LLMMessagesUpdateFrame
+        if isinstance(frame, LLMMessagesUpdateFrame):
+            await self.push_frame(AWSNovaSonicMessagesUpdateFrame(context=self._context))
+
+
+class AWSNovaSonicAssistantContextAggregator(OpenAIAssistantContextAggregator):
+    """Context aggregator for assistant messages in AWS Nova Sonic conversations.
+
+    Provides specialized handling for assistant responses and function calls
+    in AWS Nova Sonic context, with custom frame processing logic.
+    """
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process frames with Nova Sonic-specific logic.
+
+        Args:
+            frame: The frame to process.
+            direction: The direction the frame is traveling.
+        """
+        # HACK: For now, disable the context aggregator by making it just pass through all frames
+        # that the parent handles (except the function call stuff, which we still need).
+        # For an explanation of this hack, see
+        # AWSNovaSonicLLMService._report_assistant_response_text_added.
+        if isinstance(
+            frame,
+            (
+                InterruptionFrame,
+                LLMFullResponseStartFrame,
+                LLMFullResponseEndFrame,
+                TextFrame,
+                LLMMessagesAppendFrame,
+                LLMMessagesUpdateFrame,
+                LLMSetToolsFrame,
+                LLMSetToolChoiceFrame,
+                UserImageRawFrame,
+                BotStoppedSpeakingFrame,
+            ),
+        ):
+            await self.push_frame(frame, direction)
+        else:
+            await super().process_frame(frame, direction)
+
+    async def handle_function_call_result(self, frame: FunctionCallResultFrame):
+        """Handle function call results for AWS Nova Sonic.
+
+        Args:
+            frame: The function call result frame to handle.
+        """
+        await super().handle_function_call_result(frame)
+
+        # The standard function callback code path pushes the FunctionCallResultFrame from the LLM
+        # itself, so we didn't have a chance to add the result to the AWS Nova Sonic server-side
+        # context. Let's push a special frame to do that.
+        await self.push_frame(
+            AWSNovaSonicFunctionCallResultFrame(result_frame=frame), FrameDirection.UPSTREAM
+        )
+
+
+@dataclass
+class AWSNovaSonicContextAggregatorPair:
+    """Pair of user and assistant context aggregators for AWS Nova Sonic.
+
+    Parameters:
+        _user: The user context aggregator.
+        _assistant: The assistant context aggregator.
+    """
+
+    _user: AWSNovaSonicUserContextAggregator
+    _assistant: AWSNovaSonicAssistantContextAggregator
+
+    def user(self) -> AWSNovaSonicUserContextAggregator:
+        """Get the user context aggregator.
+
+        Returns:
+            The user context aggregator instance.
+        """
+        return self._user
+
+    def assistant(self) -> AWSNovaSonicAssistantContextAggregator:
+        """Get the assistant context aggregator.
+
+        Returns:
+            The assistant context aggregator instance.
+        """
+        return self._assistant
--- a/src/pipecat/services/aws/nova_sonic/frames.py
+++ b/src/pipecat/services/aws/nova_sonic/frames.py
@@ -0,0 +1,25 @@
+#
+# Copyright (c) 2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Custom frames for AWS Nova Sonic LLM service."""
+
+from dataclasses import dataclass
+
+from pipecat.frames.frames import DataFrame, FunctionCallResultFrame
+
+
+@dataclass
+class AWSNovaSonicFunctionCallResultFrame(DataFrame):
+    """Frame containing function call result for AWS Nova Sonic processing.
+
+    This frame wraps a standard function call result frame to enable
+    AWS Nova Sonic-specific handling and context updates.
+
+    Parameters:
+        result_frame: The underlying function call result frame.
+    """
+
+    result_frame: FunctionCallResultFrame
--- a/src/pipecat/services/aws/nova_sonic/llm.py
+++ b/src/pipecat/services/aws/nova_sonic/llm.py
--- a/src/pipecat/services/aws/nova_sonic/ready.wav
+++ b/src/pipecat/services/aws/nova_sonic/ready.wav
--- a/src/pipecat/services/aws_nova_sonic/init.py
+++ b/src/pipecat/services/aws_nova_sonic/init.py
@@ -1 +1,19 @@
-from .aws import AWSNovaSonicLLMService, Params
+#
+# Copyright (c) 2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import warnings
+
+from pipecat.services.aws.nova_sonic.llm import AWSNovaSonicLLMService, Params
+
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.aws_nova_sonic are deprecated. "
+        "Please use the equivalent types from "
+        "pipecat.services.aws.nova_sonic.llm instead.",
+        DeprecationWarning,
+        stacklevel=2,
+    )
--- a/src/pipecat/services/aws_nova_sonic/aws.py
+++ b/src/pipecat/services/aws_nova_sonic/aws.py
--- a/src/pipecat/services/aws_nova_sonic/context.py
+++ b/src/pipecat/services/aws_nova_sonic/context.py
@@ -10,358 +10,16 @@ This module provides specialized context aggregators and message handling for AW
 including conversation history management and role-specific message processing.
 """

-import copy
-from dataclasses import dataclass, field
-from enum import Enum
-
-from loguru import logger
-
-from pipecat.frames.frames import (
-    BotStoppedSpeakingFrame,
-    DataFrame,
-    Frame,
-    FunctionCallResultFrame,
-    InterruptionFrame,
-    LLMFullResponseEndFrame,
-    LLMFullResponseStartFrame,
-    LLMMessagesAppendFrame,
-    LLMMessagesUpdateFrame,
-    LLMSetToolChoiceFrame,
-    LLMSetToolsFrame,
-    TextFrame,
-    UserImageRawFrame,
-)
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
-from pipecat.processors.frame_processor import FrameDirection
-from pipecat.services.aws_nova_sonic.frames import AWSNovaSonicFunctionCallResultFrame
-from pipecat.services.openai.llm import (
-    OpenAIAssistantContextAggregator,
-    OpenAIUserContextAggregator,
-)
-
-
-class Role(Enum):
-    """Roles supported in AWS Nova Sonic conversations.
-
-    Parameters:
-        SYSTEM: System-level messages (not used in conversation history).
-        USER: Messages sent by the user.
-        ASSISTANT: Messages sent by the assistant.
-        TOOL: Messages sent by tools (not used in conversation history).
-    """
-
-    SYSTEM = "SYSTEM"
-    USER = "USER"
-    ASSISTANT = "ASSISTANT"
-    TOOL = "TOOL"
-
-
-@dataclass
-class AWSNovaSonicConversationHistoryMessage:
-    """A single message in AWS Nova Sonic conversation history.
-
-    Parameters:
-        role: The role of the message sender (USER or ASSISTANT only).
-        text: The text content of the message.
-    """
-
-    role: Role  # only USER and ASSISTANT
-    text: str
-
-
-@dataclass
-class AWSNovaSonicConversationHistory:
-    """Complete conversation history for AWS Nova Sonic initialization.
-
-    Parameters:
-        system_instruction: System-level instruction for the conversation.
-        messages: List of conversation messages between user and assistant.
-    """
-
-    system_instruction: str = None
-    messages: list[AWSNovaSonicConversationHistoryMessage] = field(default_factory=list)
-
-
-class AWSNovaSonicLLMContext(OpenAILLMContext):
-    """Specialized LLM context for AWS Nova Sonic service.
-
-    Extends OpenAI context with Nova Sonic-specific message handling,
-    conversation history management, and text buffering capabilities.
-    """
-
-    def __init__(self, messages=None, tools=None, **kwargs):
-        """Initialize AWS Nova Sonic LLM context.
-
-        Args:
-            messages: Initial messages for the context.
-            tools: Available tools for the context.
-            **kwargs: Additional arguments passed to parent class.
-        """
-        super().__init__(messages=messages, tools=tools, **kwargs)
-        self.__setup_local()
-
-    def __setup_local(self, system_instruction: str = ""):
-        self._assistant_text = ""
-        self._user_text = ""
-        self._system_instruction = system_instruction
-
-    @staticmethod
-    def upgrade_to_nova_sonic(
-        obj: OpenAILLMContext, system_instruction: str
-    ) -> "AWSNovaSonicLLMContext":
-        """Upgrade an OpenAI context to AWS Nova Sonic context.
-
-        Args:
-            obj: The OpenAI context to upgrade.
-            system_instruction: System instruction for the context.
-
-        Returns:
-            The upgraded AWS Nova Sonic context.
-        """
-        if isinstance(obj, OpenAILLMContext) and not isinstance(obj, AWSNovaSonicLLMContext):
-            obj.__class__ = AWSNovaSonicLLMContext
-            obj.__setup_local(system_instruction)
-        return obj
-
-    # NOTE: this method has the side-effect of updating _system_instruction from messages
-    def get_messages_for_initializing_history(self) -> AWSNovaSonicConversationHistory:
-        """Get conversation history for initializing AWS Nova Sonic session.
-
-        Processes stored messages and extracts system instruction and conversation
-        history in the format expected by AWS Nova Sonic.
-
-        Returns:
-            Formatted conversation history with system instruction and messages.
-        """
-        history = AWSNovaSonicConversationHistory(system_instruction=self._system_instruction)
-
-        # Bail if there are no messages
-        if not self.messages:
-            return history
-
-        messages = copy.deepcopy(self.messages)
-
-        # If we have a "system" message as our first message, let's pull that out into "instruction"
-        if messages[0].get("role") == "system":
-            system = messages.pop(0)
-            content = system.get("content")
-            if isinstance(content, str):
-                history.system_instruction = content
-            elif isinstance(content, list):
-                history.system_instruction = content[0].get("text")
-            if history.system_instruction:
-                self._system_instruction = history.system_instruction
-
-        # Process remaining messages to fill out conversation history.
-        # Nova Sonic supports "user" and "assistant" messages in history.
-        for message in messages:
-            history_message = self.from_standard_message(message)
-            if history_message:
-                history.messages.append(history_message)
-
-        return history
-
-    def get_messages_for_persistent_storage(self):
-        """Get messages formatted for persistent storage.
-
-        Returns:
-            List of messages including system instruction if present.
-        """
-        messages = super().get_messages_for_persistent_storage()
-        # If we have a system instruction and messages doesn't already contain it, add it
-        if self._system_instruction and not (messages and messages[0].get("role") == "system"):
-            messages.insert(0, {"role": "system", "content": self._system_instruction})
-        return messages
-
-    def from_standard_message(self, message) -> AWSNovaSonicConversationHistoryMessage:
-        """Convert standard message format to Nova Sonic format.
-
-        Args:
-            message: Standard message dictionary to convert.
-
-        Returns:
-            Nova Sonic conversation history message, or None if not convertible.
-        """
-        role = message.get("role")
-        if message.get("role") == "user" or message.get("role") == "assistant":
-            content = message.get("content")
-            if isinstance(message.get("content"), list):
-                content = ""
-                for c in message.get("content"):
-                    if c.get("type") == "text":
-                        content += " " + c.get("text")
-                    else:
-                        logger.error(
-                            f"Unhandled content type in context message: {c.get('type')} - {message}"
-                        )
-            # There won't be content if this is an assistant tool call entry.
-            # We're ignoring those since they can't be loaded into AWS Nova Sonic conversation
-            # history
-            if content:
-                return AWSNovaSonicConversationHistoryMessage(role=Role[role.upper()], text=content)
-        # NOTE: we're ignoring messages with role "tool" since they can't be loaded into AWS Nova
-        # Sonic conversation history
-
-    def buffer_user_text(self, text):
-        """Buffer user text for later flushing to context.
-
-        Args:
-            text: User text to buffer.
-        """
-        self._user_text += f" {text}" if self._user_text else text
-        # logger.debug(f"User text buffered: {self._user_text}")
-
-    def flush_aggregated_user_text(self) -> str:
-        """Flush buffered user text to context as a complete message.
-
-        Returns:
-            The flushed user text, or empty string if no text was buffered.
-        """
-        if not self._user_text:
-            return ""
-        user_text = self._user_text
-        message = {
-            "role": "user",
-            "content": [{"type": "text", "text": user_text}],
-        }
-        self._user_text = ""
-        self.add_message(message)
-        # logger.debug(f"Context updated (user): {self.get_messages_for_logging()}")
-        return user_text
-
-    def buffer_assistant_text(self, text):
-        """Buffer assistant text for later flushing to context.
-
-        Args:
-            text: Assistant text to buffer.
-        """
-        self._assistant_text += text
-        # logger.debug(f"Assistant text buffered: {self._assistant_text}")
-
-    def flush_aggregated_assistant_text(self):
-        """Flush buffered assistant text to context as a complete message."""
-        if not self._assistant_text:
-            return
-        message = {
-            "role": "assistant",
-            "content": [{"type": "text", "text": self._assistant_text}],
-        }
-        self._assistant_text = ""
-        self.add_message(message)
-        # logger.debug(f"Context updated (assistant): {self.get_messages_for_logging()}")
-
-
-@dataclass
-class AWSNovaSonicMessagesUpdateFrame(DataFrame):
-    """Frame containing updated AWS Nova Sonic context.
-
-    Parameters:
-        context: The updated AWS Nova Sonic LLM context.
-    """
-
-    context: AWSNovaSonicLLMContext
-
-
-class AWSNovaSonicUserContextAggregator(OpenAIUserContextAggregator):
-    """Context aggregator for user messages in AWS Nova Sonic conversations.
-
-    Extends the OpenAI user context aggregator to emit Nova Sonic-specific
-    context update frames.
-    """
-
-    async def process_frame(
-        self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM
-    ):
-        """Process frames and emit Nova Sonic-specific context updates.
-
-        Args:
-            frame: The frame to process.
-            direction: The direction the frame is traveling.
-        """
-        await super().process_frame(frame, direction)
-
-        # Parent does not push LLMMessagesUpdateFrame
-        if isinstance(frame, LLMMessagesUpdateFrame):
-            await self.push_frame(AWSNovaSonicMessagesUpdateFrame(context=self._context))
-
-
-class AWSNovaSonicAssistantContextAggregator(OpenAIAssistantContextAggregator):
-    """Context aggregator for assistant messages in AWS Nova Sonic conversations.
-
-    Provides specialized handling for assistant responses and function calls
-    in AWS Nova Sonic context, with custom frame processing logic.
-    """
-
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        """Process frames with Nova Sonic-specific logic.
-
-        Args:
-            frame: The frame to process.
-            direction: The direction the frame is traveling.
-        """
-        # HACK: For now, disable the context aggregator by making it just pass through all frames
-        # that the parent handles (except the function call stuff, which we still need).
-        # For an explanation of this hack, see
-        # AWSNovaSonicLLMService._report_assistant_response_text_added.
-        if isinstance(
-            frame,
-            (
-                InterruptionFrame,
-                LLMFullResponseStartFrame,
-                LLMFullResponseEndFrame,
-                TextFrame,
-                LLMMessagesAppendFrame,
-                LLMMessagesUpdateFrame,
-                LLMSetToolsFrame,
-                LLMSetToolChoiceFrame,
-                UserImageRawFrame,
-                BotStoppedSpeakingFrame,
-            ),
-        ):
-            await self.push_frame(frame, direction)
-        else:
-            await super().process_frame(frame, direction)
-
-    async def handle_function_call_result(self, frame: FunctionCallResultFrame):
-        """Handle function call results for AWS Nova Sonic.
-
-        Args:
-            frame: The function call result frame to handle.
-        """
-        await super().handle_function_call_result(frame)
-
-        # The standard function callback code path pushes the FunctionCallResultFrame from the LLM
-        # itself, so we didn't have a chance to add the result to the AWS Nova Sonic server-side
-        # context. Let's push a special frame to do that.
-        await self.push_frame(
-            AWSNovaSonicFunctionCallResultFrame(result_frame=frame), FrameDirection.UPSTREAM
-        )
-
-
-@dataclass
-class AWSNovaSonicContextAggregatorPair:
-    """Pair of user and assistant context aggregators for AWS Nova Sonic.
-
-    Parameters:
-        _user: The user context aggregator.
-        _assistant: The assistant context aggregator.
-    """
-
-    _user: AWSNovaSonicUserContextAggregator
-    _assistant: AWSNovaSonicAssistantContextAggregator
-
-    def user(self) -> AWSNovaSonicUserContextAggregator:
-        """Get the user context aggregator.
-
-        Returns:
-            The user context aggregator instance.
-        """
-        return self._user
-
-    def assistant(self) -> AWSNovaSonicAssistantContextAggregator:
-        """Get the assistant context aggregator.
-
-        Returns:
-            The assistant context aggregator instance.
-        """
-        return self._assistant
+import warnings
+
+from pipecat.services.aws.nova_sonic.context import *
+
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.aws_nova_sonic.context are deprecated. "
+        "Please use the equivalent types from "
+        "pipecat.services.aws.nova_sonic.context instead.",
+        DeprecationWarning,
+        stacklevel=2,
+    )
--- a/src/pipecat/services/aws_nova_sonic/frames.py
+++ b/src/pipecat/services/aws_nova_sonic/frames.py
@@ -6,20 +6,16 @@

 """Custom frames for AWS Nova Sonic LLM service."""

-from dataclasses import dataclass
+import warnings

-from pipecat.frames.frames import DataFrame, FunctionCallResultFrame
+from pipecat.services.aws.nova_sonic.frames import *

-
-@dataclass
-class AWSNovaSonicFunctionCallResultFrame(DataFrame):
-    """Frame containing function call result for AWS Nova Sonic processing.
-
-    This frame wraps a standard function call result frame to enable
-    AWS Nova Sonic-specific handling and context updates.
-
-    Parameters:
-        result_frame: The underlying function call result frame.
-    """
-
-    result_frame: FunctionCallResultFrame
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.aws_nova_sonic.frames are deprecated. "
+        "Please use the equivalent types from "
+        "pipecat.services.aws.nova_sonic.frames instead.",
+        DeprecationWarning,
+        stacklevel=2,
+    )
--- a/src/pipecat/services/azure/realtime/init.py
+++ b/src/pipecat/services/azure/realtime/init.py
--- a/src/pipecat/services/azure/realtime/llm.py
+++ b/src/pipecat/services/azure/realtime/llm.py
@@ -0,0 +1,65 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Azure OpenAI Realtime LLM service implementation."""
+
+from loguru import logger
+
+from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
+
+try:
+    from websockets.asyncio.client import connect as websocket_connect
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error("In order to use Azure Realtime, you need to `pip install pipecat-ai[openai]`.")
+    raise Exception(f"Missing module: {e}")
+
+
+class AzureRealtimeLLMService(OpenAIRealtimeLLMService):
+    """Azure OpenAI Realtime LLM service with Azure-specific authentication.
+
+    Extends the OpenAI Realtime service to work with Azure OpenAI endpoints,
+    using Azure's authentication headers and endpoint format. Provides the same
+    real-time audio and text communication capabilities as the base OpenAI service.
+    """
+
+    def __init__(
+        self,
+        *,
+        api_key: str,
+        base_url: str,
+        **kwargs,
+    ):
+        """Initialize Azure Realtime LLM service.
+
+        Args:
+            api_key: The API key for the Azure OpenAI service.
+            base_url: The full Azure WebSocket endpoint URL including api-version and deployment.
+                Example: "wss://my-project.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=my-realtime-deployment"
+            **kwargs: Additional arguments passed to parent OpenAIRealtimeLLMService.
+        """
+        super().__init__(base_url=base_url, api_key=api_key, **kwargs)
+        self.api_key = api_key
+        self.base_url = base_url
+
+    async def _connect(self):
+        try:
+            if self._websocket:
+                # Here we assume that if we have a websocket, we are connected. We
+                # handle disconnections in the send/recv code paths.
+                return
+
+            logger.info(f"Connecting to {self.base_url}, api key: {self.api_key}")
+            self._websocket = await websocket_connect(
+                uri=self.base_url,
+                additional_headers={
+                    "api-key": self.api_key,
+                },
+            )
+            self._receive_task = self.create_task(self._receive_task_handler())
+        except Exception as e:
+            logger.error(f"{self} initialization error: {e}")
+            self._websocket = None
--- a/src/pipecat/services/gemini_multimodal_live/events.py
+++ b/src/pipecat/services/gemini_multimodal_live/events.py
@@ -4,527 +4,41 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-"""Event models and utilities for Google Gemini Multimodal Live API."""
-
-import base64
-import io
-import json
-from enum import Enum
-from typing import List, Literal, Optional
-
-from PIL import Image
-from pydantic import BaseModel, Field
-
-from pipecat.frames.frames import ImageRawFrame
-
-#
-# Client events
-#
-
-
-class MediaChunk(BaseModel):
-    """Represents a chunk of media data for transmission.
-
-    Parameters:
-        mimeType: MIME type of the media content.
-        data: Base64-encoded media data.
-    """
-
-    mimeType: str
-    data: str
-
-
-class ContentPart(BaseModel):
-    """Represents a part of content that can contain text or media.
-
-    Parameters:
-        text: Text content. Defaults to None.
-        inlineData: Inline media data. Defaults to None.
-    """
-
-    text: Optional[str] = Field(default=None, validate_default=False)
-    inlineData: Optional[MediaChunk] = Field(default=None, validate_default=False)
-    fileData: Optional["FileData"] = Field(default=None, validate_default=False)
-
-
-class FileData(BaseModel):
-    """Represents a file reference in the Gemini File API."""
-
-    mimeType: str
-    fileUri: str
-
-
-ContentPart.model_rebuild()  # Rebuild model to resolve forward reference
-
-
-class Turn(BaseModel):
-    """Represents a conversational turn in the dialogue.
-
-    Parameters:
-        role: The role of the speaker, either "user" or "model". Defaults to "user".
-        parts: List of content parts that make up the turn.
-    """
-
-    role: Literal["user", "model"] = "user"
-    parts: List[ContentPart]
-
-
-class StartSensitivity(str, Enum):
-    """Determines how start of speech is detected."""
-
-    UNSPECIFIED = "START_SENSITIVITY_UNSPECIFIED"  # Default is HIGH
-    HIGH = "START_SENSITIVITY_HIGH"  # Detect start of speech more often
-    LOW = "START_SENSITIVITY_LOW"  # Detect start of speech less often
-
-
-class EndSensitivity(str, Enum):
-    """Determines how end of speech is detected."""
-
-    UNSPECIFIED = "END_SENSITIVITY_UNSPECIFIED"  # Default is HIGH
-    HIGH = "END_SENSITIVITY_HIGH"  # End speech more often
-    LOW = "END_SENSITIVITY_LOW"  # End speech less often
-
-
-class AutomaticActivityDetection(BaseModel):
-    """Configures automatic detection of voice activity.
-
-    Parameters:
-        disabled: Whether automatic activity detection is disabled. Defaults to None.
-        start_of_speech_sensitivity: Sensitivity for detecting speech start. Defaults to None.
-        prefix_padding_ms: Padding before speech start in milliseconds. Defaults to None.
-        end_of_speech_sensitivity: Sensitivity for detecting speech end. Defaults to None.
-        silence_duration_ms: Duration of silence to detect speech end. Defaults to None.
-    """
-
-    disabled: Optional[bool] = None
-    start_of_speech_sensitivity: Optional[StartSensitivity] = None
-    prefix_padding_ms: Optional[int] = None
-    end_of_speech_sensitivity: Optional[EndSensitivity] = None
-    silence_duration_ms: Optional[int] = None
-
-
-class RealtimeInputConfig(BaseModel):
-    """Configures the realtime input behavior.
-
-    Parameters:
-        automatic_activity_detection: Voice activity detection configuration. Defaults to None.
-    """
-
-    automatic_activity_detection: Optional[AutomaticActivityDetection] = None
-
-
-class RealtimeInput(BaseModel):
-    """Contains realtime input media chunks and text.
-
-    Parameters:
-        mediaChunks: List of media chunks for realtime processing.
-        text: Text for realtime processing.
-    """
-
-    mediaChunks: Optional[List[MediaChunk]] = None
-    text: Optional[str] = None
-
-
-class ClientContent(BaseModel):
-    """Content sent from client to the Gemini Live API.
-
-    Parameters:
-        turns: List of conversation turns. Defaults to None.
-        turnComplete: Whether the client's turn is complete. Defaults to False.
-    """
-
-    turns: Optional[List[Turn]] = None
-    turnComplete: bool = False
-
-
-class AudioInputMessage(BaseModel):
-    """Message containing audio input data.
-
-    Parameters:
-        realtimeInput: Realtime input containing audio chunks.
-    """
-
-    realtimeInput: RealtimeInput
-
-    @classmethod
-    def from_raw_audio(cls, raw_audio: bytes, sample_rate: int) -> "AudioInputMessage":
-        """Create an audio input message from raw audio data.
-
-        Args:
-            raw_audio: Raw audio bytes.
-            sample_rate: Audio sample rate in Hz.
-
-        Returns:
-            AudioInputMessage instance with encoded audio data.
-        """
-        data = base64.b64encode(raw_audio).decode("utf-8")
-        return cls(
-            realtimeInput=RealtimeInput(
-                mediaChunks=[MediaChunk(mimeType=f"audio/pcm;rate={sample_rate}", data=data)]
-            )
-        )
-
-
-class VideoInputMessage(BaseModel):
-    """Message containing video/image input data.
-
-    Parameters:
-        realtimeInput: Realtime input containing video/image chunks.
-    """
-
-    realtimeInput: RealtimeInput
-
-    @classmethod
-    def from_image_frame(cls, frame: ImageRawFrame) -> "VideoInputMessage":
-        """Create a video input message from an image frame.
-
-        Args:
-            frame: Image frame to encode.
-
-        Returns:
-            VideoInputMessage instance with encoded image data.
-        """
-        buffer = io.BytesIO()
-        Image.frombytes(frame.format, frame.size, frame.image).save(buffer, format="JPEG")
-        data = base64.b64encode(buffer.getvalue()).decode("utf-8")
-        return cls(
-            realtimeInput=RealtimeInput(mediaChunks=[MediaChunk(mimeType=f"image/jpeg", data=data)])
-        )
-
-
-class TextInputMessage(BaseModel):
-    """Message containing text input data."""
-
-    realtimeInput: RealtimeInput
-
-    @classmethod
-    def from_text(cls, text: str) -> "TextInputMessage":
-        """Create a text input message from a string.
-
-        Args:
-            text: The text to send.
-
-        Returns:
-            A TextInputMessage instance.
-        """
-        return cls(realtimeInput=RealtimeInput(text=text))
-
-
-class ClientContentMessage(BaseModel):
-    """Message containing client content for the API.
-
-    Parameters:
-        clientContent: The client content to send.
-    """
-
-    clientContent: ClientContent
-
-
-class SystemInstruction(BaseModel):
-    """System instruction for the model.
-
-    Parameters:
-        parts: List of content parts that make up the system instruction.
-    """
-
-    parts: List[ContentPart]
-
-
-class AudioTranscriptionConfig(BaseModel):
-    """Configuration for audio transcription."""
-
-    pass
-
-
-class Setup(BaseModel):
-    """Setup configuration for the Gemini Live session.
-
-    Parameters:
-        model: Model identifier to use.
-        system_instruction: System instruction for the model. Defaults to None.
-        tools: List of available tools/functions. Defaults to None.
-        generation_config: Generation configuration parameters. Defaults to None.
-        input_audio_transcription: Input audio transcription config. Defaults to None.
-        output_audio_transcription: Output audio transcription config. Defaults to None.
-        realtime_input_config: Realtime input configuration. Defaults to None.
-    """
-
-    model: str
-    system_instruction: Optional[SystemInstruction] = None
-    tools: Optional[List[dict]] = None
-    generation_config: Optional[dict] = None
-    input_audio_transcription: Optional[AudioTranscriptionConfig] = None
-    output_audio_transcription: Optional[AudioTranscriptionConfig] = None
-    realtime_input_config: Optional[RealtimeInputConfig] = None
-
-
-class Config(BaseModel):
-    """Configuration message for session setup.
-
-    Parameters:
-        setup: Setup configuration for the session.
-    """
-
-    setup: Setup
-
-
-#
-# Grounding metadata models
-#
-
-
-class SearchEntryPoint(BaseModel):
-    """Represents the search entry point with rendered content for search suggestions."""
-
-    renderedContent: Optional[str] = None
-
-
-class WebSource(BaseModel):
-    """Represents a web source from grounding chunks."""
-
-    uri: Optional[str] = None
-    title: Optional[str] = None
-
-
-class GroundingChunk(BaseModel):
-    """Represents a grounding chunk containing web source information."""
-
-    web: Optional[WebSource] = None
-
-
-class GroundingSegment(BaseModel):
-    """Represents a segment of text that is grounded."""
-
-    startIndex: Optional[int] = None
-    endIndex: Optional[int] = None
-    text: Optional[str] = None
-
-
-class GroundingSupport(BaseModel):
-    """Represents support information for grounded text segments."""
-
-    segment: Optional[GroundingSegment] = None
-    groundingChunkIndices: Optional[List[int]] = None
-    confidenceScores: Optional[List[float]] = None
-
-
-class GroundingMetadata(BaseModel):
-    """Represents grounding metadata from Google Search."""
-
-    searchEntryPoint: Optional[SearchEntryPoint] = None
-    groundingChunks: Optional[List[GroundingChunk]] = None
-    groundingSupports: Optional[List[GroundingSupport]] = None
-    webSearchQueries: Optional[List[str]] = None
-
-
-#
-# Server events
-#
-
-
-class SetupComplete(BaseModel):
-    """Indicates that session setup is complete."""
-
-    pass
-
-
-class InlineData(BaseModel):
-    """Inline data embedded in server responses.
-
-    Parameters:
-        mimeType: MIME type of the data.
-        data: Base64-encoded data content.
-    """
-
-    mimeType: str
-    data: str
-
-
-class Part(BaseModel):
-    """Part of a server response containing data or text.
-
-    Parameters:
-        inlineData: Inline binary data. Defaults to None.
-        text: Text content. Defaults to None.
-    """
-
-    inlineData: Optional[InlineData] = None
-    text: Optional[str] = None
-
-
-class ModelTurn(BaseModel):
-    """Represents a turn from the model in the conversation.
-
-    Parameters:
-        parts: List of content parts in the model's response.
-    """
-
-    parts: List[Part]
-
-
-class ServerContentInterrupted(BaseModel):
-    """Indicates server content was interrupted.
-
-    Parameters:
-        interrupted: Whether the content was interrupted.
-    """
-
-    interrupted: bool
-
-
-class ServerContentTurnComplete(BaseModel):
-    """Indicates the server's turn is complete.
-
-    Parameters:
-        turnComplete: Whether the turn is complete.
-    """
-
-    turnComplete: bool
-
-
-class BidiGenerateContentTranscription(BaseModel):
-    """Transcription data from bidirectional content generation.
-
-    Parameters:
-        text: The transcribed text content.
-    """
-
-    text: str
-
-
-class ServerContent(BaseModel):
-    """Content sent from server to client.
-
-    Parameters:
-        modelTurn: Model's conversational turn. Defaults to None.
-        interrupted: Whether content was interrupted. Defaults to None.
-        turnComplete: Whether the turn is complete. Defaults to None.
-        inputTranscription: Transcription of input audio. Defaults to None.
-        outputTranscription: Transcription of output audio. Defaults to None.
-    """
-
-    modelTurn: Optional[ModelTurn] = None
-    interrupted: Optional[bool] = None
-    turnComplete: Optional[bool] = None
-    inputTranscription: Optional[BidiGenerateContentTranscription] = None
-    outputTranscription: Optional[BidiGenerateContentTranscription] = None
-    groundingMetadata: Optional[GroundingMetadata] = None
-
-
-class FunctionCall(BaseModel):
-    """Represents a function call from the model.
-
-    Parameters:
-        id: Unique identifier for the function call.
-        name: Name of the function to call.
-        args: Arguments to pass to the function.
-    """
-
-    id: str
-    name: str
-    args: dict
-
-
-class ToolCall(BaseModel):
-    """Contains one or more function calls.
-
-    Parameters:
-        functionCalls: List of function calls to execute.
-    """
-
-    functionCalls: List[FunctionCall]
-
-
-class Modality(str, Enum):
-    """Modality types in token counts."""
-
-    UNSPECIFIED = "MODALITY_UNSPECIFIED"
-    TEXT = "TEXT"
-    IMAGE = "IMAGE"
-    AUDIO = "AUDIO"
-    VIDEO = "VIDEO"
-
-
-class ModalityTokenCount(BaseModel):
-    """Token count for a specific modality.
-
-    Parameters:
-        modality: The modality type.
-        tokenCount: Number of tokens for this modality.
-    """
-
-    modality: Modality
-    tokenCount: int
-
-
-class UsageMetadata(BaseModel):
-    """Usage metadata about the API response.
-
-    Parameters:
-        promptTokenCount: Number of tokens in the prompt. Defaults to None.
-        cachedContentTokenCount: Number of cached content tokens. Defaults to None.
-        responseTokenCount: Number of tokens in the response. Defaults to None.
-        toolUsePromptTokenCount: Number of tokens for tool use prompts. Defaults to None.
-        thoughtsTokenCount: Number of tokens for model thoughts. Defaults to None.
-        totalTokenCount: Total number of tokens used. Defaults to None.
-        promptTokensDetails: Detailed breakdown of prompt tokens by modality. Defaults to None.
-        cacheTokensDetails: Detailed breakdown of cache tokens by modality. Defaults to None.
-        responseTokensDetails: Detailed breakdown of response tokens by modality. Defaults to None.
-        toolUsePromptTokensDetails: Detailed breakdown of tool use tokens by modality. Defaults to None.
-    """
-
-    promptTokenCount: Optional[int] = None
-    cachedContentTokenCount: Optional[int] = None
-    responseTokenCount: Optional[int] = None
-    toolUsePromptTokenCount: Optional[int] = None
-    thoughtsTokenCount: Optional[int] = None
-    totalTokenCount: Optional[int] = None
-    promptTokensDetails: Optional[List[ModalityTokenCount]] = None
-    cacheTokensDetails: Optional[List[ModalityTokenCount]] = None
-    responseTokensDetails: Optional[List[ModalityTokenCount]] = None
-    toolUsePromptTokensDetails: Optional[List[ModalityTokenCount]] = None
-
-
-class ServerEvent(BaseModel):
-    """Server event received from the Gemini Live API.
-
-    Parameters:
-        setupComplete: Setup completion notification. Defaults to None.
-        serverContent: Content from the server. Defaults to None.
-        toolCall: Tool/function call request. Defaults to None.
-        usageMetadata: Token usage metadata. Defaults to None.
-    """
-
-    setupComplete: Optional[SetupComplete] = None
-    serverContent: Optional[ServerContent] = None
-    toolCall: Optional[ToolCall] = None
-    usageMetadata: Optional[UsageMetadata] = None
-
-
-def parse_server_event(str):
-    """Parse a server event from JSON string.
-
-    Args:
-        str: JSON string containing the server event.
-
-    Returns:
-        ServerEvent instance if parsing succeeds, None otherwise.
-    """
-    try:
-        evt = json.loads(str)
-        return ServerEvent.model_validate(evt)
-    except Exception as e:
-        print(f"Error parsing server event: {e}")
-        return None
-
-
-class ContextWindowCompressionConfig(BaseModel):
-    """Configuration for context window compression.
-
-    Parameters:
-        sliding_window: Whether to use sliding window compression. Defaults to True.
-        trigger_tokens: Token count threshold to trigger compression. Defaults to None.
-    """
-
-    sliding_window: Optional[bool] = Field(default=True)
-    trigger_tokens: Optional[int] = Field(default=None)
+"""Event models and utilities for Google Gemini Multimodal Live API.
+
+.. deprecated:: 0.0.90
+    Importing StartSensitivity and EndSensitivity from this module is deprecated.
+    Import them directly from google.genai.types instead.
+"""
+
+import warnings
+
+from loguru import logger
+
+try:
+    from google.genai.types import (
+        EndSensitivity as _EndSensitivity,
+    )
+    from google.genai.types import (
+        StartSensitivity as _StartSensitivity,
+    )
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
+    raise Exception(f"Missing module: {e}")
+
+# These aliases are just here for backward compatibility, since we used to
+# define public-facing StartSensitivity and EndSensitivity enums in this
+# module.
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Importing StartSensitivity and EndSensitivity from "
+        "pipecat.services.gemini_multimodal_live.events is deprecated. "
+        "Please import them directly from google.genai.types instead.",
+        DeprecationWarning,
+        stacklevel=2,
+    )
+
+StartSensitivity = _StartSensitivity
+EndSensitivity = _EndSensitivity
--- a/src/pipecat/services/gemini_multimodal_live/file_api.py
+++ b/src/pipecat/services/gemini_multimodal_live/file_api.py
@@ -9,181 +9,31 @@
 This module provides a client for Google's Gemini File API, enabling file
 uploads, metadata retrieval, listing, and deletion. Files uploaded through
 this API can be referenced in Gemini generative model calls.
+
+.. deprecated:: 0.0.90
+    Importing GeminiFileAPI from this module is deprecated.
+    Import it from pipecat.services.google.gemini_live.file_api instead.
 """

-import mimetypes
-from typing import Any, Dict, Optional
+import warnings

-import aiohttp
 from loguru import logger

+try:
+    from pipecat.services.google.gemini_live.file_api import GeminiFileAPI as _GeminiFileAPI
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
+    raise Exception(f"Missing module: {e}")

-class GeminiFileAPI:
-    """Client for the Gemini File API.
-
-    This class provides methods for uploading, fetching, listing, and deleting files
-    through Google's Gemini File API.
-
-    Files uploaded through this API remain available for 48 hours and can be referenced
-    in calls to the Gemini generative models. Maximum file size is 2GB, with total
-    project storage limited to 20GB.
-    """
-
-    def __init__(
-        self, api_key: str, base_url: str = "https://generativelanguage.googleapis.com/v1beta/files"
-    ):
-        """Initialize the Gemini File API client.
-
-        Args:
-            api_key: Google AI API key
-            base_url: Base URL for the Gemini File API (default is the v1beta endpoint)
-        """
-        self._api_key = api_key
-        self._base_url = base_url
-        # Upload URL uses the /upload/ path
-        self.upload_base_url = "https://generativelanguage.googleapis.com/upload/v1beta/files"
-
-    async def upload_file(
-        self, file_path: str, display_name: Optional[str] = None
-    ) -> Dict[str, Any]:
-        """Upload a file to the Gemini File API using the correct resumable upload protocol.
-
-        Args:
-            file_path: Path to the file to upload
-            display_name: Optional display name for the file
-
-        Returns:
-            File metadata including uri, name, and display_name
-        """
-        logger.info(f"Uploading file: {file_path}")
-
-        async with aiohttp.ClientSession() as session:
-            # Determine the file's MIME type
-            mime_type, _ = mimetypes.guess_type(file_path)
-            if not mime_type:
-                mime_type = "application/octet-stream"
-
-            # Read the file
-            with open(file_path, "rb") as f:
-                file_data = f.read()
-
-            # Create the metadata payload
-            metadata = {}
-            if display_name:
-                metadata = {"file": {"display_name": display_name}}
-
-            # Step 1: Initial resumable request to get upload URL
-            headers = {
-                "X-Goog-Upload-Protocol": "resumable",
-                "X-Goog-Upload-Command": "start",
-                "X-Goog-Upload-Header-Content-Length": str(len(file_data)),
-                "X-Goog-Upload-Header-Content-Type": mime_type,
-                "Content-Type": "application/json",
-            }
-
-            logger.debug(f"Step 1: Getting upload URL from {self.upload_base_url}")
-            async with session.post(
-                f"{self.upload_base_url}?key={self._api_key}", headers=headers, json=metadata
-            ) as response:
-                if response.status != 200:
-                    error_text = await response.text()
-                    logger.error(f"Error initiating file upload: {error_text}")
-                    raise Exception(f"Failed to initiate upload: {response.status} - {error_text}")
-
-                # Get the upload URL from the response header
-                upload_url = response.headers.get("X-Goog-Upload-URL")
-                if not upload_url:
-                    logger.error(f"Response headers: {dict(response.headers)}")
-                    raise Exception("No upload URL in response headers")
-
-                logger.debug(f"Got upload URL: {upload_url}")
-
-            # Step 2: Upload the actual file data
-            upload_headers = {
-                "Content-Length": str(len(file_data)),
-                "X-Goog-Upload-Offset": "0",
-                "X-Goog-Upload-Command": "upload, finalize",
-            }
-
-            logger.debug(f"Step 2: Uploading file data to {upload_url}")
-            async with session.post(upload_url, headers=upload_headers, data=file_data) as response:
-                if response.status != 200:
-                    error_text = await response.text()
-                    logger.error(f"Error uploading file data: {error_text}")
-                    raise Exception(f"Failed to upload file: {response.status} - {error_text}")
-
-                file_info = await response.json()
-                logger.info(f"File uploaded successfully: {file_info.get('file', {}).get('name')}")
-                return file_info
-
-    async def get_file(self, name: str) -> Dict[str, Any]:
-        """Get metadata for a file.
-
-        Args:
-            name: File name (or full path)
-
-        Returns:
-            File metadata
-        """
-        # Extract just the name part if a full path is provided
-        if "/" in name:
-            name = name.split("/")[-1]
-
-        async with aiohttp.ClientSession() as session:
-            async with session.get(f"{self._base_url}/{name}?key={self._api_key}") as response:
-                if response.status != 200:
-                    error_text = await response.text()
-                    logger.error(f"Error getting file metadata: {error_text}")
-                    raise Exception(f"Failed to get file metadata: {response.status}")
-
-                file_info = await response.json()
-                return file_info
-
-    async def list_files(
-        self, page_size: int = 10, page_token: Optional[str] = None
-    ) -> Dict[str, Any]:
-        """List uploaded files.
-
-        Args:
-            page_size: Number of files to return per page
-            page_token: Token for pagination
-
-        Returns:
-            List of files and next page token if available
-        """
-        params = {"key": self._api_key, "pageSize": page_size}
-
-        if page_token:
-            params["pageToken"] = page_token
-
-        async with aiohttp.ClientSession() as session:
-            async with session.get(self._base_url, params=params) as response:
-                if response.status != 200:
-                    error_text = await response.text()
-                    logger.error(f"Error listing files: {error_text}")
-                    raise Exception(f"Failed to list files: {response.status}")
-
-                result = await response.json()
-                return result
-
-    async def delete_file(self, name: str) -> bool:
-        """Delete a file.
-
-        Args:
-            name: File name (or full path)
-
-        Returns:
-            True if deleted successfully
-        """
-        # Extract just the name part if a full path is provided
-        if "/" in name:
-            name = name.split("/")[-1]
-
-        async with aiohttp.ClientSession() as session:
-            async with session.delete(f"{self._base_url}/{name}?key={self._api_key}") as response:
-                if response.status != 200:
-                    error_text = await response.text()
-                    logger.error(f"Error deleting file: {error_text}")
-                    raise Exception(f"Failed to delete file: {response.status}")
-
-                return True
+# These aliases are just here for backward compatibility, since we used to
+# define public-facing StartSensitivity and EndSensitivity enums in this
+# module.
+warnings.warn(
+    "Importing GeminiFileAPI from "
+    "pipecat.services.gemini_multimodal_live.file_api is deprecated. "
+    "Please import it from pipecat.services.google.gemini_live.file_api instead.",
+    DeprecationWarning,
+    stacklevel=2,
+)
+GeminiFileAPI = _GeminiFileAPI
--- a/src/pipecat/services/gemini_multimodal_live/gemini.py
+++ b/src/pipecat/services/gemini_multimodal_live/gemini.py
--- a/src/pipecat/services/google/init.py
+++ b/src/pipecat/services/google/init.py
@@ -9,6 +9,7 @@ import sys
 from pipecat.services import DeprecatedModuleProxy

 from .frames import *
+from .gemini_live import *
 from .image import *
 from .llm import *
 from .llm_openai import *
--- a/src/pipecat/services/google/gemini_live/init.py
+++ b/src/pipecat/services/google/gemini_live/init.py
@@ -0,0 +1,3 @@
+from .file_api import GeminiFileAPI
+from .llm import GeminiLiveLLMService
+from .llm_vertex import GeminiLiveVertexLLMService
--- a/src/pipecat/services/google/gemini_live/file_api.py
+++ b/src/pipecat/services/google/gemini_live/file_api.py
@@ -0,0 +1,189 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Gemini File API client for uploading and managing files.
+
+This module provides a client for Google's Gemini File API, enabling file
+uploads, metadata retrieval, listing, and deletion. Files uploaded through
+this API can be referenced in Gemini generative model calls.
+"""
+
+import mimetypes
+from typing import Any, Dict, Optional
+
+import aiohttp
+from loguru import logger
+
+
+class GeminiFileAPI:
+    """Client for the Gemini File API.
+
+    This class provides methods for uploading, fetching, listing, and deleting files
+    through Google's Gemini File API.
+
+    Files uploaded through this API remain available for 48 hours and can be referenced
+    in calls to the Gemini generative models. Maximum file size is 2GB, with total
+    project storage limited to 20GB.
+    """
+
+    def __init__(
+        self, api_key: str, base_url: str = "https://generativelanguage.googleapis.com/v1beta/files"
+    ):
+        """Initialize the Gemini File API client.
+
+        Args:
+            api_key: Google AI API key
+            base_url: Base URL for the Gemini File API (default is the v1beta endpoint)
+        """
+        self._api_key = api_key
+        self._base_url = base_url
+        # Upload URL uses the /upload/ path
+        self.upload_base_url = "https://generativelanguage.googleapis.com/upload/v1beta/files"
+
+    async def upload_file(
+        self, file_path: str, display_name: Optional[str] = None
+    ) -> Dict[str, Any]:
+        """Upload a file to the Gemini File API using the correct resumable upload protocol.
+
+        Args:
+            file_path: Path to the file to upload
+            display_name: Optional display name for the file
+
+        Returns:
+            File metadata including uri, name, and display_name
+        """
+        logger.info(f"Uploading file: {file_path}")
+
+        async with aiohttp.ClientSession() as session:
+            # Determine the file's MIME type
+            mime_type, _ = mimetypes.guess_type(file_path)
+            if not mime_type:
+                mime_type = "application/octet-stream"
+
+            # Read the file
+            with open(file_path, "rb") as f:
+                file_data = f.read()
+
+            # Create the metadata payload
+            metadata = {}
+            if display_name:
+                metadata = {"file": {"display_name": display_name}}
+
+            # Step 1: Initial resumable request to get upload URL
+            headers = {
+                "X-Goog-Upload-Protocol": "resumable",
+                "X-Goog-Upload-Command": "start",
+                "X-Goog-Upload-Header-Content-Length": str(len(file_data)),
+                "X-Goog-Upload-Header-Content-Type": mime_type,
+                "Content-Type": "application/json",
+            }
+
+            logger.debug(f"Step 1: Getting upload URL from {self.upload_base_url}")
+            async with session.post(
+                f"{self.upload_base_url}?key={self._api_key}", headers=headers, json=metadata
+            ) as response:
+                if response.status != 200:
+                    error_text = await response.text()
+                    logger.error(f"Error initiating file upload: {error_text}")
+                    raise Exception(f"Failed to initiate upload: {response.status} - {error_text}")
+
+                # Get the upload URL from the response header
+                upload_url = response.headers.get("X-Goog-Upload-URL")
+                if not upload_url:
+                    logger.error(f"Response headers: {dict(response.headers)}")
+                    raise Exception("No upload URL in response headers")
+
+                logger.debug(f"Got upload URL: {upload_url}")
+
+            # Step 2: Upload the actual file data
+            upload_headers = {
+                "Content-Length": str(len(file_data)),
+                "X-Goog-Upload-Offset": "0",
+                "X-Goog-Upload-Command": "upload, finalize",
+            }
+
+            logger.debug(f"Step 2: Uploading file data to {upload_url}")
+            async with session.post(upload_url, headers=upload_headers, data=file_data) as response:
+                if response.status != 200:
+                    error_text = await response.text()
+                    logger.error(f"Error uploading file data: {error_text}")
+                    raise Exception(f"Failed to upload file: {response.status} - {error_text}")
+
+                file_info = await response.json()
+                logger.info(f"File uploaded successfully: {file_info.get('file', {}).get('name')}")
+                return file_info
+
+    async def get_file(self, name: str) -> Dict[str, Any]:
+        """Get metadata for a file.
+
+        Args:
+            name: File name (or full path)
+
+        Returns:
+            File metadata
+        """
+        # Extract just the name part if a full path is provided
+        if "/" in name:
+            name = name.split("/")[-1]
+
+        async with aiohttp.ClientSession() as session:
+            async with session.get(f"{self._base_url}/{name}?key={self._api_key}") as response:
+                if response.status != 200:
+                    error_text = await response.text()
+                    logger.error(f"Error getting file metadata: {error_text}")
+                    raise Exception(f"Failed to get file metadata: {response.status}")
+
+                file_info = await response.json()
+                return file_info
+
+    async def list_files(
+        self, page_size: int = 10, page_token: Optional[str] = None
+    ) -> Dict[str, Any]:
+        """List uploaded files.
+
+        Args:
+            page_size: Number of files to return per page
+            page_token: Token for pagination
+
+        Returns:
+            List of files and next page token if available
+        """
+        params = {"key": self._api_key, "pageSize": page_size}
+
+        if page_token:
+            params["pageToken"] = page_token
+
+        async with aiohttp.ClientSession() as session:
+            async with session.get(self._base_url, params=params) as response:
+                if response.status != 200:
+                    error_text = await response.text()
+                    logger.error(f"Error listing files: {error_text}")
+                    raise Exception(f"Failed to list files: {response.status}")
+
+                result = await response.json()
+                return result
+
+    async def delete_file(self, name: str) -> bool:
+        """Delete a file.
+
+        Args:
+            name: File name (or full path)
+
+        Returns:
+            True if deleted successfully
+        """
+        # Extract just the name part if a full path is provided
+        if "/" in name:
+            name = name.split("/")[-1]
+
+        async with aiohttp.ClientSession() as session:
+            async with session.delete(f"{self._base_url}/{name}?key={self._api_key}") as response:
+                if response.status != 200:
+                    error_text = await response.text()
+                    logger.error(f"Error deleting file: {error_text}")
+                    raise Exception(f"Failed to delete file: {response.status}")
+
+                return True
--- a/src/pipecat/services/google/gemini_live/llm.py
+++ b/src/pipecat/services/google/gemini_live/llm.py
--- a/src/pipecat/services/google/gemini_live/llm_vertex.py
+++ b/src/pipecat/services/google/gemini_live/llm_vertex.py
@@ -0,0 +1,184 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Service for accessing Gemini Live via Google Vertex AI.
+
+This module provides integration with Google's Gemini Live model via
+Vertex AI, supporting both text and audio modalities with voice transcription,
+streaming responses, and tool usage.
+"""
+
+import json
+from typing import List, Optional, Union
+
+from loguru import logger
+
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.services.google.gemini_live.llm import (
+    GeminiLiveLLMService,
+    HttpOptions,
+    InputParams,
+)
+
+try:
+    from google.auth import default
+    from google.auth.exceptions import GoogleAuthError
+    from google.auth.transport.requests import Request
+    from google.genai import Client
+    from google.oauth2 import service_account
+
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error("In order to use Google Vertex AI, you need to `pip install pipecat-ai[google]`.")
+    raise Exception(f"Missing module: {e}")
+
+
+class GeminiLiveVertexLLMService(GeminiLiveLLMService):
+    """Provides access to Google's Gemini Live model via Vertex AI.
+
+    This service enables real-time conversations with Gemini, supporting both
+    text and audio modalities. It handles voice transcription, streaming audio
+    responses, and tool usage.
+    """
+
+    def __init__(
+        self,
+        *,
+        credentials: Optional[str] = None,
+        credentials_path: Optional[str] = None,
+        location: str,
+        project_id: str,
+        model="google/gemini-2.0-flash-live-preview-04-09",
+        voice_id: str = "Charon",
+        start_audio_paused: bool = False,
+        start_video_paused: bool = False,
+        system_instruction: Optional[str] = None,
+        tools: Optional[Union[List[dict], ToolsSchema]] = None,
+        params: Optional[InputParams] = None,
+        inference_on_context_initialization: bool = True,
+        file_api_base_url: str = "https://generativelanguage.googleapis.com/v1beta/files",
+        http_options: Optional[HttpOptions] = None,
+        **kwargs,
+    ):
+        """Initialize the service for accessing Gemini Live via Google Vertex AI.
+
+        Args:
+            credentials: JSON string of service account credentials.
+            credentials_path: Path to the service account JSON file.
+            location: GCP region for Vertex AI endpoint (e.g., "us-east4").
+            project_id: Google Cloud project ID.
+            model: Model identifier to use. Defaults to "models/gemini-2.0-flash-live-preview-04-09".
+            voice_id: TTS voice identifier. Defaults to "Charon".
+            start_audio_paused: Whether to start with audio input paused. Defaults to False.
+            start_video_paused: Whether to start with video input paused. Defaults to False.
+            system_instruction: System prompt for the model. Defaults to None.
+            tools: Tools/functions available to the model. Defaults to None.
+            params: Configuration parameters for the model along with Vertex AI
+                location and project ID.
+            inference_on_context_initialization: Whether to generate a response when context
+                is first set. Defaults to True.
+            file_api_base_url: Base URL for the Gemini File API. Defaults to the official endpoint.
+            http_options: HTTP options for the client.
+            **kwargs: Additional arguments passed to parent GeminiLiveLLMService.
+        """
+        # Check if user incorrectly passed api_key, which is used by parent
+        # class but not here.
+        if "api_key" in kwargs:
+            logger.error(
+                "GeminiLiveVertexLLMService does not accept 'api_key' parameter. "
+                "Use 'credentials' or 'credentials_path' instead for Vertex AI authentication."
+            )
+            raise ValueError(
+                "Invalid parameter 'api_key'. Use 'credentials' or 'credentials_path' for Vertex AI authentication."
+            )
+
+        # These need to be set before calling super().__init__() because
+        # super().__init__() invokes create_client(), which needs these.
+        self._credentials = self._get_credentials(credentials, credentials_path)
+        self._project_id = project_id
+        self._location = location
+
+        # Call parent constructor with the obtained API key
+        super().__init__(
+            # api_key is required by parent class, but actually not used with
+            # Vertex
+            api_key="dummy",
+            model=model,
+            voice_id=voice_id,
+            start_audio_paused=start_audio_paused,
+            start_video_paused=start_video_paused,
+            system_instruction=system_instruction,
+            tools=tools,
+            params=params,
+            inference_on_context_initialization=inference_on_context_initialization,
+            file_api_base_url=file_api_base_url,
+            http_options=http_options,
+            **kwargs,
+        )
+
+    def create_client(self):
+        """Create the Gemini client instance."""
+        self._client = Client(
+            vertexai=True,
+            credentials=self._credentials,
+            project=self._project_id,
+            location=self._location,
+        )
+
+    @property
+    def file_api(self):
+        """Gemini File API is not supported with Vertex AI."""
+        raise NotImplementedError(
+            "When using Vertex AI, the recommended approach is to use Google Cloud Storage for file handling. The Gemini File API is not directly supported in this context."
+        )
+
+    @staticmethod
+    def _get_credentials(credentials: Optional[str], credentials_path: Optional[str]) -> str:
+        """Retrieve Credentials using Google service account credentials JSON.
+
+        Supports multiple authentication methods:
+        1. Direct JSON credentials string
+        2. Path to service account JSON file
+        3. Default application credentials (ADC)
+
+        Args:
+            credentials: JSON string of service account credentials.
+            credentials_path: Path to the service account JSON file.
+
+        Returns:
+            OAuth token for API authentication.
+
+        Raises:
+            ValueError: If no valid credentials are provided or found.
+        """
+        creds: Optional[service_account.Credentials] = None
+
+        if credentials:
+            # Parse and load credentials from JSON string
+            creds = service_account.Credentials.from_service_account_info(
+                json.loads(credentials),
+                scopes=["https://www.googleapis.com/auth/cloud-platform"],
+            )
+        elif credentials_path:
+            # Load credentials from JSON file
+            creds = service_account.Credentials.from_service_account_file(
+                credentials_path,
+                scopes=["https://www.googleapis.com/auth/cloud-platform"],
+            )
+        else:
+            try:
+                creds, project_id = default(
+                    scopes=["https://www.googleapis.com/auth/cloud-platform"]
+                )
+            except GoogleAuthError:
+                pass
+
+        if not creds:
+            raise ValueError("No valid credentials provided.")
+
+        creds.refresh(Request())  # Ensure token is up-to-date, lifetime is 1 hour.
+
+        return creds
--- a/src/pipecat/services/google/llm_openai.py
+++ b/src/pipecat/services/google/llm_openai.py
@@ -94,9 +94,9 @@ class GoogleLLMOpenAIBetaService(OpenAILLMService):
        async for chunk in chunk_stream:
            if chunk.usage:
                tokens = LLMTokenUsage(
-                    prompt_tokens=chunk.usage.prompt_tokens,
-                    completion_tokens=chunk.usage.completion_tokens,
-                    total_tokens=chunk.usage.total_tokens,
+                    prompt_tokens=chunk.usage.prompt_tokens or 0,
+                    completion_tokens=chunk.usage.completion_tokens or 0,
+                    total_tokens=chunk.usage.total_tokens or 0,
                )
                await self.start_llm_usage_metrics(tokens)

--- a/src/pipecat/services/google/llm_vertex.py
+++ b/src/pipecat/services/google/llm_vertex.py
@@ -53,12 +53,44 @@ class GoogleVertexLLMService(OpenAILLMService):

        Parameters:
            location: GCP region for Vertex AI endpoint (e.g., "us-east4").
+
+                .. deprecated:: 0.0.90
+                    Use `location` as a direct argument to
+                    `GoogleVertexLLMService.__init__()` instead.
+
            project_id: Google Cloud project ID.
+
+                .. deprecated:: 0.0.90
+                    Use `project_id` as a direct argument to
+                    `GoogleVertexLLMService.__init__()` instead.
        """

        # https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations
-        location: str = "us-east4"
-        project_id: str
+        location: Optional[str] = None
+        project_id: Optional[str] = None
+
+        def __init__(self, **kwargs):
+            """Initializes the InputParams."""
+            import warnings
+
+            with warnings.catch_warnings():
+                warnings.simplefilter("always")
+                if "location" in kwargs and kwargs["location"] is not None:
+                    warnings.warn(
+                        "GoogleVertexLLMService.InputParams.location is deprecated. "
+                        "Please provide 'location' as a direct argument to GoogleVertexLLMService.__init__() instead.",
+                        DeprecationWarning,
+                        stacklevel=2,
+                    )
+
+                if "project_id" in kwargs and kwargs["project_id"] is not None:
+                    warnings.warn(
+                        "GoogleVertexLLMService.InputParams.project_id is deprecated. "
+                        "Please provide 'project_id' as a direct argument to GoogleVertexLLMService.__init__() instead.",
+                        DeprecationWarning,
+                        stacklevel=2,
+                    )
+            super().__init__(**kwargs)

    def __init__(
        self,
@@ -66,7 +98,8 @@ class GoogleVertexLLMService(OpenAILLMService):
        credentials: Optional[str] = None,
        credentials_path: Optional[str] = None,
        model: str = "google/gemini-2.0-flash-001",
-        params: Optional[InputParams] = None,
+        location: Optional[str] = None,
+        project_id: Optional[str] = None,
        **kwargs,
    ):
        """Initializes the VertexLLMService.
@@ -75,33 +108,60 @@ class GoogleVertexLLMService(OpenAILLMService):
            credentials: JSON string of service account credentials.
            credentials_path: Path to the service account JSON file.
            model: Model identifier (e.g., "google/gemini-2.0-flash-001").
-            params: Vertex AI input parameters including location and project.
+            location: GCP region for Vertex AI endpoint (e.g., "us-east4").
+            project_id: Google Cloud project ID.
            **kwargs: Additional arguments passed to OpenAILLMService.
        """
-        params = params or OpenAILLMService.InputParams()
-        base_url = self._get_base_url(params)
+        # Handle deprecated InputParams fields
+        if "params" in kwargs and isinstance(kwargs["params"], GoogleVertexLLMService.InputParams):
+            params = kwargs["params"]
+            # Extract location and project_id from params if not provided
+            # directly, for backward compatibility
+            if project_id is None:
+                project_id = params.project_id
+            if location is None:
+                location = params.location
+            # Convert to base InputParams
+            params = OpenAILLMService.InputParams(
+                **params.model_dump(exclude={"location", "project_id"}, exclude_unset=True)
+            )
+            kwargs["params"] = params
+
+        # Validate project_id and location parameters
+        # NOTE: once we remove Vertex-spcific InputParams class, we can update
+        #       __init__() signature as follows:
+        #       - location: str = "us-east4",
+        #       - project_id: str,
+        #       But for now, we need them as-is to maintain proper backward
+        #       compatibility.
+        if project_id is None:
+            raise ValueError("project_id is required")
+        if location is None:
+            # If location is not provided, default to "us-east4".
+            # Note: this is legacy behavior; ideally location would be
+            # required.
+            logger.warning("location is not provided. Defaulting to 'us-east4'.")
+            location = "us-east4"  # Default location if not provided
+
+        base_url = self._get_base_url(location, project_id)
        self._api_key = self._get_api_token(credentials, credentials_path)

        super().__init__(
            api_key=self._api_key,
            base_url=base_url,
            model=model,
-            params=params,
            **kwargs,
        )

    @staticmethod
-    def _get_base_url(params: InputParams) -> str:
+    def _get_base_url(location: str, project_id: str) -> str:
        """Construct the base URL for Vertex AI API."""
        # Determine the correct API host based on location
-        if params.location == "global":
+        if location == "global":
            api_host = "aiplatform.googleapis.com"
        else:
-            api_host = f"{params.location}-aiplatform.googleapis.com"
-        return (
-            f"https://{api_host}/v1/"
-            f"projects/{params.project_id}/locations/{params.location}/endpoints/openapi"
-        )
+            api_host = f"{location}-aiplatform.googleapis.com"
+        return f"https://{api_host}/v1/projects/{project_id}/locations/{location}/endpoints/openapi"

    @staticmethod
    def _get_api_token(credentials: Optional[str], credentials_path: Optional[str]) -> str:
--- a/src/pipecat/services/hume/tts.py
+++ b/src/pipecat/services/hume/tts.py
@@ -42,7 +42,7 @@ class HumeTTSService(TTSService):
    """Hume Octave Text-to-Speech service.

    Streams PCM audio via Hume's HTTP output streaming (JSON chunks) endpoint
-    using the Python SDK and emits `TTSAudioRawFrame`s suitable for Pipecat transports.
+    using the Python SDK and emits ``TTSAudioRawFrame`` frames suitable for Pipecat transports.

    Supported features:

@@ -78,7 +78,7 @@ class HumeTTSService(TTSService):

        Args:
            api_key: Hume API key. If omitted, reads the ``HUME_API_KEY`` environment variable.
-            voice_id: ID of the voice to use (ID-only; names are not supported here).
+            voice_id: ID of the voice to use. Only voice IDs are supported; voice names are not.
            params: Optional synthesis controls (acting instructions, speed, trailing silence).
            sample_rate: Output sample rate for emitted PCM frames. Defaults to 48_000 (Hume).
            **kwargs: Additional arguments passed to the parent class.
--- a/src/pipecat/services/openai/init.py
+++ b/src/pipecat/services/openai/init.py
@@ -10,6 +10,7 @@ from pipecat.services import DeprecatedModuleProxy

 from .image import *
 from .llm import *
+from .realtime import *
 from .stt import *
 from .tts import *

--- a/src/pipecat/services/openai/base_llm.py
+++ b/src/pipecat/services/openai/base_llm.py
@@ -66,6 +66,7 @@ class BaseOpenAILLMService(LLMService):
            top_p: Top-p (nucleus) sampling parameter (0.0 to 1.0).
            max_tokens: Maximum tokens in response (deprecated, use max_completion_tokens).
            max_completion_tokens: Maximum completion tokens to generate.
+            service_tier: Service tier to use (e.g., "auto", "flex", "priority").
            extra: Additional model-specific parameters.
        """

@@ -83,6 +84,7 @@ class BaseOpenAILLMService(LLMService):
        top_p: Optional[float] = Field(default_factory=lambda: NOT_GIVEN, ge=0.0, le=1.0)
        max_tokens: Optional[int] = Field(default_factory=lambda: NOT_GIVEN, ge=1)
        max_completion_tokens: Optional[int] = Field(default_factory=lambda: NOT_GIVEN, ge=1)
+        service_tier: Optional[str] = Field(default_factory=lambda: NOT_GIVEN)
        extra: Optional[Dict[str, Any]] = Field(default_factory=dict)

    def __init__(
@@ -125,6 +127,7 @@ class BaseOpenAILLMService(LLMService):
            "top_p": params.top_p,
            "max_tokens": params.max_tokens,
            "max_completion_tokens": params.max_completion_tokens,
+            "service_tier": params.service_tier,
            "extra": params.extra if isinstance(params.extra, dict) else {},
        }
        self._retry_timeout_secs = retry_timeout_secs
@@ -236,6 +239,7 @@ class BaseOpenAILLMService(LLMService):
            "top_p": self._settings["top_p"],
            "max_tokens": self._settings["max_tokens"],
            "max_completion_tokens": self._settings["max_completion_tokens"],
+            "service_tier": self._settings["service_tier"],
        }

        # Messages, tools, tool_choice
--- a/src/pipecat/services/openai/realtime/init.py
+++ b/src/pipecat/services/openai/realtime/init.py
--- a/src/pipecat/services/openai/realtime/context.py
+++ b/src/pipecat/services/openai/realtime/context.py
@@ -0,0 +1,272 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""OpenAI Realtime LLM context and aggregator implementations."""
+
+import copy
+import json
+
+from loguru import logger
+
+from pipecat.frames.frames import (
+    Frame,
+    FunctionCallResultFrame,
+    InterimTranscriptionFrame,
+    LLMMessagesUpdateFrame,
+    LLMSetToolsFrame,
+    LLMTextFrame,
+    TranscriptionFrame,
+)
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.services.openai.llm import (
+    OpenAIAssistantContextAggregator,
+    OpenAIUserContextAggregator,
+)
+
+from . import events
+from .frames import RealtimeFunctionCallResultFrame, RealtimeMessagesUpdateFrame
+
+
+class OpenAIRealtimeLLMContext(OpenAILLMContext):
+    """OpenAI Realtime LLM context with session management and message conversion.
+
+    Extends the standard OpenAI LLM context to support real-time session properties,
+    instruction management, and conversion between standard message formats and
+    realtime conversation items.
+    """
+
+    def __init__(self, messages=None, tools=None, **kwargs):
+        """Initialize the OpenAIRealtimeLLMContext.
+
+        Args:
+            messages: Initial conversation messages. Defaults to None.
+            tools: Available function tools. Defaults to None.
+            **kwargs: Additional arguments passed to parent OpenAILLMContext.
+        """
+        super().__init__(messages=messages, tools=tools, **kwargs)
+        self.__setup_local()
+
+    def __setup_local(self):
+        self.llm_needs_settings_update = True
+        self.llm_needs_initial_messages = True
+        self._session_instructions = ""
+
+        return
+
+    @staticmethod
+    def upgrade_to_realtime(obj: OpenAILLMContext) -> "OpenAIRealtimeLLMContext":
+        """Upgrade a standard OpenAI LLM context to a realtime context.
+
+        Args:
+            obj: The OpenAILLMContext instance to upgrade.
+
+        Returns:
+            The upgraded OpenAIRealtimeLLMContext instance.
+        """
+        if isinstance(obj, OpenAILLMContext) and not isinstance(obj, OpenAIRealtimeLLMContext):
+            obj.__class__ = OpenAIRealtimeLLMContext
+            obj.__setup_local()
+        return obj
+
+    # todo
+    #   - finish implementing all frames
+
+    def from_standard_message(self, message):
+        """Convert a standard message format to a realtime conversation item.
+
+        Args:
+            message: The standard message dictionary to convert.
+
+        Returns:
+            A ConversationItem instance for the realtime API.
+        """
+        if message.get("role") == "user":
+            content = message.get("content")
+            if isinstance(message.get("content"), list):
+                content = ""
+                for c in message.get("content"):
+                    if c.get("type") == "text":
+                        content += " " + c.get("text")
+                    else:
+                        logger.error(
+                            f"Unhandled content type in context message: {c.get('type')} - {message}"
+                        )
+            return events.ConversationItem(
+                role="user",
+                type="message",
+                content=[events.ItemContent(type="input_text", text=content)],
+            )
+        if message.get("role") == "assistant" and message.get("tool_calls"):
+            tc = message.get("tool_calls")[0]
+            return events.ConversationItem(
+                type="function_call",
+                call_id=tc["id"],
+                name=tc["function"]["name"],
+                arguments=tc["function"]["arguments"],
+            )
+        logger.error(f"Unhandled message type in from_standard_message: {message}")
+
+    def get_messages_for_initializing_history(self):
+        """Get conversation items for initializing the realtime session history.
+
+        Converts the context's messages to a format suitable for the realtime API,
+        handling system instructions and conversation history packaging.
+
+        Returns:
+            List of conversation items for session initialization.
+        """
+        # We can't load a long conversation history into the openai realtime api yet. (The API/model
+        # forgets that it can do audio, if you do a series of `conversation.item.create` calls.) So
+        # our general strategy until this is fixed is just to put everything into a first "user"
+        # message as a single input.
+        if not self.messages:
+            return []
+
+        messages = copy.deepcopy(self.messages)
+
+        # If we have a "system" message as our first message, let's pull that out into session
+        # "instructions"
+        if messages[0].get("role") == "system":
+            self.llm_needs_settings_update = True
+            system = messages.pop(0)
+            content = system.get("content")
+            if isinstance(content, str):
+                self._session_instructions = content
+            elif isinstance(content, list):
+                self._session_instructions = content[0].get("text")
+            if not messages:
+                return []
+
+        # If we have just a single "user" item, we can just send it normally
+        if len(messages) == 1 and messages[0].get("role") == "user":
+            return [self.from_standard_message(messages[0])]
+
+        # Otherwise, let's pack everything into a single "user" message with a bit of
+        # explanation for the LLM
+        intro_text = """
+        This is a previously saved conversation. Please treat this conversation history as a
+        starting point for the current conversation."""
+
+        trailing_text = """
+        This is the end of the previously saved conversation. Please continue the conversation
+        from here. If the last message is a user instruction or question, act on that instruction
+        or answer the question. If the last message is an assistant response, simple say that you
+        are ready to continue the conversation."""
+
+        return [
+            {
+                "role": "user",
+                "type": "message",
+                "content": [
+                    {
+                        "type": "input_text",
+                        "text": "\n\n".join(
+                            [intro_text, json.dumps(messages, indent=2), trailing_text]
+                        ),
+                    }
+                ],
+            }
+        ]
+
+    def add_user_content_item_as_message(self, item):
+        """Add a user content item as a standard message to the context.
+
+        Args:
+            item: The conversation item to add as a user message.
+        """
+        message = {
+            "role": "user",
+            "content": [{"type": "text", "text": item.content[0].transcript}],
+        }
+        self.add_message(message)
+
+
+class OpenAIRealtimeUserContextAggregator(OpenAIUserContextAggregator):
+    """User context aggregator for OpenAI Realtime API.
+
+    Handles user input frames and generates appropriate context updates
+    for the realtime conversation, including message updates and tool settings.
+
+    Args:
+        context: The OpenAI realtime LLM context.
+        **kwargs: Additional arguments passed to parent aggregator.
+    """
+
+    async def process_frame(
+        self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM
+    ):
+        """Process incoming frames and handle realtime-specific frame types.
+
+        Args:
+            frame: The frame to process.
+            direction: The direction of frame flow in the pipeline.
+        """
+        await super().process_frame(frame, direction)
+        # Parent does not push LLMMessagesUpdateFrame. This ensures that in a typical pipeline,
+        # messages are only processed by the user context aggregator, which is generally what we want. But
+        # we also need to send new messages over the websocket, so the openai realtime API has them
+        # in its context.
+        if isinstance(frame, LLMMessagesUpdateFrame):
+            await self.push_frame(RealtimeMessagesUpdateFrame(context=self._context))
+
+        # Parent also doesn't push the LLMSetToolsFrame.
+        if isinstance(frame, LLMSetToolsFrame):
+            await self.push_frame(frame, direction)
+
+    async def push_aggregation(self):
+        """Push user input aggregation.
+
+        Currently ignores all user input coming into the pipeline as realtime
+        audio input is handled directly by the service.
+        """
+        # for the moment, ignore all user input coming into the pipeline.
+        # todo: think about whether/how to fix this to allow for text input from
+        #       upstream (transport/transcription, or other sources)
+        pass
+
+
+class OpenAIRealtimeAssistantContextAggregator(OpenAIAssistantContextAggregator):
+    """Assistant context aggregator for OpenAI Realtime API.
+
+    Handles assistant output frames from the realtime service, filtering
+    out duplicate text frames and managing function call results.
+
+    Args:
+        context: The OpenAI realtime LLM context.
+        **kwargs: Additional arguments passed to parent aggregator.
+    """
+
+    # The LLMAssistantContextAggregator uses TextFrames to aggregate the LLM output,
+    # but the OpenAIRealtimeLLMService pushes LLMTextFrames and TTSTextFrames. We
+    # need to override this proces_frame for LLMTextFrame, so that only the TTSTextFrames
+    # are process. This ensures that the context gets only one set of messages.
+    # OpenAIRealtimeLLMService also pushes TranscriptionFrames and InterimTranscriptionFrames,
+    # so we need to ignore pushing those as well, as they're also TextFrames.
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        """Process assistant frames, filtering out duplicate text content.
+
+        Args:
+            frame: The frame to process.
+            direction: The direction of frame flow in the pipeline.
+        """
+        if not isinstance(frame, (LLMTextFrame, TranscriptionFrame, InterimTranscriptionFrame)):
+            await super().process_frame(frame, direction)
+
+    async def handle_function_call_result(self, frame: FunctionCallResultFrame):
+        """Handle function call result and notify the realtime service.
+
+        Args:
+            frame: The function call result frame to handle.
+        """
+        await super().handle_function_call_result(frame)
+
+        # The standard function callback code path pushes the FunctionCallResultFrame from the llm itself,
+        # so we didn't have a chance to add the result to the openai realtime api context. Let's push a
+        # special frame to do that.
+        await self.push_frame(
+            RealtimeFunctionCallResultFrame(result_frame=frame), FrameDirection.UPSTREAM
+        )
--- a/src/pipecat/services/openai/realtime/events.py
+++ b/src/pipecat/services/openai/realtime/events.py
--- a/src/pipecat/services/openai/realtime/frames.py
+++ b/src/pipecat/services/openai/realtime/frames.py
@@ -0,0 +1,37 @@
+#
+# Copyright (c) 2024–2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Custom frame types for OpenAI Realtime API integration."""
+
+from dataclasses import dataclass
+from typing import TYPE_CHECKING
+
+from pipecat.frames.frames import DataFrame, FunctionCallResultFrame
+
+if TYPE_CHECKING:
+    from pipecat.services.openai.realtime.context import OpenAIRealtimeLLMContext
+
+
+@dataclass
+class RealtimeMessagesUpdateFrame(DataFrame):
+    """Frame indicating that the realtime context messages have been updated.
+
+    Parameters:
+        context: The updated OpenAI realtime LLM context.
+    """
+
+    context: "OpenAIRealtimeLLMContext"
+
+
+@dataclass
+class RealtimeFunctionCallResultFrame(DataFrame):
+    """Frame containing function call results for the realtime service.
+
+    Parameters:
+        result_frame: The function call result frame to send to the realtime API.
+    """
+
+    result_frame: FunctionCallResultFrame
--- a/src/pipecat/services/openai_realtime/openai.py
+++ b/src/pipecat/services/openai_realtime/openai.py
--- a/src/pipecat/services/openai_realtime/init.py
+++ b/src/pipecat/services/openai_realtime/init.py
@@ -1,9 +1,27 @@
-from .azure import AzureRealtimeLLMService
-from .events import (
+#
+# Copyright (c) 2025, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import warnings
+
+from pipecat.services.azure.realtime.llm import AzureRealtimeLLMService
+from pipecat.services.openai.realtime.events import (
    InputAudioNoiseReduction,
    InputAudioTranscription,
    SemanticTurnDetection,
    SessionProperties,
    TurnDetection,
 )
-from .openai import OpenAIRealtimeLLMService
+from pipecat.services.openai.realtime.llm import OpenAIRealtimeLLMService
+
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.openai_realtime are deprecated. "
+        "Please use the equivalent types from "
+        "pipecat.services.openai.realtime instead.",
+        DeprecationWarning,
+        stacklevel=2,
+    )
--- a/src/pipecat/services/openai_realtime/azure.py
+++ b/src/pipecat/services/openai_realtime/azure.py
@@ -1,67 +1,21 @@
 #
-# Copyright (c) 2024–2025, Daily
+# Copyright (c) 2025, Daily
 #
 # SPDX-License-Identifier: BSD 2-Clause License
 #

 """Azure OpenAI Realtime LLM service implementation."""

-from loguru import logger
+import warnings

-from .openai import OpenAIRealtimeLLMService
+from pipecat.services.azure.realtime.llm import *

-try:
-    from websockets.asyncio.client import connect as websocket_connect
-except ModuleNotFoundError as e:
-    logger.error(f"Exception: {e}")
-    logger.error(
-        "In order to use OpenAI, you need to `pip install pipecat-ai[openai]`. Also, set `OPENAI_API_KEY` environment variable."
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.openai_realtime.azure are deprecated. "
+        "Please use the equivalent types from "
+        "pipecat.services.azure.realtime.llm instead.",
+        DeprecationWarning,
+        stacklevel=2,
    )
-    raise Exception(f"Missing module: {e}")
-
-
-class AzureRealtimeLLMService(OpenAIRealtimeLLMService):
-    """Azure OpenAI Realtime LLM service with Azure-specific authentication.
-
-    Extends the OpenAI Realtime service to work with Azure OpenAI endpoints,
-    using Azure's authentication headers and endpoint format. Provides the same
-    real-time audio and text communication capabilities as the base OpenAI service.
-    """
-
-    def __init__(
-        self,
-        *,
-        api_key: str,
-        base_url: str,
-        **kwargs,
-    ):
-        """Initialize Azure Realtime LLM service.
-
-        Args:
-            api_key: The API key for the Azure OpenAI service.
-            base_url: The full Azure WebSocket endpoint URL including api-version and deployment.
-                Example: "wss://my-project.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=my-realtime-deployment"
-            **kwargs: Additional arguments passed to parent OpenAIRealtimeLLMService.
-        """
-        super().__init__(base_url=base_url, api_key=api_key, **kwargs)
-        self.api_key = api_key
-        self.base_url = base_url
-
-    async def _connect(self):
-        try:
-            if self._websocket:
-                # Here we assume that if we have a websocket, we are connected. We
-                # handle disconnections in the send/recv code paths.
-                return
-
-            logger.info(f"Connecting to {self.base_url}, api key: {self.api_key}")
-            self._websocket = await websocket_connect(
-                uri=self.base_url,
-                additional_headers={
-                    "api-key": self.api_key,
-                },
-            )
-            self._receive_task = self.create_task(self._receive_task_handler())
-        except Exception as e:
-            logger.error(f"{self} initialization error: {e}")
-            self._websocket = None
--- a/src/pipecat/services/openai_realtime/context.py
+++ b/src/pipecat/services/openai_realtime/context.py
@@ -1,272 +1,21 @@
 #
-# Copyright (c) 2024–2025, Daily
+# Copyright (c) 2025, Daily
 #
 # SPDX-License-Identifier: BSD 2-Clause License
 #

 """OpenAI Realtime LLM context and aggregator implementations."""

-import copy
-import json
+import warnings

-from loguru import logger
+from pipecat.services.openai.realtime.context import *

-from pipecat.frames.frames import (
-    Frame,
-    FunctionCallResultFrame,
-    InterimTranscriptionFrame,
-    LLMMessagesUpdateFrame,
-    LLMSetToolsFrame,
-    LLMTextFrame,
-    TranscriptionFrame,
-)
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
-from pipecat.processors.frame_processor import FrameDirection
-from pipecat.services.openai.llm import (
-    OpenAIAssistantContextAggregator,
-    OpenAIUserContextAggregator,
-)
-
-from . import events
-from .frames import RealtimeFunctionCallResultFrame, RealtimeMessagesUpdateFrame
-
-
-class OpenAIRealtimeLLMContext(OpenAILLMContext):
-    """OpenAI Realtime LLM context with session management and message conversion.
-
-    Extends the standard OpenAI LLM context to support real-time session properties,
-    instruction management, and conversion between standard message formats and
-    realtime conversation items.
-    """
-
-    def __init__(self, messages=None, tools=None, **kwargs):
-        """Initialize the OpenAIRealtimeLLMContext.
-
-        Args:
-            messages: Initial conversation messages. Defaults to None.
-            tools: Available function tools. Defaults to None.
-            **kwargs: Additional arguments passed to parent OpenAILLMContext.
-        """
-        super().__init__(messages=messages, tools=tools, **kwargs)
-        self.__setup_local()
-
-    def __setup_local(self):
-        self.llm_needs_settings_update = True
-        self.llm_needs_initial_messages = True
-        self._session_instructions = ""
-
-        return
-
-    @staticmethod
-    def upgrade_to_realtime(obj: OpenAILLMContext) -> "OpenAIRealtimeLLMContext":
-        """Upgrade a standard OpenAI LLM context to a realtime context.
-
-        Args:
-            obj: The OpenAILLMContext instance to upgrade.
-
-        Returns:
-            The upgraded OpenAIRealtimeLLMContext instance.
-        """
-        if isinstance(obj, OpenAILLMContext) and not isinstance(obj, OpenAIRealtimeLLMContext):
-            obj.__class__ = OpenAIRealtimeLLMContext
-            obj.__setup_local()
-        return obj
-
-    # todo
-    #   - finish implementing all frames
-
-    def from_standard_message(self, message):
-        """Convert a standard message format to a realtime conversation item.
-
-        Args:
-            message: The standard message dictionary to convert.
-
-        Returns:
-            A ConversationItem instance for the realtime API.
-        """
-        if message.get("role") == "user":
-            content = message.get("content")
-            if isinstance(message.get("content"), list):
-                content = ""
-                for c in message.get("content"):
-                    if c.get("type") == "text":
-                        content += " " + c.get("text")
-                    else:
-                        logger.error(
-                            f"Unhandled content type in context message: {c.get('type')} - {message}"
-                        )
-            return events.ConversationItem(
-                role="user",
-                type="message",
-                content=[events.ItemContent(type="input_text", text=content)],
-            )
-        if message.get("role") == "assistant" and message.get("tool_calls"):
-            tc = message.get("tool_calls")[0]
-            return events.ConversationItem(
-                type="function_call",
-                call_id=tc["id"],
-                name=tc["function"]["name"],
-                arguments=tc["function"]["arguments"],
-            )
-        logger.error(f"Unhandled message type in from_standard_message: {message}")
-
-    def get_messages_for_initializing_history(self):
-        """Get conversation items for initializing the realtime session history.
-
-        Converts the context's messages to a format suitable for the realtime API,
-        handling system instructions and conversation history packaging.
-
-        Returns:
-            List of conversation items for session initialization.
-        """
-        # We can't load a long conversation history into the openai realtime api yet. (The API/model
-        # forgets that it can do audio, if you do a series of `conversation.item.create` calls.) So
-        # our general strategy until this is fixed is just to put everything into a first "user"
-        # message as a single input.
-        if not self.messages:
-            return []
-
-        messages = copy.deepcopy(self.messages)
-
-        # If we have a "system" message as our first message, let's pull that out into session
-        # "instructions"
-        if messages[0].get("role") == "system":
-            self.llm_needs_settings_update = True
-            system = messages.pop(0)
-            content = system.get("content")
-            if isinstance(content, str):
-                self._session_instructions = content
-            elif isinstance(content, list):
-                self._session_instructions = content[0].get("text")
-            if not messages:
-                return []
-
-        # If we have just a single "user" item, we can just send it normally
-        if len(messages) == 1 and messages[0].get("role") == "user":
-            return [self.from_standard_message(messages[0])]
-
-        # Otherwise, let's pack everything into a single "user" message with a bit of
-        # explanation for the LLM
-        intro_text = """
-        This is a previously saved conversation. Please treat this conversation history as a
-        starting point for the current conversation."""
-
-        trailing_text = """
-        This is the end of the previously saved conversation. Please continue the conversation
-        from here. If the last message is a user instruction or question, act on that instruction
-        or answer the question. If the last message is an assistant response, simple say that you
-        are ready to continue the conversation."""
-
-        return [
-            {
-                "role": "user",
-                "type": "message",
-                "content": [
-                    {
-                        "type": "input_text",
-                        "text": "\n\n".join(
-                            [intro_text, json.dumps(messages, indent=2), trailing_text]
-                        ),
-                    }
-                ],
-            }
-        ]
-
-    def add_user_content_item_as_message(self, item):
-        """Add a user content item as a standard message to the context.
-
-        Args:
-            item: The conversation item to add as a user message.
-        """
-        message = {
-            "role": "user",
-            "content": [{"type": "text", "text": item.content[0].transcript}],
-        }
-        self.add_message(message)
-
-
-class OpenAIRealtimeUserContextAggregator(OpenAIUserContextAggregator):
-    """User context aggregator for OpenAI Realtime API.
-
-    Handles user input frames and generates appropriate context updates
-    for the realtime conversation, including message updates and tool settings.
-
-    Args:
-        context: The OpenAI realtime LLM context.
-        **kwargs: Additional arguments passed to parent aggregator.
-    """
-
-    async def process_frame(
-        self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM
-    ):
-        """Process incoming frames and handle realtime-specific frame types.
-
-        Args:
-            frame: The frame to process.
-            direction: The direction of frame flow in the pipeline.
-        """
-        await super().process_frame(frame, direction)
-        # Parent does not push LLMMessagesUpdateFrame. This ensures that in a typical pipeline,
-        # messages are only processed by the user context aggregator, which is generally what we want. But
-        # we also need to send new messages over the websocket, so the openai realtime API has them
-        # in its context.
-        if isinstance(frame, LLMMessagesUpdateFrame):
-            await self.push_frame(RealtimeMessagesUpdateFrame(context=self._context))
-
-        # Parent also doesn't push the LLMSetToolsFrame.
-        if isinstance(frame, LLMSetToolsFrame):
-            await self.push_frame(frame, direction)
-
-    async def push_aggregation(self):
-        """Push user input aggregation.
-
-        Currently ignores all user input coming into the pipeline as realtime
-        audio input is handled directly by the service.
-        """
-        # for the moment, ignore all user input coming into the pipeline.
-        # todo: think about whether/how to fix this to allow for text input from
-        #       upstream (transport/transcription, or other sources)
-        pass
-
-
-class OpenAIRealtimeAssistantContextAggregator(OpenAIAssistantContextAggregator):
-    """Assistant context aggregator for OpenAI Realtime API.
-
-    Handles assistant output frames from the realtime service, filtering
-    out duplicate text frames and managing function call results.
-
-    Args:
-        context: The OpenAI realtime LLM context.
-        **kwargs: Additional arguments passed to parent aggregator.
-    """
-
-    # The LLMAssistantContextAggregator uses TextFrames to aggregate the LLM output,
-    # but the OpenAIRealtimeLLMService pushes LLMTextFrames and TTSTextFrames. We
-    # need to override this proces_frame for LLMTextFrame, so that only the TTSTextFrames
-    # are process. This ensures that the context gets only one set of messages.
-    # OpenAIRealtimeLLMService also pushes TranscriptionFrames and InterimTranscriptionFrames,
-    # so we need to ignore pushing those as well, as they're also TextFrames.
-    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        """Process assistant frames, filtering out duplicate text content.
-
-        Args:
-            frame: The frame to process.
-            direction: The direction of frame flow in the pipeline.
-        """
-        if not isinstance(frame, (LLMTextFrame, TranscriptionFrame, InterimTranscriptionFrame)):
-            await super().process_frame(frame, direction)
-
-    async def handle_function_call_result(self, frame: FunctionCallResultFrame):
-        """Handle function call result and notify the realtime service.
-
-        Args:
-            frame: The function call result frame to handle.
-        """
-        await super().handle_function_call_result(frame)
-
-        # The standard function callback code path pushes the FunctionCallResultFrame from the llm itself,
-        # so we didn't have a chance to add the result to the openai realtime api context. Let's push a
-        # special frame to do that.
-        await self.push_frame(
-            RealtimeFunctionCallResultFrame(result_frame=frame), FrameDirection.UPSTREAM
-        )
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.openai_realtime.context are deprecated. "
+        "Please use the equivalent types from "
+        "pipecat.services.openai.realtime.context instead.",
+        DeprecationWarning,
+        stacklevel=2,
+    )
--- a/src/pipecat/services/openai_realtime/events.py
+++ b/src/pipecat/services/openai_realtime/events.py
--- a/src/pipecat/services/openai_realtime/frames.py
+++ b/src/pipecat/services/openai_realtime/frames.py
@@ -1,37 +1,21 @@
 #
-# Copyright (c) 2024–2025, Daily
+# Copyright (c) 2025, Daily
 #
 # SPDX-License-Identifier: BSD 2-Clause License
 #

 """Custom frame types for OpenAI Realtime API integration."""

-from dataclasses import dataclass
-from typing import TYPE_CHECKING
+import warnings

-from pipecat.frames.frames import DataFrame, FunctionCallResultFrame
+from pipecat.services.openai.realtime.frames import *

-if TYPE_CHECKING:
-    from pipecat.services.openai_realtime.context import OpenAIRealtimeLLMContext
-
-
-@dataclass
-class RealtimeMessagesUpdateFrame(DataFrame):
-    """Frame indicating that the realtime context messages have been updated.
-
-    Parameters:
-        context: The updated OpenAI realtime LLM context.
-    """
-
-    context: "OpenAIRealtimeLLMContext"
-
-
-@dataclass
-class RealtimeFunctionCallResultFrame(DataFrame):
-    """Frame containing function call results for the realtime service.
-
-    Parameters:
-        result_frame: The function call result frame to send to the realtime API.
-    """
-
-    result_frame: FunctionCallResultFrame
+with warnings.catch_warnings():
+    warnings.simplefilter("always")
+    warnings.warn(
+        "Types in pipecat.services.openai_realtime.frames are deprecated. "
+        "Please use the equivalent types from "
+        "pipecat.services.openai.realtime.frames instead.",
+        DeprecationWarning,
+        stacklevel=2,
+    )
--- a/src/pipecat/services/piper/tts.py
+++ b/src/pipecat/services/piper/tts.py
@@ -14,7 +14,6 @@ from loguru import logger
 from pipecat.frames.frames import (
    ErrorFrame,
    Frame,
-    TTSAudioRawFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
 )
@@ -99,16 +98,15 @@ class PiperTTSService(TTSService):

                await self.start_tts_usage_metrics(text)

+                yield TTSStartedFrame()
+
                CHUNK_SIZE = self.chunk_size

-                yield TTSStartedFrame()
-                async for chunk in response.content.iter_chunked(CHUNK_SIZE):
-                    # remove wav header if present
-                    if chunk.startswith(b"RIFF"):
-                        chunk = chunk[44:]
-                    if len(chunk) > 0:
-                        await self.stop_ttfb_metrics()
-                        yield TTSAudioRawFrame(chunk, self.sample_rate, 1)
+                async for frame in self._stream_audio_frames_from_iterator(
+                    response.content.iter_chunked(CHUNK_SIZE), strip_wav_header=True
+                ):
+                    await self.stop_ttfb_metrics()
+                    yield frame
        except Exception as e:
            logger.error(f"Error in run_tts: {e}")
            yield ErrorFrame(error=str(e))
--- a/src/pipecat/services/rime/tts.py
+++ b/src/pipecat/services/rime/tts.py
@@ -553,15 +553,13 @@ class RimeHttpTTSService(TTSService):

                CHUNK_SIZE = self.chunk_size

-                async for chunk in response.content.iter_chunked(CHUNK_SIZE):
-                    if need_to_strip_wav_header and chunk.startswith(b"RIFF"):
-                        chunk = chunk[44:]
-                        need_to_strip_wav_header = False
+                async for frame in self._stream_audio_frames_from_iterator(
+                    response.content.iter_chunked(CHUNK_SIZE),
+                    strip_wav_header=need_to_strip_wav_header,
+                ):
+                    await self.stop_ttfb_metrics()
+                    yield frame

-                    if len(chunk) > 0:
-                        await self.stop_ttfb_metrics()
-                        frame = TTSAudioRawFrame(chunk, self.sample_rate, 1)
-                        yield frame
        except Exception as e:
            logger.exception(f"Error generating TTS: {e}")
            yield ErrorFrame(error=f"Rime TTS error: {str(e)}")
--- a/src/pipecat/services/tts_service.py
+++ b/src/pipecat/services/tts_service.py
@@ -8,7 +8,17 @@

 import asyncio
 from abc import abstractmethod
-from typing import Any, AsyncGenerator, Dict, List, Mapping, Optional, Sequence, Tuple
+from typing import (
+    Any,
+    AsyncGenerator,
+    AsyncIterator,
+    Dict,
+    List,
+    Mapping,
+    Optional,
+    Sequence,
+    Tuple,
+)

 from loguru import logger

@@ -374,6 +384,36 @@ class TTSService(AIService):
        ):
            await self._stop_frame_queue.put(frame)

+    async def _stream_audio_frames_from_iterator(
+        self, iterator: AsyncIterator[bytes], *, strip_wav_header: bool
+    ) -> AsyncGenerator[Frame, None]:
+        buffer = bytearray()
+        need_to_strip_wav_header = strip_wav_header
+        async for chunk in iterator:
+            if need_to_strip_wav_header and chunk.startswith(b"RIFF"):
+                chunk = chunk[44:]
+                need_to_strip_wav_header = False
+
+            # Append to current buffer.
+            buffer.extend(chunk)
+
+            # Round to nearest even number.
+            aligned_length = len(buffer) & ~1  # 111111111...11110
+            if aligned_length > 0:
+                aligned_chunk = buffer[:aligned_length]
+                buffer = buffer[aligned_length:]  # keep any leftover byte
+
+                if len(aligned_chunk) > 0:
+                    frame = TTSAudioRawFrame(bytes(aligned_chunk), self.sample_rate, 1)
+                    yield frame
+
+        if len(buffer) > 0:
+            # Make sure we don't need an extra padding byte.
+            if len(buffer) % 2 == 1:
+                buffer.extend(b"\x00")
+            frame = TTSAudioRawFrame(bytes(buffer), self.sample_rate, 1)
+            yield frame
+
    async def _handle_interruption(self, frame: InterruptionFrame, direction: FrameDirection):
        self._processing_text = False
        await self._text_aggregator.handle_interruption()
--- a/src/pipecat/utils/tracing/service_decorators.py
+++ b/src/pipecat/utils/tracing/service_decorators.py
@@ -651,9 +651,9 @@ def traced_gemini_live(operation: str) -> Callable:

                        elif operation == "llm_tool_call" and args:
                            # Extract tool call information
-                            evt = args[0] if args else None
-                            if evt and hasattr(evt, "toolCall") and evt.toolCall.functionCalls:
-                                function_calls = evt.toolCall.functionCalls
+                            msg = args[0] if args else None
+                            if msg and hasattr(msg, "tool_call") and msg.tool_call.function_calls:
+                                function_calls = msg.tool_call.function_calls
                                if function_calls:
                                    # Add information about the first function call
                                    call = function_calls[0]
@@ -722,19 +722,19 @@ def traced_gemini_live(operation: str) -> Callable:

                        elif operation == "llm_response" and args:
                            # Extract usage and response metadata from turn complete event
-                            evt = args[0] if args else None
-                            if evt and hasattr(evt, "usageMetadata") and evt.usageMetadata:
-                                usage = evt.usageMetadata
+                            msg = args[0] if args else None
+                            if msg and hasattr(msg, "usage_metadata") and msg.usage_metadata:
+                                usage = msg.usage_metadata

                                # Token usage - basic attributes for span visibility
-                                if hasattr(usage, "promptTokenCount"):
-                                    operation_attrs["tokens.prompt"] = usage.promptTokenCount or 0
-                                if hasattr(usage, "responseTokenCount"):
+                                if hasattr(usage, "prompt_token_count"):
+                                    operation_attrs["tokens.prompt"] = usage.prompt_token_count or 0
+                                if hasattr(usage, "response_token_count"):
                                    operation_attrs["tokens.completion"] = (
-                                        usage.responseTokenCount or 0
+                                        usage.response_token_count or 0
                                    )
-                                if hasattr(usage, "totalTokenCount"):
-                                    operation_attrs["tokens.total"] = usage.totalTokenCount or 0
+                                if hasattr(usage, "total_token_count"):
+                                    operation_attrs["tokens.total"] = usage.total_token_count or 0

                            # Get output text and modality from service state
                            text = getattr(self, "_bot_text_buffer", "")
@@ -751,9 +751,9 @@ def traced_gemini_live(operation: str) -> Callable:

                            # Add turn completion status
                            if (
-                                evt
-                                and hasattr(evt, "serverContent")
-                                and evt.serverContent.turnComplete
+                                msg
+                                and hasattr(msg, "server_content")
+                                and msg.server_content.turn_complete
                            ):
                                operation_attrs["turn_complete"] = True

@@ -772,16 +772,16 @@ def traced_gemini_live(operation: str) -> Callable:

                        # For llm_response operation, also handle token usage metrics
                        if operation == "llm_response" and hasattr(self, "start_llm_usage_metrics"):
-                            evt = args[0] if args else None
-                            if evt and hasattr(evt, "usageMetadata") and evt.usageMetadata:
-                                usage = evt.usageMetadata
+                            msg = args[0] if args else None
+                            if msg and hasattr(msg, "usage_metadata") and msg.usage_metadata:
+                                usage = msg.usage_metadata
                                # Create LLMTokenUsage object
                                from pipecat.metrics.metrics import LLMTokenUsage

                                tokens = LLMTokenUsage(
-                                    prompt_tokens=usage.promptTokenCount or 0,
-                                    completion_tokens=usage.responseTokenCount or 0,
-                                    total_tokens=usage.totalTokenCount or 0,
+                                    prompt_tokens=usage.prompt_token_count or 0,
+                                    completion_tokens=usage.response_token_count or 0,
+                                    total_tokens=usage.total_token_count or 0,
                                )
                                _add_token_usage_to_span(current_span, tokens)

--- a/tests/test_pipeline.py
+++ b/tests/test_pipeline.py
@@ -11,6 +11,7 @@ import unittest
 from pipecat.frames.frames import (
    CancelFrame,
    EndFrame,
+    ErrorFrame,
    Frame,
    HeartbeatFrame,
    InputAudioRawFrame,
@@ -450,3 +451,34 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
            await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
        except asyncio.CancelledError:
            assert cancelled
+
+    async def test_task_error(self):
+        class ErrorProcessor(FrameProcessor):
+            def __init__(self, **kwargs):
+                super().__init__(**kwargs)
+
+            async def process_frame(self, frame: Frame, direction: FrameDirection):
+                await super().process_frame(frame, direction)
+
+                if isinstance(frame, TextFrame):
+                    await self.push_error(ErrorFrame("Boo!"))
+
+                await self.push_frame(frame, direction)
+
+        error_received = False
+
+        pipeline = Pipeline([ErrorProcessor()])
+        task = PipelineTask(pipeline)
+
+        @task.event_handler("on_pipeline_error")
+        async def on_pipeline_error(task: PipelineTask, frame: ErrorFrame):
+            nonlocal error_received
+            error_received = True
+            await task.cancel()
+
+        await task.queue_frame(TextFrame(text="Hello from Pipecat!"))
+
+        try:
+            await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
+        except asyncio.CancelledError:
+            assert error_received
Author	SHA1	Message	Date
James Hush	cb6e86e69f	docs: add a demo showing how to track usage	2025-10-16 13:45:42 +08:00
Aleix Conchillo Flaqué	fce6f55ddb	Merge pull request #2844 from pipecat-ai/aleix/runner-files-path runner: allow subdirectories in --folder	2025-10-14 08:38:38 -07:00
Aleix Conchillo Flaqué	d9580f72a9	runner: allow subdirectories in --folder	2025-10-13 18:29:19 -07:00
Aleix Conchillo Flaqué	0588c82bbf	Merge pull request #2838 from makosst/manta_graph_readme Added Manta Graph to README	2025-10-11 09:31:21 -07:00
makosst	16e9093d5a	Added Manta Graph to README	2025-10-11 09:20:17 -07:00
Aleix Conchillo Flaqué	91a5d580fd	Merge pull request #2835 from pipecat-ai/aleix/tts-http-aligned-audio-frames tts: fix RimeHttpTTSService/PiperTTSService 16-bit audio frames alignment	2025-10-10 14:20:44 -07:00
Aleix Conchillo Flaqué	0473556992	tts: fix RimeHttpTTSService/PiperTTSService 16-bit audio frames alignment	2025-10-10 14:19:22 -07:00
Aleix Conchillo Flaqué	fdaa4e476e	Merge pull request #2830 from pipecat-ai/aleix/pipecat-0.0.90 update CHANGELOG for 0.0.90	2025-10-10 10:22:26 -07:00
Aleix Conchillo Flaqué	502e7e42a7	update CHANGELOG for 0.0.90	2025-10-10 10:20:19 -07:00
kompfner	2ab3d4fb42	Merge pull request #2834 from pipecat-ai/pk/update-vertex-disclaimer Update a Google Vertex disclaimer for accuracy	2025-10-10 13:19:51 -04:00
Paul Kompfner	55014bdd77	Update a Google Vertex disclaimer for accuracy	2025-10-10 13:18:03 -04:00
kompfner	334796bd65	Merge pull request #2833 from pipecat-ai/pk/vertex-non-optional-location `location` should not be optional when using Google Vertex.	2025-10-10 13:02:40 -04:00
Paul Kompfner	1c25b6fb72	`location` should not be optional when using Google Vertex. Also, update `GoogleVertexLLMService` initialization pattern in the example file.	2025-10-10 12:58:45 -04:00
Mark Backman	91b29de7ca	Merge pull request #2832 from pipecat-ai/mb/docs-fixes-0.0.90 Docs fixes for 0.0.90 release	2025-10-10 12:46:40 -04:00
Mark Backman	21d610cd30	Docs fixes for 0.0.90 release	2025-10-10 12:43:31 -04:00
Mark Backman	f7fe673ad1	Merge pull request #2831 from pipecat-ai/mb/update-evals Update release evals for OpenAI Realtime, Gemini Live Vertex; shorten…	2025-10-10 12:34:27 -04:00
Mark Backman	4b415721e2	Update release evals for OpenAI Realtime, Gemini Live Vertex; shorten 26 foundational names	2025-10-10 12:26:23 -04:00
kompfner	8d2a98e0e7	Merge pull request #2829 from pipecat-ai/pk/fix-gemini-live-deprecation-messages Fix deprecation messages pointing users to the new import paths for G…	2025-10-10 10:42:15 -04:00
Paul Kompfner	523e890c8c	Fix deprecation messages pointing users to the new import paths for Gemini Live	2025-10-10 10:30:38 -04:00
kompfner	3c748fe772	Merge pull request #2823 from pipecat-ai/pk/vertex-init-args-fixup Move `location` and `project_id` out of `InputParams` in `GoogleVerte…	2025-10-10 10:18:51 -04:00
kompfner	d293cee372	Merge pull request #2822 from pipecat-ai/pk/make-pause-processing-frames-more-robust Make `pause_processing_frames()` and `pause_processing_system_frames(…	2025-10-10 10:16:27 -04:00
Paul Kompfner	8b62a96878	Improve how we're deprecating `location` and `project_id` in `GoogleVertexLLMService`, allowing user code to (correctly) continue referring to `GoogleVertexLLMService.InputParams`. Also fix the slightly wrong (but so far harmless) pattern of initializing `OpenAILLMService.InputParams()` in the `GoogleVertexLLMService` if `params` wasn't provided—we should be letting the superclass decide what to do if the argument isn't specified.	2025-10-10 10:12:00 -04:00
Mark Backman	0c102ce70b	Merge pull request #2826 from pipecat-ai/mb/deprecate-livekit-frame-serializer Deprecate LivekitFrameSerializer	2025-10-10 10:01:45 -04:00
Mark Backman	3894d2a4b9	Deprecate LivekitFrameSerializer	2025-10-10 09:51:57 -04:00
Aleix Conchillo Flaqué	1f6b61c0db	Merge pull request #2828 from pipecat-ai/aleix/gemini-live-gemini-to-llm google: rename google.gemini_live.gemini to google.gemini_live.llm	2025-10-10 06:42:51 -07:00
Aleix Conchillo Flaqué	8ee28b37cd	google: rename google.gemini_live.vertext to google.gemini_live.llm_vertex	2025-10-10 06:41:19 -07:00
Filipi da Silva Fuchter	e85e7e4d84	Merge pull request #2773 from pipecat-ai/filipi/krisp_viva Added audio filter `KrispVivaFilter` using the Krisp VIVA SDK.	2025-10-10 09:51:15 -03:00
Filipi Fuchter	1b3afb5511	Added audio filter KrispVivaFilter using the Krisp VIVA SDK	2025-10-10 09:44:47 -03:00
Aleix Conchillo Flaqué	7cec013666	google: rename google.gemini_live.gemini to google.gemini_live.llm	2025-10-09 22:20:09 -07:00
Aleix Conchillo Flaqué	86127167fb	Merge pull request #2827 from pipecat-ai/aleix/openai-realtime-move move openai_realtime to openai.realtime	2025-10-09 22:18:04 -07:00
Aleix Conchillo Flaqué	9935a68018	examples(19b): fix deprecations	2025-10-09 22:14:52 -07:00
Aleix Conchillo Flaqué	5679dde70f	ai_service: use openai.realtime.events instead of openai_realtime_beta.events	2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué	d81b0f6368	update CHANGELOG with openai_realtime deprecation	2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué	9698b008da	deprecate openai_realtime	2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué	7b05c9283b	move openai.realtime.azure to azure.realtime.llm	2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué	303dd2ec35	move openai.realtime.openai to openai.realtime.llm	2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué	aa6e81648a	move openai_realtime to openai.realtime	2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué	1a87870ef3	Merge pull request #2825 from pipecat-ai/aleix/aws-nova-sonic-move move aws_nova_sonic to aws.nova_sonic	2025-10-09 18:37:46 -07:00
Aleix Conchillo Flaqué	aac4ce2d12	update CHANGELOG with aws_nova_sonic deprecation	2025-10-09 18:32:26 -07:00
Aleix Conchillo Flaqué	2a79b2c853	aws: deprecate aws_nova_sonic	2025-10-09 17:44:29 -07:00
Aleix Conchillo Flaqué	15bf5b1533	aws: move aws_nova_sonic to aws.nova_sonic	2025-10-09 17:35:47 -07:00
Aleix Conchillo Flaqué	cdc86db8ce	update CHANGELOG with GoogleVertexLLMService token fix	2025-10-09 16:58:22 -07:00
Aleix Conchillo Flaqué	9d2ad750b5	Merge pull request #2779 from LucasStringPay/patch-1 Ignore None value for 'completion_tokens' or similar for Gemini	2025-10-09 16:55:33 -07:00
Aleix Conchillo Flaqué	19ceb1a48f	Merge pull request #2817 from pipecat-ai/aleix/runner-download-folder runner: add --folder argument to allow file downloads	2025-10-09 16:55:17 -07:00
Aleix Conchillo Flaqué	59217eae38	runner: add --folder argument to allow file downloads	2025-10-09 16:49:51 -07:00
Aleix Conchillo Flaqué	bea0aee835	Merge pull request #2824 from pipecat-ai/aleix/gemini-under-google google: move gemini_live inside google service	2025-10-09 16:40:15 -07:00
Aleix Conchillo Flaqué	aeace9b9be	google: move gemini_live inside google service	2025-10-09 16:06:42 -07:00
Paul Kompfner	2994640f47	Move `location` and `project_id` out of `InputParams` in `GoogleVertexLLMService`, making them top-level init args instead. We do this for two reasons: - Conceptually, these args comprise project-level setup, akin to credentials, whereas everything in `InputParams` is concerned with model configuration - Providing a `project_id` when initializing `GoogleVertexLLMService` should not be optional, but prior to the change in this commit it was (erroneously) treated as optional by dint of `InputParams` being optional This improvement was discussed [in this comment](https://github.com/pipecat-ai/pipecat/pull/2795#discussion_r2408279142).	2025-10-09 16:53:21 -04:00
Paul Kompfner	10069719e4	Make `pause_processing_frames()` and `pause_processing_system_frames()` more robust in `FrameProcessor`. To understand this fix, let's look exclusively at `pause_processing_frames()` (`pause_processing_system_frames()` works the same way). `pause_processing_frames()` works by setting a `__should_block_frames` flag, which is then read each time through the loop in the long-running `__process_frame_task_handler`. if `__should_block_frames` is `True`, it pauses processing frames until it's resumed. Prior to this fix, the check for `__should_block_frames` was before `await self.__process_queue.get()`. The problem is that a lot of the time spent in the loop is waiting for a frame from the process queue. So if `pause_processing_frames()` is set at any time other than within `process_frame()` itself, it actually won't have an effect by the next frame, only on the frame after the next, which is later than intended. Because thus far in the Pipecat codebase we've only ever called `pause_processing_frames()` and `pause_processing_system_frames()` from within `process_frame()`, this change should have no behavioral effect. But it will be helpful if we ever need to call it from anywhere else. I noticed this issue while developing a feature that did exactly that (though I later abandoned that code).	2025-10-09 15:57:31 -04:00
kompfner	046b76df60	Merge pull request #2820 from pipecat-ai/pk/gemini-live-vertex-support Support Gemini Live + Vertex AI	2025-10-09 11:53:41 -04:00
Paul Kompfner	f2d9063984	Renames: remove "multimodal" from Gemini Live types	2025-10-09 10:58:36 -04:00
Paul Kompfner	99f008e927	Make a note in our examples that there's an issue with Gemini Live + Vertex around specifying a modality other than AUDIO	2025-10-08 21:03:07 -04:00
Paul Kompfner	2699f0c2a6	Fix tool calls when using Gemini Live + Vertex AI	2025-10-08 21:03:07 -04:00
Paul Kompfner	0b6dd98000	Make a note in our examples that there's an issue with Gemini Live + Vertex around using "google_search" alongside other tools	2025-10-08 21:03:07 -04:00
Paul Kompfner	a14fb20d15	Fix Gemini Live w/Vertex AI not being able to handle an empty list provided for "function_declarations"	2025-10-08 21:03:07 -04:00
Paul Kompfner	728361a6a7	Add `GeminiVertexMultimodalLiveLLMService`	2025-10-08 21:03:01 -04:00
kompfner	106db69e8e	Merge pull request #2816 from pipecat-ai/pk/gemini-live-await-ongoing-response-after-endframe Implement ending `GeminiMultimodalLiveLLMService` gracefully (i.e. af…	2025-10-08 17:20:14 -04:00
Paul Kompfner	cf90071926	Format fix	2025-10-08 17:19:46 -04:00
Paul Kompfner	deaeb75a1f	Fix changelog after rebase (and add a missing item)	2025-10-08 17:16:31 -04:00
Paul Kompfner	a666327d70	Implement ending `GeminiMultimodalLiveLLMService` gracefully (i.e. after the bot is finished)	2025-10-08 17:13:04 -04:00
kompfner	13a0522546	Merge pull request #2804 from pipecat-ai/pk/gemini-live-session-resumption Add (relatively spartan) reconnection logic to `GeminiMultimodalLiveLLMService`	2025-10-08 17:10:45 -04:00
Paul Kompfner	7da37a0d1f	Pull `_connection_established_threshold` and `_max_consecutive_failures` into file-level constants	2025-10-08 17:04:05 -04:00
Paul Kompfner	7efb22a323	Add (relatively spartan) reconnection logic to `GeminiMultimodalLiveLLMService`, leveraging the Gemini Live session resumption mechanism	2025-10-08 16:53:21 -04:00
kompfner	8084e2f909	Merge pull request #2776 from pipecat-ai/pk/gemini-live-gen-ai-library Gemini Live service uses the `genai` library rather than WebSockets directly	2025-10-08 16:50:16 -04:00
Paul Kompfner	86127c6a6e	Add to the changelog the `GeminiMultimodalLiveLLMService` update to use `google-genai`	2025-10-08 16:46:41 -04:00
Paul Kompfner	402e019ae2	Make a bit of code clearer	2025-10-08 16:45:55 -04:00
Paul Kompfner	f09e4e238b	Fix some mishandling of enum values	2025-10-08 16:45:55 -04:00
Paul Kompfner	2921162b3b	Add deprecation warning around importing StartSensitivity and EndSensitivity from pipecat.services.gemini_multimodal_live.events	2025-10-08 16:45:55 -04:00
Paul Kompfner	ac1582c906	Let users directly use `google-genai` types rather than aliased re-exported types	2025-10-08 16:45:55 -04:00
Paul Kompfner	e4b01a5844	Bumping deprecation version of `GeminiMultimodalLiveLLMService`'s `base_url` arg	2025-10-08 16:45:55 -04:00
Paul Kompfner	fa663abbbc	Add CHANGELOG entry for new `GeminiMultimodalLiveLLMService` configuration options	2025-10-08 16:45:55 -04:00
Paul Kompfner	d19e6111c3	Bumping deprecation version of `GeminiMultimodalLiveLLMService`'s `base_url` arg	2025-10-08 16:45:55 -04:00
Paul Kompfner	8a6d504a7e	Add `enable_affective_dialog` and `proactivity` settings to `GeminiMultimodalLiveLLMService`	2025-10-08 16:45:55 -04:00
Paul Kompfner	43915937f2	Update how we import and re-export some types in `GeminiMultimodalLiveLLMService`	2025-10-08 16:45:55 -04:00
Paul Kompfner	48e92a22fe	Add thinking settings to `GeminiMultimodalLiveLLMService`	2025-10-08 16:45:55 -04:00
Paul Kompfner	566af6b0b8	Minor comment improvement	2025-10-08 16:45:55 -04:00
Paul Kompfner	12e7613d5f	Deprecate the `base_url` argument to `GeminiMultimodalLiveLLMService`. It expected a WebSocket URL, but we're no longer (directly) using WebSockets to talk to Gemini. Instead of trying to (potentially erroneously) map a given custom WebSocket URL to an `HttpOptions` object (the new preferred way of customizing requests made by the Gemini API client), we're simply deprecating `base_url` and pointing users to the `http_options` argument instead.	2025-10-08 16:45:55 -04:00
Paul Kompfner	04a68f2c57	Fix tracing in `GeminiMultimodalLiveLLMService`	2025-10-08 16:45:55 -04:00
Paul Kompfner	9b4ca12f49	Revert to only supporting providing a single modality - looks like specifying a list of modalities results in an API error. Also, fix some missing `await`s in error handling.	2025-10-08 16:45:55 -04:00
Paul Kompfner	453ce715a6	Add some error handling to `GeminiMultimodalLiveLLMService`	2025-10-08 16:45:55 -04:00
Paul Kompfner	d87b6189ba	Update `GeminiMultimodalLiveLLMService` to use the `google-genai` library, which is the new recommended approach for interfacing with Gemini Live.	2025-10-08 16:45:55 -04:00
Mark Backman	8293347b77	Merge pull request #2814 from pipecat-ai/mb/openai-service-tier Add service_tier to BaseOpenAILLMService	2025-10-08 16:44:27 -04:00
Mark Backman	c85a3f0b94	Add service_tier to BaseOpenAILLMService	2025-10-08 16:33:36 -04:00
Aleix Conchillo Flaqué	233fb25e6c	Merge pull request #2810 from pipecat-ai/aleix/on-pipeline-error PipelineTask: add on_pipeline_error event	2025-10-08 11:26:46 -07:00
Aleix Conchillo Flaqué	080978daa6	Merge pull request #2808 from pipecat-ai/aleix/readme-pipecat-tv README: add Pipecat TV reference	2025-10-08 11:26:17 -07:00
Aleix Conchillo Flaqué	62b7c3d3b2	PipelineTask: add on_pipeline_error event	2025-10-07 18:36:38 -07:00
Aleix Conchillo Flaqué	066b77fba0	README: add Pipecat TV reference	2025-10-07 15:01:28 -07:00
LucasStringPay	607e3040d4	Ignore None 'completion_tokens' or similar Similar as `144ea36c81` , reported in https://github.com/pipecat-ai/pipecat/issues/2207	2025-10-02 15:16:11 -07:00