Merge pull request #237 from pipecat-ai/aleix/pipecat-0.0.30

update CHANGELOG for 0.0.30
2024-06-14 05:28:14 +08:00 · 2024-06-13 14:27:37 -07:00 · 2024-06-14 05:24:52 +08:00 · 2024-06-13 13:30:31 -07:00 · 2024-06-13 13:30:11 -07:00 · 2024-06-13 13:30:00 -07:00
116 changed files with 4388 additions and 892 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,10 +5,162 @@ All notable changes to **pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

-## [Unreleased]
+## [0.0.30] - 2024-06-13

 ### Added

+- Added `report_only_initial_ttfb` to `PipelineParams`. This will make it so
+  only the initial TTFB metrics after the user stops talking are reported.
+
+- Added `OpenPipeLLMService`. This service will let you run OpenAI through
+  OpenPipe's SDK.
+
+- Allow specifying frame processors' name through a new `name` constructor
+  argument.
+
+### Changed
+
+- `FrameSerializer.deserialize()` can now return `None` in case it is not
+  possible to desearialize the given data.
+
+- `daily_rest.DailyRoomProperties` now allows extra unknown parameters.
+
+- Added `DeepgramSTTService`. This service has an ongoing websocket
+  connection. To handle this, it subclasses `AIService` instead of
+  `STTService`. The output of this service will be pushed from the same task,
+  except system frames like `StartFrame`, `CancelFrame` or
+  `StartInterruptionFrame`.
+
+### Fixed
+
+- Fixed an issue where `DailyRoomProperties.exp` always had the same old
+  timestamp unless set by the user.
+
+- Fixed a couple of issues with `WebsocketServerTransport`. It needed to use
+  `push_audio_frame()` and also VAD was not working properly.
+
+- Fixed an issue that would cause LLM aggregator to fail with small
+  `VADParams.stop_secs` values.
+
+- Fixed an issue where `BaseOutputTransport` would send longer audio frames
+  preventing interruptions.
+
+### Other
+
+- Added new `07h-interruptible-openpipe.py` example. This example shows how to
+  use OpenPipe to run OpenAI LLMs and get the logs stored in OpenPipe.
+
+- Added new `dialin-chatbot` example. This examples shows how to call the bot
+  using a phone number.
+
+## [0.0.29] - 2024-06-07
+
+### Added
+
+- Added a new `FunctionFilter`. This filter will let you filter frames based on
+  a given function, except system messages which should never be filtered.
+
+- Added `FrameProcessor.can_generate_metrics()` method to indicate if a
+  processor can generate metrics. In the future this might get an extra argument
+  to ask for a specific type of metric.
+
+- Added `BasePipeline`. All pipeline classes should be based on this class. All
+  subclasses should implement a `processors_with_metrics()` method that returns
+  a list of all `FrameProcessor`s in the pipeline that can generate metrics.
+
+- Added `enable_metrics` to `PipelineParams`.
+
+- Added `MetricsFrame`. The `MetricsFrame` will report different metrics in the
+  system. Right now, it can report TTFB (Time To First Byte) values for
+  different services, that is the time spent between the arrival of a `Frame` to
+  the processor/service until the first `DataFrame` is pushed downstream. If
+  metrics are enabled an intial `MetricsFrame` with all the services in the
+  pipeline will be sent.
+
+- Added TTFB metrics and debug logging for TTS services.
+
+### Changed
+
+- Moved `ParallelTask` to `pipecat.pipeline.parallel_task`.
+
+### Fixed
+
+- Fixed PlayHT TTS service to work properly async.
+
+## [0.0.28] - 2024-06-05
+
+### Fixed
+
+- Fixed an issue with `SileroVADAnalyzer` that would cause memory to keep
+  growing indefinitely.
+
+## [0.0.27] - 2024-06-05
+
+### Added
+
+- Added `DailyTransport.participants()` and `DailyTransport.participant_counts()`.
+
+## [0.0.26] - 2024-06-05
+
+### Added
+
+- Added `OpenAITTSService`.
+
+- Allow passing `output_format` and `model_id` to `CartesiaTTSService` to change
+  audio sample format and the model to use.
+
+- Added `DailyRESTHelper` which helps you create Daily rooms and tokens in an
+  easy way.
+
+- `PipelineTask` now has a `has_finished()` method to indicate if the task has
+  completed. If a task is never ran `has_finished()` will return False.
+
+- `PipelineRunner` now supports SIGTERM. If received, the runner will be
+  canceled.
+
+### Fixed
+
+- Fixed an issue where `BaseInputTransport` and `BaseOutputTransport` where
+  stopping push tasks before pushing `EndFrame` frames could cause the bots to
+  get stuck.
+
+- Fixed an error closing local audio transports.
+
+- Fixed an issue with Deepgram TTS that was introduced in the previous release.
+
+- Fixed `AnthropicLLMService` interruptions. If an interruption occurred, a
+  `user` message could be appended after the previous `user` message. Anthropic
+  does not allow that because it requires alternate `user` and `assistant`
+  messages.
+
+### Performance
+
+- The `BaseInputTransport` does not pull audio frames from sub-classes any
+  more. Instead, sub-classes now push audio frames into a queue in the base
+  class. Also, `DailyInputTransport` now pushes audio frames every 20ms instead
+  of 10ms.
+
+- Remove redundant camera input thread from `DailyInputTransport`. This should
+  improve performance a little bit when processing participant videos.
+
+- Load Cartesia voice on startup.
+
+## [0.0.25] - 2024-05-31
+
+### Added
+
+- Added WebsocketServerTransport. This will create a websocket server and will
+  read messages coming from a client. The messages are serialized/deserialized
+  with protobufs. See `examples/websocket-server` for a detailed example.
+
+- Added function calling (LLMService.register_function()). This will allow the
+  LLM to call functions you have registered when needed. For example, if you
+  register a function to get the weather in Los Angeles and ask the LLM about
+  the weather in Los Angeles, the LLM will call your function.
+  See https://platform.openai.com/docs/guides/function-calling
+
+- Added new `LangchainProcessor`.
+
 - Added Cartesia TTS support (https://cartesia.ai/)

 ### Fixed
@@ -18,6 +170,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Fixed an issue where `camera_out_enabled` would cause the highg CPU usage if
  no image was provided.

+### Performance
+
+- Removed unnecessary audio input tasks.

 ## [0.0.24] - 2024-05-29

--- a/2
+++ b/2
@@ -1,6 +1,6 @@
 BSD 2-Clause License

-Copyright (c) 2024, Kwindla Hultman Kramer
+Copyright (c) 2024, Daily

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
--- a/dev-requirements.txt
+++ b/dev-requirements.txt
@@ -1,5 +1,6 @@
 autopep8~=2.1.0
 build~=1.2.1
+grpcio-tools~=1.62.2
 pip-tools~=7.4.1
 pytest~=8.2.0
 setuptools~=69.5.1
--- a/dot-env.template
+++ b/dot-env.template
@@ -33,3 +33,6 @@ PLAY_HT_API_KEY=...

 # OpenAI
 OPENAI_API_KEY=...
+
+#OpenPipe
+OPENPIPE_API_KEY=...
--- a/examples/README.md
+++ b/examples/README.md
@@ -38,7 +38,8 @@ Next, follow the steps in the README for each demo.
 | [Storytelling Chatbot](storytelling-chatbot) | Stitches together multiple third-party services to create a collaborative storytime experience.                                            | Deepgram, ElevenLabs, Open AI, Fal, Daily, Custom UI  |
 | [Translation Chatbot](translation-chatbot)   | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI     |
 | [Moondream Chatbot](moondream-chatbot)       | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU**                                                       | Deepgram, OpenAI, Moondream, Daily, Daily Prebuilt UI |
-| Function-calling Chatbot (TBC)               | A chatbot that can call functions in response to user input                                                                                | Deepgram, OpenAI, Fireworks, Daily, Daily Prebuilt UI |
+| Function-calling Chatbot (TBC)               | A chatbot that can call functions in response to user input.                                                                                | Deepgram, OpenAI, Fireworks, Daily, Daily Prebuilt UI |
+| [Dialin Chatbot](dialin-chatbot)             | A chatbot that connects to an incoming phone call from Daily or Twilio.                                                                                | Deepgram, OpenAI, ElevenLabs, Daily, Twilio |

 > [!IMPORTANT]
 > These example projects use Daily as a WebRTC transport and can be joined using their hosted Prebuilt UI.
--- a/examples/dialin-chatbot/.dockerignore
+++ b/examples/dialin-chatbot/.dockerignore
@@ -0,0 +1,3 @@
+**/.DS_Store
+.env
+.env.*
--- a/examples/dialin-chatbot/.gitignore
+++ b/examples/dialin-chatbot/.gitignore
@@ -0,0 +1,165 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+.pybuilder/
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# pytype static type analyzer
+.pytype/
+
+# Cython debug symbols
+cython_debug/
+
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+runpod.toml
+
+# custom script to recursively upgrade items in requirements.py
+upgrade_requirements.py
+.DS_Store
--- a/examples/dialin-chatbot/Dockerfile
+++ b/examples/dialin-chatbot/Dockerfile
@@ -0,0 +1,40 @@
+FROM python:3.11-bullseye
+
+ARG DEBIAN_FRONTEND=noninteractive
+ARG USE_PERSISTENT_DATA
+ENV PYTHONUNBUFFERED=1
+# Expose FastAPI port
+ENV FAST_API_PORT=7860
+EXPOSE 7860
+
+# Install system dependencies
+RUN apt-get update && apt-get install --no-install-recommends -y \
+    build-essential \
+    git \
+    ffmpeg \
+    google-perftools \
+    ca-certificates curl gnupg \
+    && apt-get clean && rm -rf /var/lib/apt/lists/*
+
+# Set up a new user named "user" with user ID 1000
+RUN useradd -m -u 1000 user
+
+# Set home to the user's home directory
+ENV HOME=/home/user \
+    PATH=/home/user/.local/bin:$PATH \
+    PYTHONPATH=$HOME/app \
+    PYTHONUNBUFFERED=1
+
+# Switch to the "user" user
+USER user
+
+# Set the working directory to the user's home directory
+WORKDIR $HOME/app
+
+# Install Python dependencies
+COPY *.py .
+COPY ./requirements.txt requirements.txt
+RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
+
+# Start the FastAPI server
+CMD python3 bot_runner.py --host "0.0.0.0" --port ${FAST_API_PORT}
--- a/examples/dialin-chatbot/README.md
+++ b/examples/dialin-chatbot/README.md
@@ -0,0 +1,85 @@
+<div align="center">
+ <img alt="pipecat" width="300px" height="auto" src="image.png">
+</div>
+
+# Dialin example
+
+Example project that demonstrates how to add phone number dialin to your Pipecat bots. We include examples for both Daily (`bot_daily.py`) and Twilio (`bot_twilio.py`), depending on who you want to use as a phone vendor.
+
+- 🔁 Transport: Daily WebRTC
+- 💬 Speech-to-Text: Deepgram via Daily transport
+- 🤖 LLM: GPT4-o / OpenAI
+- 🔉 Text-to-Speech: ElevenLabs
+
+#### Should I use Daily or Twilio as a vendor?
+
+If you're starting from scratch, using Daily to provision phone numbers alongside Daily as a transport offers some convenience (such as automatic call forwarding.)
+
+If you already have Twilio numbers and workflows that you want to connect to your Pipecat bots, there is some additional configuration required (you'll need to create a `on_dialin_ready` and use the Twilio client to trigger the forward.)
+
+You can read more about this, as well as see respective walkthroughs in our docs.
+
+## Setup
+
+```shell
+# Install the requirements
+pip install -r requirements.txt
+
+# Setup your env
+mv env.example .env
+```
+
+## Using Daily numbers
+
+Run `bot_runner.py` to handle incoming HTTP requests:
+
+`python bot_runner.py --host localhost`
+
+Then target the following URL:
+
+`POST /daily_start_bot`
+
+For more configuration options, please consult Daily's API documentation.
+
+
+## Using Twilio numbers
+
+As above, but target the following URL:
+
+`POST /twilio_start_bot`
+
+For more configuration options, please consult Twilio's API documentation.
+
+## Deployment example
+
+A Dockerfile is included in this demo for convenience. Here is an example of how to build and deploy your bot to [fly.io](https://fly.io).
+
+*Please note: This demo spawns agents as subprocesses for convenience / demonstration purposes. You would likely not want to do this in production as it would limit concurrency to available system resources. For more information on how to deploy your bots using VMs, refer to the Pipecat documentation.*
+
+### Build the docker image
+
+`docker build -t tag:project .`
+
+### Launch the fly project
+
+`mv fly.example.toml fly.toml`
+
+`fly launch` (using the included fly.toml)
+
+### Setup your secrets on Fly
+
+Set the necessary secrets (found in `env.example`)
+
+`fly secrets set DAILY_API_KEY=... OPENAI_API_KEY=... ELEVENLABS_API_KEY=... ELEVENLABS_VOICE_ID=...`
+
+If you're using Twilio as a number vendor:
+
+`fly secrets set TWILIO_ACCOUNT_SID=... TWILIO_AUTH_TOKEN=...`
+
+### Deploy!
+
+`fly deploy`
+
+## Need to do something more advanced?
+
+This demo covers the basics of bot telephony. If you want to know more about working with PSTN / SIP, please ping us on [Discord](https://discord.gg/pipecat).
--- a/examples/dialin-chatbot/bot_daily.py
+++ b/examples/dialin-chatbot/bot_daily.py
@@ -0,0 +1,111 @@
+import asyncio
+import aiohttp
+import os
+import sys
+import argparse
+
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import LLMAssistantResponseAggregator, LLMUserResponseAggregator
+from pipecat.frames.frames import (
+    LLMMessagesFrame,
+    EndFrame
+)
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport, DailyDialinSettings
+from pipecat.vad.silero import SileroVADAnalyzer
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+daily_api_key = os.getenv("DAILY_API_KEY", "")
+daily_api_url = os.getenv("DAILY_API_URL", "https://api.daily.co/v1")
+
+
+async def main(room_url: str, token: str, callId: str, callDomain: str):
+    async with aiohttp.ClientSession() as session:
+        # diallin_settings are only needed if Daily's SIP URI is used
+        # If you are handling this via Twilio, Telnyx, set this to None
+        # and handle call-forwarding when on_dialin_ready fires.
+        diallin_settings = DailyDialinSettings(
+            call_id=callId,
+            call_domain=callDomain
+        )
+
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Chatbot",
+            DailyParams(
+                api_url=daily_api_url,
+                api_key=daily_api_key,
+                dialin_settings=diallin_settings,
+                audio_in_enabled=True,
+                audio_out_enabled=True,
+                camera_out_enabled=False,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+                transcription_enabled=True,
+            )
+        )
+
+        tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY", ""),
+            voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by saying 'Oh, hello! Who dares dial me at this hour?!'.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),
+            tma_in,
+            llm,
+            tts,
+            transport.output(),
+            tma_out,
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        @transport.event_handler("on_participant_left")
+        async def on_participant_left(transport, participant, reason):
+            await task.queue_frame(EndFrame())
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Pipecat Simple ChatBot")
+    parser.add_argument("-u", type=str, help="Room URL")
+    parser.add_argument("-t", type=str, help="Token")
+    parser.add_argument("-i", type=str, help="Call ID")
+    parser.add_argument("-d", type=str, help="Call Domain")
+    config = parser.parse_args()
+
+    asyncio.run(main(config.u, config.t, config.i, config.d))
--- a/examples/dialin-chatbot/bot_runner.py
+++ b/examples/dialin-chatbot/bot_runner.py
@@ -0,0 +1,220 @@
+"""
+bot_runner.py
+
+HTTP service that listens for incoming calls from either Daily or Twilio,
+provisioning a room and starting a Pipecat bot in response.
+
+Refer to README for more information.
+"""
+import os
+import argparse
+import subprocess
+from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomObject, DailyRoomProperties, DailyRoomSipParams, DailyRoomParams
+from fastapi import FastAPI, Request, HTTPException
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import JSONResponse, PlainTextResponse
+from twilio.twiml.voice_response import VoiceResponse
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+
+# ------------ Configuration ------------ #
+
+MAX_SESSION_TIME = 5 * 60  # 5 minutes
+REQUIRED_ENV_VARS = ['OPENAI_API_KEY', 'DAILY_API_KEY',
+                     'ELEVENLABS_API_KEY', 'ELEVENLABS_VOICE_ID']
+
+daily_rest_helper = DailyRESTHelper(
+    os.getenv("DAILY_API_KEY", ""),
+    os.getenv("DAILY_API_URL", 'https://api.daily.co/v1'))
+
+
+# ----------------- API ----------------- #
+
+app = FastAPI()
+
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"]
+)
+
+"""
+Create Daily room, tell the bot if the room is created for Twilio's SIP or Daily's SIP (vendor).
+When the vendor is Daily, the bot handles the call forwarding automatically,
+i.e, forwards the call from the "hold music state" to the Daily Room's SIP URI.
+
+Alternatively, when the vendor is Twilio (not Daily), the bot is responsible for
+updating the state on Twilio. So when `dialin-ready` fires, it takes appropriate
+action using the Twilio Client library.
+"""
+
+
+def _create_daily_room(room_url, callId, callDomain=None, vendor="daily"):
+    if not room_url:
+        params = DailyRoomParams(
+            properties=DailyRoomProperties(
+                # Note: these are the default values, except for the display name
+                sip=DailyRoomSipParams(
+                    display_name="dialin-user",
+                    video=False,
+                    sip_mode="dial-in",
+                    num_endpoints=1
+                )
+            )
+        )
+
+        print(f"Creating new room...")
+        room: DailyRoomObject = daily_rest_helper.create_room(params=params)
+
+    else:
+        # Check passed room URL exist (we assume that it already has a sip set up!)
+        try:
+            print(f"Joining existing room: {room_url}")
+            room: DailyRoomObject = daily_rest_helper.get_room_from_url(
+                room_url)
+        except Exception:
+            raise HTTPException(
+                status_code=500, detail=f"Room not found: {room_url}")
+
+    print(f"Daily room: {room.url} {room.config.sip_endpoint}")
+
+    # Give the agent a token to join the session
+    token = daily_rest_helper.get_token(room.url, MAX_SESSION_TIME)
+
+    if not room or not token:
+        raise HTTPException(
+            status_code=500, detail=f"Failed to get room or token token")
+
+    # Spawn a new agent, and join the user session
+    # Note: this is mostly for demonstration purposes (refer to 'deployment' in docs)
+    if vendor == "daily":
+        bot_proc = f"python3 -m bot_daily -u {room.url} -t {token} -i {
+            callId} -d {callDomain}"
+    else:
+        bot_proc = f"python3 -m bot_twilio -u {room.url} -t {
+            token} -i {callId} -s {room.config.sip_endpoint}"
+
+    try:
+        subprocess.Popen(
+            [bot_proc],
+            shell=True,
+            bufsize=1,
+            cwd=os.path.dirname(os.path.abspath(__file__))
+        )
+    except Exception as e:
+        raise HTTPException(
+            status_code=500, detail=f"Failed to start subprocess: {e}")
+
+    return room
+
+
+@app.post("/twilio_start_bot", response_class=PlainTextResponse)
+async def twilio_start_bot(request: Request):
+    print(f"POST /twilio_voice_bot")
+
+    # twilio_start_bot is invoked directly by Twilio (as a web hook).
+    # On Twilio, under Active Numbers, pick the phone number
+    # Click Configure and under Voice Configuration,
+    # "a call comes in" choose webhook and point the URL to
+    # where this code is hosted.
+    data = {}
+    try:
+        # shouldnt have received json, twilio sends form data
+        form_data = await request.form()
+        data = dict(form_data)
+    except Exception:
+        pass
+
+    room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", None)
+    callId = data.get('CallSid')
+
+    if not callId:
+        raise HTTPException(
+            status_code=500, detail="Missing 'CallSid' in request")
+
+    print("CallId: %s" % callId)
+
+    # create room and tell the bot to join the created room
+    # note: Twilio does not require a callDomain
+    room: DailyRoomObject = _create_daily_room(
+        room_url, callId, None, "twilio")
+
+    print(f"Put Twilio on hold...")
+    # We have the room and the SIP URI,
+    # but we do not know if the Daily SIP Worker and the Bot have joined the call
+    # put the call on hold until the 'on_dialin_ready' fires.
+    # Then, the bot will update the called sid with the sip uri.
+    # http://com.twilio.music.classical.s3.amazonaws.com/BusyStrings.mp3
+    resp = VoiceResponse()
+    resp.play(
+        url="http://com.twilio.sounds.music.s3.amazonaws.com/MARKOVICHAMP-Borghestral.mp3", loop=10)
+    return str(resp)
+
+
+@app.post("/daily_start_bot")
+async def daily_start_bot(request: Request) -> JSONResponse:
+    # The /daily_start_bot is invoked when a call is received on Daily's SIP URI
+    # daily_start_bot will create the room, put the call on hold until
+    # the bot and sip worker are ready. Daily will automatically
+    # forward the call to the SIP URi when dialin_ready fires.
+
+    # Use specified room URL, or create a new one if not specified
+    room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", None)
+    # Get the dial-in properties from the request
+    try:
+        data = await request.json()
+        if "test" in data:
+            # Pass through any webhook checks
+            return JSONResponse({"test": True})
+        callId = data.get("callId", None)
+        callDomain = data.get("callDomain", None)
+    except Exception:
+        raise HTTPException(
+            status_code=500,
+            detail="Missing properties 'callId' or 'callDomain'")
+
+    print(f"CallId: {callId}, CallDomain: {callDomain}")
+    room: DailyRoomObject = _create_daily_room(
+        room_url, callId, callDomain, "daily")
+
+    # Grab a token for the user to join with
+    return JSONResponse({
+        "room_url": room.url,
+        "sipUri": room.config.sip_endpoint
+    })
+
+# ----------------- Main ----------------- #
+
+
+if __name__ == "__main__":
+    # Check environment variables
+    for env_var in REQUIRED_ENV_VARS:
+        if env_var not in os.environ:
+            raise Exception(f"Missing environment variable: {env_var}.")
+
+    parser = argparse.ArgumentParser(description="Pipecat Bot Runner")
+    parser.add_argument("--host", type=str,
+                        default=os.getenv("HOST", "0.0.0.0"), help="Host address")
+    parser.add_argument("--port", type=int,
+                        default=os.getenv("PORT", 7860), help="Port number")
+    parser.add_argument("--reload", action="store_true",
+                        default=True, help="Reload code on change")
+
+    config = parser.parse_args()
+
+    try:
+        import uvicorn
+
+        uvicorn.run(
+            "bot_runner:app",
+            host=config.host,
+            port=config.port,
+            reload=config.reload
+        )
+
+    except KeyboardInterrupt:
+        print("Pipecat runner shutting down...")
--- a/examples/dialin-chatbot/bot_twilio.py
+++ b/examples/dialin-chatbot/bot_twilio.py
@@ -0,0 +1,125 @@
+import asyncio
+import aiohttp
+import os
+import sys
+import argparse
+
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import LLMAssistantResponseAggregator, LLMUserResponseAggregator
+from pipecat.frames.frames import (
+    LLMMessagesFrame,
+    EndFrame
+)
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from twilio.rest import Client
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+twilio_account_sid = os.getenv('TWILIO_ACCOUNT_SID')
+twilio_auth_token = os.getenv('TWILIO_AUTH_TOKEN')
+twilioclient = Client(twilio_account_sid, twilio_auth_token)
+
+daily_api_key = os.getenv("DAILY_API_KEY", "")
+
+
+async def main(room_url: str, token: str, callId: str, sipUri: str):
+    async with aiohttp.ClientSession() as session:
+        # diallin_settings are only needed if Daily's SIP URI is used
+        # If you are handling this via Twilio, Telnyx, set this to None
+        # and handle call-forwarding when on_dialin_ready fires.
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Chatbot",
+            DailyParams(
+                api_key=daily_api_key,
+                dialin_settings=None,  # Not required for Twilio
+                audio_in_enabled=True,
+                audio_out_enabled=True,
+                camera_out_enabled=False,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+                transcription_enabled=True,
+            )
+        )
+
+        tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY", ""),
+            voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by saying 'Hello! Who dares dial me at this hour?!'.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),
+            tma_in,
+            llm,
+            tts,
+            transport.output(),
+            tma_out,
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        @transport.event_handler("on_participant_left")
+        async def on_participant_left(transport, participant, reason):
+            await task.queue_frame(EndFrame())
+
+        @transport.event_handler("on_dialin_ready")
+        async def on_dialin_ready(transport, cdata):
+            # For Twilio, Telnyx, etc. You need to update the state of the call
+            # and forward it to the sip_uri..
+            print(f"Forwarding call: {callId} {sipUri}")
+
+            try:
+                # The TwiML is updated using Twilio's client library
+                call = twilioclient.calls(callId).update(
+                    twiml=f'<Response><Dial><Sip>{sipUri}</Sip></Dial></Response>'
+                )
+            except Exception as e:
+                raise Exception(f"Failed to forward call: {str(e)}")
+
+        runner = PipelineRunner()
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Pipecat Simple ChatBot")
+    parser.add_argument("-u", type=str, help="Room URL")
+    parser.add_argument("-t", type=str, help="Token")
+    parser.add_argument("-i", type=str, help="Call ID")
+    parser.add_argument("-s", type=str, help="SIP URI")
+    config = parser.parse_args()
+
+    asyncio.run(main(config.u, config.t, config.i, config.s))
--- a/examples/dialin-chatbot/env.example
+++ b/examples/dialin-chatbot/env.example
@@ -0,0 +1,8 @@
+DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (optional: for joining the bot to the same room repeatedly for local dev)
+DAILY_API_KEY=.
+DAILY_API_URL=api.daily.co/v1
+OPENAI_API_KEY=
+ELEVENLABS_API_KEY=
+ELEVENLABS_VOICE_ID=
+TWILIO_ACCOUNT_SID=
+TWILIO_AUTH_TOKEN=
--- a/examples/dialin-chatbot/fly.example.toml
+++ b/examples/dialin-chatbot/fly.example.toml
@@ -0,0 +1,19 @@
+# fly.toml app configuration file generated for pipecat-dialin-demo on 2024-06-03T15:57:57+02:00
+#
+# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
+#
+
+app = 'pipecat-dialin-demo'
+primary_region = 'sjc'
+
+[build]
+
+[http_service]
+  internal_port = 7860
+  force_https = true
+  auto_stop_machines = true
+  auto_start_machines = true
+  min_machines_running = 1
+
+[[vm]]
+  size = 'performance-1x'
--- a/examples/dialin-chatbot/image.png
+++ b/examples/dialin-chatbot/image.png
--- a/examples/dialin-chatbot/requirements.txt
+++ b/examples/dialin-chatbot/requirements.txt
@@ -0,0 +1,7 @@
+pipecat-ai[daily,openai,silero]
+fastapi
+uvicorn
+requests
+python-dotenv
+loguru
+twilio
--- a/examples/foundational/02-llm-say-one-thing.py
+++ b/examples/foundational/02-llm-say-one-thing.py
@@ -44,7 +44,7 @@ async def main(room_url):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        messages = [
            {
--- a/examples/foundational/05-sync-speech-and-image.py
+++ b/examples/foundational/05-sync-speech-and-image.py
@@ -23,11 +23,11 @@ from pipecat.frames.frames import (
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineTask
+from pipecat.pipeline.parallel_task import ParallelTask
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.processors.aggregators.gated import GatedAggregator
 from pipecat.processors.aggregators.llm_response import LLMFullResponseAggregator
 from pipecat.processors.aggregators.sentence import SentenceAggregator
-from pipecat.processors.aggregators.parallel_task import ParallelTask
 from pipecat.services.openai import OpenAILLMService
 from pipecat.services.elevenlabs import ElevenLabsTTSService
 from pipecat.services.fal import FalImageGenService
@@ -59,6 +59,8 @@ class MonthPrepender(FrameProcessor):
        self.prepend_to_next_text_frame = False

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, MonthFrame):
            self.most_recent_month = frame.month
        elif self.prepend_to_next_text_frame and isinstance(frame, TextFrame):
@@ -93,7 +95,7 @@ async def main(room_url):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        imagegen = FalImageGenService(
            params=FalImageGenService.InputParams(
--- a/examples/foundational/05a-local-sync-speech-and-image.py
+++ b/examples/foundational/05a-local-sync-speech-and-image.py
@@ -50,6 +50,8 @@ async def main():
                    self.text = ""

                async def process_frame(self, frame: Frame, direction: FrameDirection):
+                    await super().process_frame(frame, direction)
+
                    if isinstance(frame, TextFrame):
                        self.text = frame.text
                    await self.push_frame(frame, direction)
@@ -60,6 +62,8 @@ async def main():
                    self.audio = bytearray()

                async def process_frame(self, frame: Frame, direction: FrameDirection):
+                    await super().process_frame(frame, direction)
+
                    if isinstance(frame, AudioRawFrame):
                        self.audio.extend(frame.audio)
                        self.frame = AudioRawFrame(
@@ -71,12 +75,14 @@ async def main():
                    self.frame = None

                async def process_frame(self, frame: Frame, direction: FrameDirection):
+                    await super().process_frame(frame, direction)
+
                    if isinstance(frame, URLImageRawFrame):
                        self.frame = frame

            llm = OpenAILLMService(
                api_key=os.getenv("OPENAI_API_KEY"),
-                model="gpt-4-turbo-preview")
+                model="gpt-4o")

            tts = ElevenLabsTTSService(
                aiohttp_session=session,
@@ -156,7 +162,7 @@ async def main():
            await runner.stop_when_done()

        async def run_tk():
-            while True:
+            while not task.has_finished():
                tk_root.update()
                tk_root.update_idletasks()
                await asyncio.sleep(0.1)
--- a/examples/foundational/06a-image-sync.py
+++ b/examples/foundational/06a-image-sync.py
@@ -49,6 +49,8 @@ class ImageSyncAggregator(FrameProcessor):
        self._waiting_image_bytes = self._waiting_image.tobytes()

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if not isinstance(frame, SystemFrame):
            await self.push_frame(ImageRawFrame(image=self._speaking_image_bytes, size=(1024, 1024), format=self._speaking_image_format))
            await self.push_frame(frame)
@@ -81,7 +83,7 @@ async def main(room_url: str, token):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        messages = [
            {
--- a/examples/foundational/07-interruptible.py
+++ b/examples/foundational/07-interruptible.py
@@ -53,7 +53,7 @@ async def main(room_url: str, token):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        messages = [
            {
@@ -74,7 +74,11 @@ async def main(room_url: str, token):
            tma_out              # Assistant spoken responses
        ])

-        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+        task = PipelineTask(pipeline, PipelineParams(
+            allow_interruptions=True,
+            enable_metrics=True,
+            report_only_initial_ttfb=True,
+        ))

        @transport.event_handler("on_first_participant_joined")
        async def on_first_participant_joined(transport, participant):
--- a/examples/foundational/07b-interruptible-langchain.py
+++ b/examples/foundational/07b-interruptible-langchain.py
@@ -0,0 +1,125 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import os
+import sys
+
+import aiohttp
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator, LLMUserResponseAggregator)
+from pipecat.processors.frameworks.langchain import LangchainProcessor
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
+from langchain_community.chat_message_histories import ChatMessageHistory
+from langchain_core.chat_history import BaseChatMessageHistory
+from langchain_core.runnables.history import RunnableWithMessageHistory
+from langchain_openai import ChatOpenAI
+
+from loguru import logger
+
+from runner import configure
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+message_store = {}
+
+
+def get_session_history(session_id: str) -> BaseChatMessageHistory:
+    if session_id not in message_store:
+        message_store[session_id] = ChatMessageHistory()
+    return message_store[session_id]
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Respond bot",
+            DailyParams(
+                audio_out_enabled=True,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+            ),
+        )
+
+        tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY"),
+            voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
+        )
+
+        prompt = ChatPromptTemplate.from_messages(
+            [
+                ("system",
+                 "Be nice and helpful. Answer very briefly and without special characters like `#` or `*`. "
+                 "Your response will be synthesized to voice and those characters will create unnatural sounds.",
+                 ),
+                MessagesPlaceholder("chat_history"),
+                ("human", "{input}"),
+            ])
+        chain = prompt | ChatOpenAI(model="gpt-4o", temperature=0.7)
+        history_chain = RunnableWithMessageHistory(
+            chain,
+            get_session_history,
+            history_messages_key="chat_history",
+            input_messages_key="input")
+        lc = LangchainProcessor(history_chain)
+
+        tma_in = LLMUserResponseAggregator()
+        tma_out = LLMAssistantResponseAggregator()
+
+        pipeline = Pipeline(
+            [
+                transport.input(),      # Transport user input
+                tma_in,                 # User responses
+                lc,                     # Langchain
+                tts,                    # TTS
+                transport.output(),     # Transport bot output
+                tma_out,                # Assistant spoken responses
+            ]
+        )
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            lc.set_participant_id(participant["id"])
+            # Kick off the conversation.
+            # the `LLMMessagesFrame` will be picked up by the LangchainProcessor using
+            # only the content of the last message to inject it in the prompt defined
+            # above. So no role is required here.
+            messages = [(
+                {
+                    "content": "Please briefly introduce yourself to the user."
+                }
+            )]
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/07c-interruptible-deepgram.py
+++ b/examples/foundational/07c-interruptible-deepgram.py
@@ -15,7 +15,7 @@ from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_response import (
    LLMAssistantResponseAggregator, LLMUserResponseAggregator)
-from pipecat.services.deepgram import DeepgramTTSService
+from pipecat.services.deepgram import DeepgramSTTService, DeepgramTTSService
 from pipecat.services.openai import OpenAILLMService
 from pipecat.transports.services.daily import DailyParams, DailyTransport
 from pipecat.vad.silero import SileroVADAnalyzer
@@ -39,12 +39,14 @@ async def main(room_url: str, token):
            "Respond bot",
            DailyParams(
                audio_out_enabled=True,
-                transcription_enabled=True,
                vad_enabled=True,
-                vad_analyzer=SileroVADAnalyzer()
+                vad_analyzer=SileroVADAnalyzer(),
+                vad_audio_passthrough=True
            )
        )

+        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
        tts = DeepgramTTSService(
            aiohttp_session=session,
            api_key=os.getenv("DEEPGRAM_API_KEY"),
@@ -53,7 +55,7 @@ async def main(room_url: str, token):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        messages = [
            {
@@ -67,6 +69,7 @@ async def main(room_url: str, token):

        pipeline = Pipeline([
            transport.input(),   # Transport user input
+            stt,                 # STT
            tma_in,              # User responses
            llm,                 # LLM
            tts,                 # TTS
--- a/examples/foundational/07d-interruptible-cartesia.py
+++ b/examples/foundational/07d-interruptible-cartesia.py
@@ -20,6 +20,7 @@ from pipecat.services.openai import OpenAILLMService
 from pipecat.transports.services.daily import DailyParams, DailyTransport
 from pipecat.vad.silero import SileroVADAnalyzer

+
 from runner import configure

 from loguru import logger
@@ -39,6 +40,7 @@ async def main(room_url: str, token):
            "Respond bot",
            DailyParams(
                audio_out_enabled=True,
+                audio_out_sample_rate=44100,
                transcription_enabled=True,
                vad_enabled=True,
                vad_analyzer=SileroVADAnalyzer()
@@ -47,7 +49,8 @@ async def main(room_url: str, token):

        tts = CartesiaTTSService(
            api_key=os.getenv("CARTESIA_API_KEY"),
-            voice_name="Barbershop Man"
+            voice_name="British Lady",
+            output_format="pcm_44100"
        )

        llm = OpenAILLMService(
--- a/examples/foundational/07e-interruptible-playht.py
+++ b/examples/foundational/07e-interruptible-playht.py
@@ -0,0 +1,96 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import aiohttp
+import os
+import sys
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator, LLMUserResponseAggregator)
+from pipecat.services.playht import PlayHTTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+from pipecat.processors.logger import FrameLogger
+
+from runner import configure
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Respond bot",
+            DailyParams(
+                audio_out_enabled=True,
+                audio_out_sample_rate=16000,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer()
+            )
+        )
+
+        tts = PlayHTTTSService(
+            user_id=os.getenv("PLAYHT_USER_ID"),
+            api_key=os.getenv("PLAYHT_API_KEY"),
+            voice_url="s3://voice-cloning-zero-shot/801a663f-efd0-4254-98d0-5c175514c3e8/jennifer/manifest.json",
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # TTS
+            transport.output(),  # Transport bot output
+            tma_out              # Assistant spoken responses
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            # Kick off the conversation.
+            messages.append(
+                {"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/07f-interruptible-azure-tts.py
+++ b/examples/foundational/07f-interruptible-azure-tts.py
@@ -0,0 +1,95 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import aiohttp
+import os
+import sys
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator, LLMUserResponseAggregator)
+from pipecat.services.azure import AzureTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+
+from runner import configure
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Respond bot",
+            DailyParams(
+                audio_out_enabled=True,
+                audio_out_sample_rate=16000,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer()
+            )
+        )
+
+        tts = AzureTTSService(
+            api_key=os.getenv("AZURE_SPEECH_API_KEY"),
+            region=os.getenv("AZURE_SPEECH_REGION"),
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # TTS
+            transport.output(),  # Transport bot output
+            tma_out              # Assistant spoken responses
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            # Kick off the conversation.
+            messages.append(
+                {"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/07g-interruptible-openai-tts.py
+++ b/examples/foundational/07g-interruptible-openai-tts.py
@@ -0,0 +1,94 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import aiohttp
+import os
+import sys
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator, LLMUserResponseAggregator)
+from pipecat.services.openai import OpenAITTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from runner import configure
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Respond bot",
+            DailyParams(
+                audio_out_enabled=True,
+                audio_out_sample_rate=24000,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer()
+            )
+        )
+
+        tts = OpenAITTSService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            voice="alloy"
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # TTS
+            transport.output(),  # Transport bot output
+            tma_out              # Assistant spoken responses
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            # Kick off the conversation.
+            messages.append(
+                {"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/07h-interruptible-openpipe.py
+++ b/examples/foundational/07h-interruptible-openpipe.py
@@ -0,0 +1,102 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import aiohttp
+import os
+import sys
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator,
+    LLMUserResponseAggregator,
+)
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.services.openpipe import OpenPipeLLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from runner import configure
+
+from loguru import logger
+import time
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Respond bot",
+            DailyParams(
+                audio_out_enabled=True,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer()
+            )
+        )
+
+        tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY"),
+            voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
+        )
+
+        timestamp = int(time.time())
+        llm = OpenPipeLLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            openpipe_api_key=os.getenv("OPENPIPE_API_KEY"),
+            model="gpt-4o",
+            tags={
+                "conversation_id": f"pipecat-{timestamp}"
+            }
+        )
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # TTS
+            transport.output(),  # Transport bot output
+            tma_out              # Assistant spoken responses
+        ])
+
+        task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            # Kick off the conversation.
+            messages.append(
+                {"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/09-mirror.py
+++ b/examples/foundational/09-mirror.py
@@ -30,6 +30,7 @@ async def main(room_url, token):
            audio_in_enabled=True,
            audio_out_enabled=True,
            camera_out_enabled=True,
+            camera_out_is_live=True,
            camera_out_width=1280,
            camera_out_height=720
        )
--- a/examples/foundational/09a-local-mirror.py
+++ b/examples/foundational/09a-local-mirror.py
@@ -38,6 +38,7 @@ async def main(room_url, token):
        TransportParams(
            audio_out_enabled=True,
            camera_out_enabled=True,
+            camera_out_is_live=True,
            camera_out_width=1280,
            camera_out_height=720))

@@ -47,15 +48,15 @@ async def main(room_url, token):

    pipeline = Pipeline([daily_transport.input(), tk_transport.output()])

-    runner = PipelineRunner()
+    task = PipelineTask(pipeline)

    async def run_tk():
-        while runner.is_active():
+        while not task.has_finished():
            tk_root.update()
            tk_root.update_idletasks()
            await asyncio.sleep(0.1)

-    task = PipelineTask(pipeline)
+    runner = PipelineRunner()

    await asyncio.gather(runner.run(task), run_tk())

--- a/examples/foundational/11-sound-effects.py
+++ b/examples/foundational/11-sound-effects.py
@@ -60,6 +60,8 @@ for file in sound_files:
 class OutboundSoundEffectWrapper(FrameProcessor):

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, LLMFullResponseEndFrame):
            await self.push_frame(sounds["ding1.wav"])
            # In case anything else downstream needs it
@@ -71,6 +73,8 @@ class OutboundSoundEffectWrapper(FrameProcessor):
 class InboundSoundEffectWrapper(FrameProcessor):

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, LLMMessagesFrame):
            await self.push_frame(sounds["ding2.wav"])
            # In case anything else downstream needs it
@@ -95,7 +99,7 @@ async def main(room_url: str, token):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        tts = ElevenLabsTTSService(
            aiohttp_session=session,
--- a/examples/foundational/12-describe-video.py
+++ b/examples/foundational/12-describe-video.py
@@ -42,6 +42,8 @@ class UserImageRequester(FrameProcessor):
        self._participant_id = participant_id

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM)
        await self.push_frame(frame, direction)
--- a/examples/foundational/12a-describe-video-gemini-flash.py
+++ b/examples/foundational/12a-describe-video-gemini-flash.py
@@ -42,6 +42,8 @@ class UserImageRequester(FrameProcessor):
        self._participant_id = participant_id

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM)
        await self.push_frame(frame, direction)
--- a/examples/foundational/12b-describe-video-gpt-4o.py
+++ b/examples/foundational/12b-describe-video-gpt-4o.py
@@ -42,6 +42,8 @@ class UserImageRequester(FrameProcessor):
        self._participant_id = participant_id

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM)
        await self.push_frame(frame, direction)
--- a/examples/foundational/12c-describe-video-anthropic.py
+++ b/examples/foundational/12c-describe-video-anthropic.py
@@ -42,6 +42,8 @@ class UserImageRequester(FrameProcessor):
        self._participant_id = participant_id

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if self._participant_id and isinstance(frame, TextFrame):
            await self.push_frame(UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM)
        await self.push_frame(frame, direction)
--- a/examples/foundational/13-whisper-transcription.py
+++ b/examples/foundational/13-whisper-transcription.py
@@ -29,6 +29,8 @@ logger.add(sys.stderr, level="DEBUG")
 class TranscriptionLogger(FrameProcessor):

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")

--- a/examples/foundational/13a-whisper-local.py
+++ b/examples/foundational/13a-whisper-local.py
@@ -16,8 +16,6 @@ from pipecat.services.whisper import WhisperSTTService
 from pipecat.transports.base_transport import TransportParams
 from pipecat.transports.local.audio import LocalAudioTransport

-from runner import configure
-
 from loguru import logger

 from dotenv import load_dotenv
@@ -30,11 +28,13 @@ logger.add(sys.stderr, level="DEBUG")
 class TranscriptionLogger(FrameProcessor):

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TranscriptionFrame):
            print(f"Transcription: {frame.text}")


-async def main(room_url: str):
+async def main():
    transport = LocalAudioTransport(TransportParams(audio_in_enabled=True))

    stt = WhisperSTTService()
@@ -51,5 +51,4 @@ async def main(room_url: str):


 if __name__ == "__main__":
-    (url, token) = configure()
-    asyncio.run(main(url))
+    asyncio.run(main())
--- a/examples/foundational/13b-deepgram-transcription.py
+++ b/examples/foundational/13b-deepgram-transcription.py
@@ -0,0 +1,58 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import os
+import sys
+
+from pipecat.frames.frames import Frame, TranscriptionFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.services.deepgram import DeepgramSTTService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+from runner import configure
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+class TranscriptionLogger(FrameProcessor):
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, TranscriptionFrame):
+            print(f"Transcription: {frame.text}")
+
+
+async def main(room_url: str):
+    transport = DailyTransport(room_url, None, "Transcription bot",
+                               DailyParams(audio_in_enabled=True))
+
+    stt = DeepgramSTTService(os.getenv("DEEPGRAM_API_KEY"))
+
+    tl = TranscriptionLogger()
+
+    pipeline = Pipeline([transport.input(), stt, tl])
+
+    task = PipelineTask(pipeline)
+
+    runner = PipelineRunner()
+
+    await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url))
--- a/examples/foundational/14-function-calling.py
+++ b/examples/foundational/14-function-calling.py
@@ -7,9 +7,9 @@
 import asyncio
 import aiohttp
 import os
-import json
 import sys

+from pipecat.frames.frames import TextFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineTask
@@ -17,18 +17,13 @@ from pipecat.processors.aggregators.llm_response import (
    LLMAssistantContextAggregator,
    LLMUserContextAggregator,
 )
-from pipecat.services.openai import OpenAILLMContext
 from pipecat.processors.logger import FrameLogger
 from pipecat.services.elevenlabs import ElevenLabsTTSService
-from pipecat.services.openai import OpenAILLMService
+from pipecat.services.openai import OpenAILLMContext, OpenAILLMService
 from pipecat.transports.services.daily import DailyParams, DailyTransport
 from pipecat.vad.silero import SileroVADAnalyzer
-from openai.types.chat import (
-    ChatCompletionToolParam,
-)
-from pipecat.frames.frames import (
-    TextFrame
-)
+
+from openai.types.chat import ChatCompletionToolParam

 from runner import configure

@@ -46,7 +41,7 @@ async def start_fetch_weather(llm):


 async def fetch_weather_from_api(llm, args):
-    return ({"conditions": "nice", "temperature": "75"})
+    return {"conditions": "nice", "temperature": "75"}


 async def main(room_url: str, token):
@@ -71,7 +66,7 @@ async def main(room_url: str, token):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")
        llm.register_function(
            "get_current_weather",
            fetch_weather_from_api,
--- a/examples/foundational/15-switch-voices.py
+++ b/examples/foundational/15-switch-voices.py
@@ -0,0 +1,159 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import aiohttp
+import os
+import sys
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.parallel_pipeline import ParallelPipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantContextAggregator,
+    LLMUserContextAggregator
+)
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.filters.function_filter import FunctionFilter
+from pipecat.services.cartesia import CartesiaTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from openai.types.chat import ChatCompletionToolParam
+
+from runner import configure
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+current_voice = "News Lady"
+
+
+async def switch_voice(llm, args):
+    global current_voice
+    current_voice = args["voice"]
+    return {"voice": f"You are now using your {current_voice} voice. Your responses should now be as if you were a {current_voice}."}
+
+
+async def news_lady_filter(frame) -> bool:
+    return current_voice == "News Lady"
+
+
+async def british_lady_filter(frame) -> bool:
+    return current_voice == "British Lady"
+
+
+async def barbershop_man_filter(frame) -> bool:
+    return current_voice == "Barbershop Man"
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Pipecat",
+            DailyParams(
+                audio_out_enabled=True,
+                audio_out_sample_rate=44100,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer()
+            )
+        )
+
+        news_lady = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_name="Newslady",
+            output_format="pcm_44100"
+        )
+
+        british_lady = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_name="British Lady",
+            output_format="pcm_44100"
+        )
+
+        barbershop_man = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_name="Barbershop Man",
+            output_format="pcm_44100"
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+        llm.register_function("switch_voice", switch_voice)
+
+        tools = [
+            ChatCompletionToolParam(
+                type="function",
+                function={
+                    "name": "switch_voice",
+                    "description": "Switch your voice only when the user asks you to",
+                    "parameters": {
+                        "type": "object",
+                        "properties": {
+                            "voice": {
+                                "type": "string",
+                                "description": "The voice the user wants you to use",
+                            },
+                        },
+                        "required": ["voice"],
+                    },
+                })]
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities. Respond to what the user said in a creative and helpful way. Your output should not include non-alphanumeric characters. You can do the following voices: 'News Lady', 'British Lady' and 'Barbershop Man'.",
+            },
+        ]
+
+        context = OpenAILLMContext(messages, tools)
+        tma_in = LLMUserContextAggregator(context)
+        tma_out = LLMAssistantContextAggregator(context)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            tma_in,              # User responses
+            llm,                 # LLM
+            ParallelPipeline(    # TTS (one of the following vocies)
+                [FunctionFilter(news_lady_filter), news_lady],            # News Lady voice
+                [FunctionFilter(british_lady_filter), british_lady],      # British Lady voice
+                [FunctionFilter(barbershop_man_filter), barbershop_man],  # Barbershop Man voice
+            ),
+            transport.output(),  # Transport bot output
+            tma_out              # Assistant spoken responses
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            # Kick off the conversation.
+            messages.append(
+                {
+                    "role": "system",
+                    "content": f"Please introduce yourself to the user and let them know the voices you can do. Your initial responses should be as if you were a {current_voice}."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/15a-switch-languages.py
+++ b/examples/foundational/15a-switch-languages.py
@@ -0,0 +1,153 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import aiohttp
+import os
+import sys
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.parallel_pipeline import ParallelPipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantContextAggregator,
+    LLMUserContextAggregator
+)
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.filters.function_filter import FunctionFilter
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.services.whisper import Model, WhisperSTTService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from openai.types.chat import ChatCompletionToolParam
+
+from runner import configure
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+current_language = "English"
+
+
+async def switch_language(llm, args):
+    global current_language
+    current_language = args["language"]
+    return {"voice": f"Your answers from now on should be in {current_language}."}
+
+
+async def english_filter(frame) -> bool:
+    return current_language == "English"
+
+
+async def spanish_filter(frame) -> bool:
+    return current_language == "Spanish"
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Pipecat",
+            DailyParams(
+                audio_in_enabled=True,
+                audio_out_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+                vad_audio_passthrough=True
+            )
+        )
+
+        stt = WhisperSTTService(model=Model.LARGE)
+
+        english_tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY"),
+            voice_id="pNInz6obpgDQGcFmaJgB",
+        )
+
+        spanish_tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY"),
+            model="eleven_multilingual_v2",
+            voice_id="9F4C8ztpNUmXkdDDbz3J",
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+        llm.register_function("switch_language", switch_language)
+
+        tools = [
+            ChatCompletionToolParam(
+                type="function",
+                function={
+                    "name": "switch_language",
+                    "description": "Switch to another language when the user asks you to",
+                    "parameters": {
+                        "type": "object",
+                        "properties": {
+                            "language": {
+                                "type": "string",
+                                "description": "The language the user wants you to speak",
+                            },
+                        },
+                        "required": ["language"],
+                    },
+                })]
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities. Respond to what the user said in a creative and helpful way. Your output should not include non-alphanumeric characters. You can speak the following languages: 'English' and 'Spanish'.",
+            },
+        ]
+
+        context = OpenAILLMContext(messages, tools)
+        tma_in = LLMUserContextAggregator(context)
+        tma_out = LLMAssistantContextAggregator(context)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            stt,                 # STT
+            tma_in,              # User responses
+            llm,                 # LLM
+            ParallelPipeline(    # TTS (bot will speak the chosen language)
+                [FunctionFilter(english_filter), english_tts],  # English
+                [FunctionFilter(spanish_filter), spanish_tts],  # Spanish
+            ),
+            transport.output(),  # Transport bot output
+            tma_out              # Assistant spoken responses
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+            # Kick off the conversation.
+            messages.append(
+                {
+                    "role": "system",
+                    "content": f"Please introduce yourself to the user and let them know the languages you speak. Your initial responses should be in {current_language}."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/16-gpu-container-local-bot.py
+++ b/examples/foundational/16-gpu-container-local-bot.py
@@ -0,0 +1,130 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import aiohttp
+import os
+import sys
+import json
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator, LLMUserResponseAggregator)
+from pipecat.services.deepgram import DeepgramTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport, DailyTransportMessageFrame
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from runner import configure
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main(room_url: str, token):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            room_url,
+            token,
+            "Respond bot",
+            DailyParams(
+                audio_out_enabled=True,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer()
+            )
+        )
+
+        tts = DeepgramTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("DEEPGRAM_API_KEY"),
+            voice="aura-asteria-en",
+            base_url="http://0.0.0.0:8080/v1/speak"
+        )
+
+        llm = OpenAILLMService(
+            # To use OpenAI
+            # api_key=os.getenv("OPENAI_API_KEY"),
+            # model="gpt-4o"
+            # Or, to use a local vLLM (or similar) api server
+            model="meta-llama/Meta-Llama-3-8B-Instruct",
+            base_url="http://0.0.0.0:8000/v1"
+        )
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # TTS
+            transport.output(),  # Transport bot output
+            tma_out              # Assistant spoken responses
+        ])
+
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True, enable_metrics=True))
+
+        # When a participant joins, start transcription for that participant so the
+        # bot can "hear" and respond to them.
+        @transport.event_handler("on_participant_joined")
+        async def on_participant_joined(transport, participant):
+            transport.capture_participant_transcription(participant["id"])
+
+        # When the first participant joins, the bot should introduce itself.
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            messages.append(
+                {"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        # Handle "latency-ping" messages. The client will send app messages that look like
+        # this:
+        #   { "latency-ping": { ts: <client-side timestamp> }}
+        #
+        # We want to send an immediate pong back to the client from this handler function.
+        # Also, we will push a frame into the top of the pipeline and send it after the
+        #
+        @transport.event_handler("on_app_message")
+        async def on_app_message(transport, message, sender):
+            try:
+                if "latency-ping" in message:
+                    logger.debug(f"Received latency ping app message: {message}")
+                    ts = message["latency-ping"]["ts"]
+                    # Send immediately
+                    transport.output().send_message(DailyTransportMessageFrame(
+                        message={"latency-pong-msg-handler": {"ts": ts}},
+                        participant_id=sender))
+                    # And push to the pipeline for the Daily transport.output to send
+                    await tma_in.push_frame(
+                        DailyTransportMessageFrame(
+                            message={"latency-pong-pipeline-delivery": {"ts": ts}},
+                            participant_id=sender))
+            except Exception as e:
+                logger.debug(f"message handling error: {e} - {message}")
+
+        runner = PipelineRunner()
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    (url, token) = configure()
+    asyncio.run(main(url, token))
--- a/examples/foundational/websocket-server/frames.proto
+++ b/examples/foundational/websocket-server/frames.proto
@@ -1,25 +0,0 @@
-syntax = "proto3";
-
-package pipecat_proto;
-
-message TextFrame {
-    string text = 1;
-}
-
-message AudioFrame {
-    bytes audio = 1;
-}
-
-message TranscriptionFrame {
-    string text = 1;
-    string participant_id = 2;
-    string timestamp = 3;
-}
-
-message Frame {
-    oneof frame {
-        TextFrame text = 1;
-        AudioFrame audio = 2;
-        TranscriptionFrame transcription = 3;
-    }
-}
--- a/examples/foundational/websocket-server/index.html
+++ b/examples/foundational/websocket-server/index.html
@@ -1,134 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-
-<head>
-    <meta charset="UTF-8">
-    <meta name="viewport" content="width=device-width, initial-scale=1.0">
-    <script src="//cdn.jsdelivr.net/npm/protobufjs@7.X.X/dist/protobuf.min.js"></script>
-    <title>WebSocket Audio Stream</title>
-</head>
-
-<body>
-    <h1>WebSocket Audio Stream</h1>
-    <button id="startAudioBtn">Start Audio</button>
-    <button id="stopAudioBtn">Stop Audio</button>
-    <script>
-        const SAMPLE_RATE = 16000;
-        const BUFFER_SIZE = 8192;
-        const MIN_AUDIO_SIZE = 6400;
-
-        let audioContext;
-        let microphoneStream;
-        let scriptProcessor;
-        let source;
-        let frame;
-        let audioChunks = [];
-        let isPlaying = false;
-        let ws;
-
-        const proto = protobuf.load("frames.proto", (err, root) => {
-            if (err) throw err;
-            frame = root.lookupType("pipecat_proto.Frame");
-        });
-
-        function initWebSocket() {
-            ws = new WebSocket('ws://localhost:8765');
-
-            ws.addEventListener('open', () => console.log('WebSocket connection established.'));
-            ws.addEventListener('message', handleWebSocketMessage);
-            ws.addEventListener('close', (event) => console.log("WebSocket connection closed.", event.code, event.reason));
-            ws.addEventListener('error', (event) => console.error('WebSocket error:', event));
-        }
-
-        async function handleWebSocketMessage(event) {
-            const arrayBuffer = await event.data.arrayBuffer();
-            enqueueAudioFromProto(arrayBuffer);
-        }
-
-        function enqueueAudioFromProto(arrayBuffer) {
-            const parsedFrame = frame.decode(new Uint8Array(arrayBuffer));
-            if (!parsedFrame?.audio) return false;
-
-            const frameCount = parsedFrame.audio.data.length / 2;
-            const audioOutBuffer = audioContext.createBuffer(1, frameCount, SAMPLE_RATE);
-            const nowBuffering = audioOutBuffer.getChannelData(0);
-            const view = new Int16Array(parsedFrame.audio.data.buffer);
-
-            for (let i = 0; i < frameCount; i++) {
-                const word = view[i];
-                nowBuffering[i] = ((word + 32768) % 65536 - 32768) / 32768.0;
-            }
-
-            audioChunks.push(audioOutBuffer);
-            if (!isPlaying) playNextChunk();
-        }
-
-        function playNextChunk() {
-            if (audioChunks.length === 0) {
-                isPlaying = false;
-                return;
-            }
-
-            isPlaying = true;
-            const audioOutBuffer = audioChunks.shift();
-            const source = audioContext.createBufferSource();
-            source.buffer = audioOutBuffer;
-            source.connect(audioContext.destination);
-            source.onended = playNextChunk;
-            source.start();
-        }
-
-        function startAudio() {
-            if (!navigator.mediaDevices || !navigator.mediaDevices.getUserMedia) {
-                alert('getUserMedia is not supported in your browser.');
-                return;
-            }
-
-            navigator.mediaDevices.getUserMedia({ audio: true })
-                .then((stream) => {
-                    microphoneStream = stream;
-                    audioContext = new (window.AudioContext || window.webkitAudioContext)();
-                    scriptProcessor = audioContext.createScriptProcessor(BUFFER_SIZE, 1, 1);
-                    source = audioContext.createMediaStreamSource(stream);
-                    source.connect(scriptProcessor);
-                    scriptProcessor.connect(audioContext.destination);
-
-                    const audioBuffer = [];
-                    const skipRatio = Math.floor(audioContext.sampleRate / (SAMPLE_RATE * 2));
-
-                    scriptProcessor.onaudioprocess = (event) => {
-                        const rawLeftChannelData = event.inputBuffer.getChannelData(0);
-                        for (let i = 0; i < rawLeftChannelData.length; i += skipRatio) {
-                            const normalized = ((rawLeftChannelData[i] * 32768.0) + 32768) % 65536 - 32768;
-                            const swappedBytes = ((normalized & 0xff) << 8) | ((normalized >> 8) & 0xff);
-                            audioBuffer.push(swappedBytes);
-                        }
-
-                        if (audioBuffer.length >= MIN_AUDIO_SIZE) {
-                            const audioFrame = frame.create({ audio: { audio: audioBuffer.slice(0, MIN_AUDIO_SIZE) } });
-                            const encodedFrame = new Uint8Array(frame.encode(audioFrame).finish());
-                            ws.send(encodedFrame);
-                            audioBuffer.splice(0, MIN_AUDIO_SIZE);
-                        }
-                    };
-
-                    initWebSocket();
-                })
-                .catch((error) => console.error('Error accessing microphone:', error));
-        }
-
-        function stopAudio() {
-            if (ws) {
-                ws.close();
-                scriptProcessor.disconnect();
-                source.disconnect();
-                ws = undefined;
-            }
-        }
-
-        document.getElementById('startAudioBtn').addEventListener('click', startAudio);
-        document.getElementById('stopAudioBtn').addEventListener('click', stopAudio);
-    </script>
-</body>
-
-</html>
--- a/examples/foundational/websocket-server/sample.py
+++ b/examples/foundational/websocket-server/sample.py
@@ -1,50 +0,0 @@
-import asyncio
-import aiohttp
-import logging
-import os
-from pipecat.pipeline.frame_processor import FrameProcessor
-from pipecat.pipeline.frames import TextFrame, TranscriptionFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.services.elevenlabs_ai_services import ElevenLabsTTSService
-from pipecat.transports.websocket_transport import WebsocketTransport
-from pipecat.services.whisper_ai_services import WhisperSTTService
-
-logging.basicConfig(format="%(levelno)s %(asctime)s %(message)s")
-logger = logging.getLogger("pipecat")
-logger.setLevel(logging.DEBUG)
-
-
-class WhisperTranscriber(FrameProcessor):
-    async def process_frame(self, frame):
-        if isinstance(frame, TranscriptionFrame):
-            print(f"Transcribed: {frame.text}")
-        else:
-            yield frame
-
-
-async def main():
-    async with aiohttp.ClientSession() as session:
-        transport = WebsocketTransport(
-            mic_enabled=True,
-            speaker_enabled=True,
-        )
-        tts = ElevenLabsTTSService(
-            aiohttp_session=session,
-            api_key=os.getenv("ELEVENLABS_API_KEY"),
-            voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
-        )
-
-        pipeline = Pipeline([
-            WhisperSTTService(),
-            WhisperTranscriber(),
-            tts,
-        ])
-
-        @transport.on_connection
-        async def queue_frame():
-            await pipeline.queue_frames([TextFrame("Hello there!")])
-
-        await transport.run(pipeline)
-
-if __name__ == "__main__":
-    asyncio.run(main())
--- a/examples/moondream-chatbot/bot.py
+++ b/examples/moondream-chatbot/bot.py
@@ -74,6 +74,8 @@ class TalkingAnimation(FrameProcessor):
        self._is_talking = False

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, AudioRawFrame):
            if not self._is_talking:
                await self.push_frame(talking_frame)
@@ -93,6 +95,8 @@ class UserImageRequester(FrameProcessor):
        self.participant_id = participant_id

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if self.participant_id and isinstance(frame, TextFrame):
            if frame.text == user_request_answer:
                await self.push_frame(UserImageRequestFrame(self.participant_id), FrameDirection.UPSTREAM)
@@ -107,6 +111,8 @@ class TextFilterProcessor(FrameProcessor):
        self.text = text

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            if frame.text != self.text:
                await self.push_frame(frame)
@@ -116,6 +122,8 @@ class TextFilterProcessor(FrameProcessor):

 class ImageFilterProcessor(FrameProcessor):
    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if not isinstance(frame, ImageRawFrame):
            await self.push_frame(frame)

@@ -145,7 +153,7 @@ async def main(room_url: str, token):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        ta = TalkingAnimation()

--- a/examples/patient-intake/bot.py
+++ b/examples/patient-intake/bot.py
@@ -1,11 +1,15 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
 import asyncio
 import aiohttp
-import copy
-import json
 import os
-import re
 import sys
 import wave
+
 from typing import List

 from openai._types import NotGiven, NOT_GIVEN
@@ -14,23 +18,18 @@ from openai.types.chat import (
    ChatCompletionToolParam,
 )

+from pipecat.frames.frames import AudioRawFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineTask
+from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.llm_response import LLMUserContextAggregator, LLMAssistantContextAggregator
 from pipecat.processors.logger import FrameLogger
-from pipecat.frames.frames import (
-    Frame,
-    LLMMessagesFrame,
-    AudioRawFrame,
-)
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.processors.frame_processor import FrameDirection
 from pipecat.services.elevenlabs import ElevenLabsTTSService
-from pipecat.services.openai import OpenAILLMService
+from pipecat.services.openai import OpenAILLMContext, OpenAILLMContextFrame, OpenAILLMService
 from pipecat.services.ai_services import AIService
-from pipecat.transports.services.daily import DailyParams, DailyTranscriptionSettings, DailyTransport
+from pipecat.transports.services.daily import DailyParams, DailyTransport
 from pipecat.vad.silero import SileroVADAnalyzer
-from pipecat.services.openai import OpenAILLMContext, OpenAILLMContextFrame

 from runner import configure

@@ -242,7 +241,6 @@ class IntakeProcessor:
        self._context.add_message(
            {"role": "system", "content": "Finally, ask the user the reason for their doctor visit today. Once they answer, call the list_visit_reasons function."})
        await llm.process_frame(OpenAILLMContextFrame(self._context), FrameDirection.DOWNSTREAM)
-        pass

    async def start_visit_reasons(self, llm):
        print("!!! doing start visit reasons")
@@ -251,7 +249,6 @@ class IntakeProcessor:
        self._context.add_message({"role": "system",
                                   "content": "Now, thank the user and end the conversation."})
        await llm.process_frame(OpenAILLMContextFrame(self._context), FrameDirection.DOWNSTREAM)
-        pass

    async def save_data(self, llm, args):
        logger.info(f"!!! Saving data: {args}")
@@ -305,12 +302,10 @@ async def main(room_url: str, token):
            model="gpt-4o")

        messages = []
-        context = OpenAILLMContext(
-            messages=messages,
-        )
+        context = OpenAILLMContext(messages=messages)
        user_context = LLMUserContextAggregator(context)
        assistant_context = LLMAssistantContextAggregator(context)
-        # checklist = ChecklistProcessor(context, llm)
+
        intake = IntakeProcessor(context, llm)
        llm.register_function("verify_birthday", intake.verify_birthday)
        llm.register_function(
@@ -329,19 +324,20 @@ async def main(room_url: str, token):
            "list_visit_reasons",
            intake.save_data,
            start_callback=intake.start_visit_reasons)
+
        fl = FrameLogger("LLM Output")

        pipeline = Pipeline([
-            transport.input(),
-            user_context,
-            llm,
-            fl,
-            tts,
-            transport.output(),
-            assistant_context,
+            transport.input(),   # Transport input
+            user_context,        # User responses
+            llm,                 # LLM
+            fl,                  # Frame logger
+            tts,                 # TTS
+            transport.output(),  # Transport output
+            assistant_context,   # Assistant responses
        ])

-        task = PipelineTask(pipeline, allow_interruptions=False)
+        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=False))

        @transport.event_handler("on_first_participant_joined")
        async def on_first_participant_joined(transport, participant):
--- a/examples/simple-chatbot/bot.py
+++ b/examples/simple-chatbot/bot.py
@@ -64,6 +64,8 @@ class TalkingAnimation(FrameProcessor):
        self._is_talking = False

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, AudioRawFrame):
            if not self._is_talking:
                await self.push_frame(talking_frame)
@@ -117,7 +119,7 @@ async def main(room_url: str, token):

        llm = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo-preview")
+            model="gpt-4o")

        messages = [
            {
--- a/examples/storytelling-chatbot/src/bot.py
+++ b/examples/storytelling-chatbot/src/bot.py
@@ -56,7 +56,7 @@ async def main(room_url, token=None):

        llm_service = OpenAILLMService(
            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4-turbo"
+            model="gpt-4o"
        )

        tts_service = ElevenLabsTTSService(
--- a/examples/storytelling-chatbot/src/processors.py
+++ b/examples/storytelling-chatbot/src/processors.py
@@ -52,6 +52,8 @@ class StoryImageProcessor(FrameProcessor):
        self._fal_service = fal_service

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, StoryImageFrame):
            try:
                async with timeout(7):
@@ -86,6 +88,8 @@ class StoryProcessor(FrameProcessor):
        self._story = story

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, UserStoppedSpeakingFrame):
            # Send an app message to the UI
            await self.push_frame(DailyTransportMessageFrame(CUE_ASSISTANT_TURN))
--- a/examples/translation-chatbot/bot.py
+++ b/examples/translation-chatbot/bot.py
@@ -40,6 +40,8 @@ class TranslationProcessor(FrameProcessor):
        self._language = language

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            context = [
                {
@@ -65,6 +67,8 @@ class TranslationSubtitles(FrameProcessor):
    # subtitles.
    #
    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            message = {
                "language": self._language,
@@ -97,7 +101,8 @@ async def main(room_url: str, token):
        )

        llm = OpenAILLMService(
-            api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4-turbo-preview"
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o"
        )

        sa = SentenceAggregator()
--- a/examples/websocket-server/README.md
+++ b/examples/websocket-server/README.md
@@ -0,0 +1,27 @@
+# Websocket Server
+
+This is an example that shows how to use `WebsocketServerTransport` to communicate with a web client.
+
+## Get started
+
+```python
+python3 -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
+```
+
+## Run the bot
+
+```bash
+python bot.py
+```
+
+## Run the HTTP server
+
+This will host the static web client:
+
+```bash
+python -m http.server
+```
+
+Then, visit `http://localhost:8000` in your browser to start a session.
--- a/examples/websocket-server/bot.py
+++ b/examples/websocket-server/bot.py
@@ -0,0 +1,93 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import aiohttp
+import asyncio
+import os
+import sys
+
+from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator,
+    LLMUserResponseAggregator
+)
+from pipecat.services.deepgram import DeepgramSTTService
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.network.websocket_server import WebsocketServerParams, WebsocketServerTransport
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def main():
+    async with aiohttp.ClientSession() as session:
+        transport = WebsocketServerTransport(
+            params=WebsocketServerParams(
+                audio_out_enabled=True,
+                add_wav_header=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+                vad_audio_passthrough=True
+            )
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+
+        stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
+
+        tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY"),
+            voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
+        )
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Websocket input from client
+            stt,                 # Speech-To-Text
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # Text-To-Speech
+            transport.output(),  # Websocket output to client
+            tma_out              # LLM responses
+        ])
+
+        task = PipelineTask(pipeline)
+
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            # Kick off the conversation.
+            messages.append(
+                {"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+
+        await runner.run(task)
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/websocket-server/frames.proto
+++ b/examples/websocket-server/frames.proto
@@ -0,0 +1,43 @@
+//
+// Copyright (c) 2024, Daily
+//
+// SPDX-License-Identifier: BSD 2-Clause License
+//
+
+// Generate frames_pb2.py with:
+//
+//   python -m grpc_tools.protoc --proto_path=./ --python_out=./protobufs frames.proto
+
+syntax = "proto3";
+
+package pipecat;
+
+message TextFrame {
+  uint64 id = 1;
+  string name = 2;
+  string text = 3;
+}
+
+message AudioRawFrame {
+  uint64 id = 1;
+  string name = 2;
+  bytes audio = 3;
+  uint32 sample_rate = 4;
+  uint32 num_channels = 5;
+}
+
+message TranscriptionFrame {
+  uint64 id = 1;
+  string name = 2;
+  string text = 3;
+  string user_id = 4;
+  string timestamp = 5;
+}
+
+message Frame {
+  oneof frame {
+    TextFrame text = 1;
+    AudioRawFrame audio = 2;
+    TranscriptionFrame transcription = 3;
+  }
+}
--- a/examples/websocket-server/index.html
+++ b/examples/websocket-server/index.html
@@ -0,0 +1,205 @@
+<!DOCTYPE html>
+<html lang="en">
+
+  <head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <script src="https://cdn.jsdelivr.net/npm/protobufjs@7.X.X/dist/protobuf.min.js"></script>
+    <title>Pipecat WebSocket Client Example</title>
+  </head>
+
+  <body>
+    <h1>Pipecat WebSocket Client Example</h1>
+    <h3><div id="progressText">Loading, wait...</div></h2>
+    <button id="startAudioBtn">Start Audio</button>
+    <button id="stopAudioBtn">Stop Audio</button>
+    <script>
+      const SAMPLE_RATE = 16000;
+      const NUM_CHANNELS = 1;
+      const PLAY_TIME_RESET_THRESHOLD_MS = 1.0;
+
+      // The protobuf type. We will load it later.
+      let Frame = null;
+
+      // The websocket connection.
+      let ws = null;
+
+      // The audio context
+      let audioContext = null;
+
+      // The audio context media stream source
+      let source = null;
+
+      // The microphone stream from getUserMedia. SHould be sampled to the
+      // proper sample rate.
+      let microphoneStream = null;
+
+      // Script processor to get data from microphone.
+      let scriptProcessor = null;
+
+      // AudioContext play time.
+      let playTime = 0;
+
+      // Last time we received a websocket message.
+      let lastMessageTime = 0;
+
+      // Whether we should be playing audio.
+      let isPlaying = false;
+
+      let startBtn = document.getElementById('startAudioBtn');
+      let stopBtn = document.getElementById('stopAudioBtn');
+
+      const proto = protobuf.load("frames.proto", (err, root) => {
+          if (err) {
+              throw err;
+          }
+          Frame = root.lookupType("pipecat.Frame");
+          const progressText = document.getElementById("progressText");
+          progressText.textContent = "We are ready! Make sure to run the server and then click `Start Audio`.";
+
+          startBtn.disabled = false;
+          stopBtn.disabled = true;
+      });
+
+      function initWebSocket() {
+          ws = new WebSocket('ws://localhost:8765');
+
+          ws.addEventListener('open', () => console.log('WebSocket connection established.'));
+          ws.addEventListener('message', handleWebSocketMessage);
+          ws.addEventListener('close', (event) => {
+              console.log("WebSocket connection closed.", event.code, event.reason);
+              stopAudio(false);
+          });
+          ws.addEventListener('error', (event) => console.error('WebSocket error:', event));
+      }
+
+      async function handleWebSocketMessage(event) {
+          const arrayBuffer = await event.data.arrayBuffer();
+          if (isPlaying) {
+              enqueueAudioFromProto(arrayBuffer);
+          }
+      }
+
+      function enqueueAudioFromProto(arrayBuffer) {
+          const parsedFrame = Frame.decode(new Uint8Array(arrayBuffer));
+          if (!parsedFrame?.audio) {
+              return false;
+          }
+
+          // Reset play time if it's been a while we haven't played anything.
+          const diffTime = audioContext.currentTime - lastMessageTime;
+          if ((playTime == 0) || (diffTime > PLAY_TIME_RESET_THRESHOLD_MS)) {
+              playTime = audioContext.currentTime;
+          }
+          lastMessageTime = audioContext.currentTime;
+
+          // We should be able to use parsedFrame.audio.audio.buffer but for
+          // some reason that contains all the bytes from the protobuf message.
+          const audioVector = Array.from(parsedFrame.audio.audio);
+          const audioArray = new Uint8Array(audioVector);
+
+          audioContext.decodeAudioData(audioArray.buffer, function(buffer) {
+              const source = new AudioBufferSourceNode(audioContext);
+              source.buffer = buffer;
+              source.start(playTime);
+              source.connect(audioContext.destination);
+              playTime = playTime + buffer.duration;
+          });
+      }
+
+      function convertFloat32ToS16PCM(float32Array) {
+          let int16Array = new Int16Array(float32Array.length);
+
+          for (let i = 0; i < float32Array.length; i++) {
+              let clampedValue = Math.max(-1, Math.min(1, float32Array[i]));
+              int16Array[i] = clampedValue < 0 ? clampedValue * 32768 : clampedValue * 32767;
+          }
+          return int16Array;
+      }
+
+      function startAudioBtnHandler() {
+          if (!navigator.mediaDevices || !navigator.mediaDevices.getUserMedia) {
+              alert('getUserMedia is not supported in your browser.');
+              return;
+          }
+
+          startBtn.disabled = true;
+          stopBtn.disabled = false;
+
+          audioContext = new (window.AudioContext || window.webkitAudioContext)({
+              latencyHint: "interactive",
+              sampleRate: SAMPLE_RATE
+          });
+
+          isPlaying = true;
+
+          initWebSocket();
+
+          navigator.mediaDevices.getUserMedia({
+              audio: {
+                  sampleRate: SAMPLE_RATE,
+                  channelCount: NUM_CHANNELS,
+                  autoGainControl: true,
+                  echoCancellation: true,
+                  noiseSuppression: true,
+              }
+          }).then((stream) => {
+              microphoneStream = stream;
+              // 512 is closest thing to 200ms.
+              scriptProcessor = audioContext.createScriptProcessor(512, 1, 1);
+              source = audioContext.createMediaStreamSource(stream);
+              source.connect(scriptProcessor);
+              scriptProcessor.connect(audioContext.destination);
+
+              scriptProcessor.onaudioprocess = (event) => {
+                  if (!ws) {
+                      return;
+                  }
+
+                  const audioData = event.inputBuffer.getChannelData(0);
+                  const pcmS16Array = convertFloat32ToS16PCM(audioData);
+                  const pcmByteArray = new Uint8Array(pcmS16Array.buffer);
+                  const frame = Frame.create({
+                      audio: {
+                          audio: Array.from(pcmByteArray),
+                          sampleRate: SAMPLE_RATE,
+                          numChannels: NUM_CHANNELS
+                      }
+                  });
+                  const encodedFrame = new Uint8Array(Frame.encode(frame).finish());
+                  ws.send(encodedFrame);
+              };
+          }).catch((error) => console.error('Error accessing microphone:', error));
+      }
+
+      function stopAudio(closeWebsocket) {
+          playTime = 0;
+          isPlaying = false;
+          startBtn.disabled = false;
+          stopBtn.disabled = true;
+
+          if (ws && closeWebsocket) {
+              ws.close();
+              ws = null;
+          }
+
+          if (scriptProcessor) {
+              scriptProcessor.disconnect();
+          }
+          if (source) {
+              source.disconnect();
+          }
+      }
+
+      function stopAudioBtnHandler() {
+          stopAudio(true);
+      }
+
+      startBtn.addEventListener('click', startAudioBtnHandler);
+      stopBtn.addEventListener('click', stopAudioBtnHandler);
+      startBtn.disabled = true;
+      stopBtn.disabled = true;
+    </script>
+  </body>
+
+</html>
--- a/examples/websocket-server/requirements.txt
+++ b/examples/websocket-server/requirements.txt
@@ -0,0 +1,2 @@
+python-dotenv
+pipecat-ai[openai,silero,websocket,whisper]
--- a/linux-py3.10-requirements.txt
+++ b/linux-py3.10-requirements.txt
@@ -4,24 +4,37 @@
 #
 #    pip-compile --all-extras pyproject.toml
 #
+aiofiles==23.2.1
+    # via deepgram-sdk
 aiohttp==3.9.5
-    # via pipecat-ai (pyproject.toml)
+    # via
+    #   cartesia
+    #   deepgram-sdk
+    #   langchain
+    #   langchain-community
+    #   pipecat-ai (pyproject.toml)
 aiosignal==1.3.1
    # via aiohttp
 annotated-types==0.7.0
    # via pydantic
 anthropic==0.25.9
-    # via pipecat-ai (pyproject.toml)
+    # via
+    #   openpipe
+    #   pipecat-ai (pyproject.toml)
 anyio==4.4.0
    # via
    #   anthropic
    #   httpx
    #   openai
 async-timeout==4.0.3
-    # via aiohttp
+    # via
+    #   aiohttp
+    #   langchain
 attrs==23.2.0
-    # via aiohttp
-av==12.0.0
+    # via
+    #   aiohttp
+    #   openpipe
+av==12.1.0
    # via faster-whisper
 azure-cognitiveservices-speech==1.37.0
    # via pipecat-ai (pyproject.toml)
@@ -29,20 +42,30 @@ blinker==1.8.2
    # via flask
 cachetools==5.3.3
    # via google-auth
-certifi==2024.2.2
+cartesia==0.1.1
+    # via pipecat-ai (pyproject.toml)
+certifi==2024.6.2
    # via
    #   httpcore
    #   httpx
    #   requests
+cffi==1.16.0
+    # via sounddevice
 charset-normalizer==3.3.2
    # via requests
 click==8.1.7
    # via flask
 coloredlogs==15.0.1
    # via onnxruntime
-ctranslate2==4.2.1
+ctranslate2==4.3.1
    # via faster-whisper
-daily-python==0.9.0
+daily-python==0.9.1
+    # via pipecat-ai (pyproject.toml)
+dataclasses-json==0.6.7
+    # via
+    #   deepgram-sdk
+    #   langchain-community
+deepgram-sdk==3.2.7
    # via pipecat-ai (pyproject.toml)
 distro==1.9.0
    # via
@@ -51,12 +74,14 @@ distro==1.9.0
 einops==0.8.0
    # via pipecat-ai (pyproject.toml)
 exceptiongroup==1.2.1
-    # via anyio
+    # via
+    #   anyio
+    #   pytest
 fal-client==0.4.0
    # via pipecat-ai (pyproject.toml)
 faster-whisper==1.0.2
    # via pipecat-ai (pyproject.toml)
-filelock==3.14.0
+filelock==3.15.1
    # via
    #   huggingface-hub
    #   pyht
@@ -75,7 +100,7 @@ frozenlist==1.4.1
    # via
    #   aiohttp
    #   aiosignal
-fsspec==2024.5.0
+fsspec==2024.6.0
    # via
    #   huggingface-hub
    #   torch
@@ -88,9 +113,9 @@ google-api-core[grpc]==2.19.0
    #   google-ai-generativelanguage
    #   google-api-python-client
    #   google-generativeai
-google-api-python-client==2.131.0
+google-api-python-client==2.133.0
    # via google-generativeai
-google-auth==2.29.0
+google-auth==2.30.0
    # via
    #   google-ai-generativelanguage
    #   google-api-core
@@ -101,11 +126,13 @@ google-auth-httplib2==0.2.0
    # via google-api-python-client
 google-generativeai==0.5.4
    # via pipecat-ai (pyproject.toml)
-googleapis-common-protos==1.63.0
+googleapis-common-protos==1.63.1
    # via
    #   google-api-core
    #   grpcio-status
-grpcio==1.64.0
+greenlet==3.0.3
+    # via sqlalchemy
+grpcio==1.64.1
    # via
    #   google-api-core
    #   grpcio-status
@@ -123,11 +150,14 @@ httplib2==0.22.0
 httpx==0.27.0
    # via
    #   anthropic
+    #   cartesia
+    #   deepgram-sdk
    #   fal-client
    #   openai
+    #   openpipe
 httpx-sse==0.4.0
    # via fal-client
-huggingface-hub==0.23.2
+huggingface-hub==0.23.3
    # via
    #   faster-whisper
    #   timm
@@ -141,29 +171,62 @@ idna==3.7
    #   httpx
    #   requests
    #   yarl
+iniconfig==2.0.0
+    # via pytest
 itsdangerous==2.2.0
    # via flask
 jinja2==3.1.4
    # via
    #   flask
    #   torch
+jsonpatch==1.33
+    # via langchain-core
+jsonpointer==3.0.0
+    # via jsonpatch
+langchain==0.2.3
+    # via
+    #   langchain-community
+    #   pipecat-ai (pyproject.toml)
+langchain-community==0.2.4
+    # via pipecat-ai (pyproject.toml)
+langchain-core==0.2.5
+    # via
+    #   langchain
+    #   langchain-community
+    #   langchain-openai
+    #   langchain-text-splitters
+langchain-openai==0.1.8
+    # via pipecat-ai (pyproject.toml)
+langchain-text-splitters==0.2.1
+    # via langchain
+langsmith==0.1.77
+    # via
+    #   langchain
+    #   langchain-community
+    #   langchain-core
 loguru==0.7.2
    # via pipecat-ai (pyproject.toml)
 markupsafe==2.1.5
    # via
    #   jinja2
    #   werkzeug
+marshmallow==3.21.3
+    # via dataclasses-json
 mpmath==1.3.0
    # via sympy
 multidict==6.0.5
    # via
    #   aiohttp
    #   yarl
+mypy-extensions==1.0.0
+    # via typing-inspect
 networkx==3.3
    # via torch
 numpy==1.26.4
    # via
    #   ctranslate2
+    #   langchain
+    #   langchain-community
    #   onnxruntime
    #   pipecat-ai (pyproject.toml)
    #   pyloudnorm
@@ -204,16 +267,28 @@ nvidia-nvtx-cu12==12.1.105
 onnxruntime==1.18.0
    # via faster-whisper
 openai==1.26.0
+    # via
+    #   langchain-openai
+    #   openpipe
+    #   pipecat-ai (pyproject.toml)
+openpipe==4.14.0
    # via pipecat-ai (pyproject.toml)
-packaging==24.0
+orjson==3.10.4
+    # via langsmith
+packaging==23.2
    # via
    #   huggingface-hub
+    #   langchain-core
+    #   marshmallow
    #   onnxruntime
+    #   pytest
    #   transformers
 pillow==10.3.0
    # via
    #   pipecat-ai (pyproject.toml)
    #   torchvision
+pluggy==1.5.0
+    # via pytest
 proto-plus==1.23.0
    # via
    #   google-ai-generativelanguage
@@ -226,6 +301,7 @@ protobuf==4.25.3
    #   googleapis-common-protos
    #   grpcio-status
    #   onnxruntime
+    #   pipecat-ai (pyproject.toml)
    #   proto-plus
    #   pyht
 pyasn1==0.6.0
@@ -236,12 +312,17 @@ pyasn1-modules==0.4.0
    # via google-auth
 pyaudio==0.2.14
    # via pipecat-ai (pyproject.toml)
-pydantic==2.7.2
+pycparser==2.22
+    # via cffi
+pydantic==2.7.4
    # via
    #   anthropic
    #   google-generativeai
+    #   langchain
+    #   langchain-core
+    #   langsmith
    #   openai
-pydantic-core==2.18.3
+pydantic-core==2.18.4
    # via pydantic
 pyht==0.0.28
    # via pipecat-ai (pyproject.toml)
@@ -249,21 +330,37 @@ pyloudnorm==0.1.1
    # via pipecat-ai (pyproject.toml)
 pyparsing==3.1.2
    # via httplib2
+pytest==8.2.2
+    # via pytest-asyncio
+pytest-asyncio==0.23.7
+    # via cartesia
+python-dateutil==2.9.0.post0
+    # via openpipe
 python-dotenv==1.0.1
    # via pipecat-ai (pyproject.toml)
 pyyaml==6.0.1
    # via
    #   ctranslate2
    #   huggingface-hub
+    #   langchain
+    #   langchain-community
+    #   langchain-core
    #   timm
    #   transformers
 regex==2024.5.15
-    # via transformers
-requests==2.32.2
    # via
+    #   tiktoken
+    #   transformers
+requests==2.32.3
+    # via
+    #   cartesia
    #   google-api-core
    #   huggingface-hub
+    #   langchain
+    #   langchain-community
+    #   langsmith
    #   pyht
+    #   tiktoken
    #   transformers
 rsa==4.9
    # via google-auth
@@ -273,16 +370,31 @@ safetensors==0.4.3
    #   transformers
 scipy==1.13.1
    # via pyloudnorm
+six==1.16.0
+    # via python-dateutil
 sniffio==1.3.1
    # via
    #   anthropic
    #   anyio
    #   httpx
    #   openai
-sympy==1.12
+sounddevice==0.4.7
+    # via pipecat-ai (pyproject.toml)
+sqlalchemy==2.0.30
+    # via
+    #   langchain
+    #   langchain-community
+sympy==1.12.1
    # via
    #   onnxruntime
    #   torch
+tenacity==8.3.0
+    # via
+    #   langchain
+    #   langchain-community
+    #   langchain-core
+tiktoken==0.7.0
+    # via langchain-openai
 timm==0.9.16
    # via pipecat-ai (pyproject.toml)
 tokenizers==0.19.1
@@ -290,15 +402,17 @@ tokenizers==0.19.1
    #   anthropic
    #   faster-whisper
    #   transformers
-torch==2.3.0
+tomli==2.0.1
+    # via pytest
+torch==2.3.1
    # via
    #   pipecat-ai (pyproject.toml)
    #   timm
    #   torchaudio
    #   torchvision
-torchaudio==2.3.0
+torchaudio==2.3.1
    # via pipecat-ai (pyproject.toml)
-torchvision==0.18.0
+torchvision==0.18.1
    # via timm
 tqdm==4.66.4
    # via
@@ -308,25 +422,35 @@ tqdm==4.66.4
    #   transformers
 transformers==4.40.2
    # via pipecat-ai (pyproject.toml)
-triton==2.3.0
+triton==2.3.1
    # via torch
-typing-extensions==4.11.0
+typing-extensions==4.12.2
    # via
    #   anthropic
    #   anyio
+    #   deepgram-sdk
    #   google-generativeai
    #   huggingface-hub
    #   openai
    #   pipecat-ai (pyproject.toml)
    #   pydantic
    #   pydantic-core
+    #   sqlalchemy
    #   torch
+    #   typing-inspect
+typing-inspect==0.9.0
+    # via dataclasses-json
 uritemplate==4.1.1
    # via google-api-python-client
 urllib3==2.2.1
    # via requests
+verboselogs==1.7
+    # via deepgram-sdk
 websockets==12.0
-    # via pipecat-ai (pyproject.toml)
+    # via
+    #   cartesia
+    #   deepgram-sdk
+    #   pipecat-ai (pyproject.toml)
 werkzeug==3.0.3
    # via flask
 yarl==1.9.4
--- a/macos-py3.10-requirements.txt
+++ b/macos-py3.10-requirements.txt
@@ -4,25 +4,36 @@
 #
 #    pip-compile --all-extras pyproject.toml
 #
+aiofiles==23.2.1
+    # via deepgram-sdk
 aiohttp==3.9.5
    # via
    #   cartesia
+    #   deepgram-sdk
+    #   langchain
+    #   langchain-community
    #   pipecat-ai (pyproject.toml)
 aiosignal==1.3.1
    # via aiohttp
 annotated-types==0.7.0
    # via pydantic
 anthropic==0.25.9
-    # via pipecat-ai (pyproject.toml)
+    # via
+    #   openpipe
+    #   pipecat-ai (pyproject.toml)
 anyio==4.4.0
    # via
    #   anthropic
    #   httpx
    #   openai
 async-timeout==4.0.3
-    # via aiohttp
+    # via
+    #   aiohttp
+    #   langchain
 attrs==23.2.0
-    # via aiohttp
+    # via
+    #   aiohttp
+    #   openpipe
 av==12.1.0
    # via faster-whisper
 azure-cognitiveservices-speech==1.37.0
@@ -31,9 +42,9 @@ blinker==1.8.2
    # via flask
 cachetools==5.3.3
    # via google-auth
-cartesia==0.1.0
+cartesia==0.1.1
    # via pipecat-ai (pyproject.toml)
-certifi==2024.2.2
+certifi==2024.6.2
    # via
    #   httpcore
    #   httpx
@@ -46,10 +57,16 @@ click==8.1.7
    # via flask
 coloredlogs==15.0.1
    # via onnxruntime
-ctranslate2==4.2.1
+ctranslate2==4.3.1
    # via faster-whisper
 daily-python==0.9.1
    # via pipecat-ai (pyproject.toml)
+dataclasses-json==0.6.7
+    # via
+    #   deepgram-sdk
+    #   langchain-community
+deepgram-sdk==3.2.7
+    # via pipecat-ai (pyproject.toml)
 distro==1.9.0
    # via
    #   anthropic
@@ -64,7 +81,7 @@ fal-client==0.4.0
    # via pipecat-ai (pyproject.toml)
 faster-whisper==1.0.2
    # via pipecat-ai (pyproject.toml)
-filelock==3.14.0
+filelock==3.15.1
    # via
    #   huggingface-hub
    #   pyht
@@ -82,7 +99,7 @@ frozenlist==1.4.1
    # via
    #   aiohttp
    #   aiosignal
-fsspec==2024.5.0
+fsspec==2024.6.0
    # via
    #   huggingface-hub
    #   torch
@@ -95,9 +112,9 @@ google-api-core[grpc]==2.19.0
    #   google-ai-generativelanguage
    #   google-api-python-client
    #   google-generativeai
-google-api-python-client==2.131.0
+google-api-python-client==2.133.0
    # via google-generativeai
-google-auth==2.29.0
+google-auth==2.30.0
    # via
    #   google-ai-generativelanguage
    #   google-api-core
@@ -108,11 +125,11 @@ google-auth-httplib2==0.2.0
    # via google-api-python-client
 google-generativeai==0.5.4
    # via pipecat-ai (pyproject.toml)
-googleapis-common-protos==1.63.0
+googleapis-common-protos==1.63.1
    # via
    #   google-api-core
    #   grpcio-status
-grpcio==1.64.0
+grpcio==1.64.1
    # via
    #   google-api-core
    #   grpcio-status
@@ -131,11 +148,13 @@ httpx==0.27.0
    # via
    #   anthropic
    #   cartesia
+    #   deepgram-sdk
    #   fal-client
    #   openai
+    #   openpipe
 httpx-sse==0.4.0
    # via fal-client
-huggingface-hub==0.23.2
+huggingface-hub==0.23.3
    # via
    #   faster-whisper
    #   timm
@@ -157,23 +176,54 @@ jinja2==3.1.4
    # via
    #   flask
    #   torch
+jsonpatch==1.33
+    # via langchain-core
+jsonpointer==3.0.0
+    # via jsonpatch
+langchain==0.2.3
+    # via
+    #   langchain-community
+    #   pipecat-ai (pyproject.toml)
+langchain-community==0.2.4
+    # via pipecat-ai (pyproject.toml)
+langchain-core==0.2.5
+    # via
+    #   langchain
+    #   langchain-community
+    #   langchain-openai
+    #   langchain-text-splitters
+langchain-openai==0.1.8
+    # via pipecat-ai (pyproject.toml)
+langchain-text-splitters==0.2.1
+    # via langchain
+langsmith==0.1.77
+    # via
+    #   langchain
+    #   langchain-community
+    #   langchain-core
 loguru==0.7.2
    # via pipecat-ai (pyproject.toml)
 markupsafe==2.1.5
    # via
    #   jinja2
    #   werkzeug
+marshmallow==3.21.3
+    # via dataclasses-json
 mpmath==1.3.0
    # via sympy
 multidict==6.0.5
    # via
    #   aiohttp
    #   yarl
+mypy-extensions==1.0.0
+    # via typing-inspect
 networkx==3.3
    # via torch
 numpy==1.26.4
    # via
    #   ctranslate2
+    #   langchain
+    #   langchain-community
    #   onnxruntime
    #   pipecat-ai (pyproject.toml)
    #   pyloudnorm
@@ -183,10 +233,19 @@ numpy==1.26.4
 onnxruntime==1.18.0
    # via faster-whisper
 openai==1.26.0
+    # via
+    #   langchain-openai
+    #   openpipe
+    #   pipecat-ai (pyproject.toml)
+openpipe==4.14.0
    # via pipecat-ai (pyproject.toml)
-packaging==24.0
+orjson==3.10.4
+    # via langsmith
+packaging==23.2
    # via
    #   huggingface-hub
+    #   langchain-core
+    #   marshmallow
    #   onnxruntime
    #   pytest
    #   transformers
@@ -208,6 +267,7 @@ protobuf==4.25.3
    #   googleapis-common-protos
    #   grpcio-status
    #   onnxruntime
+    #   pipecat-ai (pyproject.toml)
    #   proto-plus
    #   pyht
 pyasn1==0.6.0
@@ -220,12 +280,15 @@ pyaudio==0.2.14
    # via pipecat-ai (pyproject.toml)
 pycparser==2.22
    # via cffi
-pydantic==2.7.2
+pydantic==2.7.4
    # via
    #   anthropic
    #   google-generativeai
+    #   langchain
+    #   langchain-core
+    #   langsmith
    #   openai
-pydantic-core==2.18.3
+pydantic-core==2.18.4
    # via pydantic
 pyht==0.0.28
    # via pipecat-ai (pyproject.toml)
@@ -233,26 +296,37 @@ pyloudnorm==0.1.1
    # via pipecat-ai (pyproject.toml)
 pyparsing==3.1.2
    # via httplib2
-pytest==8.2.1
+pytest==8.2.2
    # via pytest-asyncio
 pytest-asyncio==0.23.7
    # via cartesia
+python-dateutil==2.9.0.post0
+    # via openpipe
 python-dotenv==1.0.1
    # via pipecat-ai (pyproject.toml)
 pyyaml==6.0.1
    # via
    #   ctranslate2
    #   huggingface-hub
+    #   langchain
+    #   langchain-community
+    #   langchain-core
    #   timm
    #   transformers
 regex==2024.5.15
-    # via transformers
+    # via
+    #   tiktoken
+    #   transformers
 requests==2.32.3
    # via
    #   cartesia
    #   google-api-core
    #   huggingface-hub
+    #   langchain
+    #   langchain-community
+    #   langsmith
    #   pyht
+    #   tiktoken
    #   transformers
 rsa==4.9
    # via google-auth
@@ -262,6 +336,8 @@ safetensors==0.4.3
    #   transformers
 scipy==1.13.1
    # via pyloudnorm
+six==1.16.0
+    # via python-dateutil
 sniffio==1.3.1
    # via
    #   anthropic
@@ -270,10 +346,21 @@ sniffio==1.3.1
    #   openai
 sounddevice==0.4.7
    # via pipecat-ai (pyproject.toml)
+sqlalchemy==2.0.30
+    # via
+    #   langchain
+    #   langchain-community
 sympy==1.12.1
    # via
    #   onnxruntime
    #   torch
+tenacity==8.3.0
+    # via
+    #   langchain
+    #   langchain-community
+    #   langchain-core
+tiktoken==0.7.0
+    # via langchain-openai
 timm==0.9.16
    # via pipecat-ai (pyproject.toml)
 tokenizers==0.19.1
@@ -283,15 +370,15 @@ tokenizers==0.19.1
    #   transformers
 tomli==2.0.1
    # via pytest
-torch==2.3.0
+torch==2.3.1
    # via
    #   pipecat-ai (pyproject.toml)
    #   timm
    #   torchaudio
    #   torchvision
-torchaudio==2.3.0
+torchaudio==2.3.1
    # via pipecat-ai (pyproject.toml)
-torchvision==0.18.0
+torchvision==0.18.1
    # via timm
 tqdm==4.66.4
    # via
@@ -301,24 +388,32 @@ tqdm==4.66.4
    #   transformers
 transformers==4.40.2
    # via pipecat-ai (pyproject.toml)
-typing-extensions==4.11.0
+typing-extensions==4.12.2
    # via
    #   anthropic
    #   anyio
+    #   deepgram-sdk
    #   google-generativeai
    #   huggingface-hub
    #   openai
    #   pipecat-ai (pyproject.toml)
    #   pydantic
    #   pydantic-core
+    #   sqlalchemy
    #   torch
+    #   typing-inspect
+typing-inspect==0.9.0
+    # via dataclasses-json
 uritemplate==4.1.1
    # via google-api-python-client
 urllib3==2.2.1
    # via requests
+verboselogs==1.7
+    # via deepgram-sdk
 websockets==12.0
    # via
    #   cartesia
+    #   deepgram-sdk
    #   pipecat-ai (pyproject.toml)
 werkzeug==3.0.3
    # via flask
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -24,8 +24,9 @@ dependencies = [
    "numpy~=1.26.4",
    "loguru~=0.7.0",
    "Pillow~=10.3.0",
+    "protobuf~=4.25.3",
    "pyloudnorm~=0.1.1",
-    "typing-extensions~=4.11.0",
+    "typing-extensions~=4.12.1",
 ]

 [project.urls]
@@ -37,13 +38,16 @@ anthropic = [ "anthropic~=0.25.7" ]
 azure = [ "azure-cognitiveservices-speech~=1.37.0" ]
 cartesia = [ "numpy~=1.26.0", "sounddevice", "cartesia" ]
 daily = [ "daily-python~=0.9.0" ]
+deepgram = [ "deepgram-sdk~=3.2.7" ]
 examples = [ "python-dotenv~=1.0.0", "flask~=3.0.3", "flask_cors~=4.0.1" ]
 fal = [ "fal-client~=0.4.0" ]
 google = [ "google-generativeai~=0.5.3" ]
 fireworks = [ "openai~=1.26.0" ]
+langchain = [ "langchain~=0.2.1", "langchain-community~=0.2.1", "langchain-openai~=0.1.8" ]
 local = [ "pyaudio~=0.2.0" ]
 moondream = [ "einops~=0.8.0", "timm~=0.9.16", "transformers~=4.40.2" ]
 openai = [ "openai~=1.26.0" ]
+openpipe = [ "openpipe~=4.14.0" ]
 playht = [ "pyht~=0.0.28" ]
 silero = [ "torch~=2.3.0", "torchaudio~=2.3.0" ]
 websocket = [ "websockets~=12.0" ]
--- a/src/pipecat/frames/frames.proto
+++ b/src/pipecat/frames/frames.proto
@@ -4,28 +4,40 @@
 // SPDX-License-Identifier: BSD 2-Clause License
 //

+// Generate frames_pb2.py with:
+//
+//   python -m grpc_tools.protoc --proto_path=./ --python_out=./protobufs frames.proto
+
 syntax = "proto3";

-package pipecat_proto;
+package pipecat;

 message TextFrame {
-    string text = 1;
+  uint64 id = 1;
+  string name = 2;
+  string text = 3;
 }

-message AudioFrame {
-    bytes data = 1;
+message AudioRawFrame {
+  uint64 id = 1;
+  string name = 2;
+  bytes audio = 3;
+  uint32 sample_rate = 4;
+  uint32 num_channels = 5;
 }

 message TranscriptionFrame {
-    string text = 1;
-    string participantId = 2;
-    string timestamp = 3;
+  uint64 id = 1;
+  string name = 2;
+  string text = 3;
+  string user_id = 4;
+  string timestamp = 5;
 }

 message Frame {
-    oneof frame {
-        TextFrame text = 1;
-        AudioFrame audio = 2;
-        TranscriptionFrame transcription = 3;
-    }
+  oneof frame {
+    TextFrame text = 1;
+    AudioRawFrame audio = 2;
+    TranscriptionFrame transcription = 3;
+  }
 }
--- a/src/pipecat/frames/frames.py
+++ b/src/pipecat/frames/frames.py
@@ -4,7 +4,7 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-from typing import Any, List, Tuple
+from typing import Any, List, Mapping, Tuple

 from dataclasses import dataclass, field

@@ -188,6 +188,8 @@ class SystemFrame(Frame):
 class StartFrame(SystemFrame):
    """This is the first frame that should be pushed down a pipeline."""
    allow_interruptions: bool = False
+    enable_metrics: bool = False
+    report_only_initial_ttfb: bool = False


@dataclass
@@ -238,6 +240,13 @@ class StopInterruptionFrame(SystemFrame):
    pass


+@dataclass
+class MetricsFrame(SystemFrame):
+    """Emitted by processor that can compute metrics like latencies.
+    """
+    ttfb: Mapping[str, float]
+
+
 #
 # Control frames
 #
--- a/src/pipecat/frames/protobufs/frames_pb2.py
+++ b/src/pipecat/frames/protobufs/frames_pb2.py
@@ -1,7 +1,7 @@
 # -*- coding: utf-8 -*-
 # Generated by the protocol buffer compiler.  DO NOT EDIT!
 # source: frames.proto
-# Protobuf Python Version: 4.25.3
+# Protobuf Python Version: 4.25.1
 """Generated protocol buffer code."""
 from google.protobuf import descriptor as _descriptor
 from google.protobuf import descriptor_pool as _descriptor_pool
@@ -14,19 +14,19 @@ _sym_db = _symbol_database.Default()



-DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0c\x66rames.proto\x12\rpipecat_proto\"\x19\n\tTextFrame\x12\x0c\n\x04text\x18\x01 \x01(\t\"\x1a\n\nAudioFrame\x12\x0c\n\x04\x64\x61ta\x18\x01 \x01(\x0c\"L\n\x12TranscriptionFrame\x12\x0c\n\x04text\x18\x01 \x01(\t\x12\x15\n\rparticipantId\x18\x02 \x01(\t\x12\x11\n\ttimestamp\x18\x03 \x01(\t\"\xa2\x01\n\x05\x46rame\x12(\n\x04text\x18\x01 \x01(\x0b\x32\x18.pipecat_proto.TextFrameH\x00\x12*\n\x05\x61udio\x18\x02 \x01(\x0b\x32\x19.pipecat_proto.AudioFrameH\x00\x12:\n\rtranscription\x18\x03 \x01(\x0b\x32!.pipecat_proto.TranscriptionFrameH\x00\x42\x07\n\x05\x66rameb\x06proto3')
+DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0c\x66rames.proto\x12\x07pipecat\"3\n\tTextFrame\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x0c\n\x04name\x18\x02 \x01(\t\x12\x0c\n\x04text\x18\x03 \x01(\t\"c\n\rAudioRawFrame\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x0c\n\x04name\x18\x02 \x01(\t\x12\r\n\x05\x61udio\x18\x03 \x01(\x0c\x12\x13\n\x0bsample_rate\x18\x04 \x01(\r\x12\x14\n\x0cnum_channels\x18\x05 \x01(\r\"`\n\x12TranscriptionFrame\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x0c\n\x04name\x18\x02 \x01(\t\x12\x0c\n\x04text\x18\x03 \x01(\t\x12\x0f\n\x07user_id\x18\x04 \x01(\t\x12\x11\n\ttimestamp\x18\x05 \x01(\t\"\x93\x01\n\x05\x46rame\x12\"\n\x04text\x18\x01 \x01(\x0b\x32\x12.pipecat.TextFrameH\x00\x12\'\n\x05\x61udio\x18\x02 \x01(\x0b\x32\x16.pipecat.AudioRawFrameH\x00\x12\x34\n\rtranscription\x18\x03 \x01(\x0b\x32\x1b.pipecat.TranscriptionFrameH\x00\x42\x07\n\x05\x66rameb\x06proto3')

 _globals = globals()
 _builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, _globals)
 _builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'frames_pb2', _globals)
 if _descriptor._USE_C_DESCRIPTORS == False:
  DESCRIPTOR._options = None
-  _globals['_TEXTFRAME']._serialized_start=31
-  _globals['_TEXTFRAME']._serialized_end=56
-  _globals['_AUDIOFRAME']._serialized_start=58
-  _globals['_AUDIOFRAME']._serialized_end=84
-  _globals['_TRANSCRIPTIONFRAME']._serialized_start=86
-  _globals['_TRANSCRIPTIONFRAME']._serialized_end=162
-  _globals['_FRAME']._serialized_start=165
-  _globals['_FRAME']._serialized_end=327
+  _globals['_TEXTFRAME']._serialized_start=25
+  _globals['_TEXTFRAME']._serialized_end=76
+  _globals['_AUDIORAWFRAME']._serialized_start=78
+  _globals['_AUDIORAWFRAME']._serialized_end=177
+  _globals['_TRANSCRIPTIONFRAME']._serialized_start=179
+  _globals['_TRANSCRIPTIONFRAME']._serialized_end=275
+  _globals['_FRAME']._serialized_start=278
+  _globals['_FRAME']._serialized_end=425
 # @@protoc_insertion_point(module_scope)
--- a/src/pipecat/pipeline/base_pipeline.py
+++ b/src/pipecat/pipeline/base_pipeline.py
@@ -0,0 +1,21 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+from abc import abstractmethod
+
+from typing import List
+
+from pipecat.processors.frame_processor import FrameProcessor
+
+
+class BasePipeline(FrameProcessor):
+
+    def __init__(self):
+        super().__init__()
+
+    @abstractmethod
+    def processors_with_metrics(self) -> List[FrameProcessor]:
+        pass
--- a/src/pipecat/pipeline/parallel_pipeline.py
+++ b/src/pipecat/pipeline/parallel_pipeline.py
@@ -6,6 +6,10 @@

 import asyncio

+from itertools import chain
+from typing import List
+
+from pipecat.pipeline.base_pipeline import BasePipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.frames.frames import CancelFrame, EndFrame, Frame, StartFrame
@@ -20,6 +24,8 @@ class Source(FrameProcessor):
        self._up_queue = upstream_queue

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        match direction:
            case FrameDirection.UPSTREAM:
                await self._up_queue.put(frame)
@@ -34,6 +40,8 @@ class Sink(FrameProcessor):
        self._down_queue = downstream_queue

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        match direction:
            case FrameDirection.UPSTREAM:
                await self.push_frame(frame, direction)
@@ -41,7 +49,7 @@ class Sink(FrameProcessor):
                await self._down_queue.put(frame)


-class ParallelPipeline(FrameProcessor):
+class ParallelPipeline(BasePipeline):
    def __init__(self, *args):
        super().__init__()

@@ -77,6 +85,13 @@ class ParallelPipeline(FrameProcessor):

        logger.debug(f"Finished creating {self} pipelines")

+    #
+    # BasePipeline
+    #
+
+    def processors_with_metrics(self) -> List[FrameProcessor]:
+        return list(chain.from_iterable(p.processors_with_metrics() for p in self._pipelines))
+
    #
    # Frame processor
    #
@@ -90,6 +105,8 @@ class ParallelPipeline(FrameProcessor):
        self._down_task = loop.create_task(self._process_down_queue())

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, StartFrame):
            await self._start_tasks()

--- a/src/pipecat/processors/aggregators/parallel_task.py
+++ b/src/pipecat/processors/aggregators/parallel_task.py
@@ -6,8 +6,10 @@

 import asyncio

+from itertools import chain
 from typing import List

+from pipecat.pipeline.base_pipeline import BasePipeline
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.frames.frames import Frame
@@ -22,6 +24,8 @@ class Source(FrameProcessor):
        self._up_queue = upstream_queue

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        match direction:
            case FrameDirection.UPSTREAM:
                await self._up_queue.put(frame)
@@ -36,6 +40,8 @@ class Sink(FrameProcessor):
        self._down_queue = downstream_queue

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        match direction:
            case FrameDirection.UPSTREAM:
                await self.push_frame(frame, direction)
@@ -43,7 +49,7 @@ class Sink(FrameProcessor):
                await self._down_queue.put(frame)


-class ParallelTask(FrameProcessor):
+class ParallelTask(BasePipeline):
    def __init__(self, *args):
        super().__init__()

@@ -75,11 +81,20 @@ class ParallelTask(FrameProcessor):
            self._pipelines.append(pipeline)
        logger.debug(f"Finished creating {self} pipelines")

+    #
+    # BasePipeline
+    #
+
+    def processors_with_metrics(self) -> List[FrameProcessor]:
+        return list(chain.from_iterable(p.processors_with_metrics() for p in self._pipelines))
+
    #
    # Frame processor
    #

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if direction == FrameDirection.UPSTREAM:
            # If we get an upstream frame we process it in each sink.
            await asyncio.gather(*[s.process_frame(frame, direction) for s in self._sinks])
--- a/src/pipecat/pipeline/pipeline.py
+++ b/src/pipecat/pipeline/pipeline.py
@@ -4,11 +4,10 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-import asyncio
-
 from typing import Callable, Coroutine, List

 from pipecat.frames.frames import Frame
+from pipecat.pipeline.base_pipeline import BasePipeline
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor


@@ -19,6 +18,8 @@ class PipelineSource(FrameProcessor):
        self._upstream_push_frame = upstream_push_frame

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        match direction:
            case FrameDirection.UPSTREAM:
                await self._upstream_push_frame(frame, direction)
@@ -33,6 +34,8 @@ class PipelineSink(FrameProcessor):
        self._downstream_push_frame = downstream_push_frame

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        match direction:
            case FrameDirection.UPSTREAM:
                await self.push_frame(frame, direction)
@@ -40,7 +43,7 @@ class PipelineSink(FrameProcessor):
                await self._downstream_push_frame(frame, direction)


-class Pipeline(FrameProcessor):
+class Pipeline(BasePipeline):

    def __init__(self, processors: List[FrameProcessor]):
        super().__init__()
@@ -53,6 +56,19 @@ class Pipeline(FrameProcessor):

        self._link_processors()

+    #
+    # BasePipeline
+    #
+
+    def processors_with_metrics(self):
+        services = []
+        for p in self._processors:
+            if isinstance(p, BasePipeline):
+                services += p.processors_with_metrics()
+            elif p.can_generate_metrics():
+                services.append(p)
+        return services
+
    #
    # Frame processor
    #
@@ -61,13 +77,16 @@ class Pipeline(FrameProcessor):
        await self._cleanup_processors()

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if direction == FrameDirection.DOWNSTREAM:
            await self._source.process_frame(frame, FrameDirection.DOWNSTREAM)
        elif direction == FrameDirection.UPSTREAM:
            await self._sink.process_frame(frame, FrameDirection.UPSTREAM)

    async def _cleanup_processors(self):
-        await asyncio.gather(*[p.cleanup() for p in self._processors])
+        for p in self._processors:
+            await p.cleanup()

    def _link_processors(self):
        prev = self._processors[0]
--- a/src/pipecat/pipeline/runner.py
+++ b/src/pipecat/pipeline/runner.py
@@ -20,18 +20,15 @@ class PipelineRunner:
        self.name: str = name or f"{self.__class__.__name__}#{obj_count(self)}"

        self._tasks = {}
-        self._running = True

        if handle_sigint:
            self._setup_sigint()

    async def run(self, task: PipelineTask):
        logger.debug(f"Runner {self} started running {task}")
-        self._running = True
        self._tasks[task.name] = task
        await task.run()
        del self._tasks[task.name]
-        self._running = False
        logger.debug(f"Runner {self} finished running {task}")

    async def stop_when_done(self):
@@ -42,18 +39,19 @@ class PipelineRunner:
        logger.debug(f"Canceling runner {self}")
        await asyncio.gather(*[t.cancel() for t in self._tasks.values()])

-    def is_active(self):
-        return self._running
-
    def _setup_sigint(self):
        loop = asyncio.get_running_loop()
        loop.add_signal_handler(
            signal.SIGINT,
-            lambda *args: asyncio.create_task(self._sigint_handler())
+            lambda *args: asyncio.create_task(self._sig_handler())
+        )
+        loop.add_signal_handler(
+            signal.SIGTERM,
+            lambda *args: asyncio.create_task(self._sig_handler())
        )

-    async def _sigint_handler(self):
-        logger.warning(f"Ctrl-C detected. Canceling runner {self}")
+    async def _sig_handler(self):
+        logger.warning(f"Interruption detected. Canceling runner {self}")
        await self.cancel()

    def __str__(self):
--- a/src/pipecat/pipeline/task.py
+++ b/src/pipecat/pipeline/task.py
@@ -10,7 +10,8 @@ from typing import AsyncIterable, Iterable

 from pydantic import BaseModel

-from pipecat.frames.frames import CancelFrame, EndFrame, ErrorFrame, Frame, StartFrame, StopTaskFrame
+from pipecat.frames.frames import CancelFrame, EndFrame, ErrorFrame, Frame, MetricsFrame, StartFrame, StopTaskFrame
+from pipecat.pipeline.base_pipeline import BasePipeline
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.utils.utils import obj_count, obj_id

@@ -19,6 +20,8 @@ from loguru import logger

 class PipelineParams(BaseModel):
    allow_interruptions: bool = False
+    enable_metrics: bool = False
+    report_only_initial_ttfb: bool = False


 class Source(FrameProcessor):
@@ -28,6 +31,8 @@ class Source(FrameProcessor):
        self._up_queue = up_queue

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        match direction:
            case FrameDirection.UPSTREAM:
                await self._up_queue.put(frame)
@@ -37,12 +42,13 @@ class Source(FrameProcessor):

 class PipelineTask:

-    def __init__(self, pipeline: FrameProcessor, params: PipelineParams = PipelineParams()):
+    def __init__(self, pipeline: BasePipeline, params: PipelineParams = PipelineParams()):
        self.id: int = obj_id()
        self.name: str = f"{self.__class__.__name__}#{obj_count(self)}"

        self._pipeline = pipeline
        self._params = params
+        self._finished = False

        self._down_queue = asyncio.Queue()
        self._up_queue = asyncio.Queue()
@@ -50,6 +56,9 @@ class PipelineTask:
        self._source = Source(self._up_queue)
        self._source.link(pipeline)

+    def has_finished(self):
+        return self._finished
+
    async def stop_when_done(self):
        logger.debug(f"Task {self} scheduled to stop when done")
        await self.queue_frame(EndFrame())
@@ -67,6 +76,7 @@ class PipelineTask:
        self._process_up_task = asyncio.create_task(self._process_up_queue())
        self._process_down_task = asyncio.create_task(self._process_down_queue())
        await asyncio.gather(self._process_up_task, self._process_down_task)
+        self._finished = True

    async def queue_frame(self, frame: Frame):
        await self._down_queue.put(frame)
@@ -81,9 +91,20 @@ class PipelineTask:
        else:
            raise Exception("Frames must be an iterable or async iterable")

+    def _initial_metrics_frame(self) -> MetricsFrame:
+        processors = self._pipeline.processors_with_metrics()
+        ttfb = dict(zip([p.name for p in processors], [0] * len(processors)))
+        return MetricsFrame(ttfb=ttfb)
+
    async def _process_down_queue(self):
-        await self._source.process_frame(
-            StartFrame(allow_interruptions=self._params.allow_interruptions), FrameDirection.DOWNSTREAM)
+        start_frame = StartFrame(
+            allow_interruptions=self._params.allow_interruptions,
+            enable_metrics=self._params.enable_metrics,
+            report_only_initial_ttfb=self._params.report_only_initial_ttfb
+        )
+        await self._source.process_frame(start_frame, FrameDirection.DOWNSTREAM)
+        await self._source.process_frame(self._initial_metrics_frame(), FrameDirection.DOWNSTREAM)
+
        running = True
        should_cleanup = True
        while running:
--- a/src/pipecat/processors/aggregators/gated.py
+++ b/src/pipecat/processors/aggregators/gated.py
@@ -48,6 +48,8 @@ class GatedAggregator(FrameProcessor):
        self._accumulator: List[Frame] = []

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        # We must not block system frames.
        if isinstance(frame, SystemFrame):
            await self.push_frame(frame, direction)
--- a/src/pipecat/processors/aggregators/llm_response.py
+++ b/src/pipecat/processors/aggregators/llm_response.py
@@ -71,6 +71,8 @@ class LLMResponseAggregator(FrameProcessor):
    #    S I T E -> X
    #    S I E T -> X
    #  S I E I T -> X
+    #      S E T -> X
+    #    S E I T -> X
    #
    # The following case would not be supported:
    #
@@ -79,6 +81,8 @@ class LLMResponseAggregator(FrameProcessor):
    # and T2 would be dropped.

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        send_aggregation = False

        if isinstance(frame, self._start_frame):
@@ -87,6 +91,7 @@ class LLMResponseAggregator(FrameProcessor):
            self._seen_start_frame = True
            self._seen_end_frame = False
            self._seen_interim_results = False
+            await self.push_frame(frame, direction)
        elif isinstance(frame, self._end_frame):
            self._seen_end_frame = True
            self._seen_start_frame = False
@@ -94,11 +99,12 @@ class LLMResponseAggregator(FrameProcessor):
            # We might have received the end frame but we might still be
            # aggregating (i.e. we have seen interim results but not the final
            # text).
-            self._aggregating = self._seen_interim_results
+            self._aggregating = self._seen_interim_results or len(self._aggregation) == 0

            # Send the aggregation if we are not aggregating anymore (i.e. no
            # more interim results received).
            send_aggregation = not self._aggregating
+            await self.push_frame(frame, direction)
        elif isinstance(frame, self._accumulator_frame):
            if self._aggregating:
                self._aggregation += f" {frame.text}"
@@ -207,6 +213,8 @@ class LLMFullResponseAggregator(FrameProcessor):
        self._aggregation = ""

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            self._aggregation += frame.text
        elif isinstance(frame, LLMFullResponseEndFrame):
--- a/src/pipecat/processors/aggregators/sentence.py
+++ b/src/pipecat/processors/aggregators/sentence.py
@@ -33,6 +33,8 @@ class SentenceAggregator(FrameProcessor):
        self._aggregation = ""

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        # We ignore interim description at this point.
        if isinstance(frame, InterimTranscriptionFrame):
            return
--- a/src/pipecat/processors/aggregators/user_response.py
+++ b/src/pipecat/processors/aggregators/user_response.py
@@ -74,6 +74,8 @@ class ResponseAggregator(FrameProcessor):
    #    S I T E -> X
    #    S I E T -> X
    #  S I E I T -> X
+    #      S E T -> X
+    #    S E I T -> X
    #
    # The following case would not be supported:
    #
@@ -82,6 +84,8 @@ class ResponseAggregator(FrameProcessor):
    # and T2 would be dropped.

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        send_aggregation = False

        if isinstance(frame, self._start_frame):
@@ -89,6 +93,7 @@ class ResponseAggregator(FrameProcessor):
            self._seen_start_frame = True
            self._seen_end_frame = False
            self._seen_interim_results = False
+            await self.push_frame(frame, direction)
        elif isinstance(frame, self._end_frame):
            self._seen_end_frame = True
            self._seen_start_frame = False
@@ -96,11 +101,12 @@ class ResponseAggregator(FrameProcessor):
            # We might have received the end frame but we might still be
            # aggregating (i.e. we have seen interim results but not the final
            # text).
-            self._aggregating = self._seen_interim_results
+            self._aggregating = self._seen_interim_results or len(self._aggregation) == 0

            # Send the aggregation if we are not aggregating anymore (i.e. no
            # more interim results received).
            send_aggregation = not self._aggregating
+            await self.push_frame(frame, direction)
        elif isinstance(frame, self._accumulator_frame):
            if self._aggregating:
                self._aggregation += f" {frame.text}"
--- a/src/pipecat/processors/aggregators/vision_image_frame.py
+++ b/src/pipecat/processors/aggregators/vision_image_frame.py
@@ -30,6 +30,8 @@ class VisionImageFrameAggregator(FrameProcessor):
        self._describe_text = None

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            self._describe_text = frame.text
        elif isinstance(frame, ImageRawFrame):
--- a/src/pipecat/processors/filters/init.py
+++ b/src/pipecat/processors/filters/init.py
--- a/src/pipecat/processors/filters/frame_filter.py
+++ b/src/pipecat/processors/filters/frame_filter.py
@@ -30,5 +30,7 @@ class FrameFilter(FrameProcessor):
                or isinstance(frame, SystemFrame))

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if self._should_passthrough_frame(frame):
            await self.push_frame(frame, direction)
--- a/src/pipecat/processors/filters/function_filter.py
+++ b/src/pipecat/processors/filters/function_filter.py
@@ -0,0 +1,30 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+from typing import Awaitable, Callable
+
+from pipecat.frames.frames import Frame, SystemFrame
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+
+
+class FunctionFilter(FrameProcessor):
+
+    def __init__(self, filter: Callable[[Frame], Awaitable[bool]]):
+        super().__init__()
+        self._filter = filter
+
+    #
+    # Frame processor
+    #
+
+    def _should_passthrough_frame(self, frame):
+        return isinstance(frame, SystemFrame)
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        passthrough = self._should_passthrough_frame(frame)
+        allowed = await self._filter(frame)
+        if passthrough or allowed:
+            await self.push_frame(frame, direction)
--- a/src/pipecat/processors/filters/wake_check_filter.py
+++ b/src/pipecat/processors/filters/wake_check_filter.py
@@ -43,6 +43,8 @@ class WakeCheckFilter(FrameProcessor):
            self._wake_patterns.append(pattern)

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        try:
            if isinstance(frame, TranscriptionFrame):
                p = self._participant_states.get(frame.user_id)
--- a/src/pipecat/processors/frame_processor.py
+++ b/src/pipecat/processors/frame_processor.py
@@ -5,10 +5,11 @@
 #

 import asyncio
-from asyncio import AbstractEventLoop
+import time
+
 from enum import Enum

-from pipecat.frames.frames import ErrorFrame, Frame
+from pipecat.frames.frames import ErrorFrame, Frame, MetricsFrame, StartFrame, UserStoppedSpeakingFrame
 from pipecat.utils.utils import obj_count, obj_id

 from loguru import logger
@@ -21,12 +22,52 @@ class FrameDirection(Enum):

 class FrameProcessor:

-    def __init__(self):
+    def __init__(
+            self,
+            name: str | None = None,
+            loop: asyncio.AbstractEventLoop | None = None,
+            **kwargs):
        self.id: int = obj_id()
-        self.name = f"{self.__class__.__name__}#{obj_count(self)}"
+        self.name = name or f"{self.__class__.__name__}#{obj_count(self)}"
        self._prev: "FrameProcessor" | None = None
        self._next: "FrameProcessor" | None = None
-        self._loop: AbstractEventLoop = asyncio.get_running_loop()
+        self._loop: asyncio.AbstractEventLoop = loop or asyncio.get_running_loop()
+
+        # Properties
+        self._allow_interruptions = False
+        self._enable_metrics = False
+        self._report_only_initial_ttfb = False
+
+        # Metrics
+        self._start_ttfb_time = 0
+        self._should_report_ttfb = True
+
+    @property
+    def interruptions_allowed(self):
+        return self._allow_interruptions
+
+    @property
+    def metrics_enabled(self):
+        return self._enable_metrics
+
+    @property
+    def report_only_initial_ttfb(self):
+        return self._report_only_initial_ttfb
+
+    def can_generate_metrics(self) -> bool:
+        return False
+
+    async def start_ttfb_metrics(self):
+        if self.metrics_enabled and self._should_report_ttfb:
+            self._start_ttfb_time = time.time()
+            self._should_report_ttfb = not self._report_only_initial_ttfb
+
+    async def stop_ttfb_metrics(self):
+        if self.metrics_enabled and self._start_ttfb_time > 0:
+            ttfb = time.time() - self._start_ttfb_time
+            logger.debug(f"{self.name} TTFB: {ttfb}")
+            await self.push_frame(MetricsFrame(ttfb={self.name: ttfb}))
+            self._start_ttfb_time = 0

    async def cleanup(self):
        pass
@@ -36,11 +77,16 @@ class FrameProcessor:
        processor._prev = self
        logger.debug(f"Linking {self} -> {self._next}")

-    def get_event_loop(self) -> AbstractEventLoop:
+    def get_event_loop(self) -> asyncio.AbstractEventLoop:
        return self._loop

    async def process_frame(self, frame: Frame, direction: FrameDirection):
-        pass
+        if isinstance(frame, StartFrame):
+            self._allow_interruptions = frame.allow_interruptions
+            self._enable_metrics = frame.enable_metrics
+            self._report_only_initial_ttfb = frame.report_only_initial_ttfb
+        elif isinstance(frame, UserStoppedSpeakingFrame):
+            self._should_report_ttfb = True

    async def push_error(self, error: ErrorFrame):
        await self.push_frame(error, FrameDirection.UPSTREAM)
--- a/src/pipecat/processors/frameworks/init.py
+++ b/src/pipecat/processors/frameworks/init.py
--- a/src/pipecat/processors/frameworks/langchain.py
+++ b/src/pipecat/processors/frameworks/langchain.py
@@ -0,0 +1,79 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+from typing import Union
+
+from pipecat.frames.frames import (
+    Frame,
+    LLMFullResponseEndFrame,
+    LLMFullResponseStartFrame,
+    LLMMessagesFrame,
+    LLMResponseEndFrame,
+    LLMResponseStartFrame,
+    TextFrame)
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+
+from loguru import logger
+
+try:
+    from langchain_core.messages import AIMessageChunk
+    from langchain_core.runnables import Runnable
+except ModuleNotFoundError as e:
+    logger.exception(
+        "In order to use Langchain, you need to `pip install pipecat-ai[langchain]`. "
+    )
+    raise Exception(f"Missing module: {e}")
+
+
+class LangchainProcessor(FrameProcessor):
+    def __init__(self, chain: Runnable, transcript_key: str = "input"):
+        super().__init__()
+        self._chain = chain
+        self._transcript_key = transcript_key
+        self._participant_id: str | None = None
+
+    def set_participant_id(self, participant_id: str):
+        self._participant_id = participant_id
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, LLMMessagesFrame):
+            # Messages are accumulated by the `LLMUserResponseAggregator` in a list of messages.
+            # The last one by the human is the one we want to send to the LLM.
+            logger.debug(f"Got transcription frame {frame}")
+            text: str = frame.messages[-1]["content"]
+
+            await self._ainvoke(text.strip())
+        else:
+            await self.push_frame(frame, direction)
+
+    @staticmethod
+    def __get_token_value(text: Union[str, AIMessageChunk]) -> str:
+        match text:
+            case str():
+                return text
+            case AIMessageChunk():
+                return text.content
+            case _:
+                return ""
+
+    async def _ainvoke(self, text: str):
+        logger.debug(f"Invoking chain with {text}")
+        await self.push_frame(LLMFullResponseStartFrame())
+        try:
+            async for token in self._chain.astream(
+                {self._transcript_key: text},
+                config={"configurable": {"session_id": self._participant_id}},
+            ):
+                await self.push_frame(LLMResponseStartFrame())
+                await self.push_frame(TextFrame(self.__get_token_value(token)))
+                await self.push_frame(LLMResponseEndFrame())
+        except GeneratorExit:
+            logger.warning(f"{self} generator was closed prematurely")
+        except Exception as e:
+            logger.error(f"{self} an unknown error occurred: {e}")
+        await self.push_frame(LLMFullResponseEndFrame())
--- a/src/pipecat/processors/text_transformer.py
+++ b/src/pipecat/processors/text_transformer.py
@@ -27,6 +27,8 @@ class StatelessTextTransformer(FrameProcessor):
        self._transform_fn = transform_fn

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            result = self._transform_fn(frame.text)
            if isinstance(result, Coroutine):
--- a/src/pipecat/serializers/init.py
+++ b/src/pipecat/serializers/init.py
--- a/src/pipecat/serializers/abstract_frame_serializer.py
+++ b/src/pipecat/serializers/abstract_frame_serializer.py
@@ -1,16 +0,0 @@
-from abc import abstractmethod
-
-from pipecat.pipeline.frames import Frame
-
-
-class FrameSerializer:
-    def __init__(self):
-        pass
-
-    @abstractmethod
-    def serialize(self, frame: Frame) -> bytes:
-        raise NotImplementedError
-
-    @abstractmethod
-    def deserialize(self, data: bytes) -> Frame:
-        raise NotImplementedError
--- a/src/pipecat/serializers/base_serializer.py
+++ b/src/pipecat/serializers/base_serializer.py
@@ -0,0 +1,20 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+from abc import ABC, abstractmethod
+
+from pipecat.frames.frames import Frame
+
+
+class FrameSerializer(ABC):
+
+    @abstractmethod
+    def serialize(self, frame: Frame) -> bytes:
+        pass
+
+    @abstractmethod
+    def deserialize(self, data: bytes) -> Frame | None:
+        pass
--- a/src/pipecat/serializers/protobuf_serializer.py
+++ b/src/pipecat/serializers/protobuf_serializer.py
@@ -1,14 +1,23 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
 import dataclasses
-from typing import Text
-from pipecat.pipeline.frames import AudioFrame, Frame, TextFrame, TranscriptionFrame
-import pipecat.pipeline.protobufs.frames_pb2 as frame_protos
-from pipecat.serializers.abstract_frame_serializer import FrameSerializer
+
+import pipecat.frames.protobufs.frames_pb2 as frame_protos
+
+from pipecat.frames.frames import AudioRawFrame, Frame, TextFrame, TranscriptionFrame
+from pipecat.serializers.base_serializer import FrameSerializer
+
+from loguru import logger


 class ProtobufFrameSerializer(FrameSerializer):
    SERIALIZABLE_TYPES = {
        TextFrame: "text",
-        AudioFrame: "audio",
+        AudioRawFrame: "audio",
        TranscriptionFrame: "transcription"
    }

@@ -29,9 +38,10 @@ class ProtobufFrameSerializer(FrameSerializer):
            setattr(getattr(proto_frame, proto_optional_name), field.name,
                    getattr(frame, field.name))

-        return proto_frame.SerializeToString()
+        result = proto_frame.SerializeToString()
+        return result

-    def deserialize(self, data: bytes) -> Frame:
+    def deserialize(self, data: bytes) -> Frame | None:
        """Returns a Frame object from a Frame protobuf. Used to convert frames
        passed over the wire as protobufs to Frame objects used in pipelines
        and frame processors.
@@ -53,12 +63,30 @@ class ProtobufFrameSerializer(FrameSerializer):
        proto = frame_protos.Frame.FromString(data)
        which = proto.WhichOneof("frame")
        if which not in self.SERIALIZABLE_FIELDS:
-            raise ValueError(
-                "Proto does not contain a valid frame. You may need to add a new case to ProtobufFrameSerializer.deserialize.")
+            logger.error("Unable to deserialize a valid frame")
+            return None

        class_name = self.SERIALIZABLE_FIELDS[which]
        args = getattr(proto, which)
        args_dict = {}
        for field in proto.DESCRIPTOR.fields_by_name[which].message_type.fields:
            args_dict[field.name] = getattr(args, field.name)
-        return class_name(**args_dict)
+
+        # Remove special fields if needed
+        id = getattr(args, "id")
+        name = getattr(args, "name")
+        if not id:
+            del args_dict["id"]
+        if not name:
+            del args_dict["name"]
+
+        # Create the instance
+        instance = class_name(**args_dict)
+
+        # Set special fields
+        if id:
+            setattr(instance, "id", getattr(args, "id"))
+        if name:
+            setattr(instance, "name", getattr(args, "name"))
+
+        return instance
--- a/src/pipecat/services/ai_services.py
+++ b/src/pipecat/services/ai_services.py
@@ -16,6 +16,7 @@ from pipecat.frames.frames import (
    EndFrame,
    ErrorFrame,
    Frame,
+    StartFrame,
    TTSStartedFrame,
    TTSStoppedFrame,
    TextFrame,
@@ -27,8 +28,27 @@ from pipecat.utils.utils import exp_smoothing


 class AIService(FrameProcessor):
-    def __init__(self):
-        super().__init__()
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+
+    async def start(self, frame: StartFrame):
+        pass
+
+    async def stop(self, frame: EndFrame):
+        pass
+
+    async def cancel(self, frame: CancelFrame):
+        pass
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, StartFrame):
+            await self.start(frame)
+        elif isinstance(frame, CancelFrame):
+            await self.cancel(frame)
+        elif isinstance(frame, EndFrame):
+            await self.stop(frame)

    async def process_generator(self, generator: AsyncGenerator[Frame, None]):
        async for f in generator:
@@ -41,13 +61,38 @@ class AIService(FrameProcessor):
 class LLMService(AIService):
    """This class is a no-op but serves as a base class for LLM services."""

-    def __init__(self):
-        super().__init__()
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+        self._callbacks = {}
+        self._start_callbacks = {}
+
+    # TODO-CB: callback function type
+    def register_function(self, function_name: str, callback, start_callback=None):
+        self._callbacks[function_name] = callback
+        if start_callback:
+            self._start_callbacks[function_name] = start_callback
+
+    def unregister_function(self, function_name: str):
+        del self._callbacks[function_name]
+        if self._start_callbacks[function_name]:
+            del self._start_callbacks[function_name]
+
+    def has_function(self, function_name: str):
+        return function_name in self._callbacks.keys()
+
+    async def call_function(self, function_name: str, args):
+        if function_name in self._callbacks.keys():
+            return await self._callbacks[function_name](self, args)
+        return None
+
+    async def call_start_function(self, function_name: str):
+        if function_name in self._start_callbacks.keys():
+            await self._start_callbacks[function_name](self)


 class TTSService(AIService):
-    def __init__(self, aggregate_sentences: bool = True):
-        super().__init__()
+    def __init__(self, aggregate_sentences: bool = True, **kwargs):
+        super().__init__(**kwargs)
        self._aggregate_sentences: bool = aggregate_sentences
        self._current_sentence: str = ""

@@ -81,6 +126,8 @@ class TTSService(AIService):
        await self.push_frame(TextFrame(text))

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            await self._process_text_frame(frame)
        elif isinstance(frame, EndFrame):
@@ -99,8 +146,9 @@ class STTService(AIService):
                 max_silence_secs: float = 0.3,
                 max_buffer_secs: float = 1.5,
                 sample_rate: int = 16000,
-                 num_channels: int = 1):
-        super().__init__()
+                 num_channels: int = 1,
+                 **kwargs):
+        super().__init__(**kwargs)
        self._min_volume = min_volume
        self._max_silence_secs = max_silence_secs
        self._max_buffer_secs = max_buffer_secs
@@ -109,8 +157,8 @@ class STTService(AIService):
        (self._content, self._wave) = self._new_wave()
        self._silence_num_frames = 0
        # Volume exponential smoothing
-        self._smoothing_factor = 0.4
-        self._prev_volume = 1 - self._smoothing_factor
+        self._smoothing_factor = 0.2
+        self._prev_volume = 0

    @abstractmethod
    async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
@@ -154,6 +202,8 @@ class STTService(AIService):

    async def process_frame(self, frame: Frame, direction: FrameDirection):
        """Processes a frame of audio data, either buffering or transcribing it."""
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, CancelFrame) or isinstance(frame, EndFrame):
            self._wave.close()
            await self.push_frame(frame, direction)
@@ -167,15 +217,17 @@ class STTService(AIService):

 class ImageGenService(AIService):

-    def __init__(self):
-        super().__init__()
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)

    # Renders the image. Returns an Image object.
-    @ abstractmethod
+    @abstractmethod
    async def run_image_gen(self, prompt: str) -> AsyncGenerator[Frame, None]:
        pass

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, TextFrame):
            await self.push_frame(frame, direction)
            await self.process_generator(self.run_image_gen(frame.text))
@@ -186,15 +238,17 @@ class ImageGenService(AIService):
 class VisionService(AIService):
    """VisionService is a base class for vision services."""

-    def __init__(self):
-        super().__init__()
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
        self._describe_text = None

-    @ abstractmethod
+    @abstractmethod
    async def run_vision(self, frame: VisionImageRawFrame) -> AsyncGenerator[Frame, None]:
        pass

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        if isinstance(frame, VisionImageRawFrame):
            await self.process_generator(self.run_vision(frame))
        else:
--- a/src/pipecat/services/anthropic.py
+++ b/src/pipecat/services/anthropic.py
@@ -4,9 +4,6 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-import os
-import asyncio
-import time
 import base64

 from pipecat.frames.frames import (
@@ -52,6 +49,9 @@ class AnthropicLLMService(LLMService):
        self._model = model
        self._max_tokens = max_tokens

+    def can_generate_metrics(self) -> bool:
+        return True
+
    def _get_messages_from_openai_context(
            self, context: OpenAILLMContext):
        openai_messages = context.get_messages()
@@ -80,8 +80,20 @@ class AnthropicLLMService(LLMService):
                    }]
                })
            else:
-                # text frame
-                anthropic_messages.append({"role": role, "content": content})
+                # Text frame. Anthropic needs the roles to alternate. This will
+                # cause an issue with interruptions. So, if we detect we are the
+                # ones asking again it probably means we were interrupted.
+                if role == "user" and len(anthropic_messages) > 1:
+                    last_message = anthropic_messages[-1]
+                    if last_message["role"] == "user":
+                        anthropic_messages = anthropic_messages[:-1]
+                        content = last_message["content"]
+                        anthropic_messages.append(
+                            {"role": "user", "content": f"Sorry, I just asked you about [{content}] but now I would like to know [{text}]."})
+                    else:
+                        anthropic_messages.append({"role": role, "content": text})
+                else:
+                    anthropic_messages.append({"role": role, "content": text})

        return anthropic_messages

@@ -92,13 +104,16 @@ class AnthropicLLMService(LLMService):

            messages = self._get_messages_from_openai_context(context)

-            start_time = time.time()
+            await self.start_ttfb_metrics()
+
            response = await self._client.messages.create(
                messages=messages,
                model=self._model,
                max_tokens=self._max_tokens,
                stream=True)
-            logger.debug(f"Anthropic LLM TTFB: {time.time() - start_time}")
+
+            await self.stop_ttfb_metrics()
+
            async for event in response:
                # logger.debug(f"Anthropic LLM event: {event}")
                if (event.type == "content_block_delta"):
@@ -107,11 +122,13 @@ class AnthropicLLMService(LLMService):
                    await self.push_frame(LLMResponseEndFrame())

        except Exception as e:
-            logger.error(f"Exception: {e}")
+            logger.error(f"{self} exception: {e}")
        finally:
            await self.push_frame(LLMFullResponseEndFrame())

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        context = None

        if isinstance(frame, OpenAILLMContextFrame):
@@ -125,22 +142,3 @@ class AnthropicLLMService(LLMService):

        if context:
            await self._process_context(context)
-
-    async def x_process_frame(self, frame: Frame, direction: FrameDirection):
-        if isinstance(frame, LLMMessagesFrame):
-            stream = await self.client.messages.create(
-                max_tokens=self.max_tokens,
-                messages=[
-                    {
-                        "role": "user",
-                        "content": "Hello, Claude",
-                    }
-                ],
-                model=self.model,
-                stream=True,
-            )
-            async for event in stream:
-                if event.type == "content_block_delta":
-                    await self.push_frame(TextFrame(event.delta.text))
-        else:
-            await self.push_frame(frame, direction)
--- a/src/pipecat/services/azure.py
+++ b/src/pipecat/services/azure.py
@@ -11,7 +11,6 @@ import io
 from PIL import Image
 from typing import AsyncGenerator

-from numpy import str_
 from openai import AsyncAzureOpenAI

 from pipecat.frames.frames import AudioRawFrame, ErrorFrame, Frame, URLImageRawFrame
@@ -45,9 +44,14 @@ class AzureTTSService(TTSService):
        )
        self._voice = voice

+    def can_generate_metrics(self) -> bool:
+        return True
+
    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
        logger.debug(f"Generating TTS: {text}")

+        await self.start_ttfb_metrics()
+
        ssml = (
            "<speak version='1.0' xml:lang='en-US' xmlns='http://www.w3.org/2001/10/synthesis' "
            "xmlns:mstts='http://www.w3.org/2001/mstts'>"
@@ -61,13 +65,14 @@ class AzureTTSService(TTSService):
        result = await asyncio.to_thread(self.speech_synthesizer.speak_ssml, (ssml))

        if result.reason == ResultReason.SynthesizingAudioCompleted:
+            await self.stop_ttfb_metrics()
            # Azure always sends a 44-byte header. Strip it off.
            yield AudioRawFrame(audio=result.audio_data[44:], sample_rate=16000, num_channels=1)
        elif result.reason == ResultReason.Canceled:
            cancellation_details = result.cancellation_details
            logger.warning(f"Speech synthesis canceled: {cancellation_details.reason}")
            if cancellation_details.reason == CancellationReason.Error:
-                logger.error(f"Error details: {cancellation_details.error_details}")
+                logger.error(f"{self} error: {cancellation_details.error_details}")


 class AzureLLMService(BaseOpenAILLMService):
@@ -138,7 +143,7 @@ class AzureImageGenServiceREST(ImageGenService):
            while status != "succeeded":
                attempts_left -= 1
                if attempts_left == 0:
-                    logger.error("Image generation timed out")
+                    logger.error(f"{self} error: image generation timed out")
                    yield ErrorFrame("Image generation timed out")
                    return

@@ -151,7 +156,7 @@ class AzureImageGenServiceREST(ImageGenService):

            image_url = json_response["result"]["data"][0]["url"] if json_response else None
            if not image_url:
-                logger.error("Image generation failed")
+                logger.error(f"{self} error: image generation failed")
                yield ErrorFrame("Image generation failed")
                return

--- a/src/pipecat/services/cartesia.py
+++ b/src/pipecat/services/cartesia.py
@@ -6,10 +6,9 @@

 from cartesia.tts import AsyncCartesiaTTS

-import time
 from typing import AsyncGenerator

-from pipecat.frames.frames import AudioRawFrame, ErrorFrame, Frame
+from pipecat.frames.frames import AudioRawFrame, Frame
 from pipecat.services.ai_services import TTSService

 from loguru import logger
@@ -22,35 +21,43 @@ class CartesiaTTSService(TTSService):
            *,
            api_key: str,
            voice_name: str,
+            model_id: str = "upbeat-moon",
+            output_format: str = "pcm_16000",
            **kwargs):
        super().__init__(**kwargs)

        self._api_key = api_key
        self._voice_name = voice_name
-
-        self._client = None
-
-    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
-        logger.debug(f"Transcribing text: [{text}]")
+        self._model_id = model_id
+        self._output_format = output_format

        try:
-            if self._client is None:
-                self._client = AsyncCartesiaTTS(api_key=self._api_key)
-                voices = self._client.get_voices()
-                self._voice_id = voices[self._voice_name]["id"]
-                self._voice = self._client.get_voice_embedding(voice_id=self._voice_id)
+            self._client = AsyncCartesiaTTS(api_key=self._api_key)
+            voices = self._client.get_voices()
+            voice_id = voices[self._voice_name]["id"]
+            self._voice = self._client.get_voice_embedding(voice_id=voice_id)
+        except Exception as e:
+            logger.error(f"{self} initialization error: {e}")
+
+    def can_generate_metrics(self) -> bool:
+        return True
+
+    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
+        logger.debug(f"Generating TTS: [{text}]")
+
+        try:
+            await self.start_ttfb_metrics()

            chunk_generator = await self._client.generate(
-                transcript=text, voice=self._voice, stream=True,
-                model_id="upbeat-moon", data_rtype='array', output_format='pcm_16000',
-                # a chunk_time of 0.1 seems to be the default. there are small audio pops/gaps which
-                # we need to debug
-                chunk_time=0.1
+                stream=True,
+                transcript=text,
+                voice=self._voice,
+                model_id=self._model_id,
+                output_format=self._output_format,
            )

            async for chunk in chunk_generator:
-                # print(f"")
-                frame = AudioRawFrame(chunk['audio'], 16000, 1)
-                yield frame
+                await self.stop_ttfb_metrics()
+                yield AudioRawFrame(chunk["audio"], chunk["sampling_rate"], 1)
        except Exception as e:
-            logger.error(f"Exception {e}")
+            logger.error(f"{self} exception: {e}")
--- a/src/pipecat/services/deepgram.py
+++ b/src/pipecat/services/deepgram.py
@@ -5,11 +5,30 @@
 #

 import aiohttp
+import asyncio
+import time

 from typing import AsyncGenerator

-from pipecat.frames.frames import AudioRawFrame, ErrorFrame, Frame
-from pipecat.services.ai_services import TTSService
+from pipecat.frames.frames import (
+    AudioRawFrame,
+    CancelFrame,
+    EndFrame,
+    ErrorFrame,
+    Frame,
+    InterimTranscriptionFrame,
+    StartFrame,
+    SystemFrame,
+    TranscriptionFrame)
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.services.ai_services import AIService, TTSService
+
+from deepgram import (
+    DeepgramClient,
+    DeepgramClientOptions,
+    LiveTranscriptionEvents,
+    LiveOptions,
+)

 from loguru import logger

@@ -22,31 +41,120 @@ class DeepgramTTSService(TTSService):
            aiohttp_session: aiohttp.ClientSession,
            api_key: str,
            voice: str = "aura-helios-en",
+            base_url: str = "https://api.deepgram.com/v1/speak",
            **kwargs):
        super().__init__(**kwargs)

        self._voice = voice
        self._api_key = api_key
        self._aiohttp_session = aiohttp_session
+        self._base_url = base_url
+
+    def can_generate_metrics(self) -> bool:
+        return True

    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
-        logger.info(f"Running Deepgram TTS for {text}")
-        base_url = "https://api.deepgram.com/v1/speak"
-        request_url = f"{base_url}?model = {
-            self._voice} & encoding = linear16 & container = none & sample_rate = 16000"
+        logger.debug(f"Generating TTS: [{text}]")
+
+        base_url = self._base_url
+        request_url = f"{base_url}?model={self._voice}&encoding=linear16&container=none&sample_rate=16000"
        headers = {"authorization": f"token {self._api_key}"}
        body = {"text": text}

        try:
+            await self.start_ttfb_metrics()
            async with self._aiohttp_session.post(request_url, headers=headers, json=body) as r:
                if r.status != 200:
-                    text = await r.text()
-                    logger.error(f"Error getting audio (status: {r.status}, error: {text})")
-                    yield ErrorFrame(f"Error getting audio (status: {r.status}, error: {text})")
+                    response_text = await r.text()
+                    # If we get a a "Bad Request: Input is unutterable", just print out a debug log.
+                    # All other unsuccesful requests should emit an error frame. If not specifically
+                    # handled by the running PipelineTask, the ErrorFrame will cancel the task.
+                    if "unutterable" in response_text:
+                        logger.debug(f"Unutterable text: [{text}]")
+                        return
+
+                    logger.error(
+                        f"{self} error getting audio (status: {r.status}, error: {response_text})")
+                    yield ErrorFrame(f"Error getting audio (status: {r.status}, error: {response_text})")
                    return

                async for data in r.content:
+                    await self.stop_ttfb_metrics()
                    frame = AudioRawFrame(audio=data, sample_rate=16000, num_channels=1)
                    yield frame
        except Exception as e:
-            logger.error(f"Exception {e}")
+            logger.error(f"{self} exception: {e}")
+
+
+class DeepgramSTTService(AIService):
+    def __init__(self,
+                 api_key: str,
+                 live_options: LiveOptions = LiveOptions(
+                     encoding="linear16",
+                     language="en-US",
+                     model="nova-2-conversationalai",
+                     sample_rate=16000,
+                     channels=1,
+                     interim_results=True,
+                     smart_format=True,
+                 ),
+                 **kwargs):
+        super().__init__(**kwargs)
+
+        self._live_options = live_options
+
+        self._client = DeepgramClient(
+            api_key, config=DeepgramClientOptions(options={"keepalive": "true"}))
+        self._connection = self._client.listen.asynclive.v("1")
+        self._connection.on(LiveTranscriptionEvents.Transcript, self._on_message)
+
+        self._create_push_task()
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, SystemFrame):
+            await self.push_frame(frame, direction)
+        elif isinstance(frame, AudioRawFrame):
+            await self._connection.send(frame.audio)
+        else:
+            await self._push_queue.put((frame, direction))
+
+    async def start(self, frame: StartFrame):
+        if await self._connection.start(self._live_options):
+            logger.debug(f"{self}: Connected to Deepgram")
+        else:
+            logger.error(f"{self}: Unable to connect to Deepgram")
+
+    async def stop(self, frame: EndFrame):
+        await self._connection.finish()
+        await self._push_queue.put((frame, FrameDirection.DOWNSTREAM))
+        await self._push_frame_task
+
+    async def cancel(self, frame: CancelFrame):
+        await self._connection.finish()
+        self._push_frame_task.cancel()
+
+    def _create_push_task(self):
+        self._push_frame_task = self.get_event_loop().create_task(self._push_frame_task_handler())
+        self._push_queue = asyncio.Queue()
+
+    async def _push_frame_task_handler(self):
+        running = True
+        while running:
+            try:
+                (frame, direction) = await self._push_queue.get()
+                await self.push_frame(frame, direction)
+                running = not isinstance(frame, EndFrame)
+            except asyncio.CancelledError:
+                break
+
+    async def _on_message(self, *args, **kwargs):
+        result = kwargs["result"]
+        is_final = result.is_final
+        transcript = result.channel.alternatives[0].transcript
+        if len(transcript) > 0:
+            if is_final:
+                await self._push_queue.put((TranscriptionFrame(transcript, "", int(time.time_ns() / 1000000)), FrameDirection.DOWNSTREAM))
+            else:
+                await self._push_queue.put((InterimTranscriptionFrame(transcript, "", int(time.time_ns() / 1000000)), FrameDirection.DOWNSTREAM))
--- a/src/pipecat/services/elevenlabs.py
+++ b/src/pipecat/services/elevenlabs.py
@@ -31,6 +31,9 @@ class ElevenLabsTTSService(TTSService):
        self._aiohttp_session = aiohttp_session
        self._model = model

+    def can_generate_metrics(self) -> bool:
+        return True
+
    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
        logger.debug(f"Generating TTS: [{text}]")

@@ -47,14 +50,17 @@ class ElevenLabsTTSService(TTSService):
            "Content-Type": "application/json",
        }

+        await self.start_ttfb_metrics()
+
        async with self._aiohttp_session.post(url, json=payload, headers=headers, params=querystring) as r:
            if r.status != 200:
                text = await r.text()
-                logger.error(f"Error getting audio (status: {r.status}, error: {text})")
+                logger.error(f"{self} error getting audio (status: {r.status}, error: {text})")
                yield ErrorFrame(f"Error getting audio (status: {r.status}, error: {text})")
                return

            async for chunk in r.content:
                if len(chunk) > 0:
+                    await self.stop_ttfb_metrics()
                    frame = AudioRawFrame(chunk, 16000, 1)
                    yield frame
--- a/src/pipecat/services/fal.py
+++ b/src/pipecat/services/fal.py
@@ -62,7 +62,7 @@ class FalImageGenService(ImageGenService):
        image_url = response["images"][0]["url"] if response else None

        if not image_url:
-            logger.error("Image generation failed")
+            logger.error(f"{self} error: image generation failed")
            yield ErrorFrame("Image generation failed")
            return

--- a/src/pipecat/services/google.py
+++ b/src/pipecat/services/google.py
@@ -1,8 +1,10 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#

-import json
-import os
 import asyncio
-import time

 from typing import List

@@ -45,6 +47,9 @@ class GoogleLLMService(LLMService):
        gai.configure(api_key=api_key)
        self._client = gai.GenerativeModel(model)

+    def can_generate_metrics(self) -> bool:
+        return True
+
    def _get_messages_from_openai_context(
            self, context: OpenAILLMContext) -> List[glm.Content]:
        openai_messages = context.get_messages()
@@ -81,9 +86,11 @@ class GoogleLLMService(LLMService):

            messages = self._get_messages_from_openai_context(context)

-            start_time = time.time()
+            await self.start_ttfb_metrics()
+
            response = self._client.generate_content(messages, stream=True)
-            logger.debug(f"Google LLM TTFB: {time.time() - start_time}")
+
+            await self.stop_ttfb_metrics()

            async for chunk in self._async_generator_wrapper(response):
                try:
@@ -97,14 +104,16 @@ class GoogleLLMService(LLMService):
                        logger.debug(
                            f"LLM refused to generate content for safety reasons - {messages}.")
                    else:
-                        logger.error(f"Error {e}")
+                        logger.error(f"{self} error: {e}")

        except Exception as e:
-            logger.error(f"Exception: {e}")
+            logger.error(f"{self} exception: {e}")
        finally:
            await self.push_frame(LLMFullResponseEndFrame())

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        context = None

        if isinstance(frame, OpenAILLMContextFrame):
--- a/src/pipecat/services/moondream.py
+++ b/src/pipecat/services/moondream.py
@@ -71,7 +71,7 @@ class MoondreamService(VisionService):

    async def run_vision(self, frame: VisionImageRawFrame) -> AsyncGenerator[Frame, None]:
        if not self._model:
-            logger.error("Moondream model not available")
+            logger.error(f"{self} error: Moondream model not available")
            yield ErrorFrame("Moondream model not available")
            return

--- a/src/pipecat/services/openai.py
+++ b/src/pipecat/services/openai.py
@@ -4,17 +4,18 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-import io
-import json
-import time
 import aiohttp
 import base64
+import io
+import json

+from typing import Any, AsyncGenerator, List, Literal
+
+from loguru import logger
 from PIL import Image

-from typing import AsyncGenerator, List, Literal
-
 from pipecat.frames.frames import (
+    AudioRawFrame,
    ErrorFrame,
    Frame,
    LLMFullResponseEndFrame,
@@ -26,24 +27,24 @@ from pipecat.frames.frames import (
    URLImageRawFrame,
    VisionImageRawFrame
 )
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext, OpenAILLMContextFrame
-from pipecat.processors.frame_processor import FrameDirection
-from pipecat.services.ai_services import LLMService, ImageGenService
-from openai.types.chat import (
-    ChatCompletionSystemMessageParam,
-    ChatCompletionFunctionMessageParam,
-    ChatCompletionToolParam,
-    ChatCompletionUserMessageParam,
+from pipecat.processors.aggregators.openai_llm_context import (
+    OpenAILLMContext,
+    OpenAILLMContextFrame
+)
+from pipecat.processors.frame_processor import FrameDirection
+from pipecat.services.ai_services import (
+    ImageGenService,
+    LLMService,
+    TTSService
 )
-from loguru import logger

 try:
-    from openai import AsyncOpenAI, AsyncStream
-
+    from openai import AsyncOpenAI, AsyncStream, BadRequestError
    from openai.types.chat import (
-        ChatCompletion,
        ChatCompletionChunk,
+        ChatCompletionFunctionMessageParam,
        ChatCompletionMessageParam,
+        ChatCompletionToolParam
    )
 except ModuleNotFoundError as e:
    logger.error(f"Exception: {e}")
@@ -66,30 +67,32 @@ class BaseOpenAILLMService(LLMService):
    calls from the LLM.
    """

-    def __init__(self, model: str, api_key=None, base_url=None):
-        super().__init__()
+    def __init__(self, model: str, api_key=None, base_url=None, **kwargs):
+        super().__init__(**kwargs)
        self._model: str = model
-        self._client = self.create_client(api_key=api_key, base_url=base_url)
-        self._callbacks = {}
-        self._start_callbacks = {}
+        self._client = self.create_client(api_key=api_key, base_url=base_url, **kwargs)

-    def create_client(self, api_key=None, base_url=None):
+    def create_client(self, api_key=None, base_url=None, **kwargs):
        return AsyncOpenAI(api_key=api_key, base_url=base_url)

-    # TODO-CB: callback function type
-    def register_function(self, function_name, callback, start_callback=None):
-        self._callbacks[function_name] = callback
-        if start_callback:
-            self._start_callbacks[function_name] = start_callback
+    def can_generate_metrics(self) -> bool:
+        return True

-    def unregister_function(self, function_name):
-        del self._callbacks[function_name]
-        if self._start_callbacks[function_name]:
-            del self._start_callbacks[function_name]
+    async def get_chat_completions(
+            self,
+            context: OpenAILLMContext,
+            messages: List[ChatCompletionMessageParam]) -> AsyncStream[ChatCompletionChunk]:
+        chunks = await self._client.chat.completions.create(
+            model=self._model,
+            stream=True,
+            messages=messages,
+            tools=context.tools,
+            tool_choice=context.tool_choice,
+        )
+        return chunks

    async def _stream_chat_completions(
-        self, context: OpenAILLMContext
-    ) -> AsyncStream[ChatCompletionChunk]:
+            self, context: OpenAILLMContext) -> AsyncStream[ChatCompletionChunk]:
        logger.debug(f"Generating chat: {context.get_messages_json()}")

        messages: List[ChatCompletionMessageParam] = context.get_messages()
@@ -106,35 +109,20 @@ class BaseOpenAILLMService(LLMService):
                del message["data"]
                del message["mime_type"]

-        start_time = time.time()
-        chunks: AsyncStream[ChatCompletionChunk] = (
-            await self._client.chat.completions.create(
-                model=self._model,
-                stream=True,
-                messages=messages,
-                tools=context.tools,
-                tool_choice=context.tool_choice,
-            )
-        )
-
-        logger.debug(f"OpenAI LLM TTFB: {time.time() - start_time}")
+        try:
+            chunks = await self.get_chat_completions(context, messages)
+        except Exception as e:
+            logger.error(f"{self} exception: {e}")

        return chunks

-    async def _chat_completions(self, messages) -> str | None:
-        response: ChatCompletion = await self._client.chat.completions.create(
-            model=self._model, stream=False, messages=messages
-        )
-        if response and len(response.choices) > 0:
-            return response.choices[0].message.content
-        else:
-            return None
-
    async def _process_context(self, context: OpenAILLMContext):
        function_name = ""
        arguments = ""
        tool_call_id = ""

+        await self.start_ttfb_metrics()
+
        chunk_stream: AsyncStream[ChatCompletionChunk] = (
            await self._stream_chat_completions(context)
        )
@@ -143,6 +131,8 @@ class BaseOpenAILLMService(LLMService):
            if len(chunk.choices) == 0:
                continue

+            await self.stop_ttfb_metrics()
+
            if chunk.choices[0].delta.tool_calls:
                # We're streaming the LLM response to enable the fastest response times.
                # For text, we just yield each chunk as we receive it and count on consumers
@@ -159,10 +149,7 @@ class BaseOpenAILLMService(LLMService):
                if tool_call.function and tool_call.function.name:
                    function_name += tool_call.function.name
                    tool_call_id = tool_call.id
-                    # only send a function start frame if we're not handling the function call
-                    if function_name in self._callbacks.keys():
-                        if function_name in self._start_callbacks.keys():
-                            await self._start_callbacks[function_name](self)
+                    await self.call_start_function(function_name)
                if tool_call.function and tool_call.function.arguments:
                    # Keep iterating through the response to collect all the argument fragments
                    arguments += tool_call.function.arguments
@@ -176,9 +163,8 @@ class BaseOpenAILLMService(LLMService):
        # the context, and re-prompt to get a chat answer. If we don't have a registered
        # handler, raise an exception.
        if function_name and arguments:
-            if function_name in self._callbacks.keys():
+            if self.has_function(function_name):
                await self._handle_function_call(context, tool_call_id, function_name, arguments)
-
            else:
                raise OpenAIUnhandledFunctionException(
                    f"The LLM tried to call a function named '{function_name}', but there isn't a callback registered for that function.")
@@ -191,7 +177,7 @@ class BaseOpenAILLMService(LLMService):
            arguments
    ):
        arguments = json.loads(arguments)
-        result = await self._callbacks[function_name](self, arguments)
+        result = await self.call_function(function_name, arguments)
        arguments = json.dumps(arguments)
        if isinstance(result, (str, dict)):
            # Handle it in "full magic mode"
@@ -231,6 +217,8 @@ class BaseOpenAILLMService(LLMService):
            raise BaseException(f"Unknown return type from function callback: {type(result)}")

    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
        context = None
        if isinstance(frame, OpenAILLMContextFrame):
            context: OpenAILLMContext = frame.context
@@ -249,7 +237,7 @@ class BaseOpenAILLMService(LLMService):

 class OpenAILLMService(BaseOpenAILLMService):

-    def __init__(self, model="gpt-4", **kwargs):
+    def __init__(self, model="gpt-4o", **kwargs):
        super().__init__(model, **kwargs)


@@ -282,7 +270,7 @@ class OpenAIImageGenService(ImageGenService):
        image_url = image.data[0].url

        if not image_url:
-            logger.error(f"No image provided in response: {image}")
+            logger.error(f"{self} No image provided in response: {image}")
            yield ErrorFrame("Image generation failed")
            return

@@ -292,3 +280,58 @@ class OpenAIImageGenService(ImageGenService):
            image = Image.open(image_stream)
            frame = URLImageRawFrame(image_url, image.tobytes(), image.size, image.format)
            yield frame
+
+
+class OpenAITTSService(TTSService):
+    """This service uses the OpenAI TTS API to generate audio from text.
+    The returned audio is PCM encoded at 24kHz. When using the DailyTransport, set the sample rate in the DailyParams accordingly:
+    ```
+    DailyParams(
+        audio_out_enabled=True,
+        audio_out_sample_rate=24_000,
+    )
+    ```
+    """
+
+    def __init__(
+            self,
+            *,
+            api_key: str | None = None,
+            voice: Literal["alloy", "echo", "fable", "onyx", "nova", "shimmer"] = "alloy",
+            model: Literal["tts-1", "tts-1-hd"] = "tts-1",
+            **kwargs):
+        super().__init__(**kwargs)
+
+        self._voice = voice
+        self._model = model
+
+        self._client = AsyncOpenAI(api_key=api_key)
+
+    def can_generate_metrics(self) -> bool:
+        return True
+
+    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
+        logger.debug(f"Generating TTS: [{text}]")
+
+        try:
+            await self.start_ttfb_metrics()
+
+            async with self._client.audio.speech.with_streaming_response.create(
+                    input=text,
+                    model=self._model,
+                    voice=self._voice,
+                    response_format="pcm",
+            ) as r:
+                if r.status_code != 200:
+                    error = await r.text()
+                    logger.error(
+                        f"{self} error getting audio (status: {r.status_code}, error: {error})")
+                    yield ErrorFrame(f"Error getting audio (status: {r.status_code}, error: {error})")
+                    return
+                async for chunk in r.iter_bytes(8192):
+                    if len(chunk) > 0:
+                        await self.stop_ttfb_metrics()
+                        frame = AudioRawFrame(chunk, 24_000, 1)
+                        yield frame
+        except BadRequestError as e:
+            logger.error(f"{self} error generating TTS: {e}")
--- a/src/pipecat/services/openpipe.py
+++ b/src/pipecat/services/openpipe.py
@@ -0,0 +1,70 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+from typing import Dict, List
+
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.services.openai import BaseOpenAILLMService
+
+from loguru import logger
+
+try:
+    from openpipe import AsyncOpenAI as OpenPipeAI, AsyncStream
+    from openai.types.chat import (ChatCompletionMessageParam, ChatCompletionChunk)
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error(
+        "In order to use OpenPipe, you need to `pip install pipecat-ai[openpipe]`. Also, set `OPENPIPE_API_KEY` and `OPENAI_API_KEY` environment variables.")
+    raise Exception(f"Missing module: {e}")
+
+
+class OpenPipeLLMService(BaseOpenAILLMService):
+
+    def __init__(
+            self,
+            model: str = "gpt-4o",
+            api_key: str | None = None,
+            base_url: str | None = None,
+            openpipe_api_key: str | None = None,
+            openpipe_base_url: str = "https://app.openpipe.ai/api/v1",
+            tags: Dict[str, str] | None = None,
+            **kwargs):
+        super().__init__(
+            model,
+            api_key,
+            base_url,
+            openpipe_api_key=openpipe_api_key,
+            openpipe_base_url=openpipe_base_url,
+            **kwargs)
+        self._tags = tags
+
+    def create_client(self, api_key=None, base_url=None, **kwargs):
+        openpipe_api_key = kwargs.get("openpipe_api_key") or ""
+        openpipe_base_url = kwargs.get("openpipe_base_url") or ""
+        client = OpenPipeAI(
+            api_key=api_key,
+            base_url=base_url,
+            openpipe={
+                "api_key": openpipe_api_key,
+                "base_url": openpipe_base_url
+            }
+        )
+        return client
+
+    async def get_chat_completions(
+            self,
+            context: OpenAILLMContext,
+            messages: List[ChatCompletionMessageParam]) -> AsyncStream[ChatCompletionChunk]:
+        chunks = await self._client.chat.completions.create(
+            model=self._model,
+            stream=True,
+            messages=messages,
+            openpipe={
+                "tags": self._tags,
+                "log_request": True
+            }
+        )
+        return chunks
--- a/src/pipecat/services/playht.py
+++ b/src/pipecat/services/playht.py
@@ -15,8 +15,8 @@ from pipecat.services.ai_services import TTSService
 from loguru import logger

 try:
-    from pyht import Client
    from pyht.client import TTSOptions
+    from pyht.async_client import AsyncClient
    from pyht.protos.api_pb2 import Format
 except ModuleNotFoundError as e:
    logger.error(f"Exception: {e}")
@@ -25,7 +25,7 @@ except ModuleNotFoundError as e:
    raise Exception(f"Missing module: {e}")


-class PlayHTAIService(TTSService):
+class PlayHTTTSService(TTSService):

    def __init__(self, *, api_key: str, user_id: str, voice_url: str, **kwargs):
        super().__init__(**kwargs)
@@ -33,7 +33,7 @@ class PlayHTAIService(TTSService):
        self._user_id = user_id
        self._speech_key = api_key

-        self._client = Client(
+        self._client = AsyncClient(
            user_id=self._user_id,
            api_key=self._speech_key,
        )
@@ -43,32 +43,41 @@ class PlayHTAIService(TTSService):
            quality="higher",
            format=Format.FORMAT_WAV)

-    def __del__(self):
-        self._client.close()
+    def can_generate_metrics(self) -> bool:
+        return True

    async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
-        b = bytearray()
-        in_header = True
-        for chunk in self._client.tts(text, self._options):
-            # skip the RIFF header.
-            if in_header:
-                b.extend(chunk)
-                if len(b) <= 36:
-                    continue
-                else:
-                    fh = io.BytesIO(b)
-                    fh.seek(36)
-                    (data, size) = struct.unpack('<4sI', fh.read(8))
-                    logger.debug(
-                        f"first attempt: data: {data}, size: {hex(size)}, position: {fh.tell()}")
-                    while data != b'data':
-                        fh.read(size)
+        logger.debug(f"Generating TTS: [{text}]")
+
+        try:
+            b = bytearray()
+            in_header = True
+
+            await self.start_ttfb_metrics()
+
+            playht_gen = self._client.tts(
+                text,
+                voice_engine="PlayHT2.0-turbo",
+                options=self._options)
+
+            async for chunk in playht_gen:
+                # skip the RIFF header.
+                if in_header:
+                    b.extend(chunk)
+                    if len(b) <= 36:
+                        continue
+                    else:
+                        fh = io.BytesIO(b)
+                        fh.seek(36)
                        (data, size) = struct.unpack('<4sI', fh.read(8))
-                        logger.debug(
-                            f"subsequent data: {data}, size: {hex(size)}, position: {fh.tell()}, data != data: {data != b'data'}")
-                    logger.debug("position: ", fh.tell())
-                    in_header = False
-            else:
-                if len(chunk):
-                    frame = AudioRawFrame(chunk, 16000, 1)
-                    yield frame
+                        while data != b'data':
+                            fh.read(size)
+                            (data, size) = struct.unpack('<4sI', fh.read(8))
+                        in_header = False
+                else:
+                    if len(chunk):
+                        await self.stop_ttfb_metrics()
+                        frame = AudioRawFrame(chunk, 16000, 1)
+                        yield frame
+        except Exception as e:
+            logger.error(f"{self} error generating TTS: {e}")
--- a/src/pipecat/services/whisper.py
+++ b/src/pipecat/services/whisper.py
@@ -45,7 +45,7 @@ class WhisperSTTService(STTService):
                 model: Model = Model.DISTIL_MEDIUM_EN,
                 device: str = "auto",
                 compute_type: str = "default",
-                 no_speech_prob: float = 0.1,
+                 no_speech_prob: float = 0.4,
                 **kwargs):

        super().__init__(**kwargs)
@@ -56,6 +56,9 @@ class WhisperSTTService(STTService):
        self._model: WhisperModel | None = None
        self._load()

+    def can_generate_metrics(self) -> bool:
+        return True
+
    def _load(self):
        """Loads the Whisper model. Note that if this is the first time
        this model is being run, it will take time to download."""
@@ -69,10 +72,12 @@ class WhisperSTTService(STTService):
    async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
        """Transcribes given audio using Whisper"""
        if not self._model:
+            logger.error(f"{self} error: Whisper model not available")
            yield ErrorFrame("Whisper model not available")
-            logger.error("Whisper model not available")
            return

+        await self.start_ttfb_metrics()
+
        # Divide by 32768 because we have signed 16-bit data.
        audio_float = np.frombuffer(audio, dtype=np.int16).astype(np.float32) / 32768.0

@@ -83,4 +88,6 @@ class WhisperSTTService(STTService):
                text += f"{segment.text} "

        if text:
+            await self.stop_ttfb_metrics()
+            logger.debug(f"Transcription: [{text}]")
            yield TranscriptionFrame(text, "", int(time.time_ns() / 1000000))
--- a/src/pipecat/storage/search.py
+++ b/src/pipecat/storage/search.py
@@ -1,9 +0,0 @@
-class SearchIndexer():
-    def __init__(self, story_id):
-        pass
-
-    def index_text(self, text):
-        pass
-
-    def index_image(self, text):
-        pass
--- a/src/pipecat/transports/init.py
+++ b/src/pipecat/transports/init.py
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Aleix Conchillo Flaqué	4193a4f415	Merge pull request #237 from pipecat-ai/aleix/pipecat-0.0.30 update CHANGELOG for 0.0.30	2024-06-14 05:28:14 +08:00
Aleix Conchillo Flaqué	0226ec450a	update CHANGELOG for 0.0.30	2024-06-13 14:27:37 -07:00
Aleix Conchillo Flaqué	020b8ebb35	Merge pull request #236 from pipecat-ai/aleix/report-only-initial-ttfb report only initial ttfb	2024-06-14 05:24:52 +08:00
Aleix Conchillo Flaqué	1170b30c1b	aggregator(user_response): also handle small VADParams.stop_secs	2024-06-13 13:30:31 -07:00
Aleix Conchillo Flaqué	0004d4a906	vad: reduce smoothing factor and increase confidence	2024-06-13 13:30:11 -07:00
Aleix Conchillo Flaqué	cb27e86266	metrics: allow sending only initial TTFB metrics	2024-06-13 13:30:00 -07:00
Aleix Conchillo Flaqué	77a3b2ea5c	Merge pull request #235 from pipecat-ai/aleix/openpipe-refactoring openpipe refactoring	2024-06-14 01:28:50 +08:00
Aleix Conchillo Flaqué	099e65f3b6	report processor name in error logs	2024-06-13 10:20:45 -07:00
Aleix Conchillo Flaqué	befb8db120	update pyproject and requirements	2024-06-13 10:20:45 -07:00
Aleix Conchillo Flaqué	9992d826b1	examples: renamed 06b-listen... to 07h-inte...	2024-06-13 10:18:20 -07:00
Aleix Conchillo Flaqué	18604e1a39	re-add removed CHANGELOG lines	2024-06-13 10:18:20 -07:00
Aleix Conchillo Flaqué	312c569182	services(openpipe): refactored so it's based on BaseOpenAILLMService	2024-06-13 09:30:50 -07:00
Aleix Conchillo Flaqué	b43e0ed130	Merge pull request #233 from KwalAI/openpipe-integration OpenPipe Integration	2024-06-13 22:41:57 +08:00
Aleix Conchillo Flaqué	289debea34	Merge pull request #234 from pipecat-ai/aleix/fix-daily-room-properties-exp transports(helpers): fix DailyRoomProperties.exp	2024-06-13 22:38:41 +08:00
Aleix Conchillo Flaqué	ccd6af7016	transports(helpers): fix DailyRoomProperties.exp	2024-06-12 23:15:22 -07:00
Ankur Duggal	effc69e4e4	formatting	2024-06-12 15:01:19 -07:00
Ankur Duggal	c7a0d0db64	OpenPipe Integration	2024-06-12 14:23:56 -07:00
Aleix Conchillo Flaqué	50d69a1ca4	Merge pull request #231 from pipecat-ai/aleix/websocket-deserializer-none serializer: allow deserialize() to return None	2024-06-13 04:36:03 +08:00
Aleix Conchillo Flaqué	8a6b8fe70a	Merge pull request #232 from pipecat-ai/aleix/pyproject-deepgram pyproject: add deepgram-sdk	2024-06-13 03:53:08 +08:00
Aleix Conchillo Flaqué	c4e53aea71	update macos-py3.10-requirements with deepgram	2024-06-12 12:52:20 -07:00
Aleix Conchillo Flaqué	ad5125e93f	pyproject: add deepgram-sdk	2024-06-12 12:50:18 -07:00
Aleix Conchillo Flaqué	8d92cbac93	Merge pull request #230 from pipecat-ai/aleix/processor-names processor names	2024-06-13 03:16:07 +08:00
Aleix Conchillo Flaqué	0225443ec8	transports(base): always send MetricsFrame	2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué	71e1d0a334	pipeline: send initial TTFB initial metrics from PipelineTask	2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué	83f69e02fd	allow specifying frame processor names	2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué	e1b2da1ff0	serializer: allow deserialize() to return None	2024-06-12 12:11:36 -07:00
Kwindla Hultman Kramer	5eb1b90a4b	Merge pull request #229 from pipecat-ai/khk-deepgram-url-configurable Deepgram TTS service improvements	2024-06-12 14:52:04 -04:00
Kwindla Hultman Kramer	9c4ee74b91	bot to test for demo	2024-06-12 10:41:49 -07:00
Aleix Conchillo Flaqué	f65f566829	re-add transports/services/helpers/__init__.py	2024-06-12 10:37:28 -07:00
Aleix Conchillo Flaqué	c8ad3123b7	Merge pull request #207 from pipecat-ai/dialin-example New example: Dialin bot (call your Pipecat via phone)	2024-06-13 01:36:00 +08:00
Jon Taylor	8cefce28cf	added example fly toml	2024-06-12 10:35:03 -07:00
Jon Taylor	a834d26885	removed https from daily boy	2024-06-12 10:35:03 -07:00
Jon Taylor	810e3cd551	added fly.example.toml due to gitignore	2024-06-12 10:35:03 -07:00
Jon Taylor	f258fa96cd	added env to dockerignore	2024-06-12 10:35:03 -07:00
Jon Taylor	757ec61f14	added deepgram to readme	2024-06-12 10:35:03 -07:00
Jon Taylor	2c933f43d8	linting errors and removed unusued sip url	2024-06-12 10:35:03 -07:00
Jon Taylor	cc5bfa8af8	removed helps and fixed linting	2024-06-12 10:35:03 -07:00
Jon Taylor	de9f3e55f1	new example: dialin	2024-06-12 10:35:03 -07:00
Aleix Conchillo Flaqué	ed0c986218	Merge pull request #228 from pipecat-ai/aleix/websocket-fixes websocket fixes	2024-06-13 01:30:21 +08:00
Aleix Conchillo Flaqué	72c27215b6	transports(websocket): use push_audio_frame()	2024-06-12 10:29:39 -07:00
Aleix Conchillo Flaqué	c23b14f768	examples: use DeepgramSTTService in websocker-server	2024-06-12 10:29:22 -07:00
Aleix Conchillo Flaqué	81282f9c4d	services(deepgram): keep conenction alive	2024-06-12 10:29:22 -07:00
Aleix Conchillo Flaqué	2b324f6f81	Merge pull request #227 from pipecat-ai/aleix/daily-room-properties-extra transports(daily): DailyRoomProperties now allow extra unknown parame…	2024-06-13 00:25:07 +08:00
Kwindla Hultman Kramer	049f110344	PipelineTask should not exit when Deepgram TTS returns a Bad Request "unutterable"	2024-06-12 09:24:09 -07:00
Kwindla Hultman Kramer	448a0307a8	rebasing	2024-06-12 07:54:18 -07:00
Aleix Conchillo Flaqué	7390e42f5c	transports(daily): DailyRoomProperties now allow extra unknown parameters	2024-06-11 22:31:32 -07:00
Aleix Conchillo Flaqué	ee880d229f	Merge pull request #223 from pipecat-ai/aleix/fix-lower-vad-stop-secs processors: fix LLMResponseAggregator with lower VAD values	2024-06-12 13:30:34 +08:00
Aleix Conchillo Flaqué	9cd07d81f8	processors: fix LLMResponseAggregator with lower VAD values	2024-06-11 22:30:06 -07:00
Aleix Conchillo Flaqué	b453d089c3	Merge pull request #226 from pipecat-ai/aleix/chunk-audio-output transport: chunk longer audio frames	2024-06-12 13:28:28 +08:00
Aleix Conchillo Flaqué	7410fe1d1e	transport: chunk longer audio frames	2024-06-11 17:50:51 -07:00
Aleix Conchillo Flaqué	6323a77431	Merge pull request #224 from pipecat-ai/aleix/deepgram-stt-simple deepgram stt simple	2024-06-12 08:48:19 +08:00
Aleix Conchillo Flaqué	0aedaa8553	services(deepgram): abstract StartFrame/EndFrame/CancelFrame	2024-06-10 21:18:42 -07:00
Aleix Conchillo Flaqué	6554479d39	transports: don't queue system frames	2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué	ce2ebd3198	examples: updated 07c-interruptible-deepgram to usee DeepgramSTTService	2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué	13ea1efc96	examples: add new 13b-deepgram-transcription	2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué	ef380321cf	services: added new DeepgramSTTService	2024-06-10 21:00:01 -07:00
Kwindla Hultman Kramer	294b037730	configurable deepgram base url	2024-06-08 09:38:48 -04:00
Aleix Conchillo Flaqué	7603996612	Merge pull request #220 from pipecat-ai/aleix/pipecat-0.0.29 update CHANGELOG for 0.0.29	2024-06-08 04:43:52 +08:00
Aleix Conchillo Flaqué	3048d2b0b1	update CHANGELOG for 0.0.29	2024-06-07 13:43:00 -07:00
Aleix Conchillo Flaqué	0bb47a09d2	Merge pull request #218 from pipecat-ai/aleix/send-inital-metrics-mapping send inital metrics mapping	2024-06-08 04:41:59 +08:00
Aleix Conchillo Flaqué	1afe6901d9	processors: add processors_with_metrics() and can_generate_metrics()	2024-06-07 13:38:21 -07:00
Aleix Conchillo Flaqué	3e019fb512	services(openai): remove unused _chat_completions	2024-06-07 13:18:11 -07:00
Aleix Conchillo Flaqué	e069aa9608	updated CHANGELOG with BasePipeline	2024-06-07 13:18:09 -07:00
Aleix Conchillo Flaqué	0b32e42d25	transports(daily): fix extra super().process_frame()	2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué	8d18be5069	services(anthropic): fix metrics	2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué	e715d99d0c	pipeline: send initial ttfb metrics mapping	2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué	dc28590247	moved ParallelTask to pipecat.pipeline.parallel_task	2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué	139f158ea1	Merge pull request #219 from pipecat-ai/aleix/switch-voices switch voices and languages	2024-06-08 04:13:25 +08:00
Aleix Conchillo Flaqué	4b2a18837f	services(whisper): add text logging	2024-06-07 13:12:51 -07:00
Aleix Conchillo Flaqué	b4340d0185	services(whisper): increase no speech probability to 0.4	2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué	90d11398e6	examples: add 15a-switch-languages	2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué	bf8c73b25b	examples: add 15-switch-voices	2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué	21cd21de1b	processors(filters): add FunctionFilter	2024-06-07 13:12:18 -07:00
Aleix Conchillo Flaqué	c25f6e56e7	Merge pull request #217 from pipecat-ai/khk-tts-timings Added TTFB timings for all TTS services	2024-06-07 05:42:52 +08:00
Aleix Conchillo Flaqué	a1f1d1995c	transports: allow sending metrics	2024-06-06 14:35:34 -07:00
Aleix Conchillo Flaqué	390582d7f3	services: use start/stop_ttfb_metrics to report TTFB metrics	2024-06-06 14:00:10 -07:00
Aleix Conchillo Flaqué	e765a29ca2	processors: implement base process_frame(). all subsclassed should call it	2024-06-06 10:54:21 -07:00
Kwindla Hultman Kramer	cf5c244487	Merge branch 'main' into khk-tts-timings	2024-06-06 13:05:42 -04:00
Kwindla Hultman Kramer	a5eb30a93d	changelog	2024-06-06 11:49:05 -04:00
Kwindla Hultman Kramer	ac7bc35944	azure tts ttfb	2024-06-06 11:45:48 -04:00
Kwindla Hultman Kramer	ddfd721f6e	openai tts ttfb	2024-06-06 11:32:47 -04:00
Kwindla Hultman Kramer	aee3916cd1	cartesia async fixed	2024-06-06 11:24:26 -04:00
Kwindla Hultman Kramer	3eff1e559b	pipecat async working, but maybe needs a threaded implementation	2024-06-06 11:11:06 -04:00
Kwindla Hultman Kramer	1a542c91fa	temp commit, woring on playht	2024-06-06 10:48:22 -04:00
Aleix Conchillo Flaqué	cd60a84f8a	Merge pull request #215 from pipecat-ai/aleix/silero-vad-memory-fix vad(silero): fix memory issue	2024-06-06 05:50:47 +08:00
Aleix Conchillo Flaqué	3dd4bac6e6	vad(silero): fix memory issue	2024-06-05 14:50:28 -07:00
Kwindla Hultman Kramer	06ff9cfede	added timing logs for cartesia, deepgram, elevenlabs	2024-06-05 16:12:10 -04:00
Aleix Conchillo Flaqué	2d1ed9a304	Merge pull request #214 from pipecat-ai/aleix/pipecat-0.0.27 transports(daily): added participants() and participant_counts()	2024-06-06 03:15:34 +08:00
Aleix Conchillo Flaqué	50b51c05f6	transports(daily): added participants() and participant_counts()	2024-06-05 12:14:00 -07:00
Aleix Conchillo Flaqué	5ce4b8dd5b	update CHANGELOG with OpenAITTSService	2024-06-05 11:44:24 -07:00
Aleix Conchillo Flaqué	2f4467b5a5	Merge pull request #213 from pipecat-ai/aleix/pipecat-0.0.26 update CHANGELOG for 0.0.26	2024-06-06 01:10:01 +08:00
Aleix Conchillo Flaqué	e91ab54a69	update CHANGELOG for 0.0.26	2024-06-05 10:07:45 -07:00
Aleix Conchillo Flaqué	6a33432c82	Merge pull request #212 from pipecat-ai/aleix/make-pinlesscallupdate-public transports(daily): move pinlessCallUpdate to public api	2024-06-05 23:14:14 +08:00
Aleix Conchillo Flaqué	135654a080	transports(daily): move pinlessCallUpdate to public api	2024-06-05 08:08:56 -07:00
Aleix Conchillo Flaqué	7b708a2bee	Merge pull request #211 from pipecat-ai/aleix/base-transport-async various fixes and improvements	2024-06-05 22:57:35 +08:00
Aleix Conchillo Flaqué	b515c28417	services(cartesia): allow output_format and model_id	2024-06-04 19:24:33 -07:00
Aleix Conchillo Flaqué	854ffb0323	update CHANGELOG for DailyRESTHelper	2024-06-04 15:45:17 -07:00
Aleix Conchillo Flaqué	891b7b22ea	transports: push EndFrame/CancelFrame before stopping push task	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	c8d37a7227	pipeline(runner): add support for SIGTERM	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	489060881d	update macos-py3.10-requirements	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	d56a4cce1b	update CHANGELOG with latest changes	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	7eb9dfde38	pyproject: include langchain-community and langchain-openai	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	571e10f83e	services(anthropic): fix interruptions with anthropic	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	af202d4fe5	pipeline(task): introduce has_finished()	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	4057fbbcfd	transports(tk): fix pyaudio output stream cleanup	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	5cdb8a79a1	examples: use camera_out_is_live for live video	2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué	a674b43243	transport: remove redundant camera thread and switch audio pull for push	2024-06-04 15:43:54 -07:00
Jon Taylor	ac41f13b7c	Merge pull request #205 from pipecat-ai/daily_rest_helpers Created REST helpers for Daily covering commonly used methods for running / deployment	2024-06-04 22:26:39 +02:00
Jon Taylor	003b9887b1	made sip and sipuri optional and None	2024-06-04 19:03:58 +02:00
Jon Taylor	ba45c2ab5b	addressed review (urllib import and linting	2024-06-04 18:39:35 +02:00
Aleix Conchillo Flaqué	9d36a48a80	Merge pull request #208 from pipecat-ai/aleix/cartesia-voice-load-startup services(cartesia): load voices on startup	2024-06-04 22:54:25 +08:00
Aleix Conchillo Flaqué	20a525635e	Merge pull request #201 from TomTom101/TomTom101/openai_tts Added OpenAI TTS (#196)	2024-06-04 22:53:56 +08:00
Aleix Conchillo Flaqué	659eceea95	services(cartesia): load voices on startup	2024-06-03 14:08:04 -07:00
TomTom101	d462c03d00	chore: Review comments	2024-06-03 20:13:15 +02:00
Jon Taylor	6591e07eb4	removed hardcoded 'https' from API url	2024-06-03 19:32:14 +02:00
Aleix Conchillo Flaqué	fe71825954	Merge pull request #206 from pipecat-ai/aleix/fix-deepgram-tts services(deepgram): fixed DeepgramTTSService	2024-06-04 00:28:53 +08:00
Aleix Conchillo Flaqué	43516f84fe	services(deepgram): fixed DeepgramTTSService	2024-06-03 07:53:46 -07:00
Jon Taylor	0849edb00b	added Daily REST helpers file for common methods used in Pipecat bots	2024-06-03 16:38:13 +02:00
Aleix Conchillo Flaqué	dd3b4083eb	Merge pull request #204 from TomTom101/TomTom101/langchain fix: Fixed imports, support new PipelineParams	2024-06-03 03:16:30 +08:00
TomTom101	89673a4040	test(langchain): Use new PipelineParams in test	2024-06-02 20:19:55 +02:00
TomTom101	410dbd3dfc	fix: Fixed imports, support new PipelineParams	2024-06-02 20:16:11 +02:00
TomTom101	7085b1ea3f	doc(openai): Added hint re the 24kHz sample rate	2024-06-01 20:35:46 +02:00
TomTom101	8683cae719	feat: OpenAITTS	2024-06-01 10:13:28 +02:00
Aleix Conchillo Flaqué	0197efa524	Merge pull request #200 from pipecat-ai/aleix/changelog-0.0.25 update CHANGELOG.md for version 0.0.25	2024-06-01 07:48:42 +08:00
Aleix Conchillo Flaqué	16e76caa33	update CHANGELOG.md for version 0.0.25	2024-05-31 16:48:03 -07:00
Aleix Conchillo Flaqué	1f5240694d	Merge pull request #199 from pipecat-ai/aleix/langchain-changelog move LangchainProcessor to processors/frameworks and update CHANGELOG	2024-06-01 07:46:51 +08:00
Aleix Conchillo Flaqué	f087151db7	move LangchainProcessor to processors/frameworks and update CHANGELOG	2024-05-31 16:45:39 -07:00
Aleix Conchillo Flaqué	0b691ff597	Merge pull request #198 from pipecat-ai/aleix/websocket-transport websocket transport support	2024-06-01 04:40:39 +08:00
TomTom101	ae049961b7	wip: untested	2024-05-31 22:30:52 +02:00
Aleix Conchillo Flaqué	0d6eee705f	Merge pull request #190 from TomTom101/TomTom101/langchain Langchain service	2024-06-01 04:21:12 +08:00
Aleix Conchillo Flaqué	58d20ec9dc	transport(websocket-server): add on_client_disconnected	2024-05-31 12:52:43 -07:00
Aleix Conchillo Flaqué	38befe1dc1	examples(websocket): rename server.py to bot.py	2024-05-31 12:09:54 -07:00
Aleix Conchillo Flaqué	2f335100a5	remove storage folder	2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué	3fef818843	examples(websocket-server): use VAD analyzer from transport	2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué	428c8af77e	transports(websocket): base class from BaseInputTransport	2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué	54fccd2e25	pipeline: cleanup processors one by one	2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué	66c6a5dc0f	transports(websocket): base class from BaseOutputTransport	2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué	92561ae19d	some event loop parameter updates	2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué	b85e93410b	transports(daily): fix event handlers callback	2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué	593993ba97	transports(base_input): remove unnecessary task	2024-05-31 11:37:41 -07:00
Aleix Conchillo Flaqué	7b8b606278	update CHANGELOG and create websocker-server instructions	2024-05-31 11:37:19 -07:00
Aleix Conchillo Flaqué	7116ad0607	examples: fix websocket-client audio playback	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	c507044277	examples: use gpt-4o model by default	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	5f45a9d90f	examples: websocket-server updates	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	e31e87aabd	transport(websocket): update audio_frame_size	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	2957416d90	serializers(protobuf): support id and name fields	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	b9b761b67a	added sample_rate and num_channels to protobuf AudioRawFrame	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	a7539e9317	transports: simplify and fix async and nested decorators	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	75575c0c68	use get_event_loop() and move event handlers to BaseTransport	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	77b3e08214	examples: add and update wbesocket eaxmples	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	956b783c1a	transports: added new WebsocketServerTransport	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	e90c080470	serializers: added BaseSerializer	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	37aabaa03a	frames: generate protobuf pb2 file for pipecat package	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	3e289a7bef	pyproject: add protobuf dependency	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	6dd5e3fdf5	dev-requirements: add grpcio-tools	2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué	e60df3c7c0	Merge pull request #195 from pipecat-ai/aleix/function-calling-move-to-llmservice function calling move to LLMService	2024-06-01 02:36:29 +08:00
Aleix Conchillo Flaqué	42f772beed	examples: some function calling examples cleanup	2024-05-31 11:36:04 -07:00
Aleix Conchillo Flaqué	3655c4a0fc	services: move function calling registration to LLMService	2024-05-31 11:36:04 -07:00
Aleix Conchillo Flaqué	012dbffd94	update CHANGELOG.md for function calling	2024-05-31 11:36:03 -07:00
TomTom101	4b39efeee3	fix(langchain): try/catch langchain import in service; Only `langchain` is installed with the [langchain] extra (#190 )	2024-05-31 10:19:27 +02:00
TomTom101	b19243ab75	fix: corrected hint to install Langchain libs	2024-05-30 10:53:42 +02:00
TomTom101	2bf094b950	test(langchain): Rewrite to unittest, make it meaningful	2024-05-30 10:43:33 +02:00
TomTom101	143033d7db	fix: install langchain-community with the langchain extra	2024-05-30 03:15:14 +02:00
TomTom101	335990c145	wip: hint to install langchain_community	2024-05-30 03:15:14 +02:00
TomTom101	6d24e836b0	wip: Example using LC message history	2024-05-30 03:15:14 +02:00
TomTom101	278a2fed56	wip: First stab at langchain support Is this a service or processor? How to deal with conversation history? LC has sophisticated means of this, but might get in the way of `LLMResponseAggregator`	2024-05-30 03:15:14 +02:00