removed space from event handler

added pause to start and new intro prompt
removed header comment from bot runner
2024-06-26 18:30:56 +01:00 · 2024-06-26 18:24:14 +01:00 · 2024-06-24 17:35:26 +01:00 · 2024-06-24 17:34:25 +01:00 · 2024-06-24 17:28:10 +01:00 · 2024-06-24 16:25:36 +01:00
32 changed files with 1440 additions and 413 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,14 +5,34 @@ All notable changes to **pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

-## [Unreleased]
+## [0.0.32] - 2024-06-22

 ### Added

+- Allow specifying a `DeepgramSTTService` url which allows using on-prem
+  Deepgram.
+
+- Added new `FastAPIWebsocketTransport`. This is a new websocket transport that
+  can be integrated with FastAPI websockets.
+
+- Added new `TwilioFrameSerializer`. This is a new serializer that knows how to
+  serialize and deserialize audio frames from Twilio.
+
+- Added Daily transport event: `on_dialout_answered`.  See
+  https://reference-python.daily.co/api_reference.html#daily.EventHandler
+
 - Added new `AzureSTTService`. This allows you to use Azure Speech-To-Text.

+### Performance
+
+- Convert `BaseOutputTransport` and `BaseOutputTransport` to fully use asyncio
+  and remove the use of threads.
+
 ### Other

+- Added `twilio-chatbot`. This is an example that shows how to integrate Twilio
+  phone numbers with a Pipecat bot.
+
 - Updated `07f-interruptible-azure.py` to use `AzureLLMService`,
  `AzureSTTService` and `AzureTTSService`.

--- a/examples/README.md
+++ b/examples/README.md
@@ -32,14 +32,15 @@ Next, follow the steps in the README for each demo.

 ## Projects:

-| Project                                      | Description                                                                                                                                | Services                                       |
-| -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
-| [Simple Chatbot](simple-chatbot)             | Basic voice-driven conversational bot. A good starting point for learning the flow of the framework.                                       | Deepgram, OpenAI, Daily, Daily Prebuilt UI            |
-| [Storytelling Chatbot](storytelling-chatbot) | Stitches together multiple third-party services to create a collaborative storytime experience.                                            | Deepgram, ElevenLabs, Open AI, Fal, Daily, Custom UI  |
-| [Translation Chatbot](translation-chatbot)   | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI     |
-| [Moondream Chatbot](moondream-chatbot)       | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU**                                                       | Deepgram, OpenAI, Moondream, Daily, Daily Prebuilt UI |
-| Function-calling Chatbot (TBC)               | A chatbot that can call functions in response to user input.                                                                                | Deepgram, OpenAI, Fireworks, Daily, Daily Prebuilt UI |
-| [Dialin Chatbot](dialin-chatbot)             | A chatbot that connects to an incoming phone call from Daily or Twilio.                                                                                | Deepgram, OpenAI, ElevenLabs, Daily, Twilio |
+| Project                                      | Description                                                                                                                                | Services                                                          |
+|----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|
+| [Simple Chatbot](simple-chatbot)             | Basic voice-driven conversational bot. A good starting point for learning the flow of the framework.                                       | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI            |
+| [Storytelling Chatbot](storytelling-chatbot) | Stitches together multiple third-party services to create a collaborative storytime experience.                                            | Deepgram, ElevenLabs, OpenAI, Fal, Daily, Custom UI               |
+| [Translation Chatbot](translation-chatbot)   | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI                 |
+| [Moondream Chatbot](moondream-chatbot)       | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU**                                                       | Deepgram, ElevenLabs, OpenAI, Moondream, Daily, Daily Prebuilt UI |
+| [Patient intake](patient-intake)             | A chatbot that can call functions in response to user input.                                                                               | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI            |
+| [Dialin Chatbot](dialin-chatbot)             | A chatbot that connects to an incoming phone call from Daily or Twilio.                                                                    | Deepgram, ElevenLabs, OpenAI, Daily, Twilio                       |
+| [Twilio Chatbot](twilio-chatbot)             | A chatbot that connects to an incoming phone call from Twilio.                                                                             | Deepgram, ElevenLabs, OpenAI, Daily, Twilio                       |

 > [!IMPORTANT]
 > These example projects use Daily as a WebRTC transport and can be joined using their hosted Prebuilt UI.
--- a/examples/fast-chatbot/.gitignore
+++ b/examples/fast-chatbot/.gitignore
@@ -0,0 +1,165 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+.pybuilder/
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# pytype static type analyzer
+.pytype/
+
+# Cython debug symbols
+cython_debug/
+
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+runpod.toml
+
+# custom script to recursively upgrade items in requirements.py
+upgrade_requirements.py
+.DS_Store
--- a/examples/fast-chatbot/README.md
+++ b/examples/fast-chatbot/README.md
--- a/examples/fast-chatbot/bot.py
+++ b/examples/fast-chatbot/bot.py
@@ -0,0 +1,164 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+from loguru import logger
+import argparse
+import asyncio
+import aiohttp
+import os
+import sys
+import time
+from typing import Optional
+
+from pydantic import BaseModel, ValidationError
+
+from pipecat.vad.vad_analyzer import VADParams
+from pipecat.vad.silero import SileroVADAnalyzer
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+from pipecat.services.openai import OpenAILLMService
+from pipecat.services.deepgram import DeepgramSTTService
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.frames.frames import LLMMessagesFrame, EndFrame
+
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator, LLMUserResponseAggregator
+)
+
+from helpers import (
+    ClearableDeepgramTTSService,
+    AudioVolumeTimer,
+    TranscriptionTimingLogger
+)
+
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level=os.getenv("LOG_LEVEL", "DEBUG"))
+
+
+class BotSettings(BaseModel):
+    room_url: str
+    room_token: str
+    bot_name: str = "Pipecat"
+    prompt: Optional[str] = "You are a helpful assistant."
+    deepgram_api_key: Optional[str] = os.getenv("DEEPGRAM_API_KEY", None)
+    deepgram_voice: Optional[str] = os.getenv("DEEPGRAM_VOICE", "aura-asteria-en")
+    deepgram_tts_base_url: Optional[str] = os.getenv(
+        "DEEPGRAM_TTS_BASE_URL", "https://api.deepgram.com/v1/speak")
+    deepgram_stt_base_url: Optional[str] = os.getenv(
+        "DEEPGRAM_STT_BASE_URL", "https://api.deepgram.com/v1/speak")
+    openai_api_key: Optional[str] = os.getenv("OPENAI_API_KEY", None),
+    openai_model: Optional[str] = os.getenv("OPENAI_MODEL", None),
+    openai_base_url: Optional[str] = os.getenv("OPENAI_BASE_URL", None)
+    vad_stop_secs: Optional[float] = os.getenv("VAD_STOP_SECS", 0.200)
+
+
+async def main(settings: BotSettings):
+    async with aiohttp.ClientSession() as session:
+        transport = DailyTransport(
+            settings.room_url,
+            settings.room_token,
+            settings.bot_name,
+            DailyParams(
+                audio_out_enabled=True,
+                transcription_enabled=False,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(params=VADParams(
+                    stop_secs=settings.vad_stop_secs
+                )),
+                vad_audio_passthrough=True
+            )
+        )
+
+        stt = DeepgramSTTService(
+            name="STT",
+            api_key=settings.deepgram_api_key,
+            url=settings.deepgram_stt_base_url
+        )
+
+        tts = ClearableDeepgramTTSService(
+            name="Voice",
+            aiohttp_session=session,
+            api_key=settings.deepgram_api_key,
+            voice=settings.deepgram_voice,
+            **({'base_url': url} if (url := settings.deepgram_tts_base_url) else {})
+        )
+
+        llm = OpenAILLMService(
+            name="LLM",
+            api_key=settings.openai_api_key,
+            model=settings.openai_model,
+            base_url=settings.openai_base_url,
+        )
+
+        messages = [
+            {
+                "role": "system",
+                "content": settings.prompt,
+            },
+        ]
+
+        avt = AudioVolumeTimer()
+        tl = TranscriptionTimingLogger(avt)
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Transport user input
+            avt,                 # Audio volume timer
+            stt,                 # Speech-to-text
+            tl,                  # Transcription timing logger
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # TTS
+            transport.output(),  # Transport bot output
+            tma_out,             # Assistant spoken responses
+        ])
+
+        task = PipelineTask(
+            pipeline,
+            PipelineParams(
+                allow_interruptions=True,
+                enable_metrics=True,
+                report_only_initial_ttfb=True
+            ))
+
+        # When the participant leaves, we exit the bot.
+        @transport.event_handler("on_participant_left")
+        async def on_participant_left(transport, participant, reason):
+            await task.queue_frame(EndFrame())
+
+        # When the first participant joins, the bot should introduce itself.
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            # Provide some air whilst tracks subscribe
+            time.sleep(2)
+            messages.append(
+                {
+                    "role": "system",
+                    "content": "Briefly introduce yourself by saying 'hello, I'm FastBot, how can I help you today?'"})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        runner = PipelineRunner()
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Pipecat Bot")
+    parser.add_argument("-s", "--settings", type=str, required=True, help="Pipecat bot settings")
+
+    args, unknown = parser.parse_known_args()
+
+    try:
+        settings = BotSettings.model_validate_json(args.settings)
+        asyncio.run(main(settings))
+    except ValidationError as e:
+        print(e)
--- a/examples/fast-chatbot/bot_runner.py
+++ b/examples/fast-chatbot/bot_runner.py
@@ -0,0 +1,164 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import os
+import argparse
+import subprocess
+
+from pydantic import BaseModel, ValidationError
+from typing import Optional
+
+from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomObject, DailyRoomProperties, DailyRoomParams
+
+from fastapi import FastAPI, Request, HTTPException
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import JSONResponse
+
+from bot import BotSettings
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+
+# ------------ Configuration ------------ #
+
+MAX_SESSION_TIME = 5 * 60  # 5 minutes
+REQUIRED_ENV_VARS = ['DAILY_API_URL', 'DAILY_API_KEY', 'DEEPGRAM_API_KEY']
+
+daily_rest_helper = DailyRESTHelper(
+    os.getenv("DAILY_API_KEY", ""),
+    os.getenv("DAILY_API_URL", 'https://api.daily.co/v1'))
+
+
+class RunnerSettings(BaseModel):
+    prompt: Optional[
+        str] = "You are a fast, low-latency chatbot. Your goal is to demonstrate voice-driven AI capabilities at human-like speeds. When introducing yourself briefly mention your goal is to showcase speed and conversational flow. The technology powering you is Daily for transport, Cerebrium for GPU hosting, Llama 3 (8-B version) LLM, and Deepgram for speech-to-text and text-to-speech. You are hosted on the east coast of the United States. Respond to what the user said in a creative and helpful way, but keep responses short and legible. Ensure responses contain only words. Check again that you have not included special characters other than '?' or '!'."
+    deepgram_voice: Optional[str] = os.getenv("DEEPGRAM_VOICE")
+    openai_model: Optional[str] = os.getenv("OPENAI_MODEL", "gpt-4o")
+    openai_api_key: Optional[str] = os.getenv("OPENAI_API_KEY")
+    test: Optional[bool] = None
+
+# ----------------- API ----------------- #
+
+
+app = FastAPI()
+
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"]
+)
+
+# ----------------- Main ----------------- #
+
+
+@app.post("/start_bot")
+async def start_bot(request: Request) -> JSONResponse:
+    runner_settings = RunnerSettings()
+    try:
+        request_body = await request.body()
+        if len(request_body) > 0:
+            runner_settings = RunnerSettings.model_validate_json(request_body)
+    except ValidationError as e:
+        raise HTTPException(
+            status_code=400,
+            detail=f"Invalid request: {e}")
+    except Exception as e:
+        # If no data in request, pass
+        pass
+
+    # Is this a webhook creation request?
+    if runner_settings.test is not None:
+        return JSONResponse({"test": True})
+
+    # Use specified room URL, or create a new one if not specified
+    room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", "")
+
+    if not room_url:
+        params = DailyRoomParams(
+            properties=DailyRoomProperties()
+        )
+        try:
+            room: DailyRoomObject = daily_rest_helper.create_room(params=params)
+        except Exception as e:
+            raise HTTPException(
+                status_code=500,
+                detail=f"Unable to provision room {e}")
+    else:
+        # Check passed room URL exists, we should assume that it already has a sip set up
+        try:
+            room: DailyRoomObject = daily_rest_helper.get_room_from_url(room_url)
+        except Exception:
+            raise HTTPException(
+                status_code=500, detail=f"Room not found: {room_url}")
+
+    # Give the agent a token to join the session
+    token = daily_rest_helper.get_token(room.url, MAX_SESSION_TIME)
+
+    if not room or not token:
+        raise HTTPException(
+            status_code=500, detail=f"Failed to get token for room: {room_url}")
+
+    # Spawn a new agent, and join the user session
+    try:
+        bot_settings = BotSettings(
+            room_url=room.url,
+            room_token=token,
+            prompt=runner_settings.prompt,
+            deepgram_voice=runner_settings.deepgram_voice,
+            openai_model=runner_settings.openai_model,
+            openai_api_key=runner_settings.openai_api_key,
+        )
+        bot_settings_str = bot_settings.model_dump_json(exclude_none=True)
+
+        subprocess.Popen(
+            [f"python3 -m bot -s '{bot_settings_str}'"],
+            shell=True,
+            bufsize=1,
+            cwd=os.path.dirname(os.path.abspath(__file__)))
+    except Exception as e:
+        raise HTTPException(
+            status_code=500, detail=f"Failed to start subprocess: {e}")
+
+    # Grab a token for the user to join with
+    user_token = daily_rest_helper.get_token(room.url, MAX_SESSION_TIME)
+
+    return JSONResponse({
+        "room_url": room.url,
+        "token": user_token,
+    })
+
+
+if __name__ == "__main__":
+    # Check environment variables
+    for env_var in REQUIRED_ENV_VARS:
+        if env_var not in os.environ:
+            raise Exception(f"Missing environment variable: {env_var}.")
+
+    parser = argparse.ArgumentParser(description="Pipecat Bot Runner")
+    parser.add_argument("--host", type=str,
+                        default=os.getenv("HOST", "0.0.0.0"), help="Host address")
+    parser.add_argument("--port", type=int,
+                        default=os.getenv("PORT", 7860), help="Port number")
+    parser.add_argument("--reload", action="store_true",
+                        default=True, help="Reload code on change")
+
+    config = parser.parse_args()
+
+    try:
+        import uvicorn
+
+        uvicorn.run(
+            "bot_runner:app",
+            host=config.host,
+            port=config.port,
+            reload=config.reload
+        )
+
+    except KeyboardInterrupt:
+        print("Pipecat runner shutting down...")
--- a/examples/fast-chatbot/env.example
+++ b/examples/fast-chatbot/env.example
@@ -0,0 +1,12 @@
+DAILY_SAMPLE_ROOM_URL= #optional: use the same room each time, or create a new one if unset
+DAILY_API_KEY=
+DAILY_API_URL=
+
+DEEPGRAM_API_KEY=
+DEEPGRAM_VOICE=
+DEEPGRAM_STT_URL=
+DEEPGRAM_TTS_BASE_URL=
+
+OPENAI_API_KEY=
+OPENAI_MODEL=
+OPENAI_BASE_URL=
--- a/examples/fast-chatbot/helpers.py
+++ b/examples/fast-chatbot/helpers.py
@@ -1,50 +1,28 @@
-#
-# Copyright (c) 2024, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
+from loguru import logger
 import asyncio
-import aiohttp
-import os
-import sys
-import json
+import math
+import struct
+import time
 from dataclasses import dataclass, field
 from typing import List

+
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.frames.frames import (
    Frame,
-    TextFrame,
-    LLMMessagesFrame,
-    TranscriptionFrame,
-    InterimTranscriptionFrame,
    AudioRawFrame,
+    InterimTranscriptionFrame,
+    TranscriptionFrame,
+    TextFrame,
    StartInterruptionFrame,
-    StopInterruptionFrame,
    LLMFullResponseStartFrame,
-    TTSStoppedFrame
+    TTSStoppedFrame,
+    MetricsFrame
 )
-from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
-from pipecat.processors.logger import FrameLogger
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
+
+from pipecat.vad.vad_analyzer import VADAnalyzer, VADState
 from pipecat.services.deepgram import DeepgramTTSService
-from pipecat.services.openai import OpenAILLMService, OpenAILLMContext, OpenAILLMContextFrame
-from pipecat.transports.services.daily import DailyParams, DailyTransport, DailyTransportMessageFrame
-from pipecat.vad.silero import SileroVADAnalyzer
-from pipecat.vad.vad_analyzer import VADAnalyzer, VADParams, VADState
-
-
-from runner import configure
-
-from loguru import logger
-
-from dotenv import load_dotenv
-load_dotenv(override=True)
-
-logger.remove(0)
-logger.add(sys.stderr, level="DEBUG")
+from pipecat.services.openai import OpenAILLMContext, OpenAILLMContextFrame


 class GreedyLLMAggregator(FrameProcessor):
@@ -140,6 +118,9 @@ class VADGate(FrameProcessor):
                    logger.debug(f"expected a text frame, but received {frame}")
                    await self.push_frame(frame, direction)
                return
+            else:
+                if isinstance(frame, TextFrame):
+                    logger.error(f"XXXXXXXXXXXXXXXXXXX received a text frame, wasn't expecting it.")

            if isinstance(frame, AudioRawFrame):
                # if our buffer is empty or has a "finished" sentence at the end,
@@ -211,11 +192,8 @@ class VADGate(FrameProcessor):
                await asyncio.sleep(duration - 20 / 1000)
                if self.context:
                    logger.debug(f"Appending assistant message to context: [{s.text_frame.text}]")
-                    if self.context.messages and self.context.messages[-1]["role"] == "assistant":
-                        self.context.messages[-1]["content"] += " " + s.text_frame.text
-                    else:
-                        self.context.messages.append(
-                            {"role": "assistant", "content": s.text_frame.text}
+                    self.context.messages.append(
+                        {"role": "assistant", "content": s.text_frame.text}
                    )
                await self.push_frame(s.text_frame)

@@ -223,102 +201,67 @@ class VADGate(FrameProcessor):
            logger.debug(f"Exception {e}")


-async def main(room_url: str, token):
-    async with aiohttp.ClientSession() as session:
-        transport = DailyTransport(
-            room_url,
-            token,
-            "Respond bot",
-            DailyParams(
-                audio_out_enabled=True,
-                transcription_enabled=True,
-                vad_enabled=True,
-                vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.5))
-            )
-        )
+class TranscriptionTimingLogger(FrameProcessor):
+    def __init__(self, avt):
+        super().__init__()
+        self.name = "Transcription"
+        self._avt = avt

-        tts = ClearableDeepgramTTSService(
-            aiohttp_session=session,
-            api_key=os.getenv("DEEPGRAM_API_KEY"),
-            voice="aura-asteria-en",
-            # base_url="http://0.0.0.0:8080/v1/speak"
-        )
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        try:
+            await super().process_frame(frame, direction)
+            if isinstance(frame, TranscriptionFrame):
+                elapsed = time.time() - self._avt.last_transition_ts
+                logger.debug(f"Transcription TTF: {elapsed}")
+                await self.push_frame(MetricsFrame(ttfb={self.name: elapsed}))

-        llm = OpenAILLMService(
-            # To use OpenAI
-            api_key=os.getenv("OPENAI_API_KEY"),
-            model="gpt-4o"
-            # Or, to use a local vLLM (or similar) api server
-            # model="meta-llama/Meta-Llama-3-8B-Instruct",
-            # model="neuralmagic/Meta-Llama-3-70B-Instruct-FP8",
-            # base_url="http://0.0.0.0:8000/v1"
-        )
-
-        messages = [
-            {
-                "role": "system",
-                "content": "You are a helpful LLM communicating via audio. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
-            },
-        ]
-
-        ctx = OpenAILLMContext()
-        greedy = GreedyLLMAggregator(name="greedy", context=ctx)
-        gate = VADGate(name="gate", vad_analyzer=transport.input().vad_analyzer(), context=ctx)
-
-        pipeline = Pipeline([
-            transport.input(),   # Transport user input
-            greedy,
-            llm,                 # LLM
-            tts,                 # TTS
-            gate,
-            transport.output(),  # Transport bot output
-            # FrameLogger()
-        ])
-
-        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True, enable_metrics=True))
-
-        # When a participant joins, start transcription for that participant so the
-        # bot can "hear" and respond to them.
-        @ transport.event_handler("on_participant_joined")
-        async def on_participant_joined(transport, participant):
-            transport.capture_participant_transcription(participant["id"])
-
-        # When the first participant joins, the bot should introduce itself.
-        @ transport.event_handler("on_first_participant_joined")
-        async def on_first_participant_joined(transport, participant):
-            messages.append(
-                {"role": "system", "content": "Please introduce yourself to the user."})
-            await task.queue_frames([LLMMessagesFrame(messages)])
-
-        # Handle "latency-ping" messages. The client will send app messages that look like
-        # this:
-        #   { "latency-ping": { ts: <client-side timestamp> }}
-        #
-        # We want to send an immediate pong back to the client from this handler function.
-        # Also, we will push a frame into the top of the pipeline and send it after the
-        #
-        @ transport.event_handler("on_app_message")
-        async def on_app_message(transport, message, sender):
-            try:
-                if "latency-ping" in message:
-                    logger.debug(f"Received latency ping app message: {message}")
-                    ts = message["latency-ping"]["ts"]
-                    # Send immediately
-                    transport.output().send_message(DailyTransportMessageFrame(
-                        message={"latency-pong-msg-handler": {"ts": ts}},
-                        participant_id=sender))
-                    # And push to the pipeline for the Daily transport.output to send
-                    await tma_in.push_frame(
-                        DailyTransportMessageFrame(
-                            message={"latency-pong-pipeline-delivery": {"ts": ts}},
-                            participant_id=sender))
-            except Exception as e:
-                logger.debug(f"message handling error: {e} - {message}")
-
-        runner = PipelineRunner()
-        await runner.run(task)
+            await self.push_frame(frame, direction)
+        except Exception as e:
+            logger.debug(f"Exception {e}")


-if __name__ == "__main__":
-    (url, token) = configure()
-    asyncio.run(main(url, token))
+class AudioVolumeTimer(FrameProcessor):
+    def __init__(self):
+        super().__init__()
+        self.last_transition_ts = 0
+        self._prev_volume = -80
+        self._speech_volume_threshold = -50
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+
+        if isinstance(frame, AudioRawFrame):
+            volume = self.calculate_volume(frame)
+            # print(f"Audio volume: {volume:.2f} dB")
+            if (volume >= self._speech_volume_threshold and
+                    self._prev_volume < self._speech_volume_threshold):
+                # logger.debug("transition above speech volume threshold")
+                self.last_transition_ts = time.time()
+            elif (volume < self._speech_volume_threshold and
+                    self._prev_volume >= self._speech_volume_threshold):
+                # logger.debug("transition below non-speech volume threshold")
+                self.last_transition_ts = time.time()
+            self._prev_volume = volume
+
+        await self.push_frame(frame, direction)
+
+    def calculate_volume(self, frame: AudioRawFrame) -> float:
+        if frame.num_channels != 1:
+            raise ValueError(f"Expected 1 channel, got {frame.num_channels}")
+
+        # Unpack audio data into 16-bit integers
+        fmt = f"{len(frame.audio) // 2}h"
+        audio_samples = struct.unpack(fmt, frame.audio)
+
+        # Calculate RMS
+        sum_squares = sum(sample**2 for sample in audio_samples)
+        rms = math.sqrt(sum_squares / len(audio_samples))
+
+        # Convert RMS to decibels (dB)
+        # Reference: maximum value for 16-bit audio is 32767
+        if rms > 0:
+            db = 20 * math.log10(rms / 32767)
+        else:
+            db = -96  # Minimum value (almost silent)
+
+        return db
--- a/examples/fast-chatbot/requirements.txt
+++ b/examples/fast-chatbot/requirements.txt
@@ -0,0 +1,6 @@
+pipecat-ai[daily,openai,silero,deepgram]
+fastapi
+uvicorn
+requests
+python-dotenv
+loguru
--- a/examples/foundational/tmp-khk-sonnet-ttft.py
+++ b/examples/foundational/tmp-khk-sonnet-ttft.py
@@ -1,114 +0,0 @@
-#
-# Copyright (c) 2024, Daily
-#
-# SPDX-License-Identifier: BSD 2-Clause License
-#
-
-import asyncio
-import aiohttp
-import os
-import sys
-
-from pipecat.frames.frames import LLMMessagesFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.llm_response import (
-    LLMAssistantResponseAggregator, LLMUserResponseAggregator)
-from pipecat.services.cartesia import CartesiaTTSService
-from pipecat.services.anthropic import AnthropicLLMService
-from pipecat.transports.services.daily import DailyParams, DailyTransport
-from pipecat.vad.silero import SileroVADAnalyzer
-
-from runner import configure
-
-from loguru import logger
-
-from dotenv import load_dotenv
-load_dotenv(override=True)
-
-logger.remove(0)
-logger.add(sys.stderr, level="DEBUG")
-
-
-async def main(room_url: str, token):
-    async with aiohttp.ClientSession() as session:
-        transport = DailyTransport(
-            room_url,
-            token,
-            "Respond bot",
-            DailyParams(
-                audio_out_enabled=True,
-                transcription_enabled=True,
-                vad_enabled=True,
-                vad_analyzer=SileroVADAnalyzer()
-            )
-        )
-
-        tts = CartesiaTTSService(
-            api_key=os.getenv("CARTESIA_API_KEY"),
-            voice_name=sys.argv[1] if len(sys.argv) > 1 else "British Lady"
-        )
-
-        llm = AnthropicLLMService(
-            api_key=os.getenv("ANTHROPIC_API_KEY"),
-            model="claude-3-5-sonnet-20240620",
-            temperature=1.0
-        )
-
-        # todo: think more about how to handle system prompts in a more general way. OpenAI,
-        # Google, and Anthropic all have slightly different approaches to providing a system
-        # prompt.
-        messages = [
-            {
-                "role": "system",
-                "content": (
-                    "You are participating in a friendly competition to invent creative "
-                    "new ice cream flavors. Say the craziest flavor you can think of "
-                    "then wait for your opponent to come up with a different crazy flavor. "
-                    "then respond with another flavor idea. Repeat forever. Say only the "
-                    "ice cream flavors and nothing else. End each ice cream flavor statement "
-                    "with an exclamation mark! Go ..."
-                )
-            },
-        ]
-
-        tma_in = LLMUserResponseAggregator(messages)
-        tma_out = LLMAssistantResponseAggregator(messages)
-
-        pipeline = Pipeline([
-            transport.input(),   # Transport user input
-            tma_in,              # User responses
-            llm,                 # LLM
-            tts,                 # TTS
-            transport.output(),  # Transport bot output
-            tma_out,             # Assistant spoken responses
-        ])
-
-        task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True, enable_metrics=True))
-
-        # When a participant joins, start transcription for that participant so the
-        # bot can "hear" and respond to them.
-        @ transport.event_handler("on_participant_joined")
-        async def on_participant_joined(transport, participant):
-            transport.capture_participant_transcription(participant["id"])
-
-        # When the first participant joins, the bot should introduce itself.
-        @ transport.event_handler("on_first_participant_joined")
-        async def on_first_participant_joined(transport, participant):
-            await task.queue_frames([LLMMessagesFrame(messages)])
-
-        runner = PipelineRunner()
-
-        await runner.run(task)
-
-
-if __name__ == "__main__":
-    (url, token) = configure()
-    asyncio.run(main(url, token))
-
-# '{"action":"app-message","data":{"metrics":{"ttfb":[{"name":"AnthropicLLMService#0","time":0.5975627899169922}]},"type":"pipecat-metrics"},"fromId":"592d3489-90ba-401d-a760-c1a863d64a4a","callFrameId":"17189290998160.035120590426112264"}'
-# [Durian and Limburger Cheese Charcoal Activated Tar Twist!]
-# [Fermented Fish Sauce and Ghost Pepper Bubblegum Cotton Candy Nightmare!]
-# [Spoiled Yogurt and Ghost Pepper Gummy Bear Blizzard!]
-# [Matcha Green Tea and Sour Gummy Worm Fusion!]
--- a/examples/twilio-chatbot/.gitignore
+++ b/examples/twilio-chatbot/.gitignore
@@ -0,0 +1,161 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+.pybuilder/
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# pytype static type analyzer
+.pytype/
+
+# Cython debug symbols
+cython_debug/
+
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+runpod.toml
--- a/examples/twilio-chatbot/Dockerfile
+++ b/examples/twilio-chatbot/Dockerfile
@@ -0,0 +1,20 @@
+# Use an official Python runtime as a parent image
+FROM python:3.10-bullseye
+
+# Set the working directory in the container
+WORKDIR /twilio-chatbot
+
+# Copy the requirements file into the container
+COPY requirements.txt .
+
+# Install any needed packages specified in requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy the current directory contents into the container
+COPY . .
+
+# Expose the desired port
+EXPOSE 8765
+
+# Run the application
+CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8765"]
--- a/examples/twilio-chatbot/README.md
+++ b/examples/twilio-chatbot/README.md
@@ -0,0 +1,82 @@
+# Twilio Chatbot
+
+This project is a FastAPI-based chatbot that integrates with Twilio to handle WebSocket connections and provide real-time communication. The project includes endpoints for starting a call and handling WebSocket connections.
+
+## Table of Contents
+
+- [Features](#features)
+- [Requirements](#requirements)
+- [Installation](#installation)
+- [Configure Twilio URLs](#configure-twilio-urls)
+- [Running the Application](#running-the-application)
+- [Usage](#usage)
+
+## Features
+
+- **FastAPI**: A modern, fast (high-performance), web framework for building APIs with Python 3.6+.
+- **WebSocket Support**: Real-time communication using WebSockets.
+- **CORS Middleware**: Allowing cross-origin requests for testing.
+- **Dockerized**: Easily deployable using Docker.
+
+## Requirements
+
+- Python 3.10
+- Docker (for containerized deployment)
+- ngrok (for tunneling)
+- Twilio Account
+
+## Installation
+
+1. **Set up a virtual environment** (optional but recommended):
+    ```sh
+    python -m venv venv
+    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
+    ```
+
+2. **Install dependencies**:
+    ```sh
+    pip install -r requirements.txt
+    ```
+
+3. **Create .env**:
+    create .env based on env.example
+
+4. **Install ngrok**:
+    Follow the instructions on the [ngrok website](https://ngrok.com/download) to download and install ngrok.
+
+## Configure Twilio URLs
+
+1. **Update the Twilio Webhook**:
+    Copy the ngrok URL and update your Twilio phone number webhook URL to `http://<ngrok_url>/start_call`.
+
+2. **Update the streams.xml**:
+    Copy the ngrok URL and update templates/streams.xml with `wss://<ngrok_url>/ws`.
+
+## Running the Application
+
+### Using Python
+
+1. **Run the FastAPI application**:
+    ```sh
+    python server.py
+    ```
+
+2. **Start ngrok**:
+    In a new terminal, start ngrok to tunnel the local server:
+    ```sh
+    ngrok http 8765
+    ```
+### Using Docker
+
+1. **Build the Docker image**:
+    ```sh
+    docker build -t twilio-chatbot .
+    ```
+
+2. **Run the Docker container**:
+    ```sh
+    docker run -it --rm -p 8765:8765 twilio-chatbot
+    ```
+## Usage
+
+To start a call, simply make a call to your Twilio phone number. The webhook URL will direct the call to your FastAPI application, which will handle it accordingly.
--- a/examples/twilio-chatbot/bot.py
+++ b/examples/twilio-chatbot/bot.py
@@ -0,0 +1,88 @@
+import aiohttp
+import os
+import sys
+
+from pipecat.frames.frames import EndFrame, LLMMessagesFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_response import (
+    LLMAssistantResponseAggregator,
+    LLMUserResponseAggregator
+)
+from pipecat.services.openai import OpenAILLMService
+from pipecat.services.deepgram import DeepgramSTTService
+from pipecat.services.elevenlabs import ElevenLabsTTSService
+from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketTransport, FastAPIWebsocketParams
+from pipecat.vad.silero import SileroVADAnalyzer
+
+from loguru import logger
+
+from dotenv import load_dotenv
+load_dotenv(override=True)
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+async def run_bot(websocket_client):
+    async with aiohttp.ClientSession() as session:
+        transport = FastAPIWebsocketTransport(
+            websocket=websocket_client,
+            params=FastAPIWebsocketParams(
+                audio_out_enabled=True,
+                add_wav_header=False,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+                vad_audio_passthrough=True
+            )
+        )
+
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o")
+
+        stt = DeepgramSTTService(api_key=os.getenv('DEEPGRAM_API_KEY'))
+
+        tts = ElevenLabsTTSService(
+            aiohttp_session=session,
+            api_key=os.getenv("ELEVENLABS_API_KEY"),
+            voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
+        )
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in an audio call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        tma_in = LLMUserResponseAggregator(messages)
+        tma_out = LLMAssistantResponseAggregator(messages)
+
+        pipeline = Pipeline([
+            transport.input(),   # Websocket input from client
+            stt,                 # Speech-To-Text
+            tma_in,              # User responses
+            llm,                 # LLM
+            tts,                 # Text-To-Speech
+            transport.output(),  # Websocket output to client
+            tma_out              # LLM responses
+        ])
+
+        task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
+
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            # Kick off the conversation.
+            messages.append(
+                {"role": "system", "content": "Please introduce yourself to the user."})
+            await task.queue_frames([LLMMessagesFrame(messages)])
+
+        @transport.event_handler("on_client_disconnected")
+        async def on_client_disconnected(transport, client):
+            await task.queue_frames([EndFrame()])
+
+        runner = PipelineRunner(handle_sigint=False)
+
+        await runner.run(task)
--- a/examples/twilio-chatbot/env.example
+++ b/examples/twilio-chatbot/env.example
@@ -0,0 +1,4 @@
+OPENAI_API_KEY=
+DEEPGRAM_API_KEY=
+ELEVENLABS_API_KEY=
+ELEVENLABS_VOICE_ID=
--- a/examples/twilio-chatbot/requirements.txt
+++ b/examples/twilio-chatbot/requirements.txt
@@ -0,0 +1,5 @@
+pipecat-ai[daily,openai,silero,deepgram]
+fastapi
+uvicorn
+python-dotenv
+loguru
--- a/examples/twilio-chatbot/server.py
+++ b/examples/twilio-chatbot/server.py
@@ -0,0 +1,34 @@
+import uvicorn
+
+from fastapi import FastAPI, WebSocket
+from fastapi.middleware.cors import CORSMiddleware
+from starlette.responses import HTMLResponse
+
+from bot import run_bot
+
+app = FastAPI()
+
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],  # Allow all origins for testing
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+
+
+@app.post('/start_call')
+async def start_call():
+    print("POST TwiML")
+    return HTMLResponse(content=open("templates/streams.xml").read(), media_type="application/xml")
+
+
+@app.websocket("/ws")
+async def websocket_endpoint(websocket: WebSocket):
+    await websocket.accept()
+    print("WebSocket connection accepted")
+    await run_bot(websocket)
+
+
+if __name__ == "__main__":
+    uvicorn.run(app, host="0.0.0.0", port=8765)
--- a/examples/twilio-chatbot/templates/streams.xml
+++ b/examples/twilio-chatbot/templates/streams.xml
@@ -0,0 +1,7 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<Response>
+  <Connect>
+    <Stream url="wss://<your server url>/ws"></Stream>
+  </Connect>
+  <Pause length="40"/>
+</Response>
--- a/linux-py3.10-requirements.txt
+++ b/linux-py3.10-requirements.txt
@@ -26,6 +26,8 @@ anyio==4.4.0
    #   anthropic
    #   httpx
    #   openai
+    #   starlette
+    #   watchfiles
 async-timeout==4.0.3
    # via
    #   aiohttp
@@ -54,12 +56,15 @@ cffi==1.16.0
 charset-normalizer==3.3.2
    # via requests
 click==8.1.7
-    # via flask
+    # via
+    #   flask
+    #   typer
+    #   uvicorn
 coloredlogs==15.0.1
    # via onnxruntime
 ctranslate2==4.3.1
    # via faster-whisper
-daily-python==0.9.1
+daily-python==0.10.0
    # via pipecat-ai (pyproject.toml)
 dataclasses-json==0.6.7
    # via
@@ -71,17 +76,25 @@ distro==1.9.0
    # via
    #   anthropic
    #   openai
+dnspython==2.6.1
+    # via email-validator
 einops==0.8.0
    # via pipecat-ai (pyproject.toml)
+email-validator==2.2.0
+    # via fastapi
 exceptiongroup==1.2.1
    # via
    #   anyio
    #   pytest
 fal-client==0.4.0
    # via pipecat-ai (pyproject.toml)
+fastapi==0.111.0
+    # via pipecat-ai (pyproject.toml)
+fastapi-cli==0.0.4
+    # via fastapi
 faster-whisper==1.0.2
    # via pipecat-ai (pyproject.toml)
-filelock==3.15.1
+filelock==3.15.3
    # via
    #   huggingface-hub
    #   pyht
@@ -113,7 +126,7 @@ google-api-core[grpc]==2.19.0
    #   google-ai-generativelanguage
    #   google-api-python-client
    #   google-generativeai
-google-api-python-client==2.133.0
+google-api-python-client==2.134.0
    # via google-generativeai
 google-auth==2.30.0
    # via
@@ -140,24 +153,29 @@ grpcio==1.64.1
 grpcio-status==1.62.2
    # via google-api-core
 h11==0.14.0
-    # via httpcore
+    # via
+    #   httpcore
+    #   uvicorn
 httpcore==1.0.5
    # via httpx
 httplib2==0.22.0
    # via
    #   google-api-python-client
    #   google-auth-httplib2
+httptools==0.6.1
+    # via uvicorn
 httpx==0.27.0
    # via
    #   anthropic
    #   cartesia
    #   deepgram-sdk
    #   fal-client
+    #   fastapi
    #   openai
    #   openpipe
 httpx-sse==0.4.0
    # via fal-client
-huggingface-hub==0.23.3
+huggingface-hub==0.23.4
    # via
    #   faster-whisper
    #   timm
@@ -168,6 +186,7 @@ humanfriendly==10.0
 idna==3.7
    # via
    #   anyio
+    #   email-validator
    #   httpx
    #   requests
    #   yarl
@@ -177,41 +196,46 @@ itsdangerous==2.2.0
    # via flask
 jinja2==3.1.4
    # via
+    #   fastapi
    #   flask
    #   torch
 jsonpatch==1.33
    # via langchain-core
 jsonpointer==3.0.0
    # via jsonpatch
-langchain==0.2.3
+langchain==0.2.5
    # via
    #   langchain-community
    #   pipecat-ai (pyproject.toml)
-langchain-community==0.2.4
+langchain-community==0.2.5
    # via pipecat-ai (pyproject.toml)
-langchain-core==0.2.5
+langchain-core==0.2.9
    # via
    #   langchain
    #   langchain-community
    #   langchain-openai
    #   langchain-text-splitters
-langchain-openai==0.1.8
+langchain-openai==0.1.9
    # via pipecat-ai (pyproject.toml)
 langchain-text-splitters==0.2.1
    # via langchain
-langsmith==0.1.77
+langsmith==0.1.81
    # via
    #   langchain
    #   langchain-community
    #   langchain-core
 loguru==0.7.2
    # via pipecat-ai (pyproject.toml)
+markdown-it-py==3.0.0
+    # via rich
 markupsafe==2.1.5
    # via
    #   jinja2
    #   werkzeug
 marshmallow==3.21.3
    # via dataclasses-json
+mdurl==0.1.2
+    # via markdown-it-py
 mpmath==1.3.0
    # via sympy
 multidict==6.0.5
@@ -273,9 +297,11 @@ openai==1.26.0
    #   pipecat-ai (pyproject.toml)
 openpipe==4.14.0
    # via pipecat-ai (pyproject.toml)
-orjson==3.10.4
-    # via langsmith
-packaging==23.2
+orjson==3.10.5
+    # via
+    #   fastapi
+    #   langsmith
+packaging==24.1
    # via
    #   huggingface-hub
    #   langchain-core
@@ -289,7 +315,7 @@ pillow==10.3.0
    #   torchvision
 pluggy==1.5.0
    # via pytest
-proto-plus==1.23.0
+proto-plus==1.24.0
    # via
    #   google-ai-generativelanguage
    #   google-api-core
@@ -317,6 +343,7 @@ pycparser==2.22
 pydantic==2.7.4
    # via
    #   anthropic
+    #   fastapi
    #   google-generativeai
    #   langchain
    #   langchain-core
@@ -324,6 +351,8 @@ pydantic==2.7.4
    #   openai
 pydantic-core==2.18.4
    # via pydantic
+pygments==2.18.0
+    # via rich
 pyht==0.0.28
    # via pipecat-ai (pyproject.toml)
 pyloudnorm==0.1.1
@@ -337,7 +366,11 @@ pytest-asyncio==0.23.7
 python-dateutil==2.9.0.post0
    # via openpipe
 python-dotenv==1.0.1
-    # via pipecat-ai (pyproject.toml)
+    # via
+    #   pipecat-ai (pyproject.toml)
+    #   uvicorn
+python-multipart==0.0.9
+    # via fastapi
 pyyaml==6.0.1
    # via
    #   ctranslate2
@@ -347,6 +380,7 @@ pyyaml==6.0.1
    #   langchain-core
    #   timm
    #   transformers
+    #   uvicorn
 regex==2024.5.15
    # via
    #   tiktoken
@@ -362,6 +396,8 @@ requests==2.32.3
    #   pyht
    #   tiktoken
    #   transformers
+rich==13.7.1
+    # via typer
 rsa==4.9
    # via google-auth
 safetensors==0.4.3
@@ -370,6 +406,8 @@ safetensors==0.4.3
    #   transformers
 scipy==1.13.1
    # via pyloudnorm
+shellingham==1.5.4
+    # via typer
 six==1.16.0
    # via python-dateutil
 sniffio==1.3.1
@@ -380,15 +418,17 @@ sniffio==1.3.1
    #   openai
 sounddevice==0.4.7
    # via pipecat-ai (pyproject.toml)
-sqlalchemy==2.0.30
+sqlalchemy==2.0.31
    # via
    #   langchain
    #   langchain-community
+starlette==0.37.2
+    # via fastapi
 sympy==1.12.1
    # via
    #   onnxruntime
    #   torch
-tenacity==8.3.0
+tenacity==8.4.1
    # via
    #   langchain
    #   langchain-community
@@ -424,11 +464,14 @@ transformers==4.40.2
    # via pipecat-ai (pyproject.toml)
 triton==2.3.1
    # via torch
+typer==0.12.3
+    # via fastapi-cli
 typing-extensions==4.12.2
    # via
    #   anthropic
    #   anyio
    #   deepgram-sdk
+    #   fastapi
    #   google-generativeai
    #   huggingface-hub
    #   openai
@@ -437,20 +480,31 @@ typing-extensions==4.12.2
    #   pydantic-core
    #   sqlalchemy
    #   torch
+    #   typer
    #   typing-inspect
+    #   uvicorn
 typing-inspect==0.9.0
    # via dataclasses-json
+ujson==5.10.0
+    # via fastapi
 uritemplate==4.1.1
    # via google-api-python-client
-urllib3==2.2.1
+urllib3==2.2.2
    # via requests
+uvicorn[standard]==0.30.1
+    # via fastapi
+uvloop==0.19.0
+    # via uvicorn
 verboselogs==1.7
    # via deepgram-sdk
+watchfiles==0.22.0
+    # via uvicorn
 websockets==12.0
    # via
    #   cartesia
    #   deepgram-sdk
    #   pipecat-ai (pyproject.toml)
+    #   uvicorn
 werkzeug==3.0.3
    # via flask
 yarl==1.9.4
--- a/macos-py3.10-requirements.txt
+++ b/macos-py3.10-requirements.txt
@@ -1,5 +1,5 @@
 #
-# This file is autogenerated by pip-compile with Python 3.10
+# This file is autogenerated by pip-compile with Python 3.12
 # by the following command:
 #
 #    pip-compile --all-extras pyproject.toml
@@ -26,10 +26,8 @@ anyio==4.4.0
    #   anthropic
    #   httpx
    #   openai
-async-timeout==4.0.3
-    # via
-    #   aiohttp
-    #   langchain
+    #   starlette
+    #   watchfiles
 attrs==23.2.0
    # via
    #   aiohttp
@@ -54,12 +52,15 @@ cffi==1.16.0
 charset-normalizer==3.3.2
    # via requests
 click==8.1.7
-    # via flask
+    # via
+    #   flask
+    #   typer
+    #   uvicorn
 coloredlogs==15.0.1
    # via onnxruntime
 ctranslate2==4.3.1
    # via faster-whisper
-daily-python==0.9.1
+daily-python==0.10.0
    # via pipecat-ai (pyproject.toml)
 dataclasses-json==0.6.7
    # via
@@ -71,17 +72,21 @@ distro==1.9.0
    # via
    #   anthropic
    #   openai
+dnspython==2.6.1
+    # via email-validator
 einops==0.8.0
    # via pipecat-ai (pyproject.toml)
-exceptiongroup==1.2.1
-    # via
-    #   anyio
-    #   pytest
+email-validator==2.2.0
+    # via fastapi
 fal-client==0.4.0
    # via pipecat-ai (pyproject.toml)
+fastapi==0.111.0
+    # via pipecat-ai (pyproject.toml)
+fastapi-cli==0.0.4
+    # via fastapi
 faster-whisper==1.0.2
    # via pipecat-ai (pyproject.toml)
-filelock==3.15.1
+filelock==3.15.3
    # via
    #   huggingface-hub
    #   pyht
@@ -112,7 +117,7 @@ google-api-core[grpc]==2.19.0
    #   google-ai-generativelanguage
    #   google-api-python-client
    #   google-generativeai
-google-api-python-client==2.133.0
+google-api-python-client==2.134.0
    # via google-generativeai
 google-auth==2.30.0
    # via
@@ -137,24 +142,29 @@ grpcio==1.64.1
 grpcio-status==1.62.2
    # via google-api-core
 h11==0.14.0
-    # via httpcore
+    # via
+    #   httpcore
+    #   uvicorn
 httpcore==1.0.5
    # via httpx
 httplib2==0.22.0
    # via
    #   google-api-python-client
    #   google-auth-httplib2
+httptools==0.6.1
+    # via uvicorn
 httpx==0.27.0
    # via
    #   anthropic
    #   cartesia
    #   deepgram-sdk
    #   fal-client
+    #   fastapi
    #   openai
    #   openpipe
 httpx-sse==0.4.0
    # via fal-client
-huggingface-hub==0.23.3
+huggingface-hub==0.23.4
    # via
    #   faster-whisper
    #   timm
@@ -165,6 +175,7 @@ humanfriendly==10.0
 idna==3.7
    # via
    #   anyio
+    #   email-validator
    #   httpx
    #   requests
    #   yarl
@@ -174,41 +185,46 @@ itsdangerous==2.2.0
    # via flask
 jinja2==3.1.4
    # via
+    #   fastapi
    #   flask
    #   torch
 jsonpatch==1.33
    # via langchain-core
 jsonpointer==3.0.0
    # via jsonpatch
-langchain==0.2.3
+langchain==0.2.5
    # via
    #   langchain-community
    #   pipecat-ai (pyproject.toml)
-langchain-community==0.2.4
+langchain-community==0.2.5
    # via pipecat-ai (pyproject.toml)
-langchain-core==0.2.5
+langchain-core==0.2.9
    # via
    #   langchain
    #   langchain-community
    #   langchain-openai
    #   langchain-text-splitters
-langchain-openai==0.1.8
+langchain-openai==0.1.9
    # via pipecat-ai (pyproject.toml)
 langchain-text-splitters==0.2.1
    # via langchain
-langsmith==0.1.77
+langsmith==0.1.81
    # via
    #   langchain
    #   langchain-community
    #   langchain-core
 loguru==0.7.2
    # via pipecat-ai (pyproject.toml)
+markdown-it-py==3.0.0
+    # via rich
 markupsafe==2.1.5
    # via
    #   jinja2
    #   werkzeug
 marshmallow==3.21.3
    # via dataclasses-json
+mdurl==0.1.2
+    # via markdown-it-py
 mpmath==1.3.0
    # via sympy
 multidict==6.0.5
@@ -239,9 +255,11 @@ openai==1.26.0
    #   pipecat-ai (pyproject.toml)
 openpipe==4.14.0
    # via pipecat-ai (pyproject.toml)
-orjson==3.10.4
-    # via langsmith
-packaging==23.2
+orjson==3.10.5
+    # via
+    #   fastapi
+    #   langsmith
+packaging==24.1
    # via
    #   huggingface-hub
    #   langchain-core
@@ -255,7 +273,7 @@ pillow==10.3.0
    #   torchvision
 pluggy==1.5.0
    # via pytest
-proto-plus==1.23.0
+proto-plus==1.24.0
    # via
    #   google-ai-generativelanguage
    #   google-api-core
@@ -283,6 +301,7 @@ pycparser==2.22
 pydantic==2.7.4
    # via
    #   anthropic
+    #   fastapi
    #   google-generativeai
    #   langchain
    #   langchain-core
@@ -290,6 +309,8 @@ pydantic==2.7.4
    #   openai
 pydantic-core==2.18.4
    # via pydantic
+pygments==2.18.0
+    # via rich
 pyht==0.0.28
    # via pipecat-ai (pyproject.toml)
 pyloudnorm==0.1.1
@@ -303,7 +324,11 @@ pytest-asyncio==0.23.7
 python-dateutil==2.9.0.post0
    # via openpipe
 python-dotenv==1.0.1
-    # via pipecat-ai (pyproject.toml)
+    # via
+    #   pipecat-ai (pyproject.toml)
+    #   uvicorn
+python-multipart==0.0.9
+    # via fastapi
 pyyaml==6.0.1
    # via
    #   ctranslate2
@@ -313,6 +338,7 @@ pyyaml==6.0.1
    #   langchain-core
    #   timm
    #   transformers
+    #   uvicorn
 regex==2024.5.15
    # via
    #   tiktoken
@@ -328,6 +354,8 @@ requests==2.32.3
    #   pyht
    #   tiktoken
    #   transformers
+rich==13.7.1
+    # via typer
 rsa==4.9
    # via google-auth
 safetensors==0.4.3
@@ -336,6 +364,8 @@ safetensors==0.4.3
    #   transformers
 scipy==1.13.1
    # via pyloudnorm
+shellingham==1.5.4
+    # via typer
 six==1.16.0
    # via python-dateutil
 sniffio==1.3.1
@@ -346,15 +376,17 @@ sniffio==1.3.1
    #   openai
 sounddevice==0.4.7
    # via pipecat-ai (pyproject.toml)
-sqlalchemy==2.0.30
+sqlalchemy==2.0.31
    # via
    #   langchain
    #   langchain-community
+starlette==0.37.2
+    # via fastapi
 sympy==1.12.1
    # via
    #   onnxruntime
    #   torch
-tenacity==8.3.0
+tenacity==8.4.1
    # via
    #   langchain
    #   langchain-community
@@ -368,8 +400,6 @@ tokenizers==0.19.1
    #   anthropic
    #   faster-whisper
    #   transformers
-tomli==2.0.1
-    # via pytest
 torch==2.3.1
    # via
    #   pipecat-ai (pyproject.toml)
@@ -388,11 +418,13 @@ tqdm==4.66.4
    #   transformers
 transformers==4.40.2
    # via pipecat-ai (pyproject.toml)
+typer==0.12.3
+    # via fastapi-cli
 typing-extensions==4.12.2
    # via
    #   anthropic
-    #   anyio
    #   deepgram-sdk
+    #   fastapi
    #   google-generativeai
    #   huggingface-hub
    #   openai
@@ -401,20 +433,30 @@ typing-extensions==4.12.2
    #   pydantic-core
    #   sqlalchemy
    #   torch
+    #   typer
    #   typing-inspect
 typing-inspect==0.9.0
    # via dataclasses-json
+ujson==5.10.0
+    # via fastapi
 uritemplate==4.1.1
    # via google-api-python-client
-urllib3==2.2.1
+urllib3==2.2.2
    # via requests
+uvicorn[standard]==0.30.1
+    # via fastapi
+uvloop==0.19.0
+    # via uvicorn
 verboselogs==1.7
    # via deepgram-sdk
+watchfiles==0.22.0
+    # via uvicorn
 websockets==12.0
    # via
    #   cartesia
    #   deepgram-sdk
    #   pipecat-ai (pyproject.toml)
+    #   uvicorn
 werkzeug==3.0.3
    # via flask
 yarl==1.9.4
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -37,7 +37,7 @@ Website = "https://pipecat.ai"
 anthropic = [ "anthropic~=0.25.7" ]
 azure = [ "azure-cognitiveservices-speech~=1.37.0" ]
 cartesia = [ "numpy~=1.26.0", "sounddevice", "cartesia" ]
-daily = [ "daily-python~=0.9.0" ]
+daily = [ "daily-python~=0.10.0" ]
 deepgram = [ "deepgram-sdk~=3.2.7" ]
 examples = [ "python-dotenv~=1.0.0", "flask~=3.0.3", "flask_cors~=4.0.1" ]
 fal = [ "fal-client~=0.4.0" ]
@@ -50,7 +50,7 @@ openai = [ "openai~=1.26.0" ]
 openpipe = [ "openpipe~=4.14.0" ]
 playht = [ "pyht~=0.0.28" ]
 silero = [ "torch~=2.3.0", "torchaudio~=2.3.0" ]
-websocket = [ "websockets~=12.0" ]
+websocket = [ "websockets~=12.0", "fastapi~=0.111.0" ]
 whisper = [ "faster-whisper~=1.0.2" ]

 [tool.setuptools.packages.find]
--- a/src/pipecat/serializers/base_serializer.py
+++ b/src/pipecat/serializers/base_serializer.py
@@ -12,9 +12,9 @@ from pipecat.frames.frames import Frame
 class FrameSerializer(ABC):

    @abstractmethod
-    def serialize(self, frame: Frame) -> bytes:
+    def serialize(self, frame: Frame) -> str | bytes | None:
        pass

    @abstractmethod
-    def deserialize(self, data: bytes) -> Frame | None:
+    def deserialize(self, data: str | bytes) -> Frame | None:
        pass
--- a/src/pipecat/serializers/protobuf.py
+++ b/src/pipecat/serializers/protobuf.py
@@ -26,7 +26,7 @@ class ProtobufFrameSerializer(FrameSerializer):
    def __init__(self):
        pass

-    def serialize(self, frame: Frame) -> bytes:
+    def serialize(self, frame: Frame) -> str | bytes | None:
        proto_frame = frame_protos.Frame()
        if type(frame) not in self.SERIALIZABLE_TYPES:
            raise ValueError(
@@ -41,7 +41,7 @@ class ProtobufFrameSerializer(FrameSerializer):
        result = proto_frame.SerializeToString()
        return result

-    def deserialize(self, data: bytes) -> Frame | None:
+    def deserialize(self, data: str | bytes) -> Frame | None:
        """Returns a Frame object from a Frame protobuf. Used to convert frames
        passed over the wire as protobufs to Frame objects used in pipelines
        and frame processors.
--- a/src/pipecat/serializers/twilio.py
+++ b/src/pipecat/serializers/twilio.py
@@ -0,0 +1,55 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import base64
+import json
+
+from pipecat.frames.frames import AudioRawFrame, Frame
+from pipecat.serializers.base_serializer import FrameSerializer
+from pipecat.utils.audio import ulaw_8000_to_pcm_16000, pcm_16000_to_ulaw_8000
+
+
+class TwilioFrameSerializer(FrameSerializer):
+    SERIALIZABLE_TYPES = {
+        AudioRawFrame: "audio",
+    }
+
+    def __init__(self):
+        self._sid = None
+
+    def serialize(self, frame: Frame) -> str | bytes | None:
+        if not isinstance(frame, AudioRawFrame):
+            return None
+
+        data = frame.audio
+
+        serialized_data = pcm_16000_to_ulaw_8000(data)
+        payload = base64.b64encode(serialized_data).decode("utf-8")
+        answer = {
+            "event": "media",
+            "streamSid": self._sid,
+            "media": {
+                "payload": payload
+            }
+        }
+
+        return json.dumps(answer)
+
+    def deserialize(self, data: str | bytes) -> Frame | None:
+        message = json.loads(data)
+
+        if not self._sid:
+            self._sid = message["streamSid"] if "streamSid" in message else None
+
+        if message["event"] != "media":
+            return None
+        else:
+            payload_base64 = message["media"]["payload"]
+            payload = base64.b64decode(payload_base64)
+
+            deserialized_data = ulaw_8000_to_pcm_16000(payload)
+            audio_frame = AudioRawFrame(audio=deserialized_data, num_channels=1, sample_rate=16000)
+            return audio_frame
--- a/src/pipecat/services/ai_services.py
+++ b/src/pipecat/services/ai_services.py
@@ -21,6 +21,7 @@ from pipecat.frames.frames import (
    TTSStoppedFrame,
    TextFrame,
    VisionImageRawFrame,
+    LLMFullResponseEndFrame,
 )
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.utils.audio import calculate_audio_volume
@@ -110,7 +111,9 @@ class TTSService(AIService):
            text = frame.text
        else:
            self._current_sentence += frame.text
-            if self._current_sentence.strip().endswith((".", "?", "!")):
+            if self._current_sentence.strip().endswith(
+                    (".", "?", "!")) and not self._current_sentence.strip().endswith(
+                    ("Mr,", "Mrs.", "Ms.", "Dr.")):
                text = self._current_sentence.strip()
                self._current_sentence = ""

@@ -134,6 +137,11 @@ class TTSService(AIService):
            if self._current_sentence:
                await self._push_tts_frames(self._current_sentence)
            await self.push_frame(frame)
+        elif isinstance(frame, LLMFullResponseEndFrame):
+            if self._current_sentence:
+                await self._push_tts_frames(self._current_sentence.strip())
+                self._current_sentence = ""
+            await self.push_frame(frame)
        else:
            await self.push_frame(frame, direction)

--- a/src/pipecat/services/anthropic.py
+++ b/src/pipecat/services/anthropic.py
@@ -40,17 +40,14 @@ class AnthropicLLMService(LLMService):
    """

    def __init__(
-        self,
-        api_key: str,
-        model: str = "claude-3-opus-20240229",
-        max_tokens: int = 1024,
-        temperature: float = 0.0
-    ):
+            self,
+            api_key: str,
+            model: str = "claude-3-opus-20240229",
+            max_tokens: int = 1024):
        super().__init__()
        self._client = AsyncAnthropic(api_key=api_key)
        self._model = model
        self._max_tokens = max_tokens
-        self._temperature = temperature

    def can_generate_metrics(self) -> bool:
        return True
@@ -113,7 +110,6 @@ class AnthropicLLMService(LLMService):
                messages=messages,
                model=self._model,
                max_tokens=self._max_tokens,
-                temperature=self._temperature,
                stream=True)

            await self.stop_ttfb_metrics()
--- a/src/pipecat/services/deepgram.py
+++ b/src/pipecat/services/deepgram.py
@@ -89,6 +89,7 @@ class DeepgramTTSService(TTSService):
 class DeepgramSTTService(AIService):
    def __init__(self,
                 api_key: str,
+                 url: str = "",
                 live_options: LiveOptions = LiveOptions(
                     encoding="linear16",
                     language="en-US",
@@ -104,7 +105,7 @@ class DeepgramSTTService(AIService):
        self._live_options = live_options

        self._client = DeepgramClient(
-            api_key, config=DeepgramClientOptions(options={"keepalive": "true"}))
+            api_key, config=DeepgramClientOptions(url=url, options={"keepalive": "true"}))
        self._connection = self._client.listen.asynclive.v("1")
        self._connection.on(LiveTranscriptionEvents.Transcript, self._on_message)

--- a/src/pipecat/transports/network/fastapi_websocket.py
+++ b/src/pipecat/transports/network/fastapi_websocket.py
@@ -0,0 +1,160 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import asyncio
+import io
+import wave
+
+from typing import Awaitable, Callable
+from pydantic.main import BaseModel
+
+from pipecat.serializers.twilio import TwilioFrameSerializer
+from pipecat.frames.frames import AudioRawFrame, StartFrame
+from pipecat.processors.frame_processor import FrameProcessor
+from pipecat.serializers.base_serializer import FrameSerializer
+from pipecat.transports.base_input import BaseInputTransport
+from pipecat.transports.base_output import BaseOutputTransport
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+
+from loguru import logger
+
+try:
+    from fastapi import WebSocket
+    from starlette.websockets import WebSocketState
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error(
+        "In order to use FastAPI websockets, you need to `pip install pipecat-ai[websocket]`.")
+    raise Exception(f"Missing module: {e}")
+
+
+class FastAPIWebsocketParams(TransportParams):
+    add_wav_header: bool = False
+    audio_frame_size: int = 6400  # 200ms
+    serializer: FrameSerializer = TwilioFrameSerializer()
+
+
+class FastAPIWebsocketCallbacks(BaseModel):
+    on_client_connected: Callable[[WebSocket], Awaitable[None]]
+    on_client_disconnected: Callable[[WebSocket], Awaitable[None]]
+
+
+class FastAPIWebsocketInputTransport(BaseInputTransport):
+
+    def __init__(
+            self,
+            websocket: WebSocket,
+            params: FastAPIWebsocketParams,
+            callbacks: FastAPIWebsocketCallbacks,
+            **kwargs):
+        super().__init__(params, **kwargs)
+
+        self._websocket = websocket
+        self._params = params
+        self._callbacks = callbacks
+
+    async def start(self, frame: StartFrame):
+        await self._callbacks.on_client_connected(self._websocket)
+        await super().start(frame)
+        self._receive_task = self.get_event_loop().create_task(self._receive_messages())
+
+    async def stop(self):
+        if self._websocket.client_state != WebSocketState.DISCONNECTED:
+            await self._websocket.close()
+        await super().stop()
+
+    async def _receive_messages(self):
+        async for message in self._websocket.iter_text():
+            frame = self._params.serializer.deserialize(message)
+
+            if not frame:
+                continue
+
+            if isinstance(frame, AudioRawFrame):
+                await self.push_audio_frame(frame)
+
+        await self._callbacks.on_client_disconnected(self._websocket)
+
+
+class FastAPIWebsocketOutputTransport(BaseOutputTransport):
+
+    def __init__(self, websocket: WebSocket, params: FastAPIWebsocketParams, **kwargs):
+        super().__init__(params, **kwargs)
+
+        self._websocket = websocket
+        self._params = params
+        self._audio_buffer = bytes()
+
+    async def write_raw_audio_frames(self, frames: bytes):
+        self._audio_buffer += frames
+        while len(self._audio_buffer) >= self._params.audio_frame_size:
+            frame = AudioRawFrame(
+                audio=self._audio_buffer[:self._params.audio_frame_size],
+                sample_rate=self._params.audio_out_sample_rate,
+                num_channels=self._params.audio_out_channels
+            )
+
+            if self._params.add_wav_header:
+                content = io.BytesIO()
+                ww = wave.open(content, "wb")
+                ww.setsampwidth(2)
+                ww.setnchannels(frame.num_channels)
+                ww.setframerate(frame.sample_rate)
+                ww.writeframes(frame.audio)
+                ww.close()
+                content.seek(0)
+                wav_frame = AudioRawFrame(
+                    content.read(),
+                    sample_rate=frame.sample_rate,
+                    num_channels=frame.num_channels)
+                frame = wav_frame
+
+            payload = self._params.serializer.serialize(frame)
+            if payload:
+                await self._websocket.send_text(payload)
+
+            self._audio_buffer = self._audio_buffer[self._params.audio_frame_size:]
+
+
+class FastAPIWebsocketTransport(BaseTransport):
+
+    def __init__(
+            self,
+            websocket: WebSocket,
+            params: FastAPIWebsocketParams = FastAPIWebsocketParams(),
+            input_name: str | None = None,
+            output_name: str | None = None,
+            loop: asyncio.AbstractEventLoop | None = None):
+        super().__init__(input_name=input_name, output_name=output_name, loop=loop)
+        self._params = params
+
+        self._callbacks = FastAPIWebsocketCallbacks(
+            on_client_connected=self._on_client_connected,
+            on_client_disconnected=self._on_client_disconnected
+        )
+
+        self._input = FastAPIWebsocketInputTransport(
+            websocket, self._params, self._callbacks, name=self._input_name)
+        self._output = FastAPIWebsocketOutputTransport(
+            websocket, self._params, name=self._output_name)
+
+        # Register supported handlers. The user will only be able to register
+        # these handlers.
+        self._register_event_handler("on_client_connected")
+        self._register_event_handler("on_client_disconnected")
+
+    def input(self) -> FrameProcessor:
+        return self._input
+
+    def output(self) -> FrameProcessor:
+        return self._output
+
+    async def _on_client_connected(self, websocket):
+        await self._call_event_handler("on_client_connected", websocket)
+
+    async def _on_client_disconnected(self, websocket):
+        await self._call_event_handler("on_client_disconnected", websocket)
--- a/src/pipecat/transports/network/websocket_server.py
+++ b/src/pipecat/transports/network/websocket_server.py
@@ -4,11 +4,9 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-
 import asyncio
 import io
 import wave
-import websockets

 from typing import Awaitable, Callable
 from pydantic.main import BaseModel
@@ -23,6 +21,13 @@ from pipecat.transports.base_transport import BaseTransport, TransportParams

 from loguru import logger

+try:
+    import websockets
+except ModuleNotFoundError as e:
+    logger.error(f"Exception: {e}")
+    logger.error("In order to use websockets, you need to `pip install pipecat-ai[websocket]`.")
+    raise Exception(f"Missing module: {e}")
+

 class WebsocketServerParams(TransportParams):
    add_wav_header: bool = False
--- a/src/pipecat/transports/services/daily.py
+++ b/src/pipecat/transports/services/daily.py
@@ -113,6 +113,7 @@ class DailyCallbacks(BaseModel):
    on_app_message: Callable[[Any, str], Awaitable[None]]
    on_call_state_updated: Callable[[str], Awaitable[None]]
    on_dialin_ready: Callable[[str], Awaitable[None]]
+    on_dialout_answered: Callable[[Any], Awaitable[None]]
    on_dialout_connected: Callable[[Any], Awaitable[None]]
    on_dialout_stopped: Callable[[Any], Awaitable[None]]
    on_dialout_error: Callable[[Any], Awaitable[None]]
@@ -432,6 +433,9 @@ class DailyTransportClient(EventHandler):
    def on_dialin_ready(self, sip_endpoint: str):
        self._call_async_callback(self._callbacks.on_dialin_ready, sip_endpoint)

+    def on_dialout_answered(self, data: Any):
+        self._call_async_callback(self._callbacks.on_dialout_answered, data)
+
    def on_dialout_connected(self, data: Any):
        self._call_async_callback(self._callbacks.on_dialout_connected, data)

@@ -689,6 +693,7 @@ class DailyTransport(BaseTransport):
            on_app_message=self._on_app_message,
            on_call_state_updated=self._on_call_state_updated,
            on_dialin_ready=self._on_dialin_ready,
+            on_dialout_answered=self._on_dialout_answered,
            on_dialout_connected=self._on_dialout_connected,
            on_dialout_stopped=self._on_dialout_stopped,
            on_dialout_error=self._on_dialout_error,
@@ -711,6 +716,7 @@ class DailyTransport(BaseTransport):
        self._register_event_handler("on_app_message")
        self._register_event_handler("on_call_state_updated")
        self._register_event_handler("on_dialin_ready")
+        self._register_event_handler("on_dialout_answered")
        self._register_event_handler("on_dialout_connected")
        self._register_event_handler("on_dialout_stopped")
        self._register_event_handler("on_dialout_error")
@@ -838,6 +844,9 @@ class DailyTransport(BaseTransport):
            await self._handle_dialin_ready(sip_endpoint)
        await self._call_event_handler("on_dialin_ready", sip_endpoint)

+    async def _on_dialout_answered(self, data):
+        await self._call_event_handler("on_dialout_answered", data)
+
    async def _on_dialout_connected(self, data):
        await self._call_event_handler("on_dialout_connected", data)

--- a/src/pipecat/utils/audio.py
+++ b/src/pipecat/utils/audio.py
@@ -4,6 +4,7 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+import audioop
 import numpy as np
 import pyloudnorm as pyln

@@ -31,3 +32,23 @@ def calculate_audio_volume(audio: bytes, sample_rate: int) -> float:

 def exp_smoothing(value: float, prev_value: float, factor: float) -> float:
    return prev_value + factor * (value - prev_value)
+
+
+def ulaw_8000_to_pcm_16000(ulaw_8000_bytes):
+    # Convert μ-law to PCM
+    pcm_8000_bytes = audioop.ulaw2lin(ulaw_8000_bytes, 2)
+
+    # Resample from 8000 Hz to 16000 Hz
+    pcm_16000_bytes = audioop.ratecv(pcm_8000_bytes, 2, 1, 8000, 16000, None)[0]
+
+    return pcm_16000_bytes
+
+
+def pcm_16000_to_ulaw_8000(pcm_16000_bytes):
+    # Resample from 16000 Hz to 8000 Hz
+    pcm_8000_bytes = audioop.ratecv(pcm_16000_bytes, 2, 1, 16000, 8000, None)[0]
+
+    # Convert PCM to μ-law
+    ulaw_8000_bytes = audioop.lin2ulaw(pcm_8000_bytes, 2)
+
+    return ulaw_8000_bytes
--- a/tests/vllm-inference-test.py
+++ b/tests/vllm-inference-test.py
@@ -1,86 +0,0 @@
-import asyncio
-import time
-
-from vllm import LLM, SamplingParams
-from vllm.engine.arg_utils import AsyncEngineArgs
-from vllm.engine.async_llm_engine import AsyncLLMEngine
-from vllm.utils import random_uuid
-
-sampling_params = SamplingParams(
-    temperature=0.8,
-    top_p=0.95,
-    max_tokens=4096
-)
-
-prompt = "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.<|eot_id|><|start_header_id|>system<|end_header_id|>\n\nPlease introduce yourself to the user.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
-
-
-async def main():
-    print("🥶 cold starting inference")
-    start = time.monotonic_ns()
-
-    engine_args = AsyncEngineArgs(
-        model="meta-llama/Meta-Llama-3-8B-Instruct",
-        enable_prefix_caching=True,
-        gpu_memory_utilization=0.90,
-        enforce_eager=False,        # False means slower starts but faster inference
-        disable_log_stats=True,     # disable logging so we can stream tokens
-        disable_log_requests=True,
-    )
-
-    engine = AsyncLLMEngine.from_engine_args(engine_args)
-    duration_s = (time.monotonic_ns() - start) / 1e9
-    print(f"🏎️ engine started in {duration_s:.0f}s")
-
-    request_id = random_uuid()
-    result_generator = engine.generate(
-        prompt,
-        sampling_params,
-        request_id,
-    )
-    index, num_tokens = 0, 0
-    start = time.monotonic_ns()
-    async for output in result_generator:
-        if (
-            output.outputs[0].text
-            and "\ufffd" == output.outputs[0].text[-1]
-        ):
-            continue
-        text_delta = output.outputs[0].text[index:]
-        index = len(output.outputs[0].text)
-        num_tokens = len(output.outputs[0].token_ids)
-
-        print(text_delta)
-    duration_s = (time.monotonic_ns() - start) / 1e9
-
-    print(
-        f"\n\tGenerated {num_tokens} tokens in {duration_s:.1f}s,"
-        f" throughput = {num_tokens / duration_s:.0f} tokens/second.\n"
-    )
-
-    return
-
-
-async def xmain():
-    llm = LLM(
-        model="meta-llama/Meta-Llama-3-8B-Instruct",
-        enable_prefix_caching=True
-    )
-
-    outputs = llm.generate(prompt, sampling_params)
-
-    for output in outputs:
-        prompt = output.prompt
-        generated_text = output.outputs[0].text
-        print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
-
-    outputs = llm.generate(prompt, sampling_params)
-
-    for output in outputs:
-        prompt = output.prompt
-        generated_text = output.outputs[0].text
-        print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
Author	SHA1	Message	Date
Jon Taylor	5bd5d22270	removed space from event handler	2024-06-26 18:30:56 +01:00
Jon Taylor	6ee7932337	added pause to start and new intro prompt	2024-06-26 18:24:14 +01:00
Jon Taylor	c407445dd1	removed header comment from bot runner	2024-06-24 17:35:26 +01:00
Jon Taylor	447f37167e	added VAD stop seconds env	2024-06-24 17:34:25 +01:00
Jon Taylor	354c21500e	prompt tweaks	2024-06-24 17:28:10 +01:00
Jon Taylor	5728e25b5a	added fastbot example	2024-06-24 16:25:36 +01:00
Kwindla Hultman Kramer	0b6a19802f	Merge pull request #250 from pipecat-ai/lewis/flush-tts-on-llm-response-end Flush output from TTSService on LLMFullResponseEndFrame	2024-06-22 20:37:45 -04:00
Lewis Wolfgang	c4a2d2197c	Flush output from TTSService on LLMFullResponseEndFrame To cover cases when the LLM response does not end in punctuation.	2024-06-22 14:57:44 -04:00
Aleix Conchillo Flaqué	269d06aa15	Merge pull request #249 from pipecat-ai/aleix/pipecat-0.0.32 update CHANGELOG.md for 0.0.32	2024-06-22 09:21:21 -07:00
Aleix Conchillo Flaqué	dfef1f2c54	update CHANGELOG.md for 0.0.32	2024-06-22 09:19:22 -07:00
Aleix Conchillo Flaqué	b62beaba0b	Merge pull request #248 from pipecat-ai/aleix/deepgramstt-url services(deepgram): add url to DeepgramSTTService	2024-06-21 22:26:23 -07:00
Aleix Conchillo Flaqué	adf414e40f	services(deepgram): add url to DeepgramSTTService	2024-06-21 16:52:28 -07:00
Aleix Conchillo Flaqué	dc64e57f63	Merge pull request #241 from pipecat-ai/aleix/transports-async transports: fully use asyncio in all read/write operations	2024-06-21 16:00:08 -07:00
Aleix Conchillo Flaqué	d3e410b2ac	transports: fully use asyncio in all read/write operations	2024-06-21 15:55:15 -07:00
Aleix Conchillo Flaqué	c544b2474b	update linux-py3.10-requirements with fastapi and new daily-python	2024-06-21 15:44:01 -07:00
Aleix Conchillo Flaqué	18243de358	add fastapi and update macos-py3.10-requirements.txt	2024-06-21 13:16:47 -07:00
Aleix Conchillo Flaqué	6625895d1f	update macos-py3.10-requirements.txt	2024-06-21 13:13:02 -07:00
Aleix Conchillo Flaqué	f9ecce739e	Merge pull request #247 from pipecat-ai/aleix/twilio-updates some twilio updates	2024-06-21 10:14:40 -07:00
Aleix Conchillo Flaqué	0075dd8386	update linux/macos-py3.10-requirements.txt	2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué	eef1cde816	updated CHANGELOG.md with fastapi and twilio updates	2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué	8d867c30c6	transports(websocket): verify websockets module	2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué	42c668b7ae	examples(twilio-chatbot): update instructions and renames	2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué	b62227b4ae	serializers(twilio): formatting and allow str \| bytes \| None	2024-06-21 09:47:17 -07:00
Aleix Conchillo Flaqué	25ef0cb87b	serializers: allow str \| bytes \| None	2024-06-21 09:42:43 -07:00
Aleix Conchillo Flaqué	e195941aa5	Merge pull request #246 from pipecat-ai/aleix/daily-dialout-answered transports(daily): added dialout_answered event	2024-06-20 18:37:24 -07:00
Aleix Conchillo Flaqué	e09eef1dd7	Merge pull request #243 from Viking5274/main Add twilio_websocket_service with example	2024-06-20 14:09:48 -07:00
Aleix Conchillo Flaqué	7c13663a4e	transports(daily): added dialout_answered event	2024-06-20 13:01:25 -07:00
daniil5701133	5753869e5e	add twilio-chatbot example with README.md info how to start app created twilio_websocket_service.py, TwilioFrameSerializer.py moved pcm_16000_to_ulaw_8000 and ulaw_8000_to_pcm_16000 to src/pipecat/utils/audio.py fixed callback on disconnect	2024-06-20 23:00:01 +03:00
chadbailey59	ba878a19f4	fixed "Dr." interruption (#245 )	2024-06-19 20:53:04 -05:00