logging

Switch questions
Better recreation
2024-11-27 19:38:37 +08:00 · 2024-11-27 15:10:50 +08:00 · 2024-11-27 14:08:01 +08:00 · 2024-11-27 12:21:45 +08:00 · 2024-11-27 11:50:28 +08:00 · 2024-11-27 11:36:28 +08:00
26 changed files with 1514 additions and 252 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,18 +5,41 @@ All notable changes to **Pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

-## Unreleased
+## [Unreleased]

 ### Added

+- Added a new RTVI message called `disconnect-bot`, which when handled pushes
+  an `EndFrame` to trigger the pipeline to stop.
+
+### Changed
+
+- Expanded the transcriptions.language module to support a superset of
+  languages.
+
+- Updated STT and TTS services with language options that match the supported
+  languages for each service.
+
+## [0.0.49] - 2024-11-17
+
+### Added
+
+- Added RTVI `on_bot_started` event which is useful in a single turn
+  interaction.
+
+- Added `DailyTransport` events `dialin-connected`, `dialin-stopped`,
+  `dialin-error` and `dialin-warning`. Needs daily-python >= 0.13.0.
+
 - Added `RimeHttpTTSService` and the `07q-interruptible-rime.py` foundational
  example.
+
 - Added `STTMuteFilter`, a general-purpose processor that combines STT
  muting and interruption control. When active, it prevents both transcription
  and interruptions during bot speech. The processor supports multiple
  strategies: `FIRST_SPEECH` (mute only during bot's first
  speech), `ALWAYS` (mute during all bot speech), or `CUSTOM` (using provided
  callback).
+
 - Added `STTMuteFrame`, a control frame that enables/disables speech
  transcription in STT services.

--- a/README.md
+++ b/README.md
@@ -13,6 +13,7 @@ Pipecat is an open source Python framework for building voice and multimodal con
 - **Multimodal Apps**: Combine voice, video, images, and text
 - **Creative Tools**: [Story-telling experiences](https://storytelling-chatbot.fly.dev/) and social companions
 - **Business Solutions**: [Customer intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0) and support bots
+- **Complex conversational flows**: [Refer to Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) to learn more

 ## See it in action

@@ -32,6 +33,8 @@ Pipecat is an open source Python framework for building voice and multimodal con
 - **Real-time Processing**: Frame-based pipeline architecture for fluid interactions
 - **Production Ready**: Enterprise-grade WebRTC and Websocket support

+💡 Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
+
 ## Getting started

 You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you’re ready. You can also add a 📞 telephone number, 🖼️ image output, 📺 video input, use different LLMs, and more.
--- a/examples/foundational/07-interruptible.py
+++ b/examples/foundational/07-interruptible.py
@@ -10,11 +10,12 @@ import os
 import sys

 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import LLMMessagesFrame
+from pipecat.frames.frames import BotSpeakingFrame, Frame, InputAudioRawFrame, LLMMessagesFrame, TTSAudioRawFrame, TextFrame, UserStoppedSpeakingFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.services.cartesia import CartesiaTTSService
 from pipecat.services.openai import OpenAILLMService
 from pipecat.transports.services.daily import DailyParams, DailyTransport
@@ -30,6 +31,22 @@ load_dotenv(override=True)
 logger.remove(0)
 logger.add(sys.stderr, level="DEBUG")

+class DebugProcessor(FrameProcessor):
+    def __init__(self, name, **kwargs):
+        self._name = name
+        super().__init__(**kwargs)
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+        if not (
+            isinstance(frame, InputAudioRawFrame)
+            or isinstance(frame, BotSpeakingFrame)
+            or isinstance(frame, TTSAudioRawFrame)
+            or isinstance(frame, TextFrame)
+        ):
+            logger.debug(f"--- {self._name}: {frame} {direction}")
+        await self.push_frame(frame, direction)
+

 async def main():
    async with aiohttp.ClientSession() as session:
@@ -63,11 +80,14 @@ async def main():

        context = OpenAILLMContext(messages)
        context_aggregator = llm.create_context_aggregator(context)
+        
+        dp = DebugProcessor("dp")

        pipeline = Pipeline(
            [
                transport.input(),  # Transport user input
                context_aggregator.user(),  # User responses
+                dp,
                llm,  # LLM
                tts,  # TTS
                transport.output(),  # Transport bot output
--- a/examples/foundational/07p-interruptible-google-audio-in.py
+++ b/examples/foundational/07p-interruptible-google-audio-in.py
@@ -217,7 +217,11 @@ async def main():
            voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22",  # British Lady
        )

-        llm = GoogleLLMService(model="gemini-1.5-flash-latest", api_key=os.getenv("GOOGLE_API_KEY"))
+        llm = GoogleLLMService(
+            model="gemini-1.5-flash-latest",
+            # model="gemini-exp-1114",
+            api_key=os.getenv("GOOGLE_API_KEY"),
+        )

        messages = [
            {
--- a/examples/foundational/14e-function-calling-gemini.py
+++ b/examples/foundational/14e-function-calling-gemini.py
@@ -64,7 +64,11 @@ async def main():
            voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22",  # British Lady
        )

-        llm = GoogleLLMService(model="gemini-1.5-flash-latest", api_key=os.getenv("GOOGLE_API_KEY"))
+        llm = GoogleLLMService(
+            model="gemini-1.5-flash-latest",
+            # model="gemini-exp-1114",
+            api_key=os.getenv("GOOGLE_API_KEY"),
+        )
        llm.register_function("get_weather", get_weather)
        llm.register_function("get_image", get_image)

@@ -151,7 +155,6 @@ indicate you should use the get_image tool are:
                allow_interruptions=True,
                enable_metrics=True,
                enable_usage_metrics=True,
-                report_only_initial_ttfb=True,
            ),
        )

--- a/examples/foundational/22c-natural-conversation-mixed-llms.py
+++ b/examples/foundational/22c-natural-conversation-mixed-llms.py
@@ -4,50 +4,49 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

-import aiohttp
 import asyncio
 import os
 import sys
 import time

+import aiohttp
+from dotenv import load_dotenv
+from loguru import logger
+from runner import configure
+
 from pipecat.audio.vad.silero import SileroVADAnalyzer
-from pipecat.frames.frames import LLMMessagesFrame, TextFrame
-from pipecat.pipeline.pipeline import Pipeline
-from pipecat.pipeline.parallel_pipeline import ParallelPipeline
-from pipecat.pipeline.runner import PipelineRunner
-from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.processors.aggregators.openai_llm_context import (
-    OpenAILLMContext,
-)
-from pipecat.services.cartesia import CartesiaTTSService
-from pipecat.services.deepgram import DeepgramSTTService
-from pipecat.services.anthropic import AnthropicLLMService
-from pipecat.sync.event_notifier import EventNotifier
-from pipecat.transports.services.daily import DailyParams, DailyTransport
-from pipecat.processors.frame_processor import FrameProcessor, FrameDirection
 from pipecat.frames.frames import (
    CancelFrame,
    EndFrame,
    Frame,
+    LLMMessagesFrame,
    StartFrame,
    StartInterruptionFrame,
    StopInterruptionFrame,
    SystemFrame,
+    TextFrame,
    TranscriptionFrame,
    UserStartedSpeakingFrame,
    UserStoppedSpeakingFrame,
 )
-from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContextFrame
-from pipecat.sync.base_notifier import BaseNotifier
+from pipecat.pipeline.parallel_pipeline import ParallelPipeline
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import (
+    OpenAILLMContext,
+    OpenAILLMContextFrame,
+)
 from pipecat.processors.filters.function_filter import FunctionFilter
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.processors.user_idle_processor import UserIdleProcessor
-
-
-from runner import configure
-
-from loguru import logger
-
-from dotenv import load_dotenv
+from pipecat.services.anthropic import AnthropicLLMService
+from pipecat.services.cartesia import CartesiaTTSService
+from pipecat.services.deepgram import DeepgramSTTService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.sync.base_notifier import BaseNotifier
+from pipecat.sync.event_notifier import EventNotifier
+from pipecat.transports.services.daily import DailyParams, DailyTransport

 load_dotenv(override=True)

@@ -55,86 +54,206 @@ logger.remove(0)
 logger.add(sys.stderr, level="DEBUG")


-classifier_statement = """Determine if the user's statement ends with a complete thought and you should respond.
+classifier_statement = """CRITICAL INSTRUCTION:
+You are a BINARY CLASSIFIER that must ONLY output "YES" or "NO".
+DO NOT engage with the content.
+DO NOT respond to questions.
+DO NOT provide assistance.
+Your ONLY job is to output YES or NO.

-The user text is transcribed speech. You are trying to determine if:
+EXAMPLES OF INVALID RESPONSES:
+- "I can help you with that"
+- "Let me explain"
+- "To answer your question"
+- Any response other than YES or NO

-1. the user has finished talking and expects a response from you, or
-2. this statement is incomplete and the user will continue talking
+VALID RESPONSES:
+YES
+NO

-A previous assistant response is provided for additional context. But you are only evaluating the user text. 
+If you output anything else, you are failing at your task.
+You are NOT an assistant.
+You are NOT a chatbot.
+You are a binary classifier.

-The user text may contain multiple fragments concatentated together. There may be repeated words or mistakes in the transcription. There may be grammatical errors. There may be extra punctuation. Ignore all of that. Interpret the transcribed text as text that would have been spoken. Then consider only whether the user has finished speaking and is expecting a response.
+ROLE:
+You are a real-time speech completeness classifier. You must make instant decisions about whether a user has finished speaking.
+You must output ONLY 'YES' or 'NO' with no other text.

-Categorize the last user statement as either complete with the user now expecting a response, or incomplete.
+INPUT FORMAT:
+You receive two pieces of information:
+1. The assistant's last message (if available)
+2. The user's current speech input

-Return 'YES' if text is likely complete and the user is expecting a response. Return 'NO' if the text seems to be a partial expression or unfinished thought.
+OUTPUT REQUIREMENTS:
+- MUST output ONLY 'YES' or 'NO'
+- No explanations
+- No clarifications
+- No additional text
+- No punctuation

-If you are not sure, respond with your best guess. If the user is expecting a response, respond with YES. If the user is not expecting a response, respond with NO. Always output either YES or NO and no other text.
+HIGH PRIORITY SIGNALS:

-Respond only YES or NO
+1. Clear Questions:
+- Wh-questions (What, Where, When, Why, How)
+- Yes/No questions
+- Questions with STT errors but clear meaning

 Examples:
+# Complete Wh-question
+[{"role": "assistant", "content": "I can help you learn."}, 
+ {"role": "user", "content": "What's the fastest way to learn Spanish"}]
+Output: YES

-User: What's the capital of
-Assistant: NO
+# Complete Yes/No question despite STT error
+[{"role": "assistant", "content": "I know about planets."}, 
+ {"role": "user", "content": "Is is Jupiter the biggest planet"}]
+Output: YES

-User: What's the captial of France?
-Assistant: YES
+2. Complete Commands:
+- Direct instructions
+- Clear requests
+- Action demands
+- Complete statements needing response

-User: Tell me a story about
-Assistant: NO
+Examples:
+# Direct instruction
+[{"role": "assistant", "content": "I can explain many topics."}, 
+ {"role": "user", "content": "Tell me about black holes"}]
+Output: YES

-User: Tell me a story about a dragon
-Assistant YES
+# Action demand
+[{"role": "assistant", "content": "I can help with math."}, 
+ {"role": "user", "content": "Solve this equation x plus 5 equals 12"}]
+Output: YES

-User: Is there a
-Assistant: NO
+3. Direct Responses:
+- Answers to specific questions
+- Option selections
+- Clear acknowledgments with completion

-User: Is there a large
-Assistant: NO
+Examples:
+# Specific answer
+[{"role": "assistant", "content": "What's your favorite color?"}, 
+ {"role": "user", "content": "I really like blue"}]
+Output: YES

-User: Is there a large lake near Chicago?
-Assistant: YES
+# Option selection
+[{"role": "assistant", "content": "Would you prefer morning or evening?"}, 
+ {"role": "user", "content": "Morning"}]
+Output: YES

-User: When is the longest day of the year?
-Assistant: YES
+MEDIUM PRIORITY SIGNALS:

-User: When when is the longest day of the year
-Assistant: YES
+1. Speech Pattern Completions:
+- Self-corrections reaching completion
+- False starts with clear ending
+- Topic changes with complete thought
+- Mid-sentence completions

-User: When when is the
-ASSISTANT: NO
+Examples:
+# Self-correction reaching completion
+[{"role": "assistant", "content": "What would you like to know?"}, 
+ {"role": "user", "content": "Tell me about... no wait, explain how rainbows form"}]
+Output: YES

-User: What is the um I u
-Assistant: NO
+# Topic change with complete thought
+[{"role": "assistant", "content": "The weather is nice today."}, 
+ {"role": "user", "content": "Actually can you tell me who invented the telephone"}]
+Output: YES

-User: What is the um i u largest city in the world
-Assistant: YES
+# Mid-sentence completion
+[{"role": "assistant", "content": "Hello I'm ready."}, 
+ {"role": "user", "content": "What's the capital of? France"}]
+Output: YES

-User: How much does a how much does an adult elephant weigh?
-Assistant: YES
+2. Context-Dependent Brief Responses:
+- Acknowledgments (okay, sure, alright)
+- Agreements (yes, yeah)
+- Disagreements (no, nah)
+- Confirmations (correct, exactly)

-User: How much does a how much does
-Assistant: NO
+Examples:
+# Acknowledgment
+[{"role": "assistant", "content": "Should we talk about history?"}, 
+ {"role": "user", "content": "Sure"}]
+Output: YES

-User: What can you tell me All the
-Assistant: NO
+# Disagreement with completion
+[{"role": "assistant", "content": "Is that what you meant?"}, 
+ {"role": "user", "content": "No not really"}]
+Output: YES

-User: What can you tell me All the prime numbers less than 100
-Assistant: YES
+LOW PRIORITY SIGNALS:

-User: What's the what's the length of the Amazon River?
-Assistant: YES
+1. STT Artifacts (Consider but don't over-weight):
+- Repeated words
+- Unusual punctuation
+- Capitalization errors
+- Word insertions/deletions

-User: What's what's the length of the Amazon River?
-Assistant: YES
+Examples:
+# Word repetition but complete
+[{"role": "assistant", "content": "I can help with that."}, 
+ {"role": "user", "content": "What what is the time right now"}]
+Output: YES

-User: What's what's the length of the Amazon River
-Assistant: YES
+# Missing punctuation but complete
+[{"role": "assistant", "content": "I can explain that."}, 
+ {"role": "user", "content": "Please tell me how computers work"}]
+Output: YES

-User: What's what's the best way to get a coffee stain out of a white shirt
-Assistant: YES
+2. Speech Features:
+- Filler words (um, uh, like)
+- Thinking pauses
+- Word repetitions
+- Brief hesitations
+
+Examples:
+# Filler words but complete
+[{"role": "assistant", "content": "What would you like to know?"}, 
+ {"role": "user", "content": "Um uh how do airplanes fly"}]
+Output: YES
+
+# Thinking pause but incomplete
+[{"role": "assistant", "content": "I can explain anything."}, 
+ {"role": "user", "content": "Well um I want to know about the"}]
+Output: NO
+
+DECISION RULES:
+
+1. Return YES if:
+- ANY high priority signal shows clear completion
+- Medium priority signals combine to show completion
+- Meaning is clear despite low priority artifacts
+
+2. Return NO if:
+- No high priority signals present
+- Thought clearly trails off
+- Multiple incomplete indicators
+- User appears mid-formulation
+
+3. When uncertain:
+- If you can understand the intent → YES
+- If meaning is unclear → NO
+- Always make a binary decision
+- Never request clarification
+
+Examples:
+# Incomplete despite corrections
+[{"role": "assistant", "content": "What would you like to know about?"}, 
+ {"role": "user", "content": "Can you tell me about"}]
+Output: NO
+
+# Complete despite multiple artifacts
+[{"role": "assistant", "content": "I can help you learn."}, 
+ {"role": "user", "content": "How do you I mean what's the best way to learn programming"}]
+Output: YES
+
+# Trailing off incomplete
+[{"role": "assistant", "content": "I can explain anything."}, 
+ {"role": "user", "content": "I was wondering if you could tell me why"}]
+Output: NO
 """

 conversational_system_message = """You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.
@@ -297,15 +416,14 @@ async def main():
        # statement. This doesn't really need to be an LLM, we could use NLP
        # libraries for that, but we have the machinery to use an LLM, so we might as well!
        statement_llm = AnthropicLLMService(
-            api_key=os.getenv("ANTHROPIC_API_KEY"), model="claude-3-5-haiku-20241022", name="Haiku"
+            api_key=os.getenv("ANTHROPIC_API_KEY"),
+            model="claude-3-5-sonnet-20241022",
        )

        # This is the regular LLM.
-        llm = AnthropicLLMService(
-            api_key=os.getenv("ANTHROPIC_API_KEY"),
-            model="claude-3-5-sonnet-20241022",
-            name="Sonnet",
-            params=AnthropicLLMService.InputParams(enable_prompt_caching_beta=True),
+        llm = OpenAILLMService(
+            api_key=os.getenv("OPENAI_API_KEY"),
+            model="gpt-4o",
        )

        messages = [
--- a/examples/foundational/race_bot.py
+++ b/examples/foundational/race_bot.py
@@ -0,0 +1,191 @@
+#
+# Copyright (c) 2024, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+import asyncio
+import os
+import sys
+import time
+
+import aiohttp
+from loguru import logger
+from runner import configure
+
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import (
+    BotSpeakingFrame,
+    EndFrame,
+    Frame,
+    InputAudioRawFrame,
+    StartInterruptionFrame,
+    StopInterruptionFrame,
+    TextFrame,
+    TranscriptionFrame,
+    TTSAudioRawFrame,
+    UserStartedSpeakingFrame,
+    UserStoppedSpeakingFrame,
+)
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
+from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
+from pipecat.services.cartesia import CartesiaTTSService
+from pipecat.services.openai import OpenAILLMService
+from pipecat.transports.services.daily import DailyParams, DailyTransport
+
+logger.remove(0)
+logger.add(sys.stderr, level="DEBUG")
+
+
+class DebugProcessor(FrameProcessor):
+    def __init__(self, name, **kwargs):
+        self._name = name
+        super().__init__(**kwargs)
+
+    async def process_frame(self, frame: Frame, direction: FrameDirection):
+        await super().process_frame(frame, direction)
+        if not (
+            isinstance(frame, InputAudioRawFrame)
+            or isinstance(frame, BotSpeakingFrame)
+            or isinstance(frame, UserStoppedSpeakingFrame)
+            or isinstance(frame, TTSAudioRawFrame)
+            or isinstance(frame, TextFrame)
+        ):
+            logger.debug(f"--- {self._name}: {frame} {direction}")
+        await self.push_frame(frame, direction)
+
+
+async def main():
+    async with aiohttp.ClientSession() as session:
+        (room_url, _) = await configure(session)
+
+        transport = DailyTransport(
+            room_url,
+            None,
+            "AI Bot",
+            DailyParams(
+                audio_out_enabled=True,
+                transcription_enabled=True,
+                vad_enabled=True,
+                vad_analyzer=SileroVADAnalyzer(),
+            ),
+        )
+        tts = CartesiaTTSService(
+            api_key=os.getenv("CARTESIA_API_KEY"),
+            voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22",  # British Lady
+        )
+
+        llm = OpenAILLMService(api_key=os.environ["OPENAI_API_KEY"], model="gpt-4o")
+
+        messages = [
+            {
+                "role": "system",
+                "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
+            },
+        ]
+
+        dp = DebugProcessor("dp")
+
+        context = OpenAILLMContext(messages)
+        context_aggregator = llm.create_context_aggregator(context)
+
+        runner = PipelineRunner()
+
+        task = PipelineTask(
+            Pipeline(
+                [
+                    # transport.input(),
+                    context_aggregator.user(),
+                    llm,
+                    dp,
+                    tts,
+                    transport.output(),
+                    context_aggregator.assistant(),
+                ]
+            ),
+            PipelineParams(
+                allow_interruptions=True,
+            ),
+        )
+
+        # Register an event handler so we can play the audio when the
+        # participant joins.
+        @transport.event_handler("on_first_participant_joined")
+        async def on_first_participant_joined(transport, participant):
+            participant_id = participant.get("info", {}).get("participantId", "")
+
+            # Create frames for 600 seconds
+            start_time = time.time()
+            while time.time() - start_time < 300:
+                elapsed_time = round(time.time() - start_time)
+                logger.info(f"Running for {elapsed_time} seconds")
+                await task.queue_frame(
+                    StartInterruptionFrame(),
+                )
+                await asyncio.sleep(1)
+
+                await task.queue_frame(
+                    UserStartedSpeakingFrame(),
+                )
+
+                await asyncio.sleep(1)
+
+                await task.queue_frame(
+                    TranscriptionFrame("Tell me more about your company.", participant_id, time.time()),
+                )
+
+                await asyncio.sleep(1)
+
+                await task.queue_frame(
+                    StopInterruptionFrame(),
+                )
+
+                await asyncio.sleep(1)
+
+                await task.queue_frame(
+                    UserStoppedSpeakingFrame(),
+                )
+
+                await asyncio.sleep(5)
+
+                await task.queue_frame(StartInterruptionFrame())
+                await asyncio.sleep(1)
+
+                await task.queue_frame(
+                    UserStartedSpeakingFrame(),
+                )
+
+                await asyncio.sleep(1)
+
+                await task.queue_frame(
+                    TranscriptionFrame("Give me a list of appointment dates.", participant_id, time.time()),
+                )
+
+                await asyncio.sleep(1)
+
+                await task.queue_frames(
+                    StopInterruptionFrame(),
+                )
+
+                await asyncio.sleep(1)
+                await task.queue_frame(
+                    UserStoppedSpeakingFrame(),
+                )
+                await asyncio.sleep(5)
+            await task.queue_frame(EndFrame())
+
+        # @transport.event_handler("on_first_participant_joined")
+        # async def on_first_participant_joined(transport, participant):
+        #     await transport.capture_participant_transcription(participant["id"])
+        #     # Kick off the conversation.
+        #     messages.append({"role": "system", "content": "Please introduce yourself to the user."})
+        #     await task.queue_frames([LLMMessagesFrame(messages)])
+
+        await runner.run(task)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/simple-chatbot/bot.py
+++ b/examples/simple-chatbot/bot.py
@@ -5,36 +5,33 @@
 #

 import asyncio
-import aiohttp
 import os
 import sys

+import aiohttp
+from dotenv import load_dotenv
+from loguru import logger
 from PIL import Image
+from runner import configure

 from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import (
+    BotStartedSpeakingFrame,
+    BotStoppedSpeakingFrame,
+    Frame,
+    LLMMessagesFrame,
+    OutputImageRawFrame,
+    SpriteFrame,
+)
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.runner import PipelineRunner
 from pipecat.pipeline.task import PipelineParams, PipelineTask
-from pipecat.frames.frames import (
-    OutputImageRawFrame,
-    SpriteFrame,
-    Frame,
-    LLMMessagesFrame,
-    TTSAudioRawFrame,
-    TTSStoppedFrame,
-)
 from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
 from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
 from pipecat.services.elevenlabs import ElevenLabsTTSService
 from pipecat.services.openai import OpenAILLMService
 from pipecat.transports.services.daily import DailyParams, DailyTransport

-from runner import configure
-
-from loguru import logger
-
-from dotenv import load_dotenv
-
 load_dotenv(override=True)

 logger.remove(0)
@@ -73,15 +70,15 @@ class TalkingAnimation(FrameProcessor):
    async def process_frame(self, frame: Frame, direction: FrameDirection):
        await super().process_frame(frame, direction)

-        if isinstance(frame, TTSAudioRawFrame):
+        if isinstance(frame, BotStartedSpeakingFrame):
            if not self._is_talking:
                await self.push_frame(talking_frame)
                self._is_talking = True
-        elif isinstance(frame, TTSStoppedFrame):
+        elif isinstance(frame, BotStoppedSpeakingFrame):
            await self.push_frame(quiet_frame)
            self._is_talking = False

-        await self.push_frame(frame)
+        await self.push_frame(frame, direction)


 async def main():
@@ -162,7 +159,7 @@ async def main():

        @transport.event_handler("on_first_participant_joined")
        async def on_first_participant_joined(transport, participant):
-            transport.capture_participant_transcription(participant["id"])
+            await transport.capture_participant_transcription(participant["id"])
            await task.queue_frames([LLMMessagesFrame(messages)])

        runner = PipelineRunner()
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -42,7 +42,7 @@ aws = [ "boto3~=1.35.27" ]
 azure = [ "azure-cognitiveservices-speech~=1.40.0" ]
 canonical = [ "aiofiles~=24.1.0" ]
 cartesia = [ "cartesia~=1.0.13", "websockets~=13.1" ]
-daily = [ "daily-python~=0.12.0" ]
+daily = [ "daily-python~=0.13.0" ]
 deepgram = [ "deepgram-sdk~=3.7.3" ]
 elevenlabs = [ "websockets~=13.1" ]
 examples = [ "python-dotenv~=1.0.1", "flask~=3.0.3", "flask_cors~=4.0.1" ]
@@ -51,7 +51,7 @@ gladia = [ "websockets~=13.1" ]
 google = [ "google-generativeai~=0.8.3", "google-cloud-texttospeech~=2.17.2" ]
 gstreamer = [ "pygobject~=3.48.2" ]
 fireworks = [ "openai~=1.37.2" ]
-krisp = [ "pipecat-ai-krisp~=0.2.0" ]
+krisp = [ "pipecat-ai-krisp~=0.3.0" ]
 langchain = [ "langchain~=0.2.14", "langchain-community~=0.2.12", "langchain-openai~=0.1.20" ]
 livekit = [ "livekit~=0.17.5", "livekit-api~=0.7.1", "tenacity~=8.5.0" ]
 lmnt = [ "lmnt~=1.1.4" ]
--- a/src/pipecat/processors/audio/init.py
+++ b/src/pipecat/processors/audio/init.py
--- a/src/pipecat/processors/frame_processor.py
+++ b/src/pipecat/processors/frame_processor.py
@@ -246,6 +246,8 @@ class FrameProcessor:
                await self._prev.queue_frame(frame, direction)
        except Exception as e:
            logger.exception(f"Uncaught exception in {self}: {e}")
+            await self.push_error(ErrorFrame(str(e)))
+            raise

    def __create_input_task(self):
        self.__input_queue = asyncio.Queue()
--- a/src/pipecat/processors/frameworks/rtvi.py
+++ b/src/pipecat/processors/frameworks/rtvi.py
@@ -591,6 +591,7 @@ class RTVIProcessor(FrameProcessor):
        self._message_queue = asyncio.Queue()
        self._message_task = self.get_event_loop().create_task(self._message_task_handler())

+        self._register_event_handler("on_bot_started")
        self._register_event_handler("on_client_ready")

    def register_action(self, action: RTVIAction):
@@ -679,7 +680,7 @@ class RTVIProcessor(FrameProcessor):
            await self._pipeline.cleanup()

    async def _start(self, frame: StartFrame):
-        pass
+        await self._call_event_handler("on_bot_started")

    async def _stop(self, frame: EndFrame):
        await self._cancel_tasks()
@@ -742,6 +743,8 @@ class RTVIProcessor(FrameProcessor):
                case "update-config":
                    update_config = RTVIUpdateConfig.model_validate(message.data)
                    await self._handle_update_config(message.id, update_config)
+                case "disconnect-bot":
+                    await self.push_frame(EndFrame())
                case "action":
                    action = RTVIActionRun.model_validate(message.data)
                    action_frame = RTVIActionFrame(message_id=message.id, rtvi_action_run=action)
--- a/src/pipecat/processors/metrics/init.py
+++ b/src/pipecat/processors/metrics/init.py
--- a/src/pipecat/services/aws.py
+++ b/src/pipecat/services/aws.py
@@ -34,35 +34,77 @@ except ModuleNotFoundError as e:

 def language_to_aws_language(language: Language) -> str | None:
    language_map = {
+        # Arabic
+        Language.AR: "arb",
+        Language.AR_AE: "ar-AE",
+        # Catalan
        Language.CA: "ca-ES",
-        Language.ZH: "cmn-CN",
+        # Chinese
+        Language.ZH: "cmn-CN",  # Mandarin
+        Language.YUE: "yue-CN",  # Cantonese
+        Language.YUE_CN: "yue-CN",
+        # Czech
+        Language.CS: "cs-CZ",
+        # Danish
        Language.DA: "da-DK",
+        # Dutch
        Language.NL: "nl-NL",
        Language.NL_BE: "nl-BE",
-        Language.EN: "en-US",
-        Language.EN_US: "en-US",
+        # English
+        Language.EN: "en-US",  # Default to US English
        Language.EN_AU: "en-AU",
        Language.EN_GB: "en-GB",
-        Language.EN_NZ: "en-NZ",
        Language.EN_IN: "en-IN",
+        Language.EN_NZ: "en-NZ",
+        Language.EN_US: "en-US",
+        Language.EN_ZA: "en-ZA",
+        # Finnish
        Language.FI: "fi-FI",
+        # French
        Language.FR: "fr-FR",
+        Language.FR_BE: "fr-BE",
        Language.FR_CA: "fr-CA",
+        # German
        Language.DE: "de-DE",
+        Language.DE_AT: "de-AT",
+        Language.DE_CH: "de-CH",
+        # Hindi
        Language.HI: "hi-IN",
+        # Icelandic
+        Language.IS: "is-IS",
+        # Italian
        Language.IT: "it-IT",
+        # Japanese
        Language.JA: "ja-JP",
+        # Korean
        Language.KO: "ko-KR",
+        # Norwegian
        Language.NO: "nb-NO",
+        Language.NB: "nb-NO",
+        Language.NB_NO: "nb-NO",
+        # Polish
        Language.PL: "pl-PL",
+        # Portuguese
        Language.PT: "pt-PT",
        Language.PT_BR: "pt-BR",
+        Language.PT_PT: "pt-PT",
+        # Romanian
        Language.RO: "ro-RO",
+        # Russian
        Language.RU: "ru-RU",
+        # Spanish
        Language.ES: "es-ES",
+        Language.ES_MX: "es-MX",
+        Language.ES_US: "es-US",
+        # Swedish
        Language.SV: "sv-SE",
+        # Turkish
        Language.TR: "tr-TR",
+        # Welsh
+        Language.CY: "cy-GB",
+        Language.CY_GB: "cy-GB",
    }
+
    return language_map.get(language)


--- a/src/pipecat/services/azure.py
+++ b/src/pipecat/services/azure.py
@@ -63,49 +63,325 @@ except ModuleNotFoundError as e:

 def language_to_azure_language(language: Language) -> str | None:
    language_map = {
+        # Afrikaans
+        Language.AF: "af-ZA",
+        Language.AF_ZA: "af-ZA",
+        # Amharic
+        Language.AM: "am-ET",
+        Language.AM_ET: "am-ET",
+        # Arabic
+        Language.AR: "ar-AE",  # Default to UAE Arabic
+        Language.AR_AE: "ar-AE",
+        Language.AR_BH: "ar-BH",
+        Language.AR_DZ: "ar-DZ",
+        Language.AR_EG: "ar-EG",
+        Language.AR_IQ: "ar-IQ",
+        Language.AR_JO: "ar-JO",
+        Language.AR_KW: "ar-KW",
+        Language.AR_LB: "ar-LB",
+        Language.AR_LY: "ar-LY",
+        Language.AR_MA: "ar-MA",
+        Language.AR_OM: "ar-OM",
+        Language.AR_QA: "ar-QA",
+        Language.AR_SA: "ar-SA",
+        Language.AR_SY: "ar-SY",
+        Language.AR_TN: "ar-TN",
+        Language.AR_YE: "ar-YE",
+        # Assamese
+        Language.AS: "as-IN",
+        Language.AS_IN: "as-IN",
+        # Azerbaijani
+        Language.AZ: "az-AZ",
+        Language.AZ_AZ: "az-AZ",
+        # Bulgarian
        Language.BG: "bg-BG",
+        Language.BG_BG: "bg-BG",
+        # Bengali
+        Language.BN: "bn-IN",  # Default to Indian Bengali
+        Language.BN_BD: "bn-BD",
+        Language.BN_IN: "bn-IN",
+        # Bosnian
+        Language.BS: "bs-BA",
+        Language.BS_BA: "bs-BA",
+        # Catalan
        Language.CA: "ca-ES",
-        Language.ZH: "zh-CN",
-        Language.ZH_TW: "zh-TW",
+        Language.CA_ES: "ca-ES",
+        # Czech
        Language.CS: "cs-CZ",
+        Language.CS_CZ: "cs-CZ",
+        # Welsh
+        Language.CY: "cy-GB",
+        Language.CY_GB: "cy-GB",
+        # Danish
        Language.DA: "da-DK",
-        Language.NL: "nl-NL",
-        Language.EN: "en-US",
-        Language.EN_US: "en-US",
-        Language.EN_AU: "en-AU",
-        Language.EN_GB: "en-GB",
-        Language.EN_NZ: "en-NZ",
-        Language.EN_IN: "en-IN",
-        Language.ET: "et-EE",
-        Language.FI: "fi-FI",
-        Language.NL_BE: "nl-BE",
-        Language.FR: "fr-FR",
-        Language.FR_CA: "fr-CA",
+        Language.DA_DK: "da-DK",
+        # German
        Language.DE: "de-DE",
+        Language.DE_AT: "de-AT",
        Language.DE_CH: "de-CH",
+        Language.DE_DE: "de-DE",
+        # Greek
        Language.EL: "el-GR",
+        Language.EL_GR: "el-GR",
+        # English
+        Language.EN: "en-US",  # Default to US English
+        Language.EN_AU: "en-AU",
+        Language.EN_CA: "en-CA",
+        Language.EN_GB: "en-GB",
+        Language.EN_HK: "en-HK",
+        Language.EN_IE: "en-IE",
+        Language.EN_IN: "en-IN",
+        Language.EN_KE: "en-KE",
+        Language.EN_NG: "en-NG",
+        Language.EN_NZ: "en-NZ",
+        Language.EN_PH: "en-PH",
+        Language.EN_SG: "en-SG",
+        Language.EN_TZ: "en-TZ",
+        Language.EN_US: "en-US",
+        Language.EN_ZA: "en-ZA",
+        # Spanish
+        Language.ES: "es-ES",  # Default to Spain Spanish
+        Language.ES_AR: "es-AR",
+        Language.ES_BO: "es-BO",
+        Language.ES_CL: "es-CL",
+        Language.ES_CO: "es-CO",
+        Language.ES_CR: "es-CR",
+        Language.ES_CU: "es-CU",
+        Language.ES_DO: "es-DO",
+        Language.ES_EC: "es-EC",
+        Language.ES_ES: "es-ES",
+        Language.ES_GQ: "es-GQ",
+        Language.ES_GT: "es-GT",
+        Language.ES_HN: "es-HN",
+        Language.ES_MX: "es-MX",
+        Language.ES_NI: "es-NI",
+        Language.ES_PA: "es-PA",
+        Language.ES_PE: "es-PE",
+        Language.ES_PR: "es-PR",
+        Language.ES_PY: "es-PY",
+        Language.ES_SV: "es-SV",
+        Language.ES_US: "es-US",
+        Language.ES_UY: "es-UY",
+        Language.ES_VE: "es-VE",
+        # Estonian
+        Language.ET: "et-EE",
+        Language.ET_EE: "et-EE",
+        # Basque
+        Language.EU: "eu-ES",
+        Language.EU_ES: "eu-ES",
+        # Persian
+        Language.FA: "fa-IR",
+        Language.FA_IR: "fa-IR",
+        # Finnish
+        Language.FI: "fi-FI",
+        Language.FI_FI: "fi-FI",
+        # Filipino
+        Language.FIL: "fil-PH",
+        Language.FIL_PH: "fil-PH",
+        # French
+        Language.FR: "fr-FR",
+        Language.FR_BE: "fr-BE",
+        Language.FR_CA: "fr-CA",
+        Language.FR_CH: "fr-CH",
+        Language.FR_FR: "fr-FR",
+        # Irish
+        Language.GA: "ga-IE",
+        Language.GA_IE: "ga-IE",
+        # Galician
+        Language.GL: "gl-ES",
+        Language.GL_ES: "gl-ES",
+        # Gujarati
+        Language.GU: "gu-IN",
+        Language.GU_IN: "gu-IN",
+        # Hebrew
+        Language.HE: "he-IL",
+        Language.HE_IL: "he-IL",
+        # Hindi
        Language.HI: "hi-IN",
+        Language.HI_IN: "hi-IN",
+        # Croatian
+        Language.HR: "hr-HR",
+        Language.HR_HR: "hr-HR",
+        # Hungarian
        Language.HU: "hu-HU",
+        Language.HU_HU: "hu-HU",
+        # Armenian
+        Language.HY: "hy-AM",
+        Language.HY_AM: "hy-AM",
+        # Indonesian
        Language.ID: "id-ID",
+        Language.ID_ID: "id-ID",
+        # Icelandic
+        Language.IS: "is-IS",
+        Language.IS_IS: "is-IS",
+        # Italian
        Language.IT: "it-IT",
+        Language.IT_IT: "it-IT",
+        # Inuktitut
+        Language.IU_CANS_CA: "iu-Cans-CA",
+        Language.IU_LATN_CA: "iu-Latn-CA",
+        # Japanese
        Language.JA: "ja-JP",
+        Language.JA_JP: "ja-JP",
+        # Javanese
+        Language.JV: "jv-ID",
+        Language.JV_ID: "jv-ID",
+        # Georgian
+        Language.KA: "ka-GE",
+        Language.KA_GE: "ka-GE",
+        # Kazakh
+        Language.KK: "kk-KZ",
+        Language.KK_KZ: "kk-KZ",
+        # Khmer
+        Language.KM: "km-KH",
+        Language.KM_KH: "km-KH",
+        # Kannada
+        Language.KN: "kn-IN",
+        Language.KN_IN: "kn-IN",
+        # Korean
        Language.KO: "ko-KR",
-        Language.LV: "lv-LV",
+        Language.KO_KR: "ko-KR",
+        # Lao
+        Language.LO: "lo-LA",
+        Language.LO_LA: "lo-LA",
+        # Lithuanian
        Language.LT: "lt-LT",
+        Language.LT_LT: "lt-LT",
+        # Latvian
+        Language.LV: "lv-LV",
+        Language.LV_LV: "lv-LV",
+        # Macedonian
+        Language.MK: "mk-MK",
+        Language.MK_MK: "mk-MK",
+        # Malayalam
+        Language.ML: "ml-IN",
+        Language.ML_IN: "ml-IN",
+        # Mongolian
+        Language.MN: "mn-MN",
+        Language.MN_MN: "mn-MN",
+        # Marathi
+        Language.MR: "mr-IN",
+        Language.MR_IN: "mr-IN",
+        # Malay
        Language.MS: "ms-MY",
+        Language.MS_MY: "ms-MY",
+        # Maltese
+        Language.MT: "mt-MT",
+        Language.MT_MT: "mt-MT",
+        # Burmese
+        Language.MY: "my-MM",
+        Language.MY_MM: "my-MM",
+        # Norwegian
+        Language.NB: "nb-NO",
+        Language.NB_NO: "nb-NO",
        Language.NO: "nb-NO",
+        # Nepali
+        Language.NE: "ne-NP",
+        Language.NE_NP: "ne-NP",
+        # Dutch
+        Language.NL: "nl-NL",
+        Language.NL_BE: "nl-BE",
+        Language.NL_NL: "nl-NL",
+        # Odia
+        Language.OR: "or-IN",
+        Language.OR_IN: "or-IN",
+        # Punjabi
+        Language.PA: "pa-IN",
+        Language.PA_IN: "pa-IN",
+        # Polish
        Language.PL: "pl-PL",
+        Language.PL_PL: "pl-PL",
+        # Pashto
+        Language.PS: "ps-AF",
+        Language.PS_AF: "ps-AF",
+        # Portuguese
        Language.PT: "pt-PT",
        Language.PT_BR: "pt-BR",
+        Language.PT_PT: "pt-PT",
+        # Romanian
        Language.RO: "ro-RO",
+        Language.RO_RO: "ro-RO",
+        # Russian
        Language.RU: "ru-RU",
+        Language.RU_RU: "ru-RU",
+        # Sinhala
+        Language.SI: "si-LK",
+        Language.SI_LK: "si-LK",
+        # Slovak
        Language.SK: "sk-SK",
-        Language.ES: "es-ES",
+        Language.SK_SK: "sk-SK",
+        # Slovenian
+        Language.SL: "sl-SI",
+        Language.SL_SI: "sl-SI",
+        # Somali
+        Language.SO: "so-SO",
+        Language.SO_SO: "so-SO",
+        # Albanian
+        Language.SQ: "sq-AL",
+        Language.SQ_AL: "sq-AL",
+        # Serbian
+        Language.SR: "sr-RS",
+        Language.SR_RS: "sr-RS",
+        Language.SR_LATN: "sr-Latn-RS",
+        Language.SR_LATN_RS: "sr-Latn-RS",
+        # Sundanese
+        Language.SU: "su-ID",
+        Language.SU_ID: "su-ID",
+        # Swedish
        Language.SV: "sv-SE",
+        Language.SV_SE: "sv-SE",
+        # Swahili
+        Language.SW: "sw-KE",
+        Language.SW_KE: "sw-KE",
+        Language.SW_TZ: "sw-TZ",
+        # Tamil
+        Language.TA: "ta-IN",
+        Language.TA_IN: "ta-IN",
+        Language.TA_LK: "ta-LK",
+        Language.TA_MY: "ta-MY",
+        Language.TA_SG: "ta-SG",
+        # Telugu
+        Language.TE: "te-IN",
+        Language.TE_IN: "te-IN",
+        # Thai
        Language.TH: "th-TH",
+        Language.TH_TH: "th-TH",
+        # Turkish
        Language.TR: "tr-TR",
+        Language.TR_TR: "tr-TR",
+        # Ukrainian
        Language.UK: "uk-UA",
+        Language.UK_UA: "uk-UA",
+        # Urdu
+        Language.UR: "ur-IN",
+        Language.UR_IN: "ur-IN",
+        Language.UR_PK: "ur-PK",
+        # Uzbek
+        Language.UZ: "uz-UZ",
+        Language.UZ_UZ: "uz-UZ",
+        # Vietnamese
        Language.VI: "vi-VN",
+        Language.VI_VN: "vi-VN",
+        # Wu Chinese
+        Language.WUU: "wuu-CN",
+        Language.WUU_CN: "wuu-CN",
+        # Yue Chinese
+        Language.YUE: "yue-CN",
+        Language.YUE_CN: "yue-CN",
+        # Chinese
+        Language.ZH: "zh-CN",
+        Language.ZH_CN: "zh-CN",
+        Language.ZH_CN_GUANGXI: "zh-CN-guangxi",
+        Language.ZH_CN_HENAN: "zh-CN-henan",
+        Language.ZH_CN_LIAONING: "zh-CN-liaoning",
+        Language.ZH_CN_SHAANXI: "zh-CN-shaanxi",
+        Language.ZH_CN_SHANDONG: "zh-CN-shandong",
+        Language.ZH_CN_SICHUAN: "zh-CN-sichuan",
+        Language.ZH_HK: "zh-HK",
+        Language.ZH_TW: "zh-TW",
+        # Zulu
+        Language.ZU: "zu-ZA",
+        Language.ZU_ZA: "zu-ZA",
    }
    return language_map.get(language)

--- a/src/pipecat/services/cartesia.py
+++ b/src/pipecat/services/cartesia.py
@@ -7,6 +7,7 @@
 import asyncio
 import base64
 import json
+import random
 import uuid
 from typing import AsyncGenerator, List, Optional, Union

@@ -44,24 +45,27 @@ except ModuleNotFoundError as e:


 def language_to_cartesia_language(language: Language) -> str | None:
-    language_map = {
+    BASE_LANGUAGES = {
        Language.DE: "de",
        Language.EN: "en",
-        Language.EN_US: "en",
-        Language.EN_GB: "en",
-        Language.EN_AU: "en",
-        Language.EN_NZ: "en",
-        Language.EN_IN: "en",
        Language.ES: "es",
        Language.FR: "fr",
-        Language.FR_CA: "fr",
        Language.JA: "ja",
        Language.PT: "pt",
-        Language.PT_BR: "pt",
        Language.ZH: "zh",
-        Language.ZH_TW: "zh",
    }
-    return language_map.get(language)
+
+    result = BASE_LANGUAGES.get(language)
+
+    # If not found in base languages, try to find the base language from a variant
+    if not result:
+        # Convert enum value to string and get the base language part (e.g. es-ES -> es)
+        lang_str = str(language.value)
+        base_code = lang_str.split("-")[0].lower()
+        # Look up the base code in our supported languages
+        result = base_code if base_code in BASE_LANGUAGES.values() else None
+
+    return result


 class CartesiaTTSService(WordTTSService):
@@ -219,17 +223,22 @@ class CartesiaTTSService(WordTTSService):
    async def _receive_task_handler(self):
        try:
            async for message in self._get_websocket():
+                # Randomly cancel the asyncio task 1% of the time
+                if random.random() < 0.01:
+                    logger.info(f"Cancelling task for {self} due to random chance")
+                    asyncio.current_task().cancel()
                msg = json.loads(message)
                if not msg or msg["context_id"] != self._context_id:
                    continue
                if msg["type"] == "done":
-                    await self.push_frame(TTSStoppedFrame())
                    await self.stop_ttfb_metrics()
                    # Unset _context_id but not the _context_id_start_timestamp
                    # because we are likely still playing out audio and need the
                    # timestamp to set send context frames.
                    self._context_id = None
-                    await self.add_word_timestamps([("LLMFullResponseEndFrame", 0), ("Reset", 0)])
+                    await self.add_word_timestamps(
+                        [("TTSStoppedFrame", 0), ("LLMFullResponseEndFrame", 0), ("Reset", 0)]
+                    )
                elif msg["type"] == "timestamps":
                    await self.add_word_timestamps(
                        list(zip(msg["word_timestamps"]["words"], msg["word_timestamps"]["start"]))
@@ -252,6 +261,7 @@ class CartesiaTTSService(WordTTSService):
                    logger.error(f"Cartesia error, unknown message type: {msg}")
        except asyncio.CancelledError:
            pass
+            # await self.push_error(ErrorFrame(f"{self} cancelled", True))
        except Exception as e:
            logger.error(f"{self} exception: {e}")

--- a/src/pipecat/services/elevenlabs.py
+++ b/src/pipecat/services/elevenlabs.py
@@ -43,24 +43,16 @@ ElevenLabsOutputFormat = Literal["pcm_16000", "pcm_22050", "pcm_24000", "pcm_441


 def language_to_elevenlabs_language(language: Language) -> str | None:
-    language_map = {
+    BASE_LANGUAGES = {
        Language.BG: "bg",
-        Language.ZH: "zh",
        Language.CS: "cs",
        Language.DA: "da",
-        Language.NL: "nl",
+        Language.DE: "de",
+        Language.EL: "el",
        Language.EN: "en",
-        Language.EN_US: "en",
-        Language.EN_AU: "en",
-        Language.EN_GB: "en",
-        Language.EN_NZ: "en",
-        Language.EN_IN: "en",
+        Language.ES: "es",
        Language.FI: "fi",
        Language.FR: "fr",
-        Language.FR_CA: "fr",
-        Language.DE: "de",
-        Language.DE_CH: "de",
-        Language.EL: "el",
        Language.HI: "hi",
        Language.HU: "hu",
        Language.ID: "id",
@@ -68,20 +60,31 @@ def language_to_elevenlabs_language(language: Language) -> str | None:
        Language.JA: "ja",
        Language.KO: "ko",
        Language.MS: "ms",
+        Language.NL: "nl",
        Language.NO: "no",
        Language.PL: "pl",
-        Language.PT: "pt-PT",
-        Language.PT_BR: "pt-BR",
+        Language.PT: "pt",
        Language.RO: "ro",
        Language.RU: "ru",
        Language.SK: "sk",
-        Language.ES: "es",
        Language.SV: "sv",
        Language.TR: "tr",
        Language.UK: "uk",
        Language.VI: "vi",
+        Language.ZH: "zh",
    }
-    return language_map.get(language)
+
+    result = BASE_LANGUAGES.get(language)
+
+    # If not found in base languages, try to find the base language from a variant
+    if not result:
+        # Convert enum value to string and get the base language part (e.g. es-ES -> es)
+        lang_str = str(language.value)
+        base_code = lang_str.split("-")[0].lower()
+        # Look up the base code in our supported languages
+        result = base_code if base_code in BASE_LANGUAGES.values() else None
+
+    return result


 def sample_rate_from_output_format(output_format: str) -> int:
--- a/src/pipecat/services/gladia.py
+++ b/src/pipecat/services/gladia.py
@@ -35,50 +35,98 @@ except ModuleNotFoundError as e:


 def language_to_gladia_language(language: Language) -> str | None:
-    language_map = {
+    BASE_LANGUAGES = {
+        Language.AF: "af",
+        Language.AM: "am",
+        Language.AR: "ar",
+        Language.AS: "as",
+        Language.AZ: "az",
        Language.BG: "bg",
+        Language.BN: "bn",
+        Language.BS: "bs",
        Language.CA: "ca",
-        Language.ZH: "zh",
        Language.CS: "cs",
+        Language.CY: "cy",
        Language.DA: "da",
-        Language.NL: "nl",
+        Language.DE: "de",
+        Language.EL: "el",
        Language.EN: "en",
-        Language.EN_US: "en",
-        Language.EN_AU: "en",
-        Language.EN_GB: "en",
-        Language.EN_NZ: "en",
-        Language.EN_IN: "en",
+        Language.ES: "es",
        Language.ET: "et",
+        Language.EU: "eu",
+        Language.FA: "fa",
        Language.FI: "fi",
        Language.FR: "fr",
-        Language.FR_CA: "fr",
-        Language.DE: "de",
-        Language.DE_CH: "de",
-        Language.EL: "el",
+        Language.GA: "ga",
+        Language.GL: "gl",
+        Language.GU: "gu",
+        Language.HE: "he",
        Language.HI: "hi",
+        Language.HR: "hr",
        Language.HU: "hu",
+        Language.HY: "hy",
        Language.ID: "id",
+        Language.IS: "is",
        Language.IT: "it",
        Language.JA: "ja",
+        Language.JV: "jv",
+        Language.KA: "ka",
+        Language.KK: "kk",
+        Language.KM: "km",
+        Language.KN: "kn",
        Language.KO: "ko",
-        Language.LV: "lv",
+        Language.LO: "lo",
        Language.LT: "lt",
+        Language.LV: "lv",
+        Language.MK: "mk",
+        Language.ML: "ml",
+        Language.MN: "mn",
+        Language.MR: "mr",
        Language.MS: "ms",
+        Language.MT: "mt",
+        Language.MY: "my",
+        Language.NE: "ne",
+        Language.NL: "nl",
        Language.NO: "no",
+        Language.OR: "or",
+        Language.PA: "pa",
        Language.PL: "pl",
+        Language.PS: "ps",
        Language.PT: "pt",
-        Language.PT_BR: "pt",
        Language.RO: "ro",
        Language.RU: "ru",
+        Language.SI: "si",
        Language.SK: "sk",
-        Language.ES: "es",
+        Language.SL: "sl",
+        Language.SO: "so",
+        Language.SQ: "sq",
+        Language.SR: "sr",
+        Language.SU: "su",
        Language.SV: "sv",
+        Language.SW: "sw",
+        Language.TA: "ta",
+        Language.TE: "te",
        Language.TH: "th",
        Language.TR: "tr",
        Language.UK: "uk",
+        Language.UR: "ur",
+        Language.UZ: "uz",
        Language.VI: "vi",
+        Language.ZH: "zh",
+        Language.ZU: "zu",
    }
-    return language_map.get(language)
+
+    result = BASE_LANGUAGES.get(language)
+
+    # If not found in base languages, try to find the base language from a variant
+    if not result:
+        # Convert enum value to string and get the base language part (e.g. es-ES -> es)
+        lang_str = str(language.value)
+        base_code = lang_str.split("-")[0].lower()
+        # Look up the base code in our supported languages
+        result = base_code if base_code in BASE_LANGUAGES.values() else None
+
+    return result


 class GladiaSTTService(STTService):
--- a/src/pipecat/services/google.py
+++ b/src/pipecat/services/google.py
@@ -58,48 +58,161 @@ except ModuleNotFoundError as e:

 def language_to_google_language(language: Language) -> str | None:
    language_map = {
+        # Afrikaans
+        Language.AF: "af-ZA",
+        Language.AF_ZA: "af-ZA",
+        # Arabic
+        Language.AR: "ar-XA",
+        # Bengali
+        Language.BN: "bn-IN",
+        Language.BN_IN: "bn-IN",
+        # Bulgarian
        Language.BG: "bg-BG",
+        Language.BG_BG: "bg-BG",
+        # Catalan
        Language.CA: "ca-ES",
+        Language.CA_ES: "ca-ES",
+        # Chinese (Mandarin and Cantonese)
        Language.ZH: "cmn-CN",
+        Language.ZH_CN: "cmn-CN",
        Language.ZH_TW: "cmn-TW",
+        Language.ZH_HK: "yue-HK",
+        # Czech
        Language.CS: "cs-CZ",
+        Language.CS_CZ: "cs-CZ",
+        # Danish
        Language.DA: "da-DK",
+        Language.DA_DK: "da-DK",
+        # Dutch
        Language.NL: "nl-NL",
+        Language.NL_BE: "nl-BE",
+        Language.NL_NL: "nl-NL",
+        # English
        Language.EN: "en-US",
        Language.EN_US: "en-US",
        Language.EN_AU: "en-AU",
        Language.EN_GB: "en-GB",
        Language.EN_IN: "en-IN",
+        # Estonian
        Language.ET: "et-EE",
+        Language.ET_EE: "et-EE",
+        # Filipino
+        Language.FIL: "fil-PH",
+        Language.FIL_PH: "fil-PH",
+        # Finnish
        Language.FI: "fi-FI",
-        Language.NL_BE: "nl-BE",
+        Language.FI_FI: "fi-FI",
+        # French
        Language.FR: "fr-FR",
        Language.FR_CA: "fr-CA",
+        Language.FR_FR: "fr-FR",
+        # Galician
+        Language.GL: "gl-ES",
+        Language.GL_ES: "gl-ES",
+        # German
        Language.DE: "de-DE",
+        Language.DE_DE: "de-DE",
+        # Greek
        Language.EL: "el-GR",
+        Language.EL_GR: "el-GR",
+        # Gujarati
+        Language.GU: "gu-IN",
+        Language.GU_IN: "gu-IN",
+        # Hebrew
+        Language.HE: "he-IL",
+        Language.HE_IL: "he-IL",
+        # Hindi
        Language.HI: "hi-IN",
+        Language.HI_IN: "hi-IN",
+        # Hungarian
        Language.HU: "hu-HU",
+        Language.HU_HU: "hu-HU",
+        # Icelandic
+        Language.IS: "is-IS",
+        Language.IS_IS: "is-IS",
+        # Indonesian
        Language.ID: "id-ID",
+        Language.ID_ID: "id-ID",
+        # Italian
        Language.IT: "it-IT",
+        Language.IT_IT: "it-IT",
+        # Japanese
        Language.JA: "ja-JP",
+        Language.JA_JP: "ja-JP",
+        # Kannada
+        Language.KN: "kn-IN",
+        Language.KN_IN: "kn-IN",
+        # Korean
        Language.KO: "ko-KR",
+        Language.KO_KR: "ko-KR",
+        # Latvian
        Language.LV: "lv-LV",
+        Language.LV_LV: "lv-LV",
+        # Lithuanian
        Language.LT: "lt-LT",
+        Language.LT_LT: "lt-LT",
+        # Malay
        Language.MS: "ms-MY",
+        Language.MS_MY: "ms-MY",
+        # Malayalam
+        Language.ML: "ml-IN",
+        Language.ML_IN: "ml-IN",
+        # Marathi
+        Language.MR: "mr-IN",
+        Language.MR_IN: "mr-IN",
+        # Norwegian
        Language.NO: "nb-NO",
+        Language.NB: "nb-NO",
+        Language.NB_NO: "nb-NO",
+        # Polish
        Language.PL: "pl-PL",
+        Language.PL_PL: "pl-PL",
+        # Portuguese
        Language.PT: "pt-PT",
        Language.PT_BR: "pt-BR",
+        Language.PT_PT: "pt-PT",
+        # Punjabi
+        Language.PA: "pa-IN",
+        Language.PA_IN: "pa-IN",
+        # Romanian
        Language.RO: "ro-RO",
+        Language.RO_RO: "ro-RO",
+        # Russian
        Language.RU: "ru-RU",
+        Language.RU_RU: "ru-RU",
+        # Serbian
+        Language.SR: "sr-RS",
+        Language.SR_RS: "sr-RS",
+        # Slovak
        Language.SK: "sk-SK",
+        Language.SK_SK: "sk-SK",
+        # Spanish
        Language.ES: "es-ES",
+        Language.ES_ES: "es-ES",
+        Language.ES_US: "es-US",
+        # Swedish
        Language.SV: "sv-SE",
+        Language.SV_SE: "sv-SE",
+        # Tamil
+        Language.TA: "ta-IN",
+        Language.TA_IN: "ta-IN",
+        # Telugu
+        Language.TE: "te-IN",
+        Language.TE_IN: "te-IN",
+        # Thai
        Language.TH: "th-TH",
+        Language.TH_TH: "th-TH",
+        # Turkish
        Language.TR: "tr-TR",
+        Language.TR_TR: "tr-TR",
+        # Ukrainian
        Language.UK: "uk-UA",
+        Language.UK_UA: "uk-UA",
+        # Vietnamese
        Language.VI: "vi-VN",
+        Language.VI_VN: "vi-VN",
    }
+
    return language_map.get(language)


@@ -168,9 +281,10 @@ class GoogleAssistantContextAggregator(OpenAIAssistantContextAggregator):
                    )
                    run_llm = not bool(self._function_calls_in_progress)
            else:
-                self._context.add_message(
-                    glm.Content(role="model", parts=[glm.Part(text=aggregation)])
-                )
+                if aggregation.strip():
+                    self._context.add_message(
+                        glm.Content(role="model", parts=[glm.Part(text=aggregation)])
+                    )

            if self._pending_image_frame_message:
                frame = self._pending_image_frame_message
--- a/src/pipecat/services/lmnt.py
+++ b/src/pipecat/services/lmnt.py
@@ -36,24 +36,27 @@ except ModuleNotFoundError as e:


 def language_to_lmnt_language(language: Language) -> str | None:
-    language_map = {
+    BASE_LANGUAGES = {
        Language.DE: "de",
        Language.EN: "en",
-        Language.EN_US: "en",
-        Language.EN_AU: "en",
-        Language.EN_GB: "en",
-        Language.EN_NZ: "en",
-        Language.EN_IN: "en",
        Language.ES: "es",
        Language.FR: "fr",
-        Language.FR_CA: "fr",
-        Language.PT: "pt",
-        Language.PT_BR: "pt",
-        Language.ZH: "zh",
-        Language.ZH_TW: "zh",
        Language.KO: "ko",
+        Language.PT: "pt",
+        Language.ZH: "zh",
    }
-    return language_map.get(language)
+
+    result = BASE_LANGUAGES.get(language)
+
+    # If not found in base languages, try to find the base language from a variant
+    if not result:
+        # Convert enum value to string and get the base language part (e.g. es-ES -> es)
+        lang_str = str(language.value)
+        base_code = lang_str.split("-")[0].lower()
+        # Look up the base code in our supported languages
+        result = base_code if base_code in BASE_LANGUAGES.values() else None
+
+    return result


 class LmntTTSService(TTSService):
--- a/src/pipecat/services/xtts.py
+++ b/src/pipecat/services/xtts.py
@@ -7,6 +7,7 @@
 from typing import Any, AsyncGenerator, Dict

 import aiohttp
+from loguru import logger

 from pipecat.audio.utils import resample_audio
 from pipecat.frames.frames import (
@@ -20,9 +21,6 @@ from pipecat.frames.frames import (
 from pipecat.services.ai_services import TTSService
 from pipecat.transcriptions.language import Language

-from loguru import logger
-
-
 # The server below can connect to XTTS through a local running docker
 #
 # Docker command: $ docker run --gpus=all -e COQUI_TOS_AGREED=1 --rm -p 8000:80 ghcr.io/coqui-ai/xtts-streaming-server:latest-cuda121
@@ -32,15 +30,10 @@ from loguru import logger


 def language_to_xtts_language(language: Language) -> str | None:
-    language_map = {
+    BASE_LANGUAGES = {
        Language.CS: "cs",
        Language.DE: "de",
        Language.EN: "en",
-        Language.EN_US: "en",
-        Language.EN_AU: "en",
-        Language.EN_GB: "en",
-        Language.EN_NZ: "en",
-        Language.EN_IN: "en",
        Language.ES: "es",
        Language.FR: "fr",
        Language.HI: "hi",
@@ -51,12 +44,28 @@ def language_to_xtts_language(language: Language) -> str | None:
        Language.NL: "nl",
        Language.PL: "pl",
        Language.PT: "pt",
-        Language.PT_BR: "pt",
        Language.RU: "ru",
        Language.TR: "tr",
+        # Special case for Chinese base language
        Language.ZH: "zh-cn",
    }
-    return language_map.get(language)
+
+    result = BASE_LANGUAGES.get(language)
+
+    # If not found in base languages, try to find the base language from a variant
+    if not result:
+        # Convert enum value to string and get the base language part (e.g. es-ES -> es)
+        lang_str = str(language.value)
+        base_code = lang_str.split("-")[0].lower()
+
+        # Special handling for Chinese variants
+        if base_code == "zh":
+            result = "zh-cn"
+        else:
+            # Look up the base code in our supported languages
+            result = base_code if base_code in BASE_LANGUAGES.values() else None
+
+    return result


 class XTTSService(TTSService):
--- a/src/pipecat/transcriptions/language.py
+++ b/src/pipecat/transcriptions/language.py
@@ -5,7 +5,6 @@
 #

 import sys
-
 from enum import Enum

 if sys.version_info < (3, 11):
@@ -20,46 +19,411 @@ else:


 class Language(StrEnum):
-    BG = "bg"  # Bulgarian
-    CA = "ca"  # Catalan
-    ZH = "zh"  # Chinese simplified
-    ZH_TW = "zh-TW"  # Chinese traditional
-    CS = "cs"  # Czech
-    DA = "da"  # Danish
-    NL = "nl"  # Dutch
-    EN = "en"  # English
-    EN_US = "en-US"  # English (USA)
-    EN_AU = "en-AU"  # English (Australia)
-    EN_GB = "en-GB"  # English (Great Britain)
-    EN_NZ = "en-NZ"  # English (New Zealand)
-    EN_IN = "en-IN"  # English (India)
-    ET = "et"  # Estonian
-    FI = "fi"  # Finnish
-    NL_BE = "nl-BE"  # Flemmish
-    FR = "fr"  # French
-    FR_CA = "fr-CA"  # French (Canada)
-    DE = "de"  # German
-    DE_CH = "de-CH"  # German (Switzerland)
-    EL = "el"  # Greek
-    HI = "hi"  # Hindi
-    HU = "hu"  # Hungarian
-    ID = "id"  # Indonesian
-    IT = "it"  # Italian
-    JA = "ja"  # Japanese
-    KO = "ko"  # Korean
-    LV = "lv"  # Latvian
-    LT = "lt"  # Lithuanian
-    MS = "ms"  # Malay
-    NO = "no"  # Norwegian
-    PL = "pl"  # Polish
-    PT = "pt"  # Portuguese
-    PT_BR = "pt-BR"  # Portuguese (Brazil)
-    RO = "ro"  # Romanian
-    RU = "ru"  # Russian
-    SK = "sk"  # Slovak
-    ES = "es"  # Spanish
-    SV = "sv"  # Swedish
-    TH = "th"  # Thai
-    TR = "tr"  # Turkish
-    UK = "uk"  # Ukrainian
-    VI = "vi"  # Vietnamese
+    # Afrikaans
+    AF = "af"
+    AF_ZA = "af-ZA"
+
+    # Amharic
+    AM = "am"
+    AM_ET = "am-ET"
+
+    # Arabic
+    AR = "ar"
+    AR_AE = "ar-AE"
+    AR_BH = "ar-BH"
+    AR_DZ = "ar-DZ"
+    AR_EG = "ar-EG"
+    AR_IQ = "ar-IQ"
+    AR_JO = "ar-JO"
+    AR_KW = "ar-KW"
+    AR_LB = "ar-LB"
+    AR_LY = "ar-LY"
+    AR_MA = "ar-MA"
+    AR_OM = "ar-OM"
+    AR_QA = "ar-QA"
+    AR_SA = "ar-SA"
+    AR_SY = "ar-SY"
+    AR_TN = "ar-TN"
+    AR_YE = "ar-YE"
+
+    # Assamese
+    AS = "as"
+    AS_IN = "as-IN"
+
+    # Azerbaijani
+    AZ = "az"
+    AZ_AZ = "az-AZ"
+
+    # Bulgarian
+    BG = "bg"
+    BG_BG = "bg-BG"
+
+    # Bengali
+    BN = "bn"
+    BN_BD = "bn-BD"
+    BN_IN = "bn-IN"
+
+    # Bosnian
+    BS = "bs"
+    BS_BA = "bs-BA"
+
+    # Catalan
+    CA = "ca"
+    CA_ES = "ca-ES"
+
+    # Czech
+    CS = "cs"
+    CS_CZ = "cs-CZ"
+
+    # Welsh
+    CY = "cy"
+    CY_GB = "cy-GB"
+
+    # Danish
+    DA = "da"
+    DA_DK = "da-DK"
+
+    # German
+    DE = "de"
+    DE_AT = "de-AT"
+    DE_CH = "de-CH"
+    DE_DE = "de-DE"
+
+    # Greek
+    EL = "el"
+    EL_GR = "el-GR"
+
+    # English
+    EN = "en"
+    EN_AU = "en-AU"
+    EN_CA = "en-CA"
+    EN_GB = "en-GB"
+    EN_HK = "en-HK"
+    EN_IE = "en-IE"
+    EN_IN = "en-IN"
+    EN_KE = "en-KE"
+    EN_NG = "en-NG"
+    EN_NZ = "en-NZ"
+    EN_PH = "en-PH"
+    EN_SG = "en-SG"
+    EN_TZ = "en-TZ"
+    EN_US = "en-US"
+    EN_ZA = "en-ZA"
+
+    # Spanish
+    ES = "es"
+    ES_AR = "es-AR"
+    ES_BO = "es-BO"
+    ES_CL = "es-CL"
+    ES_CO = "es-CO"
+    ES_CR = "es-CR"
+    ES_CU = "es-CU"
+    ES_DO = "es-DO"
+    ES_EC = "es-EC"
+    ES_ES = "es-ES"
+    ES_GQ = "es-GQ"
+    ES_GT = "es-GT"
+    ES_HN = "es-HN"
+    ES_MX = "es-MX"
+    ES_NI = "es-NI"
+    ES_PA = "es-PA"
+    ES_PE = "es-PE"
+    ES_PR = "es-PR"
+    ES_PY = "es-PY"
+    ES_SV = "es-SV"
+    ES_US = "es-US"
+    ES_UY = "es-UY"
+    ES_VE = "es-VE"
+
+    # Estonian
+    ET = "et"
+    ET_EE = "et-EE"
+
+    # Basque
+    EU = "eu"
+    EU_ES = "eu-ES"
+
+    # Persian
+    FA = "fa"
+    FA_IR = "fa-IR"
+
+    # Finnish
+    FI = "fi"
+    FI_FI = "fi-FI"
+
+    # Filipino
+    FIL = "fil"
+    FIL_PH = "fil-PH"
+
+    # French
+    FR = "fr"
+    FR_BE = "fr-BE"
+    FR_CA = "fr-CA"
+    FR_CH = "fr-CH"
+    FR_FR = "fr-FR"
+
+    # Irish
+    GA = "ga"
+    GA_IE = "ga-IE"
+
+    # Galician
+    GL = "gl"
+    GL_ES = "gl-ES"
+
+    # Gujarati
+    GU = "gu"
+    GU_IN = "gu-IN"
+
+    # Hebrew
+    HE = "he"
+    HE_IL = "he-IL"
+
+    # Hindi
+    HI = "hi"
+    HI_IN = "hi-IN"
+
+    # Croatian
+    HR = "hr"
+    HR_HR = "hr-HR"
+
+    # Hungarian
+    HU = "hu"
+    HU_HU = "hu-HU"
+
+    # Armenian
+    HY = "hy"
+    HY_AM = "hy-AM"
+
+    # Indonesian
+    ID = "id"
+    ID_ID = "id-ID"
+
+    # Icelandic
+    IS = "is"
+    IS_IS = "is-IS"
+
+    # Italian
+    IT = "it"
+    IT_IT = "it-IT"
+
+    # Inuktitut
+    IU_CANS = "iu-Cans"
+    IU_CANS_CA = "iu-Cans-CA"
+    IU_LATN = "iu-Latn"
+    IU_LATN_CA = "iu-Latn-CA"
+
+    # Japanese
+    JA = "ja"
+    JA_JP = "ja-JP"
+
+    # Javanese
+    JV = "jv"
+    JV_ID = "jv-ID"
+
+    # Georgian
+    KA = "ka"
+    KA_GE = "ka-GE"
+
+    # Kazakh
+    KK = "kk"
+    KK_KZ = "kk-KZ"
+
+    # Khmer
+    KM = "km"
+    KM_KH = "km-KH"
+
+    # Kannada
+    KN = "kn"
+    KN_IN = "kn-IN"
+
+    # Korean
+    KO = "ko"
+    KO_KR = "ko-KR"
+
+    # Lao
+    LO = "lo"
+    LO_LA = "lo-LA"
+
+    # Lithuanian
+    LT = "lt"
+    LT_LT = "lt-LT"
+
+    # Latvian
+    LV = "lv"
+    LV_LV = "lv-LV"
+
+    # Macedonian
+    MK = "mk"
+    MK_MK = "mk-MK"
+
+    # Malayalam
+    ML = "ml"
+    ML_IN = "ml-IN"
+
+    # Mongolian
+    MN = "mn"
+    MN_MN = "mn-MN"
+
+    # Marathi
+    MR = "mr"
+    MR_IN = "mr-IN"
+
+    # Malay
+    MS = "ms"
+    MS_MY = "ms-MY"
+
+    # Maltese
+    MT = "mt"
+    MT_MT = "mt-MT"
+
+    # Burmese
+    MY = "my"
+    MY_MM = "my-MM"
+
+    # Norwegian
+    NB = "nb"
+    NB_NO = "nb-NO"
+    NO = "no"
+
+    # Nepali
+    NE = "ne"
+    NE_NP = "ne-NP"
+
+    # Dutch
+    NL = "nl"
+    NL_BE = "nl-BE"
+    NL_NL = "nl-NL"
+
+    # Odia
+    OR = "or"
+    OR_IN = "or-IN"
+
+    # Punjabi
+    PA = "pa"
+    PA_IN = "pa-IN"
+
+    # Polish
+    PL = "pl"
+    PL_PL = "pl-PL"
+
+    # Pashto
+    PS = "ps"
+    PS_AF = "ps-AF"
+
+    # Portuguese
+    PT = "pt"
+    PT_BR = "pt-BR"
+    PT_PT = "pt-PT"
+
+    # Romanian
+    RO = "ro"
+    RO_RO = "ro-RO"
+
+    # Russian
+    RU = "ru"
+    RU_RU = "ru-RU"
+
+    # Sinhala
+    SI = "si"
+    SI_LK = "si-LK"
+
+    # Slovak
+    SK = "sk"
+    SK_SK = "sk-SK"
+
+    # Slovenian
+    SL = "sl"
+    SL_SI = "sl-SI"
+
+    # Somali
+    SO = "so"
+    SO_SO = "so-SO"
+
+    # Albanian
+    SQ = "sq"
+    SQ_AL = "sq-AL"
+
+    # Serbian
+    SR = "sr"
+    SR_RS = "sr-RS"
+    SR_LATN = "sr-Latn"
+    SR_LATN_RS = "sr-Latn-RS"
+
+    # Sundanese
+    SU = "su"
+    SU_ID = "su-ID"
+
+    # Swedish
+    SV = "sv"
+    SV_SE = "sv-SE"
+
+    # Swahili
+    SW = "sw"
+    SW_KE = "sw-KE"
+    SW_TZ = "sw-TZ"
+
+    # Tagalog
+    TL = "tl"
+
+    # Tamil
+    TA = "ta"
+    TA_IN = "ta-IN"
+    TA_LK = "ta-LK"
+    TA_MY = "ta-MY"
+    TA_SG = "ta-SG"
+
+    # Telugu
+    TE = "te"
+    TE_IN = "te-IN"
+
+    # Thai
+    TH = "th"
+    TH_TH = "th-TH"
+
+    # Turkish
+    TR = "tr"
+    TR_TR = "tr-TR"
+
+    # Ukrainian
+    UK = "uk"
+    UK_UA = "uk-UA"
+
+    # Urdu
+    UR = "ur"
+    UR_IN = "ur-IN"
+    UR_PK = "ur-PK"
+
+    # Uzbek
+    UZ = "uz"
+    UZ_UZ = "uz-UZ"
+
+    # Vietnamese
+    VI = "vi"
+    VI_VN = "vi-VN"
+
+    # Wu Chinese
+    WUU = "wuu"
+    WUU_CN = "wuu-CN"
+
+    # Yue Chinese
+    YUE = "yue"
+    YUE_CN = "yue-CN"
+
+    # Chinese
+    ZH = "zh"
+    ZH_CN = "zh-CN"
+    ZH_CN_GUANGXI = "zh-CN-guangxi"
+    ZH_CN_HENAN = "zh-CN-henan"
+    ZH_CN_LIAONING = "zh-CN-liaoning"
+    ZH_CN_SHAANXI = "zh-CN-shaanxi"
+    ZH_CN_SHANDONG = "zh-CN-shandong"
+    ZH_CN_SICHUAN = "zh-CN-sichuan"
+    ZH_HK = "zh-HK"
+    ZH_TW = "zh-TW"
+
+    # Xhosa
+    XH = "xh"
+
+    # Zulu
+    ZU = "zu"
+    ZU_ZA = "zu-ZA"
--- a/src/pipecat/transports/base_input.py
+++ b/src/pipecat/transports/base_input.py
@@ -71,6 +71,7 @@ class BaseInputTransport(FrameProcessor):
        return self._params.vad_analyzer

    async def push_audio_frame(self, frame: InputAudioRawFrame):
+        logger.info(f"Pushing audio qsize: {self._audio_in_queue.qsize()}")
        if self._params.audio_in_enabled or self._params.vad_enabled:
            await self._audio_in_queue.put(frame)

@@ -167,6 +168,7 @@ class BaseInputTransport(FrameProcessor):
        return vad_state

    async def _audio_task_handler(self):
+        logger.info("_audio_task_handler started")
        vad_state: VADState = VADState.QUIET
        while True:
            try:
--- a/src/pipecat/transports/network/fastapi_websocket.py
+++ b/src/pipecat/transports/network/fastapi_websocket.py
@@ -70,16 +70,6 @@ class FastAPIWebsocketInputTransport(BaseInputTransport):
        await self._callbacks.on_client_connected(self._websocket)
        self._receive_task = self.get_event_loop().create_task(self._receive_messages())

-    async def stop(self, frame: EndFrame):
-        await super().stop(frame)
-        if self._websocket.client_state != WebSocketState.DISCONNECTED:
-            await self._websocket.close()
-
-    async def cancel(self, frame: CancelFrame):
-        await super().cancel(frame)
-        if self._websocket.client_state != WebSocketState.DISCONNECTED:
-            await self._websocket.close()
-
    async def _receive_messages(self):
        async for message in self._websocket.iter_text():
            frame = self._params.serializer.deserialize(message)
--- a/src/pipecat/transports/network/websocket_server.py
+++ b/src/pipecat/transports/network/websocket_server.py
@@ -106,6 +106,7 @@ class WebsocketServerInputTransport(BaseInputTransport):
                continue

            if isinstance(frame, AudioRawFrame):
+                logger.info("websocket_server")
                await self.push_audio_frame(
                    InputAudioRawFrame(
                        audio=frame.audio,
--- a/src/pipecat/transports/services/daily.py
+++ b/src/pipecat/transports/services/daily.py
@@ -128,7 +128,11 @@ class DailyCallbacks(BaseModel):
    on_error: Callable[[str], Awaitable[None]]
    on_app_message: Callable[[Any, str], Awaitable[None]]
    on_call_state_updated: Callable[[str], Awaitable[None]]
+    on_dialin_connected: Callable[[Any], Awaitable[None]]
    on_dialin_ready: Callable[[str], Awaitable[None]]
+    on_dialin_stopped: Callable[[Any], Awaitable[None]]
+    on_dialin_error: Callable[[Any], Awaitable[None]]
+    on_dialin_warning: Callable[[Any], Awaitable[None]]
    on_dialout_answered: Callable[[Any], Awaitable[None]]
    on_dialout_connected: Callable[[Any], Awaitable[None]]
    on_dialout_stopped: Callable[[Any], Awaitable[None]]
@@ -536,9 +540,21 @@ class DailyTransportClient(EventHandler):
    def on_call_state_updated(self, state: str):
        self._call_async_callback(self._callbacks.on_call_state_updated, state)

+    def on_dialin_connected(self, data: Any):
+        self._call_async_callback(self._callbacks.on_dialin_connected, data)
+
    def on_dialin_ready(self, sip_endpoint: str):
        self._call_async_callback(self._callbacks.on_dialin_ready, sip_endpoint)

+    def on_dialin_stopped(self, data: Any):
+        self._call_async_callback(self._callbacks.on_dialin_stopped, data)
+
+    def on_dialin_error(self, data: Any):
+        self._call_async_callback(self._callbacks.on_dialin_error, data)
+
+    def on_dialin_warning(self, data: Any):
+        self._call_async_callback(self._callbacks.on_dialin_warning, data)
+
    def on_dialout_answered(self, data: Any):
        self._call_async_callback(self._callbacks.on_dialout_answered, data)

@@ -822,7 +838,11 @@ class DailyTransport(BaseTransport):
            on_error=self._on_error,
            on_app_message=self._on_app_message,
            on_call_state_updated=self._on_call_state_updated,
+            on_dialin_connected=self._on_dialin_connected,
            on_dialin_ready=self._on_dialin_ready,
+            on_dialin_stopped=self._on_dialin_stopped,
+            on_dialin_error=self._on_dialin_error,
+            on_dialin_warning=self._on_dialin_warning,
            on_dialout_answered=self._on_dialout_answered,
            on_dialout_connected=self._on_dialout_connected,
            on_dialout_stopped=self._on_dialout_stopped,
@@ -851,7 +871,11 @@ class DailyTransport(BaseTransport):
        self._register_event_handler("on_left")
        self._register_event_handler("on_app_message")
        self._register_event_handler("on_call_state_updated")
+        self._register_event_handler("on_dialin_connected")
        self._register_event_handler("on_dialin_ready")
+        self._register_event_handler("on_dialin_stopped")
+        self._register_event_handler("on_dialin_error")
+        self._register_event_handler("on_dialin_warning")
        self._register_event_handler("on_dialout_answered")
        self._register_event_handler("on_dialout_connected")
        self._register_event_handler("on_dialout_stopped")
@@ -987,11 +1011,23 @@ class DailyTransport(BaseTransport):
            except Exception as e:
                logger.exception(f"Error handling dialin-ready event ({url}): {e}")

+    async def _on_dialin_connected(self, data):
+        await self._call_event_handler("on_dialin_connected", data)
+
    async def _on_dialin_ready(self, sip_endpoint):
        if self._params.dialin_settings:
            await self._handle_dialin_ready(sip_endpoint)
        await self._call_event_handler("on_dialin_ready", sip_endpoint)

+    async def _on_dialin_stopped(self, data):
+        await self._call_event_handler("on_dialin_stopped", data)
+
+    async def _on_dialin_error(self, data):
+        await self._call_event_handler("on_dialin_error", data)
+
+    async def _on_dialin_warning(self, data):
+        await self._call_event_handler("on_dialin_warning", data)
+
    async def _on_dialout_answered(self, data):
        await self._call_event_handler("on_dialout_answered", data)
Author	SHA1	Message	Date
James Hush	1884ff3f09	logging	2024-11-27 19:38:37 +08:00
James Hush	f34e6bce94	Switch questions	2024-11-27 15:10:50 +08:00
James Hush	909bb30517	Better recreation	2024-11-27 14:08:01 +08:00
James Hush	632bae7eee	Interrupted?	2024-11-27 12:21:45 +08:00
James Hush	cedccdcbc0	Add interruptions	2024-11-27 11:50:28 +08:00
James Hush	1893784b89	Save race bot	2024-11-27 11:36:28 +08:00
James Hush	e2384e2484	fix: add logging and error handling for issue #721	2024-11-26 11:22:58 +08:00
Mark Backman	98c0a6e047	Merge pull request #749 from pipecat-ai/mb/pipecat-flows-standalone Make Pipecat Flows an independent package	2024-11-25 17:09:11 -05:00
Mark Backman	f599e160de	Make Pipecat Flows an independent package	2024-11-25 13:42:08 -05:00
Mark Backman	11c5d822f9	Merge pull request #746 from pipecat-ai/mb/update-flows Bumping pipecat-ai-flows version	2024-11-22 11:25:03 -05:00
Mark Backman	c3e22f0931	Bumping pipecat-ai-flows version	2024-11-22 11:21:40 -05:00
Kwindla Hultman Kramer	9409546f90	Merge pull request #743 from pipecat-ai/khk/gemini-exp Empty text content bug fix for Gemini	2024-11-21 14:04:28 -08:00
Kwindla Hultman Kramer	8ddac0ccd8	Testing with gemini-exp-1114. Bug fix	2024-11-21 10:33:12 -08:00
Mark Backman	f938960d50	Merge pull request #736 from pipecat-ai/mb/language-support Make language support more robust	2024-11-20 13:03:47 -05:00
Mark Backman	2981d87bc1	Update changelog	2024-11-20 12:56:35 -05:00
Mark Backman	106042bbb2	Make language support more robust	2024-11-20 12:56:11 -05:00
Filipi da Silva Fuchter	d25ddeb962	Merge pull request #739 from pipecat-ai/krisp_v7 bumping krisp to support v7	2024-11-20 11:39:39 -03:00
Filipi Fuchter	c441baa692	bumping krisp to support v7	2024-11-20 11:37:45 -03:00
Mark Backman	676ff14913	Merge pull request #735 from pipecat-ai/vp-internal-push-frame-fix internal push frame fix	2024-11-20 06:34:40 -05:00
Vanessa Pyne	14893ade92	Update src/pipecat/processors/frame_processor.py Co-authored-by: Mark Backman <mark@daily.co>	2024-11-19 22:37:58 -06:00
Mark Backman	2a39ff69d6	Merge pull request #720 from pipecat-ai/mb/conversation-flow	2024-11-19 21:46:20 -05:00
Mark Backman	e79289454a	Merge pull request #734 from pipecat-ai/mb/fix-cartesia	2024-11-19 21:27:52 -05:00
Mark Backman	25d02da1b2	Merge pull request #738 from pipecat-ai/mb/natural-conversation-demo	2024-11-19 21:27:38 -05:00
Mark Backman	a36fc370fa	Improve the 22c foundational example	2024-11-19 15:49:40 -05:00
Mark Backman	e4c2f6d4c2	Update changelog	2024-11-18 21:32:53 -05:00
Mark Backman	97659ca3f0	Use the new pipecat-ai-flows module	2024-11-18 21:29:35 -05:00
vipyne	e00c75ce3f	fix: raise exception in internal_push_frame	2024-11-18 16:01:04 -06:00
Mark Backman	cf62167f54	Revert: services(cartesia): generated TTSStoppedFrame after no more audio	2024-11-18 12:25:04 -05:00
Mark Backman	b3dfeb61c4	Add CHANGELOG entry	2024-11-18 12:18:20 -05:00
Mark Backman	bd020320cd	Support a list of messages	2024-11-18 12:18:20 -05:00
Mark Backman	7a55d2d7db	Add end session handler and update example	2024-11-18 12:18:20 -05:00
Mark Backman	b7308dca5d	Fix issue where actions would execute on terminating nodes	2024-11-18 12:18:20 -05:00
Mark Backman	5301f44b3b	Add pre- and post-actions	2024-11-18 12:18:20 -05:00
Mark Backman	686165b95a	Add ability to register actions	2024-11-18 12:18:20 -05:00
Mark Backman	4e0ecdd673	Class name updates and remove FrameProcessor base class	2024-11-18 12:18:20 -05:00
Mark Backman	1b74560f9d	Move function registration into the ConversationFlowProcessor class	2024-11-18 12:18:20 -05:00
Mark Backman	0c1070433f	Clean up and commenting	2024-11-18 12:18:20 -05:00
Mark Backman	ece2c08cde	debugging	2024-11-18 12:18:20 -05:00
Mark Backman	0b9742da9e	Add a conversation flow processor	2024-11-18 12:18:20 -05:00
Aleix Conchillo Flaqué	635aa6eb5b	Merge pull request #729 from pipecat-ai/aleix/fastapi-websocket-dont-close transports(fastapi): don't try to close socket	2024-11-18 16:01:41 +01:00
Mark Backman	1ff17cc2b6	Merge pull request #733 from pipecat-ai/aleix/add-missing-init-files processors: add missing __init__.py	2024-11-18 09:44:56 -05:00
Mark Backman	41ce9e9087	Merge pull request #697 from pipecat-ai/cst/leave-message add handler for disconnect-bot message	2024-11-18 09:38:11 -05:00
Mark Backman	4803c54ecf	Update CHANGELOG	2024-11-18 09:36:19 -05:00
Christian Stuff	5d7b3f2b38	add handler for disconnect-bot message	2024-11-18 09:33:30 -05:00
Aleix Conchillo Flaqué	23e5b1ec4d	processors: add missing __init__.py	2024-11-18 11:32:20 +01:00
Aleix Conchillo Flaqué	7f5a8928b8	transports(fastapi): don't try to close socket The websocket is passed from outside (in the transport constructor) so we should not be trying to close it. FastAPI does actually close it later. We didn't see any issue because these functions were not implemented properly. The value to check was `application_state` instead of `client_state`. But in any case, Pipecat should not be responsible for closing things passed from outside.	2024-11-18 01:15:19 +01:00
Aleix Conchillo Flaqué	53f675f5cf	Merge pull request #727 from pipecat-ai/aleix/pipecat-0.0.49 update CHANGELOG for 0.0.49	2024-11-18 06:27:12 +08:00
Aleix Conchillo Flaqué	8173e4ce55	update CHANGELOG for 0.0.49	2024-11-17 23:26:09 +01:00
Aleix Conchillo Flaqué	5445bb0363	rtvi: add on_bot_started event	2024-11-17 22:40:00 +01:00
Mark Backman	a2a94724e5	Merge pull request #725 from pipecat-ai/mb/fix-simple-chatbot Fix simple-chatbot example	2024-11-16 12:10:05 -05:00
Aleix Conchillo Flaqué	a8f9b0635a	Merge pull request #722 from pipecat-ai/aleix/more-dailin-events transports(daily): add more dial-in events	2024-11-17 01:09:01 +08:00
Mark Backman	4273a31fd5	Fix simple-chatbot example	2024-11-16 07:48:42 -05:00
Aleix Conchillo Flaqué	67f975a2c8	transports(daily): add more dial-in events	2024-11-16 01:22:50 +01:00