Compare commits

..

2 Commits

Author SHA1 Message Date
Mark Backman
6fca53c31d Add changelog for #4525 2026-05-19 18:32:02 -04:00
Mark Backman
e1f3b4fdbe raise ImportError instead of Exception for missing optional deps
Across 84 files, the optional-dependency guard at module load did
`raise Exception(f"Missing module: {e}")`, which is too generic and
drops the original `ModuleNotFoundError` traceback. Switch to
`raise ImportError(...) from e` so callers can `except ImportError:`
cleanly and the original cause is preserved.

Two files (audio/turn/krisp_viva_turn.py, turns/user_start/krisp_viva_ip_user_turn_start_strategy.py)
already used the correct pattern and were left untouched.
2026-05-19 18:31:25 -04:00
132 changed files with 257 additions and 5045 deletions

View File

@@ -1,91 +0,0 @@
---
name: squash-commits
description: Reorganize messy branch commits into a small set of logical, meaningful commits without changing any content. Drops merge-from-main commits. Safe: creates a backup branch first.
---
Reorganize the commits on the current branch into a small number of logical commits. Do NOT change any file content — only the commit structure changes.
## Instructions
### 1. Safety check
```bash
git status --short
```
If there are uncommitted changes, stop and tell the user to commit or stash them first.
### 2. Inspect the branch
```bash
git log main..HEAD --oneline
git diff main..HEAD --name-only
```
List every file changed vs `main` and every commit on the branch (excluding merge commits from main).
### 3. Create a backup branch
```bash
git branch backup/<current-branch-name>
```
Tell the user the backup exists so they can recover if needed.
### 4. Soft-reset to main and unstage everything
```bash
git reset --soft main
git restore --staged .
```
All branch changes are now in the working tree, unstaged. No content has changed.
### 5. Plan the logical groups
Read the changed files and the original commit messages to understand what the work covers. Group related files into logical commits. Typical groups:
- Core feature or fix (new source files + modified core files)
- Secondary features or fixes (each as its own commit if distinct)
- Refactoring or renames
- Tests
- Changelogs / docs
Use the changelog files (if any) as a strong hint — each changelog entry often maps to one commit.
Present the proposed grouping to the user and ask for confirmation before committing.
### 6. Commit in logical groups
For each group, stage only the relevant files and commit with a clear message following the project's conventions:
```bash
git add <file1> <file2> ...
git commit -m "..."
```
Use conventional commit prefixes if the project uses them (`feat:`, `fix:`, `refactor:`, `test:`, `chore:`).
### 7. Verify
```bash
git log main..HEAD --oneline
git diff main..HEAD --name-only
git status --short
```
Confirm:
- Commit count is small and each message is meaningful
- The set of changed files vs `main` is identical to before
- Working tree is clean
### 8. Remind about force-push
The branch history has been rewritten. Tell the user they will need to `git push --force-with-lease` when they are ready to update the remote. Do NOT push automatically.
## Rules
- Never change file contents. If you find yourself editing a file, stop.
- Never skip the backup branch step.
- Never force-push without explicit user instruction.
- If any step fails or the result looks wrong, tell the user and suggest restoring from the backup: `git reset --hard backup/<branch-name>`.

View File

@@ -92,7 +92,7 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
| Category | Services |
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/api-reference/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/api-reference/server/services/stt/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/api-reference/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/api-reference/server/services/stt/gladia), [Google](https://docs.pipecat.ai/api-reference/server/services/stt/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/stt/mistral), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/stt/nvidia), [OpenAI (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/api-reference/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/api-reference/server/services/stt/whisper), [xAI](https://docs.pipecat.ai/api-reference/server/services/stt/xai) |
| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Inception](https://docs.pipecat.ai/api-reference/server/services/llm/inception), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together) |
| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/api-reference/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/api-reference/server/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/api-reference/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/api-reference/server/services/tts/fish), [Google](https://docs.pipecat.ai/api-reference/server/services/tts/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/api-reference/server/services/tts/groq), [Hume](https://docs.pipecat.ai/api-reference/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/api-reference/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/api-reference/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/api-reference/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/api-reference/server/services/tts/minimax), [Mistral](https://docs.pipecat.ai/api-reference/server/services/tts/mistral), [Neuphonic](https://docs.pipecat.ai/api-reference/server/services/tts/neuphonic), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/tts/nvidia), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/tts/openai), [Piper](https://docs.pipecat.ai/api-reference/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/api-reference/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/api-reference/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/api-reference/server/services/tts/smallest), [Soniox](https://docs.pipecat.ai/api-reference/server/services/tts/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/api-reference/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/api-reference/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/api-reference/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/api-reference/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/api-reference/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/api-reference/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/api-reference/server/services/s2s/ultravox), |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/api-reference/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/api-reference/server/services/transport/small-webrtc), [Vonage (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/vonage), [WebSocket Server](https://docs.pipecat.ai/api-reference/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/api-reference/server/services/transport/whatsapp), Local |

View File

@@ -1 +0,0 @@
- Fixed Azure TTS last word being missed by observers and RTVI UI. The completion signal was racing with word timestamp processing, causing the final word's `TTSTextFrame` to arrive after `TTSStoppedFrame`. Completion is now routed through the word boundary queue to ensure all words are processed before signaling stream end.

View File

@@ -1 +0,0 @@
- Fixed `BaseOutputTransport` reordering frames that share the same presentation timestamp. Frames with equal PTS values are now emitted in insertion order, preventing subtle audio/text sequencing bugs when multiple frames arrive at the same time.

View File

@@ -1 +0,0 @@
- Fixed Cartesia word timestamps leaking SSML tag text (e.g. `<spell>`, `<emotion>`, `<break>`) into word entries. Tags are now stripped before processing, so word-to-text attribution remains accurate when SSML markup is present in the TTS input.

View File

@@ -1 +0,0 @@
- Fixed `TTSTextFrame` entries losing their original text structure when word timestamps are enabled. Each `TTSTextFrame` now carries a `raw_text` field containing the corresponding span of the original LLM-produced text (including pattern delimiters such as `<card>4111 1111 1111 1111</card>`), so the assistant context receives properly-tagged content rather than the cleaned words returned by the TTS provider. Also handles words that straddle two sentence boundaries by splitting them and attributing each part to its correct source frame.

View File

@@ -1 +0,0 @@
- Fixed skipped TTS frames (e.g. code blocks filtered via `skip_aggregator_types`) being emitted to the assistant context immediately instead of waiting for preceding spoken frames to finish. They now hold their position in the frame sequence and are flushed only after all earlier spoken sentences are complete, keeping context ordering correct.

View File

@@ -1 +0,0 @@
- Added `InceptionLLMService` for Inception's Mercury 2 diffusion reasoning model, with support for `reasoning_effort` and `realtime` settings.

View File

@@ -1 +0,0 @@
- Fixed websocket STT connection setup failures so services clear stale websocket state and emit non-fatal error frames, allowing `ServiceSwitcher` failover to keep agents running.

View File

@@ -1 +0,0 @@
- Added `max_endpoint_delay_ms` to `SonioxSTTService.Settings`, controlling the maximum delay (500-3000 ms) before endpoint detection finalizes a turn.

View File

@@ -1 +0,0 @@
- `SonioxSTTService` now applies settings updates (e.g. via `STTUpdateSettingsFrame`) using a graceful reconnect instead of a hard disconnect/reconnect, preserving the service's reconnect retry behavior.

View File

@@ -1 +0,0 @@
- Removed the unsupported Georgian (`Language.KA`) language mapping from `SonioxSTTService`.

View File

@@ -1 +0,0 @@
- Updated the default p99 TTFS latency values for Smallest AI, Mistral, and XAI STT so turn stop timing uses measured values instead of the conservative fallback.

View File

@@ -1 +0,0 @@
- Updated the development runner startup banner to show the prebuilt client URL once and list enabled or disabled transports with install hints.

View File

@@ -1 +0,0 @@
- Fixed the development runner so missing optional transport dependencies disable only their related routes instead of failing startup in all-transport mode.

View File

@@ -0,0 +1 @@
- Services and transports with missing optional dependencies now raise `ImportError` instead of a bare `Exception` when their module is imported without the required extra installed. The original `ModuleNotFoundError` is preserved as `__cause__`, so code that wraps these imports can now use `except ImportError:` cleanly instead of `except Exception:`.

View File

@@ -1 +0,0 @@
- Fixed a race in `ElevenLabsTTSService` where the periodic keepalive could be sent for a new turn's context before that context's `voice_settings` initialization message, causing ElevenLabs to close the WebSocket with a 1008 policy violation (`voice_settings field must be provided in the first message ...`). The keepalive now only targets a context once its context-init has been sent.

View File

@@ -1 +0,0 @@
- Bumped `pipecat-ai-prebuilt` to 1.0.1 in the `runner` extra, updating the prebuilt client UI served by the development runner.

View File

@@ -91,9 +91,6 @@ HEYGEN_LIVE_AVATAR_API_KEY=...
HUME_API_KEY=...
HUME_VOICE_ID=...
# Inception
INCEPTION_API_KEY=...
# Inworld
INWORLD_API_KEY=...

View File

@@ -1,177 +0,0 @@
#
# Copyright (c) 2024-2026, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
LLMContextAggregatorPair,
LLMUserAggregatorParams,
)
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.inception.llm import InceptionLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
async def fetch_weather_from_api(params: FunctionCallParams):
await params.result_callback({"conditions": "nice", "temperature": "75"})
async def fetch_restaurant_recommendation(params: FunctionCallParams):
await params.result_callback({"name": "The Golden Dragon"})
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.environ["DEEPGRAM_API_KEY"])
tts = CartesiaTTSService(
api_key=os.environ["CARTESIA_API_KEY"],
settings=CartesiaTTSService.Settings(
voice="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
),
)
llm = InceptionLLMService(
api_key=os.environ["INCEPTION_API_KEY"],
settings=InceptionLLMService.Settings(
reasoning_effort="instant",
system_instruction="You are a helpful assistant in a voice conversation. Your responses will be spoken aloud, so avoid emojis, bullet points, or other formatting that can't be spoken. Respond to what the user said in a creative, helpful, and brief way.",
),
)
# You can also register a function_name of None to get all functions
# sent to the same callback with an additional function_name parameter.
llm.register_function("get_current_weather", fetch_weather_from_api)
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
@llm.event_handler("on_function_calls_started")
async def on_function_calls_started(service, function_calls):
await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
weather_function = FunctionSchema(
name="get_current_weather",
description="Get the current weather",
properties={
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"format": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to use. Infer this from the user's location.",
},
},
required=["location", "format"],
)
restaurant_function = FunctionSchema(
name="get_restaurant_recommendation",
description="Get a restaurant recommendation",
properties={
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
},
required=["location"],
)
tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
context = LLMContext(tools=tools)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(vad_analyzer=SileroVADAnalyzer()),
)
pipeline = Pipeline(
[
transport.input(),
stt,
user_aggregator,
llm,
tts,
transport.output(),
assistant_aggregator,
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
context.add_message(
{"role": "developer", "content": "Please introduce yourself to the user."}
)
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -22,9 +22,9 @@ from pipecat.processors.aggregators.llm_response_universal import (
)
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.soniox.stt import SonioxSTTService
from pipecat.services.soniox.tts import SonioxTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
@@ -53,7 +53,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt = SonioxSTTService(api_key=os.environ["SONIOX_API_KEY"])
tts = SonioxTTSService(api_key=os.environ["SONIOX_API_KEY"])
tts = CartesiaTTSService(
api_key=os.environ["CARTESIA_API_KEY"],
settings=CartesiaTTSService.Settings(
voice="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
),
)
llm = OpenAILLMService(
api_key=os.environ["OPENAI_API_KEY"],
@@ -98,9 +103,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
await task.queue_frames([LLMRunFrame()])
await asyncio.sleep(10)
logger.info("Updating Soniox STT settings: language_hints=[es]")
logger.info("Updating Soniox STT settings: language=es")
await task.queue_frame(
STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language_hints=[Language.ES]))
STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language=Language.ES))
)
@transport.event_handler("on_client_disconnected")

View File

@@ -77,7 +77,6 @@ groq = [ "groq>=0.23.0,<2" ]
gstreamer = [ "pygobject~=3.50.0" ]
heygen = [ "livekit>=1.0.13,<2", "pipecat-ai[websockets-base]" ]
hume = [ "hume>=0.11.2,<1" ]
inception = []
inworld = [ "pipecat-ai[websockets-base]" ]
koala = [ "pvkoala~=2.0.3" ]
kokoro = [ "kokoro-onnx>=0.5.0,<1", "requests>=2.32.5,<3" ]
@@ -104,7 +103,7 @@ piper = [ "piper-tts>=1.3.0,<2", "requests>=2.32.5,<3" ]
qwen = []
resembleai = [ "pipecat-ai[websockets-base]" ]
rime = [ "pipecat-ai[websockets-base]" ]
runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.1"]
runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.0"]
sagemaker = ["aws_sdk_sagemaker_runtime_http2; python_version>='3.12'"]
sambanova = []
sarvam = [ "sarvamai==0.1.28", "pipecat-ai[websockets-base]" ]

View File

@@ -198,7 +198,6 @@ TESTS_FUNCTION_CALLING = [
("function-calling/function-calling-sarvam.py", EVAL_WEATHER),
("function-calling/function-calling-novita.py", EVAL_WEATHER),
("function-calling/function-calling-deepseek.py", EVAL_WEATHER),
("function-calling/function-calling-inception.py", EVAL_WEATHER),
# Video
("function-calling/function-calling-anthropic-video.py", EVAL_VISION_CAMERA),
("function-calling/function-calling-aws-video.py", EVAL_VISION_CAMERA),

View File

@@ -28,7 +28,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class GeminiLLMInvocationParams(TypedDict):

View File

@@ -23,7 +23,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use the Koala filter, you need to `pip install pipecat-ai[koala]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class KoalaFilter(BaseAudioFilter):

View File

@@ -27,7 +27,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use KrispVivaFilter, you need to install krisp_audio.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class KrispVivaFilter(BaseAudioFilter):

View File

@@ -28,7 +28,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use the soundfile mixer, you need to `pip install pipecat-ai[soundfile]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class SoundfileMixer(BaseAudioMixer):

View File

@@ -27,7 +27,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use the LocalSmartTurnAnalyzer, you need to `pip install pipecat-ai[local-smart-turn]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class LocalCoreMLSmartTurnAnalyzer(BaseSmartTurn):

View File

@@ -33,7 +33,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use LocalSmartTurnAnalyzerV2, you need to `pip install pipecat-ai[local-smart-turn]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class LocalSmartTurnAnalyzerV2(BaseSmartTurn):

View File

@@ -28,7 +28,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use KrispVivaVADAnalyzer, you need to install krisp_audio.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class KrispVivaVadAnalyzer(VADAnalyzer):

View File

@@ -27,7 +27,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Silero VAD, you need to `pip install pipecat-ai`.")
raise Exception(f"Missing module(s): {e}")
raise ImportError(f"Missing module(s): {e}") from e
class SileroOnnxModel:

View File

@@ -383,14 +383,10 @@ class AggregatedTextFrame(TextFrame):
Parameters:
aggregated_by: Method used to aggregate the text frames.
context_id: Unique identifier for the TTS context that generated this text.
raw_text: The full matched text including start/end pattern delimiters, set when
this frame was produced from a PatternMatch (e.g. a ``<code>...</code>`` block).
None for ordinary sentence aggregations.
"""
aggregated_by: AggregationType | str
context_id: str | None = None
raw_text: str | None = None
@dataclass

View File

@@ -25,7 +25,6 @@ from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.audio.vad.vad_analyzer import VADAnalyzer
from pipecat.audio.vad.vad_controller import VADController
from pipecat.frames.frames import (
AggregatedTextFrame,
AssistantImageRawFrame,
BotStartedSpeakingFrame,
BotStoppedSpeakingFrame,
@@ -1497,14 +1496,9 @@ class LLMAssistantAggregator(LLMContextAggregator):
if len(frame.text) == 0:
return
text = (
frame.raw_text
if isinstance(frame, AggregatedTextFrame) and frame.raw_text
else frame.text
)
self._aggregation.append(
TextPartForConcatenation(
text, includes_inter_part_spaces=frame.includes_inter_frame_spaces
frame.text, includes_inter_part_spaces=frame.includes_inter_frame_spaces
)
)

View File

@@ -23,7 +23,6 @@ from pipecat.frames.frames import (
)
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.utils.text.base_text_aggregator import BaseTextAggregator
from pipecat.utils.text.pattern_pair_aggregator import PatternMatch
from pipecat.utils.text.simple_text_aggregator import SimpleTextAggregator
@@ -86,11 +85,7 @@ class LLMTextProcessor(FrameProcessor):
out_frame = AggregatedTextFrame(
text=aggregation.text,
aggregated_by=aggregation.type,
raw_text=aggregation.full_match
if isinstance(aggregation, PatternMatch)
else aggregation.text,
)
out_frame.append_to_context = True
out_frame.skip_tts = in_frame.skip_tts
await self.push_frame(out_frame)
@@ -101,9 +96,6 @@ class LLMTextProcessor(FrameProcessor):
out_frame = AggregatedTextFrame(
text=remaining.text,
aggregated_by=remaining.type,
raw_text=remaining.full_match
if isinstance(remaining, PatternMatch)
else remaining.text,
)
out_frame.skip_tts = skip_tts
await self.push_frame(out_frame)

View File

@@ -22,7 +22,7 @@ try:
from langchain_core.runnables import Runnable
except ModuleNotFoundError as e:
logger.error("In order to use Langchain, you need to `pip install pipecat-ai[langchain]`. ")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class LangchainProcessor(FrameProcessor):

View File

@@ -528,9 +528,6 @@ class RTVIObserver(BaseObserver):
text = await transform(text, agg_type)
isTTS = isinstance(frame, TTSTextFrame)
if agg_type is not AggregationType.WORD:
logger.trace(f"{self} Aggregated LLM text: {text}, {agg_type} spoken:{isTTS}")
if self._params.bot_output_enabled:
message = RTVI.BotOutputMessage(
data=RTVI.BotOutputMessageData(text=text, spoken=isTTS, aggregated_by=agg_type)

View File

@@ -21,7 +21,7 @@ try:
from strands.multiagent.graph import Graph
except ModuleNotFoundError as e:
logger.error("In order to use Strands Agents, you need to `pip install strands-agents`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class StrandsAgentsProcessor(FrameProcessor):

View File

@@ -33,7 +33,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use GStreamer, you need to `pip install pipecat-ai[gstreamer]`. Also, you need to install GStreamer in your system."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class GStreamerPipelineSource(FrameProcessor):

View File

@@ -17,7 +17,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Sentry, you need to `pip install pipecat-ai[sentry]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
from pipecat.processors.metrics.frame_processor_metrics import FrameProcessorMetrics

View File

@@ -90,7 +90,6 @@ To run locally:
import argparse
import asyncio
import importlib.util
import mimetypes
import os
import sys
@@ -132,18 +131,6 @@ load_dotenv(override=True)
os.environ["ENV"] = "local"
TELEPHONY_TRANSPORTS = ["twilio", "telnyx", "plivo", "exotel"]
TRANSPORT_ROUTE_DEPENDENCIES = {
"daily": ("daily",),
"webrtc": ("aiortc",),
"telephony": ("fastapi", "websockets"),
"websocket": ("fastapi", "websockets"),
}
TRANSPORT_INSTALL_HINTS = {
"daily": "install pipecat-ai[daily]",
"webrtc": "install pipecat-ai[webrtc]",
"telephony": "install pipecat-ai[websocket]",
"websocket": "install pipecat-ai[websocket]",
}
# Mirror Pipecat Cloud's 4-hour max session limit so dev rooms get cleaned up.
PIPECAT_ROOM_EXP_HOURS = 4.0
@@ -169,120 +156,6 @@ Import this to add custom routes from other packages before calling
"""
def _is_module_available(module: str) -> bool:
"""Check whether a module can be imported without importing it.
Args:
module: Fully-qualified module name to check.
Returns:
``True`` if Python can resolve the module, ``False`` otherwise.
"""
try:
return importlib.util.find_spec(module) is not None
except (ImportError, ModuleNotFoundError, ValueError):
return False
def _transport_route_dependencies(transport: str) -> tuple[str, ...]:
"""Return module dependencies required for a transport route.
Args:
transport: Transport name from the runner request or CLI.
Returns:
Module names required to enable the transport route.
"""
if transport in TELEPHONY_TRANSPORTS:
return TRANSPORT_ROUTE_DEPENDENCIES["telephony"]
return TRANSPORT_ROUTE_DEPENDENCIES.get(transport, ())
def _transport_routes_enabled(transport: str) -> bool:
"""Return whether a transport route can run in this environment.
Args:
transport: Transport name from the runner request or CLI.
Returns:
``True`` if the requested transport is enabled.
"""
return all(_is_module_available(module) for module in _transport_route_dependencies(transport))
def _runner_url(args: argparse.Namespace) -> str:
"""Return the browser URL for the runner prebuilt client."""
return f"http://{args.host}:{args.port}"
def _transport_status_lists() -> tuple[list[str], list[str]]:
"""Return enabled and disabled transport labels for the startup banner."""
transports = ["daily", "webrtc", "telephony", "websocket"]
enabled = []
disabled = []
for label in transports:
if _transport_routes_enabled(label):
enabled.append(label)
else:
disabled.append(f"{label} ({TRANSPORT_INSTALL_HINTS[label]})")
return enabled, disabled
def _format_transport_status(labels: list[str]) -> str:
"""Format a startup banner transport status list."""
return ", ".join(labels) if labels else "none"
def _print_startup_message(args: argparse.Namespace):
"""Print connection information for the development runner."""
print()
if args.transport is None:
enabled, disabled = _transport_status_lists()
print("🚀 Bot ready!")
print(f" → Open: {_runner_url(args)}")
print(f" → Enabled transports: {_format_transport_status(enabled)}")
if disabled:
print(f" → Disabled transports: {_format_transport_status(disabled)}")
elif args.transport == "webrtc":
if args.esp32:
print("🚀 Bot ready! (ESP32 mode)")
elif args.whatsapp:
print("🚀 Bot ready! (WhatsApp)")
else:
print("🚀 Bot ready! (WebRTC)")
if _transport_routes_enabled("webrtc"):
print(f" → Open: {_runner_url(args)}")
else:
print(f" → WebRTC disabled ({TRANSPORT_INSTALL_HINTS['webrtc']})")
elif args.transport == "daily":
print("🚀 Bot ready! (Daily)")
if not _transport_routes_enabled("daily"):
print(f" → Daily disabled ({TRANSPORT_INSTALL_HINTS['daily']})")
else:
print(f" → Open: {_runner_url(args)}")
if args.dialin:
print(
f" → Daily dial-in webhook: "
f"http://{args.host}:{args.port}/daily-dialin-webhook"
)
print(" → Configure this URL in your Daily phone number settings")
elif args.transport in TELEPHONY_TRANSPORTS:
print(f"🚀 Bot ready! ({args.transport.capitalize()})")
if not _transport_routes_enabled(args.transport):
print(f" → Telephony disabled ({TRANSPORT_INSTALL_HINTS['telephony']})")
else:
print(f" → Open: {_runner_url(args)}")
if args.proxy:
print(f" → XML webhook: http://{args.host}:{args.port}/")
print(f" → WebSocket: ws://{args.host}:{args.port}/ws")
elif args.transport == "vonage":
print()
print("🚀 Bot ready!")
print()
def _get_bot_module():
"""Get the bot module from the calling script."""
import importlib.util
@@ -354,8 +227,6 @@ async def _run_websocket_bot(websocket: WebSocket, args: argparse.Namespace):
def _setup_websocket_routes(app: FastAPI, args: argparse.Namespace):
"""Set up the plain WebSocket route at ``/ws-client``."""
if not _transport_routes_enabled("websocket"):
return
@app.websocket("/ws-client")
async def websocket_client_endpoint(websocket: WebSocket):
@@ -467,15 +338,6 @@ def _setup_unified_start_route(
),
)
if not _transport_routes_enabled(transport):
raise HTTPException(
status_code=400,
detail=(
f"Transport '{transport}' is disabled in this runner environment. "
"Check the startup banner for enabled transports."
),
)
if transport == "webrtc":
# WebRTC: register the session; the bot starts when the WebRTC offer arrives.
session_id = str(uuid.uuid4())
@@ -609,9 +471,6 @@ def _setup_webrtc_routes(
app: FastAPI, args: argparse.Namespace, active_sessions: dict[str, dict[str, Any]]
):
"""Set up WebRTC-specific routes."""
if not _transport_routes_enabled("webrtc"):
return
try:
from pipecat.transports.smallwebrtc.connection import SmallWebRTCConnection
from pipecat.transports.smallwebrtc.request_handler import (
@@ -621,7 +480,7 @@ def _setup_webrtc_routes(
SmallWebRTCRequestHandler,
)
except ImportError as e:
logger.warning(f"WebRTC routes disabled after dependency check passed: {e}")
logger.error(f"WebRTC transport dependencies not installed: {e}")
return
@app.get("/files/{filename:path}")
@@ -906,8 +765,6 @@ def _setup_whatsapp_routes(app: FastAPI, args: argparse.Namespace):
def _setup_daily_routes(app: FastAPI, args: argparse.Namespace):
"""Set up Daily-specific routes."""
if not _transport_routes_enabled("daily"):
return
@app.get("/daily")
async def create_room_and_start_agent():
@@ -1051,9 +908,6 @@ def _setup_telephony_routes(app: FastAPI, args: argparse.Namespace):
specific telephony transport is chosen via ``-t`` because the XML template
is provider-specific and requires a proxy hostname (``--proxy``).
"""
if not _transport_routes_enabled("telephony"):
return
if args.transport in TELEPHONY_TRANSPORTS:
# XML response templates (Exotel doesn't use XML webhooks)
XML_TEMPLATES = {
@@ -1306,11 +1160,43 @@ def main(parser: argparse.ArgumentParser | None = None):
return
# Print startup message
_print_startup_message(args)
if args.transport == "vonage":
print()
if args.transport is None:
print("🚀 Bot ready!")
print(f" → WebRTC: http://{args.host}:{args.port}/client")
print(f" → Daily: http://{args.host}:{args.port}/daily")
print(f" → Telephony: ws://{args.host}:{args.port}/ws")
elif args.transport == "webrtc":
if args.esp32:
print("🚀 Bot ready! (ESP32 mode)")
elif args.whatsapp:
print("🚀 Bot ready! (WhatsApp)")
else:
print("🚀 Bot ready! (WebRTC)")
print(f" → Open http://{args.host}:{args.port}/client in your browser")
elif args.transport == "daily":
print("🚀 Bot ready! (Daily)")
if args.dialin:
print(
f" → Daily dial-in webhook: http://{args.host}:{args.port}/daily-dialin-webhook"
)
print(f" → Configure this URL in your Daily phone number settings")
else:
print(
f" → Open http://{args.host}:{args.port}/daily in your browser to start a session"
)
elif args.transport in TELEPHONY_TRANSPORTS:
print(f"🚀 Bot ready! ({args.transport.capitalize()})")
if args.proxy:
print(f" → XML webhook: http://{args.host}:{args.port}/")
print(f" → WebSocket: ws://{args.host}:{args.port}/ws")
elif args.transport == "vonage":
print()
print(f"🚀 Bot ready!")
asyncio.run(_run_vonage())
print()
return
print()
RUNNER_DOWNLOADS_FOLDER = args.folder
RUNNER_HOST = args.host

View File

@@ -48,7 +48,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Anthropic, you need to `pip install pipecat-ai[anthropic]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class AnthropicThinkingConfig(BaseModel):

View File

@@ -55,7 +55,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error('In order to use AssemblyAI, you need to `pip install "pipecat-ai[assemblyai]"`.')
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def map_language_from_assemblyai(language_code: str) -> Language:
@@ -586,9 +586,9 @@ class AssemblyAISTTService(WebsocketSTTService):
await self._call_event_handler("on_connected")
logger.debug(f"{self} Connected to AssemblyAI WebSocket")
except Exception as e:
self._websocket = None
self._connected = False
await self.push_error(error_msg=f"Unable to connect to AssemblyAI: {e}", exception=e)
raise
async def _disconnect_websocket(self):
"""Close the websocket connection to AssemblyAI."""

View File

@@ -39,7 +39,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Async, you need to `pip install pipecat-ai[asyncai]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_async_language(language: Language) -> str:

View File

@@ -49,7 +49,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use AWS services, you need to `pip install pipecat-ai[aws]`. Also, remember to set `AWS_SECRET_ACCESS_KEY`, `AWS_ACCESS_KEY_ID`, and `AWS_REGION` environment variable."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -81,7 +81,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use AWS services, you need to `pip install pipecat-ai[aws-nova-sonic]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class AWSNovaSonicUnhandledFunctionException(Exception):

View File

@@ -32,7 +32,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use SageMaker BiDi client, you need to `pip install pipecat-ai[sagemaker]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class SageMakerBidiClient:

View File

@@ -48,7 +48,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use AWS services, you need to `pip install pipecat-ai[aws]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass
@@ -339,10 +339,10 @@ class AWSTranscribeSTTService(WebsocketSTTService):
await self._call_event_handler("on_connected")
logger.info(f"{self} Successfully connected to AWS Transcribe")
except Exception as e:
self._websocket = None
await self.push_error(
error_msg=f"Unable to connect to AWS Transcribe: {e}", exception=e
)
raise
async def _disconnect_websocket(self):
"""Close the websocket connection to AWS Transcribe."""

View File

@@ -34,7 +34,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use AWS services, you need to `pip install pipecat-ai[aws]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_aws_language(language: Language) -> str:

View File

@@ -17,7 +17,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Azure Realtime, you need to `pip install pipecat-ai[openai]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -49,7 +49,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Azure, you need to `pip install pipecat-ai[azure]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -42,7 +42,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Azure, you need to `pip install pipecat-ai[azure]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def sample_rate_to_output_format(sample_rate: int) -> SpeechSynthesisOutputFormat:
@@ -540,25 +540,14 @@ class AzureTTSService(TTSService, AzureBaseTTSService):
self._last_timestamp = timestamp
async def _word_processor_task_handler(self):
"""Process word timestamps from the queue and call add_word_timestamps.
Also handles a None sentinel from _handle_completed: once all pending
words have been drained, it signals audio stream completion via
_audio_queue so that run_tts exits only after the last word has been
processed.
"""
"""Process word timestamps from the queue and call add_word_timestamps."""
while True:
try:
item = await self._word_boundary_queue.get()
if item is None:
# All words drained — now signal audio completion.
self._audio_queue.put_nowait(None)
else:
word, timestamp_seconds = item
if self._current_context_id:
await self.add_word_timestamps(
[(word, timestamp_seconds)], self._current_context_id
)
word, timestamp_seconds = await self._word_boundary_queue.get()
if self._current_context_id:
await self.add_word_timestamps(
[(word, timestamp_seconds)], self._current_context_id
)
self._word_boundary_queue.task_done()
except asyncio.CancelledError:
break
@@ -580,21 +569,17 @@ class AzureTTSService(TTSService, AzureBaseTTSService):
Args:
evt: Completion event from Azure Speech SDK.
"""
# Store duration for cumulative offset calculation
if evt.result and evt.result.audio_duration:
self._current_sentence_duration = evt.result.audio_duration.total_seconds()
# Flush any pending word before completing
if self._last_word is not None:
self._word_boundary_queue.put_nowait((self._last_word, self._last_timestamp))
self._last_word = None
self._last_timestamp = None
# Route completion through the word boundary queue so the word processor
# task drains all pending words before signaling audio stream completion.
# Without this, the last word's TTSTextFrame may arrive after
# TTSStoppedFrame, causing it to be missed by observers and the UI.
self._word_boundary_queue.put_nowait(None)
# Store duration for cumulative offset calculation
if evt.result and evt.result.audio_duration:
self._current_sentence_duration = evt.result.audio_duration.total_seconds()
self._audio_queue.put_nowait(None) # Signal completion
def _handle_canceled(self, evt):
"""Handle synthesis cancellation.

View File

@@ -42,7 +42,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Cartesia, you need to `pip install pipecat-ai[cartesia]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass
@@ -354,8 +354,7 @@ class CartesiaSTTService(WebsocketSTTService):
self._websocket = await websocket_connect(ws_url, additional_headers=headers)
await self._call_event_handler("on_connected")
except Exception as e:
self._websocket = None
await self.push_error(error_msg=f"Unable to connect to Cartesia: {e}", exception=e)
await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
async def _disconnect_websocket(self):
ws = self._websocket

View File

@@ -8,7 +8,6 @@
import base64
import json
import re
from collections.abc import AsyncGenerator
from dataclasses import dataclass, field
from enum import StrEnum
@@ -40,7 +39,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Cartesia, you need to `pip install pipecat-ai[cartesia]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class GenerationConfig(BaseModel):
@@ -432,20 +431,10 @@ class CartesiaTTSService(WebsocketTTSService):
base_lang = language.split("-")[0].lower()
return base_lang in {"zh", "ja"}
_CARTESIA_TAG_RE = re.compile(r"</?(?:spell|emotion|break|volume|speed)\b[^>]*>", re.IGNORECASE)
def _strip_cartesia_tags(self, text: str) -> str:
text = self._CARTESIA_TAG_RE.sub(" ", text)
text = re.sub(r"\s+", " ", text)
return text.strip()
def _normalize_word_timestamps(
def _process_word_timestamps_for_language(
self, words: list[str], starts: list[float]
) -> list[tuple[str, float]]:
"""Normalize raw word timestamps from Cartesia before further processing.
Strips Cartesia SSML tags (spell, emotion, break, volume, speed) from each word
and drops entries that become empty after stripping.
"""Process word timestamps based on the current language.
For Chinese and Japanese, Cartesia groups related characters in the same timestamp
message.
@@ -469,18 +458,14 @@ class CartesiaTTSService(WebsocketTTSService):
# For Chinese/Japanese, combine all characters in this message into one word
# using the first character's start time.
if words and starts:
combined_word = "".join(self._strip_cartesia_tags(w) for w in words)
combined_word = "".join(words)
first_start = starts[0]
return [(combined_word, first_start)] if combined_word else []
return [(combined_word, first_start)]
else:
return []
else:
result = []
for word, start in zip(words, starts):
cleaned = self._strip_cartesia_tags(word)
if cleaned:
result.append((cleaned, start))
return result
# For non-CJK languages, use as-is
return list(zip(words, starts))
def _word_timestamps_include_inter_frame_spaces(self) -> bool:
"""Whether timestamp text should be treated as carrying its own spacing."""
@@ -677,7 +662,7 @@ class CartesiaTTSService(WebsocketTTSService):
await self.remove_audio_context(ctx_id)
elif msg["type"] == "timestamps":
# Process the timestamps based on language before adding them
processed_timestamps = self._normalize_word_timestamps(
processed_timestamps = self._process_word_timestamps_for_language(
msg["word_timestamps"]["words"], msg["word_timestamps"]["start"]
)
await self.add_word_timestamps(

View File

@@ -31,7 +31,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Deepgram Flux, you need to `pip install pipecat-ai[deepgram]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# Re-export for backward compatibility
__all__ = [

View File

@@ -48,7 +48,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Deepgram, you need to `pip install pipecat-ai[deepgram]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class LiveOptions:

View File

@@ -39,7 +39,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use DeepgramWebsocketTTSService, you need to `pip install pipecat-ai[deepgram]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -52,7 +52,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use ElevenLabs Realtime STT, you need to `pip install pipecat-ai[elevenlabs]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_elevenlabs_language(language: Language) -> str:
@@ -823,7 +823,6 @@ class ElevenLabsRealtimeSTTService(WebsocketSTTService):
await self._call_event_handler("on_connected")
logger.debug("Connected to ElevenLabs Realtime STT")
except Exception as e:
self._websocket = None
await self.push_error(
error_msg=f"Unable to connect to ElevenLabs Realtime STT: {e}", exception=e
)

View File

@@ -56,7 +56,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use ElevenLabs, you need to `pip install pipecat-ai[elevenlabs]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# Models that support language codes
# The following models are excluded as they don't support language codes:
@@ -594,10 +594,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
self._partial_word_start_time = 0.0
self._alignment_started_context_ids: set[str | None] = set()
# Context IDs whose context-init has been sent, so the keepalive knows
# which contexts are safe to target.
self._context_init_sent: set[str] = set()
# Context management for v1 multi API
self._receive_task = None
self._keepalive_task = None
@@ -796,7 +792,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
finally:
await self.remove_active_audio_context()
self._websocket = None
self._context_init_sent.clear()
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
@@ -827,7 +822,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
self._partial_word = ""
self._partial_word_start_time = 0.0
self._alignment_started_context_ids.discard(context_id)
self._context_init_sent.discard(context_id)
async def on_audio_context_interrupted(self, context_id: str):
"""Close the ElevenLabs context when the bot is interrupted."""
@@ -920,35 +914,26 @@ class ElevenLabsTTSService(WebsocketTTSService):
while True:
await asyncio.sleep(KEEPALIVE_SLEEP)
try:
await self._send_keepalive()
if self._websocket and self._websocket.state is State.OPEN:
context_id = self.get_active_audio_context_id()
if context_id:
# Send keepalive with context ID to keep the connection alive
keepalive_message = {
"text": "",
"context_id": context_id,
}
logger.trace(f"Sending keepalive for context {context_id}")
else:
# It's possible to have a user interruption which clears the context
# without generating a new TTS response. In this case, we'll just send
# an empty message to keep the connection alive.
keepalive_message = {"text": ""}
logger.trace("Sending keepalive without context")
await self._websocket.send(json.dumps(keepalive_message))
except websockets.ConnectionClosed as e:
logger.warning(f"{self} keepalive error: {e}")
break
async def _send_keepalive(self):
"""Send a single keepalive message to keep the WebSocket connection alive.
Only stamps a ``context_id`` once its context-init (carrying
``voice_settings``) has been sent. Otherwise the keepalive would be the
context's first message, with no ``voice_settings``, and ElevenLabs would
reject the later context-init with a 1008 policy violation. A context-less
keepalive is sufficient until the context-init is sent.
"""
if not self._websocket or self._websocket.state is not State.OPEN:
return
context_id = self.get_active_audio_context_id()
if context_id and context_id in self._context_init_sent:
# The context's voice_settings context-init has been sent, so it's
# safe to keep that context alive.
keepalive_message = {"text": "", "context_id": context_id}
else:
# No active context, or the active context's context-init hasn't been
# sent yet. A context-less keepalive keeps the connection alive without
# opening the context prematurely.
keepalive_message = {"text": ""}
await self._websocket.send(json.dumps(keepalive_message))
async def _send_text(self, text: str, context_id: str):
"""Send text to the WebSocket for synthesis."""
if self._websocket and context_id:
@@ -995,9 +980,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
locator.model_dump()
for locator in self._pronunciation_dictionary_locators
]
# Mark the context-init as sent so the keepalive may now
# target this context_id.
self._context_init_sent.add(context_id)
await self._websocket.send(json.dumps(msg))
logger.trace(f"Created new context {context_id}")

View File

@@ -38,7 +38,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Fish Audio, you need to `pip install pipecat-ai[fish]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# FishAudio supports various output formats
FishAudioOutputFormat = Literal["opus", "mp3", "pcm", "wav"]

View File

@@ -53,7 +53,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Gladia, you need to `pip install pipecat-ai[gladia]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_gladia_language(language: Language) -> str:
@@ -558,9 +558,8 @@ class GladiaSTTService(WebsocketSTTService):
logger.debug(f"{self} Connected to Gladia WebSocket")
except Exception as e:
self._websocket = None
self._connection_active = False
await self.push_error(error_msg=f"Unable to connect to Gladia: {e}", exception=e)
raise
async def _disconnect_websocket(self):
"""Close the websocket connection to Gladia."""

View File

@@ -105,7 +105,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# Connection management constants

View File

@@ -36,7 +36,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Google Vertex AI, you need to `pip install pipecat-ai[google]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -35,7 +35,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -65,7 +65,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class GoogleThinkingConfig(BaseModel):

View File

@@ -57,7 +57,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use Google AI, you need to `pip install pipecat-ai[google]`. Also, set `GOOGLE_APPLICATION_CREDENTIALS` environment variable."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_google_stt_language(language: Language) -> str:

View File

@@ -57,7 +57,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use Google AI, you need to `pip install pipecat-ai[google]`. Also, set `GOOGLE_APPLICATION_CREDENTIALS` environment variable."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_google_tts_language(language: Language) -> str:

View File

@@ -35,7 +35,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use Google AI, you need to `pip install pipecat-ai[google]`. Also, set `GOOGLE_APPLICATION_CREDENTIALS` environment variable."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -44,7 +44,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error('In order to use Gradium, you need to `pip install "pipecat-ai[gradium]"`.')
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# Seconds to wait after a "flushed" message for trailing text tokens to arrive
# before finalizing the transcription.
@@ -423,8 +423,8 @@ class GradiumSTTService(WebsocketSTTService):
logger.debug("Connected to Gradium STT")
except Exception as e:
self._websocket = None
await self.push_error(error_msg=f"Unable to connect to Gradium: {e}", exception=e)
await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
raise
async def _disconnect(self):
await super()._disconnect()

View File

@@ -33,7 +33,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Gradium, you need to `pip install pipecat-ai[gradium]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
SAMPLE_RATE = 48000

View File

@@ -30,7 +30,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Groq, you need to `pip install pipecat-ai[groq]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# Hint set for `output_format`. The values mirror the Literal that
# `groq.resources.audio.speech.AsyncSpeech.create` accepts on its

View File

@@ -46,7 +46,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use HeyGen, you need to `pip install pipecat-ai[heygen]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
HEY_GEN_SAMPLE_RATE = 24000

View File

@@ -38,7 +38,7 @@ try:
except ModuleNotFoundError as e: # pragma: no cover - import-time guidance
logger.error(f"Exception: {e}")
logger.error("In order to use Hume, you need to `pip install pipecat-ai[hume]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
HUME_SAMPLE_RATE = 48_000 # Hume TTS streams at 48 kHz

View File

@@ -1,124 +0,0 @@
#
# Copyright (c) 2024-2026, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
"""Inception LLM service implementation using OpenAI-compatible interface."""
from dataclasses import dataclass, field
from typing import Literal
from loguru import logger
from pipecat.adapters.services.open_ai_adapter import OpenAILLMInvocationParams
from pipecat.services.openai.base_llm import BaseOpenAILLMService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.settings import NOT_GIVEN as _NOT_GIVEN
from pipecat.services.settings import _NotGiven, is_given
@dataclass
class InceptionLLMSettings(BaseOpenAILLMService.Settings):
"""Settings for InceptionLLMService.
Parameters:
reasoning_effort: Controls how much reasoning the model applies.
One of "instant", "low", "medium", or "high". When unset, the
parameter is omitted and Inception's server-side default applies.
realtime: When True, reduces time to first diffusion block (TTFT).
"""
reasoning_effort: Literal["instant", "low", "medium", "high"] | None | _NotGiven = field(
default_factory=lambda: _NOT_GIVEN
)
realtime: bool | None | _NotGiven = field(default_factory=lambda: _NOT_GIVEN)
class InceptionLLMService(OpenAILLMService):
"""A service for interacting with Inception's API using the OpenAI-compatible interface.
This service extends OpenAILLMService to connect to Inception's API endpoint while
maintaining full compatibility with OpenAI's interface and functionality.
Supports Mercury-2, Inception's diffusion-based reasoning model.
"""
# Inception doesn't support the "developer" message role.
supports_developer_role = False
Settings = InceptionLLMSettings
_settings: Settings
def __init__(
self,
*,
api_key: str,
base_url: str = "https://api.inceptionlabs.ai/v1",
settings: Settings | None = None,
**kwargs,
):
"""Initialize the Inception LLM service.
Args:
api_key: The API key for accessing Inception's API.
base_url: The base URL for Inception API. Defaults to "https://api.inceptionlabs.ai/v1".
settings: Runtime-updatable settings.
**kwargs: Additional keyword arguments passed to OpenAILLMService.
"""
default_settings = self.Settings(
model="mercury-2",
reasoning_effort=None,
realtime=None,
)
if settings is not None:
default_settings.apply_update(settings)
super().__init__(api_key=api_key, base_url=base_url, settings=default_settings, **kwargs)
def create_client(self, api_key=None, base_url=None, **kwargs):
"""Create OpenAI-compatible client for Inception API endpoint.
Args:
api_key: The API key for authentication. If None, uses instance default.
base_url: The base URL for the API. If None, uses instance default.
**kwargs: Additional keyword arguments for client configuration.
Returns:
An OpenAI-compatible client configured for Inception's API.
"""
logger.debug(f"Creating Inception client with api {base_url}")
return super().create_client(api_key, base_url, **kwargs)
def build_chat_completion_params(self, params_from_context: OpenAILLMInvocationParams) -> dict:
"""Build parameters for Inception chat completion request.
Extends the base OpenAI parameters with Inception-specific options
such as reasoning_effort and realtime.
Args:
params_from_context: Parameters, derived from the LLM context, to
use for the chat completion. Contains messages, tools, and tool
choice.
Returns:
Dictionary of parameters for the chat completion request.
"""
params = super().build_chat_completion_params(params_from_context)
if (
is_given(self._settings.reasoning_effort)
and self._settings.reasoning_effort is not None
):
params["reasoning_effort"] = self._settings.reasoning_effort
# realtime is Inception-specific and unknown to the OpenAI SDK,
# so it must be passed via extra_body to avoid validation errors.
extra_body = {}
if is_given(self._settings.realtime) and self._settings.realtime is not None:
extra_body["realtime"] = self._settings.realtime
if extra_body:
params["extra_body"] = extra_body
return params

View File

@@ -68,7 +68,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Inworld Realtime, you need to `pip install pipecat-ai[inworld]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -43,7 +43,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Inworld WebSocket TTS, you need to `pip install websockets`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
from pipecat.frames.frames import (
AggregationType,

View File

@@ -32,7 +32,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Kokoro, you need to `pip install pipecat-ai[kokoro]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
KOKORO_CACHE_DIR = Path(os.path.expanduser("~/.cache/kokoro-onnx"))
KOKORO_MODEL_URL = "https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx"

View File

@@ -34,7 +34,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use LMNT, you need to `pip install pipecat-ai[lmnt]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_lmnt_language(language: Language) -> str:

View File

@@ -29,7 +29,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use an MCP client, you need to `pip install pipecat-ai[mcp]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
ServerParameters: TypeAlias = StdioServerParameters | SseServerParameters | StreamableHttpParameters

View File

@@ -28,7 +28,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use Mem0, you need to `pip install mem0ai`. Also, set the environment variable MEM0_API_KEY."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class Mem0MemoryService(FrameProcessor):

View File

@@ -48,7 +48,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Mistral STT, you need to `pip install pipecat-ai[mistral]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -31,7 +31,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Mistral TTS, you need to `pip install pipecat-ai[mistral]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -34,7 +34,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Moondream, you need to `pip install pipecat-ai[moondream]`.")
raise Exception(f"Missing module(s): {e}")
raise ImportError(f"Missing module(s): {e}") from e
def detect_device():

View File

@@ -41,7 +41,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Neuphonic, you need to `pip install pipecat-ai[neuphonic]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_neuphonic_lang_code(language: Language) -> str:

View File

@@ -46,7 +46,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use NVIDIA Nemotron Speech STT, you need to `pip install pipecat-ai[nvidia]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_nvidia_nemotron_speech_language(language: Language) -> str:

View File

@@ -55,7 +55,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use NVIDIA Nemotron Speech TTS, you need to `pip install pipecat-ai[nvidia]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -71,7 +71,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use OpenAI, you need to `pip install pipecat-ai[openai]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -59,7 +59,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use OpenAI, you need to `pip install pipecat-ai[openai]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# ---------------------------------------------------------------------------

View File

@@ -30,7 +30,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Piper, you need to `pip install pipecat-ai[piper]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -33,7 +33,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Resemble AI, you need to `pip install pipecat-ai[resembleai]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -47,7 +47,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Rime, you need to `pip install pipecat-ai[rime]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_rime_language(language: Language) -> str:

View File

@@ -53,7 +53,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Sarvam, you need to `pip install pipecat-ai[sarvam]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_sarvam_language(language: Language) -> str:

View File

@@ -70,7 +70,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Sarvam, you need to `pip install pipecat-ai[sarvam]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class SarvamTTSModel(StrEnum):

View File

@@ -35,7 +35,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Simli, you need to `pip install pipecat-ai[simli]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

View File

@@ -48,7 +48,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Smallest, you need to `pip install pipecat-ai[smallest]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
def language_to_smallest_stt_language(language: Language) -> str:

View File

@@ -41,7 +41,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Smallest, you need to `pip install pipecat-ai[smallest]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
class SmallestTTSModel(StrEnum):

View File

@@ -39,7 +39,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Soniox, you need to `pip install pipecat-ai[soniox]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
KEEPALIVE_MESSAGE = '{"type": "keepalive"}'
@@ -155,6 +155,7 @@ def language_to_soniox_language(language: Language) -> str:
Language.ID: "id",
Language.IT: "it",
Language.JA: "ja",
Language.KA: "ka",
Language.KK: "kk",
Language.KN: "kn",
Language.KO: "ko",
@@ -231,7 +232,6 @@ class SonioxSTTSettings(STTSettings):
context_version 2.
enable_speaker_diarization: Whether to enable speaker diarization.
enable_language_identification: Whether to enable language identification.
max_endpoint_delay_ms: Max ms before endpoint detection finalizes the turn (500-3000).
client_reference_id: Client reference ID to use for transcription.
"""
@@ -242,7 +242,6 @@ class SonioxSTTSettings(STTSettings):
enable_language_identification: bool | None | _NotGiven = field(
default_factory=lambda: NOT_GIVEN
)
max_endpoint_delay_ms: int | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)
client_reference_id: str | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)
@@ -310,7 +309,6 @@ class SonioxSTTService(WebsocketSTTService):
context=None,
enable_speaker_diarization=False,
enable_language_identification=False,
max_endpoint_delay_ms=None,
client_reference_id=None,
)
@@ -392,7 +390,8 @@ class SonioxSTTService(WebsocketSTTService):
changed = await super()._update_settings(delta)
if changed:
await self._request_reconnect()
await self._disconnect()
await self._connect()
return changed
@@ -523,7 +522,6 @@ class SonioxSTTService(WebsocketSTTService):
"audio_format": self._audio_format,
"num_channels": self._num_channels,
"enable_endpoint_detection": enable_endpoint_detection,
"max_endpoint_delay_ms": s.max_endpoint_delay_ms,
"sample_rate": self.sample_rate,
"language_hints": _prepare_language_hints(assert_given(s.language_hints)),
"language_hints_strict": s.language_hints_strict,
@@ -539,8 +537,8 @@ class SonioxSTTService(WebsocketSTTService):
await self._call_event_handler("on_connected")
logger.debug("Connected to Soniox STT")
except Exception as e:
self._websocket = None
await self.push_error(error_msg=f"Unable to connect to Soniox: {e}", exception=e)
raise
async def _disconnect_websocket(self):
"""Close the websocket connection to Soniox."""

View File

@@ -44,7 +44,7 @@ try:
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Soniox, you need to `pip install pipecat-ai[soniox]`.")
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
# Soniox idle timeout is 20-30s; keepalive cadence must stay well inside it.

View File

@@ -61,7 +61,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use Speechmatics, you need to `pip install pipecat-ai[speechmatics]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
load_dotenv()

View File

@@ -32,7 +32,7 @@ except ModuleNotFoundError as e:
logger.error(
"In order to use Speechmatics, you need to `pip install pipecat-ai[speechmatics]`."
)
raise Exception(f"Missing module: {e}")
raise ImportError(f"Missing module: {e}") from e
@dataclass

Some files were not shown because too many files have changed in this diff Show More