Compare commits
2 Commits
main
...
mark/missi
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6fca53c31d | ||
|
|
e1f3b4fdbe |
@@ -1,91 +0,0 @@
|
||||
---
|
||||
name: squash-commits
|
||||
description: Reorganize messy branch commits into a small set of logical, meaningful commits without changing any content. Drops merge-from-main commits. Safe: creates a backup branch first.
|
||||
---
|
||||
|
||||
Reorganize the commits on the current branch into a small number of logical commits. Do NOT change any file content — only the commit structure changes.
|
||||
|
||||
## Instructions
|
||||
|
||||
### 1. Safety check
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
If there are uncommitted changes, stop and tell the user to commit or stash them first.
|
||||
|
||||
### 2. Inspect the branch
|
||||
|
||||
```bash
|
||||
git log main..HEAD --oneline
|
||||
git diff main..HEAD --name-only
|
||||
```
|
||||
|
||||
List every file changed vs `main` and every commit on the branch (excluding merge commits from main).
|
||||
|
||||
### 3. Create a backup branch
|
||||
|
||||
```bash
|
||||
git branch backup/<current-branch-name>
|
||||
```
|
||||
|
||||
Tell the user the backup exists so they can recover if needed.
|
||||
|
||||
### 4. Soft-reset to main and unstage everything
|
||||
|
||||
```bash
|
||||
git reset --soft main
|
||||
git restore --staged .
|
||||
```
|
||||
|
||||
All branch changes are now in the working tree, unstaged. No content has changed.
|
||||
|
||||
### 5. Plan the logical groups
|
||||
|
||||
Read the changed files and the original commit messages to understand what the work covers. Group related files into logical commits. Typical groups:
|
||||
|
||||
- Core feature or fix (new source files + modified core files)
|
||||
- Secondary features or fixes (each as its own commit if distinct)
|
||||
- Refactoring or renames
|
||||
- Tests
|
||||
- Changelogs / docs
|
||||
|
||||
Use the changelog files (if any) as a strong hint — each changelog entry often maps to one commit.
|
||||
|
||||
Present the proposed grouping to the user and ask for confirmation before committing.
|
||||
|
||||
### 6. Commit in logical groups
|
||||
|
||||
For each group, stage only the relevant files and commit with a clear message following the project's conventions:
|
||||
|
||||
```bash
|
||||
git add <file1> <file2> ...
|
||||
git commit -m "..."
|
||||
```
|
||||
|
||||
Use conventional commit prefixes if the project uses them (`feat:`, `fix:`, `refactor:`, `test:`, `chore:`).
|
||||
|
||||
### 7. Verify
|
||||
|
||||
```bash
|
||||
git log main..HEAD --oneline
|
||||
git diff main..HEAD --name-only
|
||||
git status --short
|
||||
```
|
||||
|
||||
Confirm:
|
||||
- Commit count is small and each message is meaningful
|
||||
- The set of changed files vs `main` is identical to before
|
||||
- Working tree is clean
|
||||
|
||||
### 8. Remind about force-push
|
||||
|
||||
The branch history has been rewritten. Tell the user they will need to `git push --force-with-lease` when they are ready to update the remote. Do NOT push automatically.
|
||||
|
||||
## Rules
|
||||
|
||||
- Never change file contents. If you find yourself editing a file, stop.
|
||||
- Never skip the backup branch step.
|
||||
- Never force-push without explicit user instruction.
|
||||
- If any step fails or the result looks wrong, tell the user and suggest restoring from the backup: `git reset --hard backup/<branch-name>`.
|
||||
@@ -92,7 +92,7 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
|
||||
| Category | Services |
|
||||
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/api-reference/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/api-reference/server/services/stt/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/api-reference/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/api-reference/server/services/stt/gladia), [Google](https://docs.pipecat.ai/api-reference/server/services/stt/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/stt/mistral), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/stt/nvidia), [OpenAI (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/api-reference/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/api-reference/server/services/stt/whisper), [xAI](https://docs.pipecat.ai/api-reference/server/services/stt/xai) |
|
||||
| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Inception](https://docs.pipecat.ai/api-reference/server/services/llm/inception), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together) |
|
||||
| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together) |
|
||||
| Text-to-Speech | [Async](https://docs.pipecat.ai/api-reference/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/api-reference/server/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/api-reference/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/api-reference/server/services/tts/fish), [Google](https://docs.pipecat.ai/api-reference/server/services/tts/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/api-reference/server/services/tts/groq), [Hume](https://docs.pipecat.ai/api-reference/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/api-reference/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/api-reference/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/api-reference/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/api-reference/server/services/tts/minimax), [Mistral](https://docs.pipecat.ai/api-reference/server/services/tts/mistral), [Neuphonic](https://docs.pipecat.ai/api-reference/server/services/tts/neuphonic), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/tts/nvidia), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/tts/openai), [Piper](https://docs.pipecat.ai/api-reference/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/api-reference/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/api-reference/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/api-reference/server/services/tts/smallest), [Soniox](https://docs.pipecat.ai/api-reference/server/services/tts/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/api-reference/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/api-reference/server/services/tts/xtts) |
|
||||
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/api-reference/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/api-reference/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/api-reference/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/api-reference/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/api-reference/server/services/s2s/ultravox), |
|
||||
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/api-reference/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/api-reference/server/services/transport/small-webrtc), [Vonage (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/vonage), [WebSocket Server](https://docs.pipecat.ai/api-reference/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/api-reference/server/services/transport/whatsapp), Local |
|
||||
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed Azure TTS last word being missed by observers and RTVI UI. The completion signal was racing with word timestamp processing, causing the final word's `TTSTextFrame` to arrive after `TTSStoppedFrame`. Completion is now routed through the word boundary queue to ensure all words are processed before signaling stream end.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed `BaseOutputTransport` reordering frames that share the same presentation timestamp. Frames with equal PTS values are now emitted in insertion order, preventing subtle audio/text sequencing bugs when multiple frames arrive at the same time.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed Cartesia word timestamps leaking SSML tag text (e.g. `<spell>`, `<emotion>`, `<break>`) into word entries. Tags are now stripped before processing, so word-to-text attribution remains accurate when SSML markup is present in the TTS input.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed `TTSTextFrame` entries losing their original text structure when word timestamps are enabled. Each `TTSTextFrame` now carries a `raw_text` field containing the corresponding span of the original LLM-produced text (including pattern delimiters such as `<card>4111 1111 1111 1111</card>`), so the assistant context receives properly-tagged content rather than the cleaned words returned by the TTS provider. Also handles words that straddle two sentence boundaries by splitting them and attributing each part to its correct source frame.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed skipped TTS frames (e.g. code blocks filtered via `skip_aggregator_types`) being emitted to the assistant context immediately instead of waiting for preceding spoken frames to finish. They now hold their position in the frame sequence and are flushed only after all earlier spoken sentences are complete, keeping context ordering correct.
|
||||
@@ -1 +0,0 @@
|
||||
- Added `InceptionLLMService` for Inception's Mercury 2 diffusion reasoning model, with support for `reasoning_effort` and `realtime` settings.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed websocket STT connection setup failures so services clear stale websocket state and emit non-fatal error frames, allowing `ServiceSwitcher` failover to keep agents running.
|
||||
@@ -1 +0,0 @@
|
||||
- Added `max_endpoint_delay_ms` to `SonioxSTTService.Settings`, controlling the maximum delay (500-3000 ms) before endpoint detection finalizes a turn.
|
||||
@@ -1 +0,0 @@
|
||||
- `SonioxSTTService` now applies settings updates (e.g. via `STTUpdateSettingsFrame`) using a graceful reconnect instead of a hard disconnect/reconnect, preserving the service's reconnect retry behavior.
|
||||
@@ -1 +0,0 @@
|
||||
- Removed the unsupported Georgian (`Language.KA`) language mapping from `SonioxSTTService`.
|
||||
@@ -1 +0,0 @@
|
||||
- Updated the default p99 TTFS latency values for Smallest AI, Mistral, and XAI STT so turn stop timing uses measured values instead of the conservative fallback.
|
||||
@@ -1 +0,0 @@
|
||||
- Updated the development runner startup banner to show the prebuilt client URL once and list enabled or disabled transports with install hints.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed the development runner so missing optional transport dependencies disable only their related routes instead of failing startup in all-transport mode.
|
||||
1
changelog/4525.changed.md
Normal file
1
changelog/4525.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Services and transports with missing optional dependencies now raise `ImportError` instead of a bare `Exception` when their module is imported without the required extra installed. The original `ModuleNotFoundError` is preserved as `__cause__`, so code that wraps these imports can now use `except ImportError:` cleanly instead of `except Exception:`.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed a race in `ElevenLabsTTSService` where the periodic keepalive could be sent for a new turn's context before that context's `voice_settings` initialization message, causing ElevenLabs to close the WebSocket with a 1008 policy violation (`voice_settings field must be provided in the first message ...`). The keepalive now only targets a context once its context-init has been sent.
|
||||
@@ -1 +0,0 @@
|
||||
- Bumped `pipecat-ai-prebuilt` to 1.0.1 in the `runner` extra, updating the prebuilt client UI served by the development runner.
|
||||
@@ -91,9 +91,6 @@ HEYGEN_LIVE_AVATAR_API_KEY=...
|
||||
HUME_API_KEY=...
|
||||
HUME_VOICE_ID=...
|
||||
|
||||
# Inception
|
||||
INCEPTION_API_KEY=...
|
||||
|
||||
# Inworld
|
||||
INWORLD_API_KEY=...
|
||||
|
||||
|
||||
@@ -1,177 +0,0 @@
|
||||
#
|
||||
# Copyright (c) 2024-2026, Daily
|
||||
#
|
||||
# SPDX-License-Identifier: BSD 2-Clause License
|
||||
#
|
||||
|
||||
|
||||
import os
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from loguru import logger
|
||||
|
||||
from pipecat.adapters.schemas.function_schema import FunctionSchema
|
||||
from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
||||
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||
from pipecat.frames.frames import LLMRunFrame, TTSSpeakFrame
|
||||
from pipecat.pipeline.pipeline import Pipeline
|
||||
from pipecat.pipeline.runner import PipelineRunner
|
||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||
from pipecat.processors.aggregators.llm_context import LLMContext
|
||||
from pipecat.processors.aggregators.llm_response_universal import (
|
||||
LLMContextAggregatorPair,
|
||||
LLMUserAggregatorParams,
|
||||
)
|
||||
from pipecat.runner.types import RunnerArguments
|
||||
from pipecat.runner.utils import create_transport
|
||||
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||
from pipecat.services.deepgram.stt import DeepgramSTTService
|
||||
from pipecat.services.inception.llm import InceptionLLMService
|
||||
from pipecat.services.llm_service import FunctionCallParams
|
||||
from pipecat.transports.base_transport import BaseTransport, TransportParams
|
||||
from pipecat.transports.daily.transport import DailyParams
|
||||
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
|
||||
|
||||
load_dotenv(override=True)
|
||||
|
||||
|
||||
async def fetch_weather_from_api(params: FunctionCallParams):
|
||||
await params.result_callback({"conditions": "nice", "temperature": "75"})
|
||||
|
||||
|
||||
async def fetch_restaurant_recommendation(params: FunctionCallParams):
|
||||
await params.result_callback({"name": "The Golden Dragon"})
|
||||
|
||||
|
||||
# We use lambdas to defer transport parameter creation until the transport
|
||||
# type is selected at runtime.
|
||||
transport_params = {
|
||||
"daily": lambda: DailyParams(
|
||||
audio_in_enabled=True,
|
||||
audio_out_enabled=True,
|
||||
),
|
||||
"twilio": lambda: FastAPIWebsocketParams(
|
||||
audio_in_enabled=True,
|
||||
audio_out_enabled=True,
|
||||
),
|
||||
"webrtc": lambda: TransportParams(
|
||||
audio_in_enabled=True,
|
||||
audio_out_enabled=True,
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
|
||||
logger.info(f"Starting bot")
|
||||
|
||||
stt = DeepgramSTTService(api_key=os.environ["DEEPGRAM_API_KEY"])
|
||||
|
||||
tts = CartesiaTTSService(
|
||||
api_key=os.environ["CARTESIA_API_KEY"],
|
||||
settings=CartesiaTTSService.Settings(
|
||||
voice="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
||||
),
|
||||
)
|
||||
|
||||
llm = InceptionLLMService(
|
||||
api_key=os.environ["INCEPTION_API_KEY"],
|
||||
settings=InceptionLLMService.Settings(
|
||||
reasoning_effort="instant",
|
||||
system_instruction="You are a helpful assistant in a voice conversation. Your responses will be spoken aloud, so avoid emojis, bullet points, or other formatting that can't be spoken. Respond to what the user said in a creative, helpful, and brief way.",
|
||||
),
|
||||
)
|
||||
# You can also register a function_name of None to get all functions
|
||||
# sent to the same callback with an additional function_name parameter.
|
||||
llm.register_function("get_current_weather", fetch_weather_from_api)
|
||||
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
|
||||
|
||||
@llm.event_handler("on_function_calls_started")
|
||||
async def on_function_calls_started(service, function_calls):
|
||||
await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
|
||||
|
||||
weather_function = FunctionSchema(
|
||||
name="get_current_weather",
|
||||
description="Get the current weather",
|
||||
properties={
|
||||
"location": {
|
||||
"type": "string",
|
||||
"description": "The city and state, e.g. San Francisco, CA",
|
||||
},
|
||||
"format": {
|
||||
"type": "string",
|
||||
"enum": ["celsius", "fahrenheit"],
|
||||
"description": "The temperature unit to use. Infer this from the user's location.",
|
||||
},
|
||||
},
|
||||
required=["location", "format"],
|
||||
)
|
||||
|
||||
restaurant_function = FunctionSchema(
|
||||
name="get_restaurant_recommendation",
|
||||
description="Get a restaurant recommendation",
|
||||
properties={
|
||||
"location": {
|
||||
"type": "string",
|
||||
"description": "The city and state, e.g. San Francisco, CA",
|
||||
},
|
||||
},
|
||||
required=["location"],
|
||||
)
|
||||
tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
|
||||
|
||||
context = LLMContext(tools=tools)
|
||||
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
|
||||
context,
|
||||
user_params=LLMUserAggregatorParams(vad_analyzer=SileroVADAnalyzer()),
|
||||
)
|
||||
|
||||
pipeline = Pipeline(
|
||||
[
|
||||
transport.input(),
|
||||
stt,
|
||||
user_aggregator,
|
||||
llm,
|
||||
tts,
|
||||
transport.output(),
|
||||
assistant_aggregator,
|
||||
]
|
||||
)
|
||||
|
||||
task = PipelineTask(
|
||||
pipeline,
|
||||
params=PipelineParams(
|
||||
enable_metrics=True,
|
||||
enable_usage_metrics=True,
|
||||
),
|
||||
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
|
||||
)
|
||||
|
||||
@transport.event_handler("on_client_connected")
|
||||
async def on_client_connected(transport, client):
|
||||
logger.info(f"Client connected")
|
||||
# Kick off the conversation.
|
||||
context.add_message(
|
||||
{"role": "developer", "content": "Please introduce yourself to the user."}
|
||||
)
|
||||
await task.queue_frames([LLMRunFrame()])
|
||||
|
||||
@transport.event_handler("on_client_disconnected")
|
||||
async def on_client_disconnected(transport, client):
|
||||
logger.info(f"Client disconnected")
|
||||
await task.cancel()
|
||||
|
||||
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
|
||||
|
||||
await runner.run(task)
|
||||
|
||||
|
||||
async def bot(runner_args: RunnerArguments):
|
||||
"""Main bot entry point compatible with Pipecat Cloud."""
|
||||
transport = await create_transport(runner_args, transport_params)
|
||||
await run_bot(transport, runner_args)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
from pipecat.runner.run import main
|
||||
|
||||
main()
|
||||
@@ -22,9 +22,9 @@ from pipecat.processors.aggregators.llm_response_universal import (
|
||||
)
|
||||
from pipecat.runner.types import RunnerArguments
|
||||
from pipecat.runner.utils import create_transport
|
||||
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||
from pipecat.services.openai.llm import OpenAILLMService
|
||||
from pipecat.services.soniox.stt import SonioxSTTService
|
||||
from pipecat.services.soniox.tts import SonioxTTSService
|
||||
from pipecat.transcriptions.language import Language
|
||||
from pipecat.transports.base_transport import BaseTransport, TransportParams
|
||||
from pipecat.transports.daily.transport import DailyParams
|
||||
@@ -53,7 +53,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
|
||||
|
||||
stt = SonioxSTTService(api_key=os.environ["SONIOX_API_KEY"])
|
||||
|
||||
tts = SonioxTTSService(api_key=os.environ["SONIOX_API_KEY"])
|
||||
tts = CartesiaTTSService(
|
||||
api_key=os.environ["CARTESIA_API_KEY"],
|
||||
settings=CartesiaTTSService.Settings(
|
||||
voice="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
||||
),
|
||||
)
|
||||
|
||||
llm = OpenAILLMService(
|
||||
api_key=os.environ["OPENAI_API_KEY"],
|
||||
@@ -98,9 +103,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
|
||||
await task.queue_frames([LLMRunFrame()])
|
||||
|
||||
await asyncio.sleep(10)
|
||||
logger.info("Updating Soniox STT settings: language_hints=[es]")
|
||||
logger.info("Updating Soniox STT settings: language=es")
|
||||
await task.queue_frame(
|
||||
STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language_hints=[Language.ES]))
|
||||
STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language=Language.ES))
|
||||
)
|
||||
|
||||
@transport.event_handler("on_client_disconnected")
|
||||
|
||||
@@ -77,7 +77,6 @@ groq = [ "groq>=0.23.0,<2" ]
|
||||
gstreamer = [ "pygobject~=3.50.0" ]
|
||||
heygen = [ "livekit>=1.0.13,<2", "pipecat-ai[websockets-base]" ]
|
||||
hume = [ "hume>=0.11.2,<1" ]
|
||||
inception = []
|
||||
inworld = [ "pipecat-ai[websockets-base]" ]
|
||||
koala = [ "pvkoala~=2.0.3" ]
|
||||
kokoro = [ "kokoro-onnx>=0.5.0,<1", "requests>=2.32.5,<3" ]
|
||||
@@ -104,7 +103,7 @@ piper = [ "piper-tts>=1.3.0,<2", "requests>=2.32.5,<3" ]
|
||||
qwen = []
|
||||
resembleai = [ "pipecat-ai[websockets-base]" ]
|
||||
rime = [ "pipecat-ai[websockets-base]" ]
|
||||
runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.1"]
|
||||
runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.0"]
|
||||
sagemaker = ["aws_sdk_sagemaker_runtime_http2; python_version>='3.12'"]
|
||||
sambanova = []
|
||||
sarvam = [ "sarvamai==0.1.28", "pipecat-ai[websockets-base]" ]
|
||||
|
||||
@@ -198,7 +198,6 @@ TESTS_FUNCTION_CALLING = [
|
||||
("function-calling/function-calling-sarvam.py", EVAL_WEATHER),
|
||||
("function-calling/function-calling-novita.py", EVAL_WEATHER),
|
||||
("function-calling/function-calling-deepseek.py", EVAL_WEATHER),
|
||||
("function-calling/function-calling-inception.py", EVAL_WEATHER),
|
||||
# Video
|
||||
("function-calling/function-calling-anthropic-video.py", EVAL_VISION_CAMERA),
|
||||
("function-calling/function-calling-aws-video.py", EVAL_VISION_CAMERA),
|
||||
|
||||
@@ -28,7 +28,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class GeminiLLMInvocationParams(TypedDict):
|
||||
|
||||
@@ -23,7 +23,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use the Koala filter, you need to `pip install pipecat-ai[koala]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class KoalaFilter(BaseAudioFilter):
|
||||
|
||||
@@ -27,7 +27,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use KrispVivaFilter, you need to install krisp_audio.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class KrispVivaFilter(BaseAudioFilter):
|
||||
|
||||
@@ -28,7 +28,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use the soundfile mixer, you need to `pip install pipecat-ai[soundfile]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class SoundfileMixer(BaseAudioMixer):
|
||||
|
||||
@@ -27,7 +27,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use the LocalSmartTurnAnalyzer, you need to `pip install pipecat-ai[local-smart-turn]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class LocalCoreMLSmartTurnAnalyzer(BaseSmartTurn):
|
||||
|
||||
@@ -33,7 +33,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use LocalSmartTurnAnalyzerV2, you need to `pip install pipecat-ai[local-smart-turn]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class LocalSmartTurnAnalyzerV2(BaseSmartTurn):
|
||||
|
||||
@@ -28,7 +28,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use KrispVivaVADAnalyzer, you need to install krisp_audio.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class KrispVivaVadAnalyzer(VADAnalyzer):
|
||||
|
||||
@@ -27,7 +27,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Silero VAD, you need to `pip install pipecat-ai`.")
|
||||
raise Exception(f"Missing module(s): {e}")
|
||||
raise ImportError(f"Missing module(s): {e}") from e
|
||||
|
||||
|
||||
class SileroOnnxModel:
|
||||
|
||||
@@ -383,14 +383,10 @@ class AggregatedTextFrame(TextFrame):
|
||||
Parameters:
|
||||
aggregated_by: Method used to aggregate the text frames.
|
||||
context_id: Unique identifier for the TTS context that generated this text.
|
||||
raw_text: The full matched text including start/end pattern delimiters, set when
|
||||
this frame was produced from a PatternMatch (e.g. a ``<code>...</code>`` block).
|
||||
None for ordinary sentence aggregations.
|
||||
"""
|
||||
|
||||
aggregated_by: AggregationType | str
|
||||
context_id: str | None = None
|
||||
raw_text: str | None = None
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -25,7 +25,6 @@ from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
||||
from pipecat.audio.vad.vad_analyzer import VADAnalyzer
|
||||
from pipecat.audio.vad.vad_controller import VADController
|
||||
from pipecat.frames.frames import (
|
||||
AggregatedTextFrame,
|
||||
AssistantImageRawFrame,
|
||||
BotStartedSpeakingFrame,
|
||||
BotStoppedSpeakingFrame,
|
||||
@@ -1497,14 +1496,9 @@ class LLMAssistantAggregator(LLMContextAggregator):
|
||||
if len(frame.text) == 0:
|
||||
return
|
||||
|
||||
text = (
|
||||
frame.raw_text
|
||||
if isinstance(frame, AggregatedTextFrame) and frame.raw_text
|
||||
else frame.text
|
||||
)
|
||||
self._aggregation.append(
|
||||
TextPartForConcatenation(
|
||||
text, includes_inter_part_spaces=frame.includes_inter_frame_spaces
|
||||
frame.text, includes_inter_part_spaces=frame.includes_inter_frame_spaces
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@@ -23,7 +23,6 @@ from pipecat.frames.frames import (
|
||||
)
|
||||
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
|
||||
from pipecat.utils.text.base_text_aggregator import BaseTextAggregator
|
||||
from pipecat.utils.text.pattern_pair_aggregator import PatternMatch
|
||||
from pipecat.utils.text.simple_text_aggregator import SimpleTextAggregator
|
||||
|
||||
|
||||
@@ -86,11 +85,7 @@ class LLMTextProcessor(FrameProcessor):
|
||||
out_frame = AggregatedTextFrame(
|
||||
text=aggregation.text,
|
||||
aggregated_by=aggregation.type,
|
||||
raw_text=aggregation.full_match
|
||||
if isinstance(aggregation, PatternMatch)
|
||||
else aggregation.text,
|
||||
)
|
||||
out_frame.append_to_context = True
|
||||
out_frame.skip_tts = in_frame.skip_tts
|
||||
await self.push_frame(out_frame)
|
||||
|
||||
@@ -101,9 +96,6 @@ class LLMTextProcessor(FrameProcessor):
|
||||
out_frame = AggregatedTextFrame(
|
||||
text=remaining.text,
|
||||
aggregated_by=remaining.type,
|
||||
raw_text=remaining.full_match
|
||||
if isinstance(remaining, PatternMatch)
|
||||
else remaining.text,
|
||||
)
|
||||
out_frame.skip_tts = skip_tts
|
||||
await self.push_frame(out_frame)
|
||||
|
||||
@@ -22,7 +22,7 @@ try:
|
||||
from langchain_core.runnables import Runnable
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error("In order to use Langchain, you need to `pip install pipecat-ai[langchain]`. ")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class LangchainProcessor(FrameProcessor):
|
||||
|
||||
@@ -528,9 +528,6 @@ class RTVIObserver(BaseObserver):
|
||||
text = await transform(text, agg_type)
|
||||
|
||||
isTTS = isinstance(frame, TTSTextFrame)
|
||||
if agg_type is not AggregationType.WORD:
|
||||
logger.trace(f"{self} Aggregated LLM text: {text}, {agg_type} spoken:{isTTS}")
|
||||
|
||||
if self._params.bot_output_enabled:
|
||||
message = RTVI.BotOutputMessage(
|
||||
data=RTVI.BotOutputMessageData(text=text, spoken=isTTS, aggregated_by=agg_type)
|
||||
|
||||
@@ -21,7 +21,7 @@ try:
|
||||
from strands.multiagent.graph import Graph
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error("In order to use Strands Agents, you need to `pip install strands-agents`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class StrandsAgentsProcessor(FrameProcessor):
|
||||
|
||||
@@ -33,7 +33,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use GStreamer, you need to `pip install pipecat-ai[gstreamer]`. Also, you need to install GStreamer in your system."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class GStreamerPipelineSource(FrameProcessor):
|
||||
|
||||
@@ -17,7 +17,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Sentry, you need to `pip install pipecat-ai[sentry]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
from pipecat.processors.metrics.frame_processor_metrics import FrameProcessorMetrics
|
||||
|
||||
|
||||
@@ -90,7 +90,6 @@ To run locally:
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import importlib.util
|
||||
import mimetypes
|
||||
import os
|
||||
import sys
|
||||
@@ -132,18 +131,6 @@ load_dotenv(override=True)
|
||||
os.environ["ENV"] = "local"
|
||||
|
||||
TELEPHONY_TRANSPORTS = ["twilio", "telnyx", "plivo", "exotel"]
|
||||
TRANSPORT_ROUTE_DEPENDENCIES = {
|
||||
"daily": ("daily",),
|
||||
"webrtc": ("aiortc",),
|
||||
"telephony": ("fastapi", "websockets"),
|
||||
"websocket": ("fastapi", "websockets"),
|
||||
}
|
||||
TRANSPORT_INSTALL_HINTS = {
|
||||
"daily": "install pipecat-ai[daily]",
|
||||
"webrtc": "install pipecat-ai[webrtc]",
|
||||
"telephony": "install pipecat-ai[websocket]",
|
||||
"websocket": "install pipecat-ai[websocket]",
|
||||
}
|
||||
|
||||
# Mirror Pipecat Cloud's 4-hour max session limit so dev rooms get cleaned up.
|
||||
PIPECAT_ROOM_EXP_HOURS = 4.0
|
||||
@@ -169,120 +156,6 @@ Import this to add custom routes from other packages before calling
|
||||
"""
|
||||
|
||||
|
||||
def _is_module_available(module: str) -> bool:
|
||||
"""Check whether a module can be imported without importing it.
|
||||
|
||||
Args:
|
||||
module: Fully-qualified module name to check.
|
||||
|
||||
Returns:
|
||||
``True`` if Python can resolve the module, ``False`` otherwise.
|
||||
"""
|
||||
try:
|
||||
return importlib.util.find_spec(module) is not None
|
||||
except (ImportError, ModuleNotFoundError, ValueError):
|
||||
return False
|
||||
|
||||
|
||||
def _transport_route_dependencies(transport: str) -> tuple[str, ...]:
|
||||
"""Return module dependencies required for a transport route.
|
||||
|
||||
Args:
|
||||
transport: Transport name from the runner request or CLI.
|
||||
|
||||
Returns:
|
||||
Module names required to enable the transport route.
|
||||
"""
|
||||
if transport in TELEPHONY_TRANSPORTS:
|
||||
return TRANSPORT_ROUTE_DEPENDENCIES["telephony"]
|
||||
return TRANSPORT_ROUTE_DEPENDENCIES.get(transport, ())
|
||||
|
||||
|
||||
def _transport_routes_enabled(transport: str) -> bool:
|
||||
"""Return whether a transport route can run in this environment.
|
||||
|
||||
Args:
|
||||
transport: Transport name from the runner request or CLI.
|
||||
|
||||
Returns:
|
||||
``True`` if the requested transport is enabled.
|
||||
"""
|
||||
return all(_is_module_available(module) for module in _transport_route_dependencies(transport))
|
||||
|
||||
|
||||
def _runner_url(args: argparse.Namespace) -> str:
|
||||
"""Return the browser URL for the runner prebuilt client."""
|
||||
return f"http://{args.host}:{args.port}"
|
||||
|
||||
|
||||
def _transport_status_lists() -> tuple[list[str], list[str]]:
|
||||
"""Return enabled and disabled transport labels for the startup banner."""
|
||||
transports = ["daily", "webrtc", "telephony", "websocket"]
|
||||
enabled = []
|
||||
disabled = []
|
||||
|
||||
for label in transports:
|
||||
if _transport_routes_enabled(label):
|
||||
enabled.append(label)
|
||||
else:
|
||||
disabled.append(f"{label} ({TRANSPORT_INSTALL_HINTS[label]})")
|
||||
|
||||
return enabled, disabled
|
||||
|
||||
|
||||
def _format_transport_status(labels: list[str]) -> str:
|
||||
"""Format a startup banner transport status list."""
|
||||
return ", ".join(labels) if labels else "none"
|
||||
|
||||
|
||||
def _print_startup_message(args: argparse.Namespace):
|
||||
"""Print connection information for the development runner."""
|
||||
print()
|
||||
if args.transport is None:
|
||||
enabled, disabled = _transport_status_lists()
|
||||
print("🚀 Bot ready!")
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
print(f" → Enabled transports: {_format_transport_status(enabled)}")
|
||||
if disabled:
|
||||
print(f" → Disabled transports: {_format_transport_status(disabled)}")
|
||||
elif args.transport == "webrtc":
|
||||
if args.esp32:
|
||||
print("🚀 Bot ready! (ESP32 mode)")
|
||||
elif args.whatsapp:
|
||||
print("🚀 Bot ready! (WhatsApp)")
|
||||
else:
|
||||
print("🚀 Bot ready! (WebRTC)")
|
||||
if _transport_routes_enabled("webrtc"):
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
else:
|
||||
print(f" → WebRTC disabled ({TRANSPORT_INSTALL_HINTS['webrtc']})")
|
||||
elif args.transport == "daily":
|
||||
print("🚀 Bot ready! (Daily)")
|
||||
if not _transport_routes_enabled("daily"):
|
||||
print(f" → Daily disabled ({TRANSPORT_INSTALL_HINTS['daily']})")
|
||||
else:
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
if args.dialin:
|
||||
print(
|
||||
f" → Daily dial-in webhook: "
|
||||
f"http://{args.host}:{args.port}/daily-dialin-webhook"
|
||||
)
|
||||
print(" → Configure this URL in your Daily phone number settings")
|
||||
elif args.transport in TELEPHONY_TRANSPORTS:
|
||||
print(f"🚀 Bot ready! ({args.transport.capitalize()})")
|
||||
if not _transport_routes_enabled(args.transport):
|
||||
print(f" → Telephony disabled ({TRANSPORT_INSTALL_HINTS['telephony']})")
|
||||
else:
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
if args.proxy:
|
||||
print(f" → XML webhook: http://{args.host}:{args.port}/")
|
||||
print(f" → WebSocket: ws://{args.host}:{args.port}/ws")
|
||||
elif args.transport == "vonage":
|
||||
print()
|
||||
print("🚀 Bot ready!")
|
||||
print()
|
||||
|
||||
|
||||
def _get_bot_module():
|
||||
"""Get the bot module from the calling script."""
|
||||
import importlib.util
|
||||
@@ -354,8 +227,6 @@ async def _run_websocket_bot(websocket: WebSocket, args: argparse.Namespace):
|
||||
|
||||
def _setup_websocket_routes(app: FastAPI, args: argparse.Namespace):
|
||||
"""Set up the plain WebSocket route at ``/ws-client``."""
|
||||
if not _transport_routes_enabled("websocket"):
|
||||
return
|
||||
|
||||
@app.websocket("/ws-client")
|
||||
async def websocket_client_endpoint(websocket: WebSocket):
|
||||
@@ -467,15 +338,6 @@ def _setup_unified_start_route(
|
||||
),
|
||||
)
|
||||
|
||||
if not _transport_routes_enabled(transport):
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=(
|
||||
f"Transport '{transport}' is disabled in this runner environment. "
|
||||
"Check the startup banner for enabled transports."
|
||||
),
|
||||
)
|
||||
|
||||
if transport == "webrtc":
|
||||
# WebRTC: register the session; the bot starts when the WebRTC offer arrives.
|
||||
session_id = str(uuid.uuid4())
|
||||
@@ -609,9 +471,6 @@ def _setup_webrtc_routes(
|
||||
app: FastAPI, args: argparse.Namespace, active_sessions: dict[str, dict[str, Any]]
|
||||
):
|
||||
"""Set up WebRTC-specific routes."""
|
||||
if not _transport_routes_enabled("webrtc"):
|
||||
return
|
||||
|
||||
try:
|
||||
from pipecat.transports.smallwebrtc.connection import SmallWebRTCConnection
|
||||
from pipecat.transports.smallwebrtc.request_handler import (
|
||||
@@ -621,7 +480,7 @@ def _setup_webrtc_routes(
|
||||
SmallWebRTCRequestHandler,
|
||||
)
|
||||
except ImportError as e:
|
||||
logger.warning(f"WebRTC routes disabled after dependency check passed: {e}")
|
||||
logger.error(f"WebRTC transport dependencies not installed: {e}")
|
||||
return
|
||||
|
||||
@app.get("/files/{filename:path}")
|
||||
@@ -906,8 +765,6 @@ def _setup_whatsapp_routes(app: FastAPI, args: argparse.Namespace):
|
||||
|
||||
def _setup_daily_routes(app: FastAPI, args: argparse.Namespace):
|
||||
"""Set up Daily-specific routes."""
|
||||
if not _transport_routes_enabled("daily"):
|
||||
return
|
||||
|
||||
@app.get("/daily")
|
||||
async def create_room_and_start_agent():
|
||||
@@ -1051,9 +908,6 @@ def _setup_telephony_routes(app: FastAPI, args: argparse.Namespace):
|
||||
specific telephony transport is chosen via ``-t`` because the XML template
|
||||
is provider-specific and requires a proxy hostname (``--proxy``).
|
||||
"""
|
||||
if not _transport_routes_enabled("telephony"):
|
||||
return
|
||||
|
||||
if args.transport in TELEPHONY_TRANSPORTS:
|
||||
# XML response templates (Exotel doesn't use XML webhooks)
|
||||
XML_TEMPLATES = {
|
||||
@@ -1306,11 +1160,43 @@ def main(parser: argparse.ArgumentParser | None = None):
|
||||
return
|
||||
|
||||
# Print startup message
|
||||
_print_startup_message(args)
|
||||
if args.transport == "vonage":
|
||||
print()
|
||||
if args.transport is None:
|
||||
print("🚀 Bot ready!")
|
||||
print(f" → WebRTC: http://{args.host}:{args.port}/client")
|
||||
print(f" → Daily: http://{args.host}:{args.port}/daily")
|
||||
print(f" → Telephony: ws://{args.host}:{args.port}/ws")
|
||||
elif args.transport == "webrtc":
|
||||
if args.esp32:
|
||||
print("🚀 Bot ready! (ESP32 mode)")
|
||||
elif args.whatsapp:
|
||||
print("🚀 Bot ready! (WhatsApp)")
|
||||
else:
|
||||
print("🚀 Bot ready! (WebRTC)")
|
||||
print(f" → Open http://{args.host}:{args.port}/client in your browser")
|
||||
elif args.transport == "daily":
|
||||
print("🚀 Bot ready! (Daily)")
|
||||
if args.dialin:
|
||||
print(
|
||||
f" → Daily dial-in webhook: http://{args.host}:{args.port}/daily-dialin-webhook"
|
||||
)
|
||||
print(f" → Configure this URL in your Daily phone number settings")
|
||||
else:
|
||||
print(
|
||||
f" → Open http://{args.host}:{args.port}/daily in your browser to start a session"
|
||||
)
|
||||
elif args.transport in TELEPHONY_TRANSPORTS:
|
||||
print(f"🚀 Bot ready! ({args.transport.capitalize()})")
|
||||
if args.proxy:
|
||||
print(f" → XML webhook: http://{args.host}:{args.port}/")
|
||||
print(f" → WebSocket: ws://{args.host}:{args.port}/ws")
|
||||
elif args.transport == "vonage":
|
||||
print()
|
||||
print(f"🚀 Bot ready!")
|
||||
asyncio.run(_run_vonage())
|
||||
print()
|
||||
return
|
||||
print()
|
||||
|
||||
RUNNER_DOWNLOADS_FOLDER = args.folder
|
||||
RUNNER_HOST = args.host
|
||||
|
||||
@@ -48,7 +48,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Anthropic, you need to `pip install pipecat-ai[anthropic]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class AnthropicThinkingConfig(BaseModel):
|
||||
|
||||
@@ -55,7 +55,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error('In order to use AssemblyAI, you need to `pip install "pipecat-ai[assemblyai]"`.')
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def map_language_from_assemblyai(language_code: str) -> Language:
|
||||
@@ -586,9 +586,9 @@ class AssemblyAISTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.debug(f"{self} Connected to AssemblyAI WebSocket")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
self._connected = False
|
||||
await self.push_error(error_msg=f"Unable to connect to AssemblyAI: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to AssemblyAI."""
|
||||
|
||||
@@ -39,7 +39,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Async, you need to `pip install pipecat-ai[asyncai]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_async_language(language: Language) -> str:
|
||||
|
||||
@@ -49,7 +49,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use AWS services, you need to `pip install pipecat-ai[aws]`. Also, remember to set `AWS_SECRET_ACCESS_KEY`, `AWS_ACCESS_KEY_ID`, and `AWS_REGION` environment variable."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -81,7 +81,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use AWS services, you need to `pip install pipecat-ai[aws-nova-sonic]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class AWSNovaSonicUnhandledFunctionException(Exception):
|
||||
|
||||
@@ -32,7 +32,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use SageMaker BiDi client, you need to `pip install pipecat-ai[sagemaker]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class SageMakerBidiClient:
|
||||
|
||||
@@ -48,7 +48,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use AWS services, you need to `pip install pipecat-ai[aws]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -339,10 +339,10 @@ class AWSTranscribeSTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.info(f"{self} Successfully connected to AWS Transcribe")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(
|
||||
error_msg=f"Unable to connect to AWS Transcribe: {e}", exception=e
|
||||
)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to AWS Transcribe."""
|
||||
|
||||
@@ -34,7 +34,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use AWS services, you need to `pip install pipecat-ai[aws]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_aws_language(language: Language) -> str:
|
||||
|
||||
@@ -17,7 +17,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Azure Realtime, you need to `pip install pipecat-ai[openai]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -49,7 +49,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Azure, you need to `pip install pipecat-ai[azure]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -42,7 +42,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Azure, you need to `pip install pipecat-ai[azure]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def sample_rate_to_output_format(sample_rate: int) -> SpeechSynthesisOutputFormat:
|
||||
@@ -540,25 +540,14 @@ class AzureTTSService(TTSService, AzureBaseTTSService):
|
||||
self._last_timestamp = timestamp
|
||||
|
||||
async def _word_processor_task_handler(self):
|
||||
"""Process word timestamps from the queue and call add_word_timestamps.
|
||||
|
||||
Also handles a None sentinel from _handle_completed: once all pending
|
||||
words have been drained, it signals audio stream completion via
|
||||
_audio_queue so that run_tts exits only after the last word has been
|
||||
processed.
|
||||
"""
|
||||
"""Process word timestamps from the queue and call add_word_timestamps."""
|
||||
while True:
|
||||
try:
|
||||
item = await self._word_boundary_queue.get()
|
||||
if item is None:
|
||||
# All words drained — now signal audio completion.
|
||||
self._audio_queue.put_nowait(None)
|
||||
else:
|
||||
word, timestamp_seconds = item
|
||||
if self._current_context_id:
|
||||
await self.add_word_timestamps(
|
||||
[(word, timestamp_seconds)], self._current_context_id
|
||||
)
|
||||
word, timestamp_seconds = await self._word_boundary_queue.get()
|
||||
if self._current_context_id:
|
||||
await self.add_word_timestamps(
|
||||
[(word, timestamp_seconds)], self._current_context_id
|
||||
)
|
||||
self._word_boundary_queue.task_done()
|
||||
except asyncio.CancelledError:
|
||||
break
|
||||
@@ -580,21 +569,17 @@ class AzureTTSService(TTSService, AzureBaseTTSService):
|
||||
Args:
|
||||
evt: Completion event from Azure Speech SDK.
|
||||
"""
|
||||
# Store duration for cumulative offset calculation
|
||||
if evt.result and evt.result.audio_duration:
|
||||
self._current_sentence_duration = evt.result.audio_duration.total_seconds()
|
||||
|
||||
# Flush any pending word before completing
|
||||
if self._last_word is not None:
|
||||
self._word_boundary_queue.put_nowait((self._last_word, self._last_timestamp))
|
||||
self._last_word = None
|
||||
self._last_timestamp = None
|
||||
|
||||
# Route completion through the word boundary queue so the word processor
|
||||
# task drains all pending words before signaling audio stream completion.
|
||||
# Without this, the last word's TTSTextFrame may arrive after
|
||||
# TTSStoppedFrame, causing it to be missed by observers and the UI.
|
||||
self._word_boundary_queue.put_nowait(None)
|
||||
# Store duration for cumulative offset calculation
|
||||
if evt.result and evt.result.audio_duration:
|
||||
self._current_sentence_duration = evt.result.audio_duration.total_seconds()
|
||||
|
||||
self._audio_queue.put_nowait(None) # Signal completion
|
||||
|
||||
def _handle_canceled(self, evt):
|
||||
"""Handle synthesis cancellation.
|
||||
|
||||
@@ -42,7 +42,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Cartesia, you need to `pip install pipecat-ai[cartesia]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -354,8 +354,7 @@ class CartesiaSTTService(WebsocketSTTService):
|
||||
self._websocket = await websocket_connect(ws_url, additional_headers=headers)
|
||||
await self._call_event_handler("on_connected")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(error_msg=f"Unable to connect to Cartesia: {e}", exception=e)
|
||||
await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
ws = self._websocket
|
||||
|
||||
@@ -8,7 +8,6 @@
|
||||
|
||||
import base64
|
||||
import json
|
||||
import re
|
||||
from collections.abc import AsyncGenerator
|
||||
from dataclasses import dataclass, field
|
||||
from enum import StrEnum
|
||||
@@ -40,7 +39,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Cartesia, you need to `pip install pipecat-ai[cartesia]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class GenerationConfig(BaseModel):
|
||||
@@ -432,20 +431,10 @@ class CartesiaTTSService(WebsocketTTSService):
|
||||
base_lang = language.split("-")[0].lower()
|
||||
return base_lang in {"zh", "ja"}
|
||||
|
||||
_CARTESIA_TAG_RE = re.compile(r"</?(?:spell|emotion|break|volume|speed)\b[^>]*>", re.IGNORECASE)
|
||||
|
||||
def _strip_cartesia_tags(self, text: str) -> str:
|
||||
text = self._CARTESIA_TAG_RE.sub(" ", text)
|
||||
text = re.sub(r"\s+", " ", text)
|
||||
return text.strip()
|
||||
|
||||
def _normalize_word_timestamps(
|
||||
def _process_word_timestamps_for_language(
|
||||
self, words: list[str], starts: list[float]
|
||||
) -> list[tuple[str, float]]:
|
||||
"""Normalize raw word timestamps from Cartesia before further processing.
|
||||
|
||||
Strips Cartesia SSML tags (spell, emotion, break, volume, speed) from each word
|
||||
and drops entries that become empty after stripping.
|
||||
"""Process word timestamps based on the current language.
|
||||
|
||||
For Chinese and Japanese, Cartesia groups related characters in the same timestamp
|
||||
message.
|
||||
@@ -469,18 +458,14 @@ class CartesiaTTSService(WebsocketTTSService):
|
||||
# For Chinese/Japanese, combine all characters in this message into one word
|
||||
# using the first character's start time.
|
||||
if words and starts:
|
||||
combined_word = "".join(self._strip_cartesia_tags(w) for w in words)
|
||||
combined_word = "".join(words)
|
||||
first_start = starts[0]
|
||||
return [(combined_word, first_start)] if combined_word else []
|
||||
return [(combined_word, first_start)]
|
||||
else:
|
||||
return []
|
||||
else:
|
||||
result = []
|
||||
for word, start in zip(words, starts):
|
||||
cleaned = self._strip_cartesia_tags(word)
|
||||
if cleaned:
|
||||
result.append((cleaned, start))
|
||||
return result
|
||||
# For non-CJK languages, use as-is
|
||||
return list(zip(words, starts))
|
||||
|
||||
def _word_timestamps_include_inter_frame_spaces(self) -> bool:
|
||||
"""Whether timestamp text should be treated as carrying its own spacing."""
|
||||
@@ -677,7 +662,7 @@ class CartesiaTTSService(WebsocketTTSService):
|
||||
await self.remove_audio_context(ctx_id)
|
||||
elif msg["type"] == "timestamps":
|
||||
# Process the timestamps based on language before adding them
|
||||
processed_timestamps = self._normalize_word_timestamps(
|
||||
processed_timestamps = self._process_word_timestamps_for_language(
|
||||
msg["word_timestamps"]["words"], msg["word_timestamps"]["start"]
|
||||
)
|
||||
await self.add_word_timestamps(
|
||||
|
||||
@@ -31,7 +31,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Deepgram Flux, you need to `pip install pipecat-ai[deepgram]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
# Re-export for backward compatibility
|
||||
__all__ = [
|
||||
|
||||
@@ -48,7 +48,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Deepgram, you need to `pip install pipecat-ai[deepgram]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class LiveOptions:
|
||||
|
||||
@@ -39,7 +39,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use DeepgramWebsocketTTSService, you need to `pip install pipecat-ai[deepgram]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -52,7 +52,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use ElevenLabs Realtime STT, you need to `pip install pipecat-ai[elevenlabs]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_elevenlabs_language(language: Language) -> str:
|
||||
@@ -823,7 +823,6 @@ class ElevenLabsRealtimeSTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.debug("Connected to ElevenLabs Realtime STT")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(
|
||||
error_msg=f"Unable to connect to ElevenLabs Realtime STT: {e}", exception=e
|
||||
)
|
||||
|
||||
@@ -56,7 +56,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use ElevenLabs, you need to `pip install pipecat-ai[elevenlabs]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
# Models that support language codes
|
||||
# The following models are excluded as they don't support language codes:
|
||||
@@ -594,10 +594,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
self._partial_word_start_time = 0.0
|
||||
self._alignment_started_context_ids: set[str | None] = set()
|
||||
|
||||
# Context IDs whose context-init has been sent, so the keepalive knows
|
||||
# which contexts are safe to target.
|
||||
self._context_init_sent: set[str] = set()
|
||||
|
||||
# Context management for v1 multi API
|
||||
self._receive_task = None
|
||||
self._keepalive_task = None
|
||||
@@ -796,7 +792,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
finally:
|
||||
await self.remove_active_audio_context()
|
||||
self._websocket = None
|
||||
self._context_init_sent.clear()
|
||||
await self._call_event_handler("on_disconnected")
|
||||
|
||||
def _get_websocket(self):
|
||||
@@ -827,7 +822,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
self._partial_word = ""
|
||||
self._partial_word_start_time = 0.0
|
||||
self._alignment_started_context_ids.discard(context_id)
|
||||
self._context_init_sent.discard(context_id)
|
||||
|
||||
async def on_audio_context_interrupted(self, context_id: str):
|
||||
"""Close the ElevenLabs context when the bot is interrupted."""
|
||||
@@ -920,35 +914,26 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
while True:
|
||||
await asyncio.sleep(KEEPALIVE_SLEEP)
|
||||
try:
|
||||
await self._send_keepalive()
|
||||
if self._websocket and self._websocket.state is State.OPEN:
|
||||
context_id = self.get_active_audio_context_id()
|
||||
if context_id:
|
||||
# Send keepalive with context ID to keep the connection alive
|
||||
keepalive_message = {
|
||||
"text": "",
|
||||
"context_id": context_id,
|
||||
}
|
||||
logger.trace(f"Sending keepalive for context {context_id}")
|
||||
else:
|
||||
# It's possible to have a user interruption which clears the context
|
||||
# without generating a new TTS response. In this case, we'll just send
|
||||
# an empty message to keep the connection alive.
|
||||
keepalive_message = {"text": ""}
|
||||
logger.trace("Sending keepalive without context")
|
||||
await self._websocket.send(json.dumps(keepalive_message))
|
||||
except websockets.ConnectionClosed as e:
|
||||
logger.warning(f"{self} keepalive error: {e}")
|
||||
break
|
||||
|
||||
async def _send_keepalive(self):
|
||||
"""Send a single keepalive message to keep the WebSocket connection alive.
|
||||
|
||||
Only stamps a ``context_id`` once its context-init (carrying
|
||||
``voice_settings``) has been sent. Otherwise the keepalive would be the
|
||||
context's first message, with no ``voice_settings``, and ElevenLabs would
|
||||
reject the later context-init with a 1008 policy violation. A context-less
|
||||
keepalive is sufficient until the context-init is sent.
|
||||
"""
|
||||
if not self._websocket or self._websocket.state is not State.OPEN:
|
||||
return
|
||||
|
||||
context_id = self.get_active_audio_context_id()
|
||||
if context_id and context_id in self._context_init_sent:
|
||||
# The context's voice_settings context-init has been sent, so it's
|
||||
# safe to keep that context alive.
|
||||
keepalive_message = {"text": "", "context_id": context_id}
|
||||
else:
|
||||
# No active context, or the active context's context-init hasn't been
|
||||
# sent yet. A context-less keepalive keeps the connection alive without
|
||||
# opening the context prematurely.
|
||||
keepalive_message = {"text": ""}
|
||||
await self._websocket.send(json.dumps(keepalive_message))
|
||||
|
||||
async def _send_text(self, text: str, context_id: str):
|
||||
"""Send text to the WebSocket for synthesis."""
|
||||
if self._websocket and context_id:
|
||||
@@ -995,9 +980,6 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
locator.model_dump()
|
||||
for locator in self._pronunciation_dictionary_locators
|
||||
]
|
||||
# Mark the context-init as sent so the keepalive may now
|
||||
# target this context_id.
|
||||
self._context_init_sent.add(context_id)
|
||||
await self._websocket.send(json.dumps(msg))
|
||||
logger.trace(f"Created new context {context_id}")
|
||||
|
||||
|
||||
@@ -38,7 +38,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Fish Audio, you need to `pip install pipecat-ai[fish]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
# FishAudio supports various output formats
|
||||
FishAudioOutputFormat = Literal["opus", "mp3", "pcm", "wav"]
|
||||
|
||||
@@ -53,7 +53,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Gladia, you need to `pip install pipecat-ai[gladia]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_gladia_language(language: Language) -> str:
|
||||
@@ -558,9 +558,8 @@ class GladiaSTTService(WebsocketSTTService):
|
||||
|
||||
logger.debug(f"{self} Connected to Gladia WebSocket")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
self._connection_active = False
|
||||
await self.push_error(error_msg=f"Unable to connect to Gladia: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to Gladia."""
|
||||
|
||||
@@ -105,7 +105,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
# Connection management constants
|
||||
|
||||
@@ -36,7 +36,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Google Vertex AI, you need to `pip install pipecat-ai[google]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -35,7 +35,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -65,7 +65,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class GoogleThinkingConfig(BaseModel):
|
||||
|
||||
@@ -57,7 +57,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use Google AI, you need to `pip install pipecat-ai[google]`. Also, set `GOOGLE_APPLICATION_CREDENTIALS` environment variable."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_google_stt_language(language: Language) -> str:
|
||||
|
||||
@@ -57,7 +57,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use Google AI, you need to `pip install pipecat-ai[google]`. Also, set `GOOGLE_APPLICATION_CREDENTIALS` environment variable."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_google_tts_language(language: Language) -> str:
|
||||
|
||||
@@ -35,7 +35,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use Google AI, you need to `pip install pipecat-ai[google]`. Also, set `GOOGLE_APPLICATION_CREDENTIALS` environment variable."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -44,7 +44,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error('In order to use Gradium, you need to `pip install "pipecat-ai[gradium]"`.')
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
# Seconds to wait after a "flushed" message for trailing text tokens to arrive
|
||||
# before finalizing the transcription.
|
||||
@@ -423,8 +423,8 @@ class GradiumSTTService(WebsocketSTTService):
|
||||
logger.debug("Connected to Gradium STT")
|
||||
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(error_msg=f"Unable to connect to Gradium: {e}", exception=e)
|
||||
await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect(self):
|
||||
await super()._disconnect()
|
||||
|
||||
@@ -33,7 +33,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Gradium, you need to `pip install pipecat-ai[gradium]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
SAMPLE_RATE = 48000
|
||||
|
||||
|
||||
@@ -30,7 +30,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Groq, you need to `pip install pipecat-ai[groq]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
# Hint set for `output_format`. The values mirror the Literal that
|
||||
# `groq.resources.audio.speech.AsyncSpeech.create` accepts on its
|
||||
|
||||
@@ -46,7 +46,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use HeyGen, you need to `pip install pipecat-ai[heygen]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
HEY_GEN_SAMPLE_RATE = 24000
|
||||
|
||||
|
||||
@@ -38,7 +38,7 @@ try:
|
||||
except ModuleNotFoundError as e: # pragma: no cover - import-time guidance
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Hume, you need to `pip install pipecat-ai[hume]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
HUME_SAMPLE_RATE = 48_000 # Hume TTS streams at 48 kHz
|
||||
|
||||
@@ -1,124 +0,0 @@
|
||||
#
|
||||
# Copyright (c) 2024-2026, Daily
|
||||
#
|
||||
# SPDX-License-Identifier: BSD 2-Clause License
|
||||
#
|
||||
|
||||
"""Inception LLM service implementation using OpenAI-compatible interface."""
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Literal
|
||||
|
||||
from loguru import logger
|
||||
|
||||
from pipecat.adapters.services.open_ai_adapter import OpenAILLMInvocationParams
|
||||
from pipecat.services.openai.base_llm import BaseOpenAILLMService
|
||||
from pipecat.services.openai.llm import OpenAILLMService
|
||||
from pipecat.services.settings import NOT_GIVEN as _NOT_GIVEN
|
||||
from pipecat.services.settings import _NotGiven, is_given
|
||||
|
||||
|
||||
@dataclass
|
||||
class InceptionLLMSettings(BaseOpenAILLMService.Settings):
|
||||
"""Settings for InceptionLLMService.
|
||||
|
||||
Parameters:
|
||||
reasoning_effort: Controls how much reasoning the model applies.
|
||||
One of "instant", "low", "medium", or "high". When unset, the
|
||||
parameter is omitted and Inception's server-side default applies.
|
||||
realtime: When True, reduces time to first diffusion block (TTFT).
|
||||
"""
|
||||
|
||||
reasoning_effort: Literal["instant", "low", "medium", "high"] | None | _NotGiven = field(
|
||||
default_factory=lambda: _NOT_GIVEN
|
||||
)
|
||||
realtime: bool | None | _NotGiven = field(default_factory=lambda: _NOT_GIVEN)
|
||||
|
||||
|
||||
class InceptionLLMService(OpenAILLMService):
|
||||
"""A service for interacting with Inception's API using the OpenAI-compatible interface.
|
||||
|
||||
This service extends OpenAILLMService to connect to Inception's API endpoint while
|
||||
maintaining full compatibility with OpenAI's interface and functionality.
|
||||
Supports Mercury-2, Inception's diffusion-based reasoning model.
|
||||
"""
|
||||
|
||||
# Inception doesn't support the "developer" message role.
|
||||
supports_developer_role = False
|
||||
|
||||
Settings = InceptionLLMSettings
|
||||
_settings: Settings
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
api_key: str,
|
||||
base_url: str = "https://api.inceptionlabs.ai/v1",
|
||||
settings: Settings | None = None,
|
||||
**kwargs,
|
||||
):
|
||||
"""Initialize the Inception LLM service.
|
||||
|
||||
Args:
|
||||
api_key: The API key for accessing Inception's API.
|
||||
base_url: The base URL for Inception API. Defaults to "https://api.inceptionlabs.ai/v1".
|
||||
settings: Runtime-updatable settings.
|
||||
**kwargs: Additional keyword arguments passed to OpenAILLMService.
|
||||
"""
|
||||
default_settings = self.Settings(
|
||||
model="mercury-2",
|
||||
reasoning_effort=None,
|
||||
realtime=None,
|
||||
)
|
||||
|
||||
if settings is not None:
|
||||
default_settings.apply_update(settings)
|
||||
|
||||
super().__init__(api_key=api_key, base_url=base_url, settings=default_settings, **kwargs)
|
||||
|
||||
def create_client(self, api_key=None, base_url=None, **kwargs):
|
||||
"""Create OpenAI-compatible client for Inception API endpoint.
|
||||
|
||||
Args:
|
||||
api_key: The API key for authentication. If None, uses instance default.
|
||||
base_url: The base URL for the API. If None, uses instance default.
|
||||
**kwargs: Additional keyword arguments for client configuration.
|
||||
|
||||
Returns:
|
||||
An OpenAI-compatible client configured for Inception's API.
|
||||
"""
|
||||
logger.debug(f"Creating Inception client with api {base_url}")
|
||||
return super().create_client(api_key, base_url, **kwargs)
|
||||
|
||||
def build_chat_completion_params(self, params_from_context: OpenAILLMInvocationParams) -> dict:
|
||||
"""Build parameters for Inception chat completion request.
|
||||
|
||||
Extends the base OpenAI parameters with Inception-specific options
|
||||
such as reasoning_effort and realtime.
|
||||
|
||||
Args:
|
||||
params_from_context: Parameters, derived from the LLM context, to
|
||||
use for the chat completion. Contains messages, tools, and tool
|
||||
choice.
|
||||
|
||||
Returns:
|
||||
Dictionary of parameters for the chat completion request.
|
||||
"""
|
||||
params = super().build_chat_completion_params(params_from_context)
|
||||
|
||||
if (
|
||||
is_given(self._settings.reasoning_effort)
|
||||
and self._settings.reasoning_effort is not None
|
||||
):
|
||||
params["reasoning_effort"] = self._settings.reasoning_effort
|
||||
|
||||
# realtime is Inception-specific and unknown to the OpenAI SDK,
|
||||
# so it must be passed via extra_body to avoid validation errors.
|
||||
extra_body = {}
|
||||
if is_given(self._settings.realtime) and self._settings.realtime is not None:
|
||||
extra_body["realtime"] = self._settings.realtime
|
||||
|
||||
if extra_body:
|
||||
params["extra_body"] = extra_body
|
||||
|
||||
return params
|
||||
@@ -68,7 +68,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Inworld Realtime, you need to `pip install pipecat-ai[inworld]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -43,7 +43,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Inworld WebSocket TTS, you need to `pip install websockets`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
from pipecat.frames.frames import (
|
||||
AggregationType,
|
||||
|
||||
@@ -32,7 +32,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Kokoro, you need to `pip install pipecat-ai[kokoro]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
KOKORO_CACHE_DIR = Path(os.path.expanduser("~/.cache/kokoro-onnx"))
|
||||
KOKORO_MODEL_URL = "https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx"
|
||||
|
||||
@@ -34,7 +34,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use LMNT, you need to `pip install pipecat-ai[lmnt]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_lmnt_language(language: Language) -> str:
|
||||
|
||||
@@ -29,7 +29,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use an MCP client, you need to `pip install pipecat-ai[mcp]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
ServerParameters: TypeAlias = StdioServerParameters | SseServerParameters | StreamableHttpParameters
|
||||
|
||||
|
||||
@@ -28,7 +28,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use Mem0, you need to `pip install mem0ai`. Also, set the environment variable MEM0_API_KEY."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class Mem0MemoryService(FrameProcessor):
|
||||
|
||||
@@ -48,7 +48,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Mistral STT, you need to `pip install pipecat-ai[mistral]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -31,7 +31,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Mistral TTS, you need to `pip install pipecat-ai[mistral]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -34,7 +34,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Moondream, you need to `pip install pipecat-ai[moondream]`.")
|
||||
raise Exception(f"Missing module(s): {e}")
|
||||
raise ImportError(f"Missing module(s): {e}") from e
|
||||
|
||||
|
||||
def detect_device():
|
||||
|
||||
@@ -41,7 +41,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Neuphonic, you need to `pip install pipecat-ai[neuphonic]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_neuphonic_lang_code(language: Language) -> str:
|
||||
|
||||
@@ -46,7 +46,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use NVIDIA Nemotron Speech STT, you need to `pip install pipecat-ai[nvidia]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_nvidia_nemotron_speech_language(language: Language) -> str:
|
||||
|
||||
@@ -55,7 +55,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use NVIDIA Nemotron Speech TTS, you need to `pip install pipecat-ai[nvidia]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -71,7 +71,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use OpenAI, you need to `pip install pipecat-ai[openai]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -59,7 +59,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use OpenAI, you need to `pip install pipecat-ai[openai]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@@ -30,7 +30,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Piper, you need to `pip install pipecat-ai[piper]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -33,7 +33,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Resemble AI, you need to `pip install pipecat-ai[resembleai]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -47,7 +47,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Rime, you need to `pip install pipecat-ai[rime]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_rime_language(language: Language) -> str:
|
||||
|
||||
@@ -53,7 +53,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Sarvam, you need to `pip install pipecat-ai[sarvam]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_sarvam_language(language: Language) -> str:
|
||||
|
||||
@@ -70,7 +70,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Sarvam, you need to `pip install pipecat-ai[sarvam]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class SarvamTTSModel(StrEnum):
|
||||
|
||||
@@ -35,7 +35,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Simli, you need to `pip install pipecat-ai[simli]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -48,7 +48,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Smallest, you need to `pip install pipecat-ai[smallest]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
def language_to_smallest_stt_language(language: Language) -> str:
|
||||
|
||||
@@ -41,7 +41,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Smallest, you need to `pip install pipecat-ai[smallest]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
class SmallestTTSModel(StrEnum):
|
||||
|
||||
@@ -39,7 +39,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Soniox, you need to `pip install pipecat-ai[soniox]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
KEEPALIVE_MESSAGE = '{"type": "keepalive"}'
|
||||
@@ -155,6 +155,7 @@ def language_to_soniox_language(language: Language) -> str:
|
||||
Language.ID: "id",
|
||||
Language.IT: "it",
|
||||
Language.JA: "ja",
|
||||
Language.KA: "ka",
|
||||
Language.KK: "kk",
|
||||
Language.KN: "kn",
|
||||
Language.KO: "ko",
|
||||
@@ -231,7 +232,6 @@ class SonioxSTTSettings(STTSettings):
|
||||
context_version 2.
|
||||
enable_speaker_diarization: Whether to enable speaker diarization.
|
||||
enable_language_identification: Whether to enable language identification.
|
||||
max_endpoint_delay_ms: Max ms before endpoint detection finalizes the turn (500-3000).
|
||||
client_reference_id: Client reference ID to use for transcription.
|
||||
"""
|
||||
|
||||
@@ -242,7 +242,6 @@ class SonioxSTTSettings(STTSettings):
|
||||
enable_language_identification: bool | None | _NotGiven = field(
|
||||
default_factory=lambda: NOT_GIVEN
|
||||
)
|
||||
max_endpoint_delay_ms: int | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)
|
||||
client_reference_id: str | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)
|
||||
|
||||
|
||||
@@ -310,7 +309,6 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
context=None,
|
||||
enable_speaker_diarization=False,
|
||||
enable_language_identification=False,
|
||||
max_endpoint_delay_ms=None,
|
||||
client_reference_id=None,
|
||||
)
|
||||
|
||||
@@ -392,7 +390,8 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
changed = await super()._update_settings(delta)
|
||||
|
||||
if changed:
|
||||
await self._request_reconnect()
|
||||
await self._disconnect()
|
||||
await self._connect()
|
||||
|
||||
return changed
|
||||
|
||||
@@ -523,7 +522,6 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
"audio_format": self._audio_format,
|
||||
"num_channels": self._num_channels,
|
||||
"enable_endpoint_detection": enable_endpoint_detection,
|
||||
"max_endpoint_delay_ms": s.max_endpoint_delay_ms,
|
||||
"sample_rate": self.sample_rate,
|
||||
"language_hints": _prepare_language_hints(assert_given(s.language_hints)),
|
||||
"language_hints_strict": s.language_hints_strict,
|
||||
@@ -539,8 +537,8 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.debug("Connected to Soniox STT")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(error_msg=f"Unable to connect to Soniox: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to Soniox."""
|
||||
|
||||
@@ -44,7 +44,7 @@ try:
|
||||
except ModuleNotFoundError as e:
|
||||
logger.error(f"Exception: {e}")
|
||||
logger.error("In order to use Soniox, you need to `pip install pipecat-ai[soniox]`.")
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
# Soniox idle timeout is 20-30s; keepalive cadence must stay well inside it.
|
||||
|
||||
@@ -61,7 +61,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use Speechmatics, you need to `pip install pipecat-ai[speechmatics]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
load_dotenv()
|
||||
|
||||
@@ -32,7 +32,7 @@ except ModuleNotFoundError as e:
|
||||
logger.error(
|
||||
"In order to use Speechmatics, you need to `pip install pipecat-ai[speechmatics]`."
|
||||
)
|
||||
raise Exception(f"Missing module: {e}")
|
||||
raise ImportError(f"Missing module: {e}") from e
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user