Compare commits
29 Commits
filipi/tav
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
780c004168 | ||
|
|
28f9203401 | ||
|
|
77cc314a08 | ||
|
|
4a8d1d0b5e | ||
|
|
87f5d60693 | ||
|
|
c699b31daa | ||
|
|
ee674ffb01 | ||
|
|
86a5710801 | ||
|
|
4a96b2a9e6 | ||
|
|
105d6f27da | ||
|
|
e0e3cd336a | ||
|
|
9586db5b50 | ||
|
|
a890ab7b21 | ||
|
|
c1bf7dbb4a | ||
|
|
709a0ce839 | ||
|
|
be93350eae | ||
|
|
4a96ab7073 | ||
|
|
c321f50e76 | ||
|
|
70773bce0a | ||
|
|
a5e6886b80 | ||
|
|
d11a4ba0cd | ||
|
|
38407e091d | ||
|
|
33e5d1f89b | ||
|
|
861dd23873 | ||
|
|
b825dd779e | ||
|
|
1487da53a9 | ||
|
|
aff84a5d9e | ||
|
|
e298491068 | ||
|
|
97b00042df |
91
.claude/skills/squash-commits/SKILL.md
Normal file
91
.claude/skills/squash-commits/SKILL.md
Normal file
@@ -0,0 +1,91 @@
|
||||
---
|
||||
name: squash-commits
|
||||
description: Reorganize messy branch commits into a small set of logical, meaningful commits without changing any content. Drops merge-from-main commits. Safe: creates a backup branch first.
|
||||
---
|
||||
|
||||
Reorganize the commits on the current branch into a small number of logical commits. Do NOT change any file content — only the commit structure changes.
|
||||
|
||||
## Instructions
|
||||
|
||||
### 1. Safety check
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
If there are uncommitted changes, stop and tell the user to commit or stash them first.
|
||||
|
||||
### 2. Inspect the branch
|
||||
|
||||
```bash
|
||||
git log main..HEAD --oneline
|
||||
git diff main..HEAD --name-only
|
||||
```
|
||||
|
||||
List every file changed vs `main` and every commit on the branch (excluding merge commits from main).
|
||||
|
||||
### 3. Create a backup branch
|
||||
|
||||
```bash
|
||||
git branch backup/<current-branch-name>
|
||||
```
|
||||
|
||||
Tell the user the backup exists so they can recover if needed.
|
||||
|
||||
### 4. Soft-reset to main and unstage everything
|
||||
|
||||
```bash
|
||||
git reset --soft main
|
||||
git restore --staged .
|
||||
```
|
||||
|
||||
All branch changes are now in the working tree, unstaged. No content has changed.
|
||||
|
||||
### 5. Plan the logical groups
|
||||
|
||||
Read the changed files and the original commit messages to understand what the work covers. Group related files into logical commits. Typical groups:
|
||||
|
||||
- Core feature or fix (new source files + modified core files)
|
||||
- Secondary features or fixes (each as its own commit if distinct)
|
||||
- Refactoring or renames
|
||||
- Tests
|
||||
- Changelogs / docs
|
||||
|
||||
Use the changelog files (if any) as a strong hint — each changelog entry often maps to one commit.
|
||||
|
||||
Present the proposed grouping to the user and ask for confirmation before committing.
|
||||
|
||||
### 6. Commit in logical groups
|
||||
|
||||
For each group, stage only the relevant files and commit with a clear message following the project's conventions:
|
||||
|
||||
```bash
|
||||
git add <file1> <file2> ...
|
||||
git commit -m "..."
|
||||
```
|
||||
|
||||
Use conventional commit prefixes if the project uses them (`feat:`, `fix:`, `refactor:`, `test:`, `chore:`).
|
||||
|
||||
### 7. Verify
|
||||
|
||||
```bash
|
||||
git log main..HEAD --oneline
|
||||
git diff main..HEAD --name-only
|
||||
git status --short
|
||||
```
|
||||
|
||||
Confirm:
|
||||
- Commit count is small and each message is meaningful
|
||||
- The set of changed files vs `main` is identical to before
|
||||
- Working tree is clean
|
||||
|
||||
### 8. Remind about force-push
|
||||
|
||||
The branch history has been rewritten. Tell the user they will need to `git push --force-with-lease` when they are ready to update the remote. Do NOT push automatically.
|
||||
|
||||
## Rules
|
||||
|
||||
- Never change file contents. If you find yourself editing a file, stop.
|
||||
- Never skip the backup branch step.
|
||||
- Never force-push without explicit user instruction.
|
||||
- If any step fails or the result looks wrong, tell the user and suggest restoring from the backup: `git reset --hard backup/<branch-name>`.
|
||||
@@ -92,7 +92,7 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
|
||||
| Category | Services |
|
||||
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/api-reference/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/api-reference/server/services/stt/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/api-reference/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/api-reference/server/services/stt/gladia), [Google](https://docs.pipecat.ai/api-reference/server/services/stt/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/stt/mistral), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/stt/nvidia), [OpenAI (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/api-reference/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/api-reference/server/services/stt/whisper), [xAI](https://docs.pipecat.ai/api-reference/server/services/stt/xai) |
|
||||
| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together) |
|
||||
| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Inception](https://docs.pipecat.ai/api-reference/server/services/llm/inception), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together) |
|
||||
| Text-to-Speech | [Async](https://docs.pipecat.ai/api-reference/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/api-reference/server/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/api-reference/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/api-reference/server/services/tts/fish), [Google](https://docs.pipecat.ai/api-reference/server/services/tts/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/api-reference/server/services/tts/groq), [Hume](https://docs.pipecat.ai/api-reference/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/api-reference/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/api-reference/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/api-reference/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/api-reference/server/services/tts/minimax), [Mistral](https://docs.pipecat.ai/api-reference/server/services/tts/mistral), [Neuphonic](https://docs.pipecat.ai/api-reference/server/services/tts/neuphonic), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/tts/nvidia), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/tts/openai), [Piper](https://docs.pipecat.ai/api-reference/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/api-reference/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/api-reference/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/api-reference/server/services/tts/smallest), [Soniox](https://docs.pipecat.ai/api-reference/server/services/tts/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/api-reference/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/api-reference/server/services/tts/xtts) |
|
||||
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/api-reference/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/api-reference/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/api-reference/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/api-reference/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/api-reference/server/services/s2s/ultravox), |
|
||||
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/api-reference/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/api-reference/server/services/transport/small-webrtc), [Vonage (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/vonage), [WebSocket Server](https://docs.pipecat.ai/api-reference/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/api-reference/server/services/transport/whatsapp), Local |
|
||||
|
||||
1
changelog/4423.added.md
Normal file
1
changelog/4423.added.md
Normal file
@@ -0,0 +1 @@
|
||||
- Added `InceptionLLMService` for Inception's Mercury 2 diffusion reasoning model, with support for `reasoning_effort` and `realtime` settings.
|
||||
1
changelog/4514.fixed.md
Normal file
1
changelog/4514.fixed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Fixed websocket STT connection setup failures so services clear stale websocket state and emit non-fatal error frames, allowing `ServiceSwitcher` failover to keep agents running.
|
||||
1
changelog/4521.added.md
Normal file
1
changelog/4521.added.md
Normal file
@@ -0,0 +1 @@
|
||||
- Added `max_endpoint_delay_ms` to `SonioxSTTService.Settings`, controlling the maximum delay (500-3000 ms) before endpoint detection finalizes a turn.
|
||||
1
changelog/4521.changed.md
Normal file
1
changelog/4521.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- `SonioxSTTService` now applies settings updates (e.g. via `STTUpdateSettingsFrame`) using a graceful reconnect instead of a hard disconnect/reconnect, preserving the service's reconnect retry behavior.
|
||||
1
changelog/4521.removed.md
Normal file
1
changelog/4521.removed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Removed the unsupported Georgian (`Language.KA`) language mapping from `SonioxSTTService`.
|
||||
1
changelog/4522.changed.md
Normal file
1
changelog/4522.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Updated the default p99 TTFS latency values for Smallest AI, Mistral, and XAI STT so turn stop timing uses measured values instead of the conservative fallback.
|
||||
1
changelog/4524.changed.md
Normal file
1
changelog/4524.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Updated the development runner startup banner to show the prebuilt client URL once and list enabled or disabled transports with install hints.
|
||||
1
changelog/4524.fixed.md
Normal file
1
changelog/4524.fixed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Fixed the development runner so missing optional transport dependencies disable only their related routes instead of failing startup in all-transport mode.
|
||||
1
changelog/4527.fixed.md
Normal file
1
changelog/4527.fixed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Fixed a race in `ElevenLabsTTSService` where the periodic keepalive could be sent for a new turn's context before that context's `voice_settings` initialization message, causing ElevenLabs to close the WebSocket with a 1008 policy violation (`voice_settings field must be provided in the first message ...`). The keepalive now only targets a context once its context-init has been sent.
|
||||
1
changelog/4531.changed.md
Normal file
1
changelog/4531.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Bumped `pipecat-ai-prebuilt` to 1.0.1 in the `runner` extra, updating the prebuilt client UI served by the development runner.
|
||||
@@ -91,6 +91,9 @@ HEYGEN_LIVE_AVATAR_API_KEY=...
|
||||
HUME_API_KEY=...
|
||||
HUME_VOICE_ID=...
|
||||
|
||||
# Inception
|
||||
INCEPTION_API_KEY=...
|
||||
|
||||
# Inworld
|
||||
INWORLD_API_KEY=...
|
||||
|
||||
|
||||
177
examples/function-calling/function-calling-inception.py
Normal file
177
examples/function-calling/function-calling-inception.py
Normal file
@@ -0,0 +1,177 @@
|
||||
#
|
||||
# Copyright (c) 2024-2026, Daily
|
||||
#
|
||||
# SPDX-License-Identifier: BSD 2-Clause License
|
||||
#
|
||||
|
||||
|
||||
import os
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from loguru import logger
|
||||
|
||||
from pipecat.adapters.schemas.function_schema import FunctionSchema
|
||||
from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
||||
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||
from pipecat.frames.frames import LLMRunFrame, TTSSpeakFrame
|
||||
from pipecat.pipeline.pipeline import Pipeline
|
||||
from pipecat.pipeline.runner import PipelineRunner
|
||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||
from pipecat.processors.aggregators.llm_context import LLMContext
|
||||
from pipecat.processors.aggregators.llm_response_universal import (
|
||||
LLMContextAggregatorPair,
|
||||
LLMUserAggregatorParams,
|
||||
)
|
||||
from pipecat.runner.types import RunnerArguments
|
||||
from pipecat.runner.utils import create_transport
|
||||
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||
from pipecat.services.deepgram.stt import DeepgramSTTService
|
||||
from pipecat.services.inception.llm import InceptionLLMService
|
||||
from pipecat.services.llm_service import FunctionCallParams
|
||||
from pipecat.transports.base_transport import BaseTransport, TransportParams
|
||||
from pipecat.transports.daily.transport import DailyParams
|
||||
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
|
||||
|
||||
load_dotenv(override=True)
|
||||
|
||||
|
||||
async def fetch_weather_from_api(params: FunctionCallParams):
|
||||
await params.result_callback({"conditions": "nice", "temperature": "75"})
|
||||
|
||||
|
||||
async def fetch_restaurant_recommendation(params: FunctionCallParams):
|
||||
await params.result_callback({"name": "The Golden Dragon"})
|
||||
|
||||
|
||||
# We use lambdas to defer transport parameter creation until the transport
|
||||
# type is selected at runtime.
|
||||
transport_params = {
|
||||
"daily": lambda: DailyParams(
|
||||
audio_in_enabled=True,
|
||||
audio_out_enabled=True,
|
||||
),
|
||||
"twilio": lambda: FastAPIWebsocketParams(
|
||||
audio_in_enabled=True,
|
||||
audio_out_enabled=True,
|
||||
),
|
||||
"webrtc": lambda: TransportParams(
|
||||
audio_in_enabled=True,
|
||||
audio_out_enabled=True,
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
|
||||
logger.info(f"Starting bot")
|
||||
|
||||
stt = DeepgramSTTService(api_key=os.environ["DEEPGRAM_API_KEY"])
|
||||
|
||||
tts = CartesiaTTSService(
|
||||
api_key=os.environ["CARTESIA_API_KEY"],
|
||||
settings=CartesiaTTSService.Settings(
|
||||
voice="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
||||
),
|
||||
)
|
||||
|
||||
llm = InceptionLLMService(
|
||||
api_key=os.environ["INCEPTION_API_KEY"],
|
||||
settings=InceptionLLMService.Settings(
|
||||
reasoning_effort="instant",
|
||||
system_instruction="You are a helpful assistant in a voice conversation. Your responses will be spoken aloud, so avoid emojis, bullet points, or other formatting that can't be spoken. Respond to what the user said in a creative, helpful, and brief way.",
|
||||
),
|
||||
)
|
||||
# You can also register a function_name of None to get all functions
|
||||
# sent to the same callback with an additional function_name parameter.
|
||||
llm.register_function("get_current_weather", fetch_weather_from_api)
|
||||
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
|
||||
|
||||
@llm.event_handler("on_function_calls_started")
|
||||
async def on_function_calls_started(service, function_calls):
|
||||
await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
|
||||
|
||||
weather_function = FunctionSchema(
|
||||
name="get_current_weather",
|
||||
description="Get the current weather",
|
||||
properties={
|
||||
"location": {
|
||||
"type": "string",
|
||||
"description": "The city and state, e.g. San Francisco, CA",
|
||||
},
|
||||
"format": {
|
||||
"type": "string",
|
||||
"enum": ["celsius", "fahrenheit"],
|
||||
"description": "The temperature unit to use. Infer this from the user's location.",
|
||||
},
|
||||
},
|
||||
required=["location", "format"],
|
||||
)
|
||||
|
||||
restaurant_function = FunctionSchema(
|
||||
name="get_restaurant_recommendation",
|
||||
description="Get a restaurant recommendation",
|
||||
properties={
|
||||
"location": {
|
||||
"type": "string",
|
||||
"description": "The city and state, e.g. San Francisco, CA",
|
||||
},
|
||||
},
|
||||
required=["location"],
|
||||
)
|
||||
tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
|
||||
|
||||
context = LLMContext(tools=tools)
|
||||
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
|
||||
context,
|
||||
user_params=LLMUserAggregatorParams(vad_analyzer=SileroVADAnalyzer()),
|
||||
)
|
||||
|
||||
pipeline = Pipeline(
|
||||
[
|
||||
transport.input(),
|
||||
stt,
|
||||
user_aggregator,
|
||||
llm,
|
||||
tts,
|
||||
transport.output(),
|
||||
assistant_aggregator,
|
||||
]
|
||||
)
|
||||
|
||||
task = PipelineTask(
|
||||
pipeline,
|
||||
params=PipelineParams(
|
||||
enable_metrics=True,
|
||||
enable_usage_metrics=True,
|
||||
),
|
||||
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
|
||||
)
|
||||
|
||||
@transport.event_handler("on_client_connected")
|
||||
async def on_client_connected(transport, client):
|
||||
logger.info(f"Client connected")
|
||||
# Kick off the conversation.
|
||||
context.add_message(
|
||||
{"role": "developer", "content": "Please introduce yourself to the user."}
|
||||
)
|
||||
await task.queue_frames([LLMRunFrame()])
|
||||
|
||||
@transport.event_handler("on_client_disconnected")
|
||||
async def on_client_disconnected(transport, client):
|
||||
logger.info(f"Client disconnected")
|
||||
await task.cancel()
|
||||
|
||||
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
|
||||
|
||||
await runner.run(task)
|
||||
|
||||
|
||||
async def bot(runner_args: RunnerArguments):
|
||||
"""Main bot entry point compatible with Pipecat Cloud."""
|
||||
transport = await create_transport(runner_args, transport_params)
|
||||
await run_bot(transport, runner_args)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
from pipecat.runner.run import main
|
||||
|
||||
main()
|
||||
@@ -22,9 +22,9 @@ from pipecat.processors.aggregators.llm_response_universal import (
|
||||
)
|
||||
from pipecat.runner.types import RunnerArguments
|
||||
from pipecat.runner.utils import create_transport
|
||||
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||
from pipecat.services.openai.llm import OpenAILLMService
|
||||
from pipecat.services.soniox.stt import SonioxSTTService
|
||||
from pipecat.services.soniox.tts import SonioxTTSService
|
||||
from pipecat.transcriptions.language import Language
|
||||
from pipecat.transports.base_transport import BaseTransport, TransportParams
|
||||
from pipecat.transports.daily.transport import DailyParams
|
||||
@@ -53,12 +53,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
|
||||
|
||||
stt = SonioxSTTService(api_key=os.environ["SONIOX_API_KEY"])
|
||||
|
||||
tts = CartesiaTTSService(
|
||||
api_key=os.environ["CARTESIA_API_KEY"],
|
||||
settings=CartesiaTTSService.Settings(
|
||||
voice="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
||||
),
|
||||
)
|
||||
tts = SonioxTTSService(api_key=os.environ["SONIOX_API_KEY"])
|
||||
|
||||
llm = OpenAILLMService(
|
||||
api_key=os.environ["OPENAI_API_KEY"],
|
||||
@@ -103,9 +98,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
|
||||
await task.queue_frames([LLMRunFrame()])
|
||||
|
||||
await asyncio.sleep(10)
|
||||
logger.info("Updating Soniox STT settings: language=es")
|
||||
logger.info("Updating Soniox STT settings: language_hints=[es]")
|
||||
await task.queue_frame(
|
||||
STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language=Language.ES))
|
||||
STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language_hints=[Language.ES]))
|
||||
)
|
||||
|
||||
@transport.event_handler("on_client_disconnected")
|
||||
|
||||
@@ -77,6 +77,7 @@ groq = [ "groq>=0.23.0,<2" ]
|
||||
gstreamer = [ "pygobject~=3.50.0" ]
|
||||
heygen = [ "livekit>=1.0.13,<2", "pipecat-ai[websockets-base]" ]
|
||||
hume = [ "hume>=0.11.2,<1" ]
|
||||
inception = []
|
||||
inworld = [ "pipecat-ai[websockets-base]" ]
|
||||
koala = [ "pvkoala~=2.0.3" ]
|
||||
kokoro = [ "kokoro-onnx>=0.5.0,<1", "requests>=2.32.5,<3" ]
|
||||
@@ -103,7 +104,7 @@ piper = [ "piper-tts>=1.3.0,<2", "requests>=2.32.5,<3" ]
|
||||
qwen = []
|
||||
resembleai = [ "pipecat-ai[websockets-base]" ]
|
||||
rime = [ "pipecat-ai[websockets-base]" ]
|
||||
runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.0"]
|
||||
runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.1"]
|
||||
sagemaker = ["aws_sdk_sagemaker_runtime_http2; python_version>='3.12'"]
|
||||
sambanova = []
|
||||
sarvam = [ "sarvamai==0.1.28", "pipecat-ai[websockets-base]" ]
|
||||
|
||||
@@ -198,6 +198,7 @@ TESTS_FUNCTION_CALLING = [
|
||||
("function-calling/function-calling-sarvam.py", EVAL_WEATHER),
|
||||
("function-calling/function-calling-novita.py", EVAL_WEATHER),
|
||||
("function-calling/function-calling-deepseek.py", EVAL_WEATHER),
|
||||
("function-calling/function-calling-inception.py", EVAL_WEATHER),
|
||||
# Video
|
||||
("function-calling/function-calling-anthropic-video.py", EVAL_VISION_CAMERA),
|
||||
("function-calling/function-calling-aws-video.py", EVAL_VISION_CAMERA),
|
||||
|
||||
@@ -529,7 +529,7 @@ class RTVIObserver(BaseObserver):
|
||||
|
||||
isTTS = isinstance(frame, TTSTextFrame)
|
||||
if agg_type is not AggregationType.WORD:
|
||||
logger.debug(f"{self} Aggregated LLM text: {text}, {agg_type} spoken:{isTTS}")
|
||||
logger.trace(f"{self} Aggregated LLM text: {text}, {agg_type} spoken:{isTTS}")
|
||||
|
||||
if self._params.bot_output_enabled:
|
||||
message = RTVI.BotOutputMessage(
|
||||
|
||||
@@ -90,6 +90,7 @@ To run locally:
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import importlib.util
|
||||
import mimetypes
|
||||
import os
|
||||
import sys
|
||||
@@ -131,6 +132,18 @@ load_dotenv(override=True)
|
||||
os.environ["ENV"] = "local"
|
||||
|
||||
TELEPHONY_TRANSPORTS = ["twilio", "telnyx", "plivo", "exotel"]
|
||||
TRANSPORT_ROUTE_DEPENDENCIES = {
|
||||
"daily": ("daily",),
|
||||
"webrtc": ("aiortc",),
|
||||
"telephony": ("fastapi", "websockets"),
|
||||
"websocket": ("fastapi", "websockets"),
|
||||
}
|
||||
TRANSPORT_INSTALL_HINTS = {
|
||||
"daily": "install pipecat-ai[daily]",
|
||||
"webrtc": "install pipecat-ai[webrtc]",
|
||||
"telephony": "install pipecat-ai[websocket]",
|
||||
"websocket": "install pipecat-ai[websocket]",
|
||||
}
|
||||
|
||||
# Mirror Pipecat Cloud's 4-hour max session limit so dev rooms get cleaned up.
|
||||
PIPECAT_ROOM_EXP_HOURS = 4.0
|
||||
@@ -156,6 +169,120 @@ Import this to add custom routes from other packages before calling
|
||||
"""
|
||||
|
||||
|
||||
def _is_module_available(module: str) -> bool:
|
||||
"""Check whether a module can be imported without importing it.
|
||||
|
||||
Args:
|
||||
module: Fully-qualified module name to check.
|
||||
|
||||
Returns:
|
||||
``True`` if Python can resolve the module, ``False`` otherwise.
|
||||
"""
|
||||
try:
|
||||
return importlib.util.find_spec(module) is not None
|
||||
except (ImportError, ModuleNotFoundError, ValueError):
|
||||
return False
|
||||
|
||||
|
||||
def _transport_route_dependencies(transport: str) -> tuple[str, ...]:
|
||||
"""Return module dependencies required for a transport route.
|
||||
|
||||
Args:
|
||||
transport: Transport name from the runner request or CLI.
|
||||
|
||||
Returns:
|
||||
Module names required to enable the transport route.
|
||||
"""
|
||||
if transport in TELEPHONY_TRANSPORTS:
|
||||
return TRANSPORT_ROUTE_DEPENDENCIES["telephony"]
|
||||
return TRANSPORT_ROUTE_DEPENDENCIES.get(transport, ())
|
||||
|
||||
|
||||
def _transport_routes_enabled(transport: str) -> bool:
|
||||
"""Return whether a transport route can run in this environment.
|
||||
|
||||
Args:
|
||||
transport: Transport name from the runner request or CLI.
|
||||
|
||||
Returns:
|
||||
``True`` if the requested transport is enabled.
|
||||
"""
|
||||
return all(_is_module_available(module) for module in _transport_route_dependencies(transport))
|
||||
|
||||
|
||||
def _runner_url(args: argparse.Namespace) -> str:
|
||||
"""Return the browser URL for the runner prebuilt client."""
|
||||
return f"http://{args.host}:{args.port}"
|
||||
|
||||
|
||||
def _transport_status_lists() -> tuple[list[str], list[str]]:
|
||||
"""Return enabled and disabled transport labels for the startup banner."""
|
||||
transports = ["daily", "webrtc", "telephony", "websocket"]
|
||||
enabled = []
|
||||
disabled = []
|
||||
|
||||
for label in transports:
|
||||
if _transport_routes_enabled(label):
|
||||
enabled.append(label)
|
||||
else:
|
||||
disabled.append(f"{label} ({TRANSPORT_INSTALL_HINTS[label]})")
|
||||
|
||||
return enabled, disabled
|
||||
|
||||
|
||||
def _format_transport_status(labels: list[str]) -> str:
|
||||
"""Format a startup banner transport status list."""
|
||||
return ", ".join(labels) if labels else "none"
|
||||
|
||||
|
||||
def _print_startup_message(args: argparse.Namespace):
|
||||
"""Print connection information for the development runner."""
|
||||
print()
|
||||
if args.transport is None:
|
||||
enabled, disabled = _transport_status_lists()
|
||||
print("🚀 Bot ready!")
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
print(f" → Enabled transports: {_format_transport_status(enabled)}")
|
||||
if disabled:
|
||||
print(f" → Disabled transports: {_format_transport_status(disabled)}")
|
||||
elif args.transport == "webrtc":
|
||||
if args.esp32:
|
||||
print("🚀 Bot ready! (ESP32 mode)")
|
||||
elif args.whatsapp:
|
||||
print("🚀 Bot ready! (WhatsApp)")
|
||||
else:
|
||||
print("🚀 Bot ready! (WebRTC)")
|
||||
if _transport_routes_enabled("webrtc"):
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
else:
|
||||
print(f" → WebRTC disabled ({TRANSPORT_INSTALL_HINTS['webrtc']})")
|
||||
elif args.transport == "daily":
|
||||
print("🚀 Bot ready! (Daily)")
|
||||
if not _transport_routes_enabled("daily"):
|
||||
print(f" → Daily disabled ({TRANSPORT_INSTALL_HINTS['daily']})")
|
||||
else:
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
if args.dialin:
|
||||
print(
|
||||
f" → Daily dial-in webhook: "
|
||||
f"http://{args.host}:{args.port}/daily-dialin-webhook"
|
||||
)
|
||||
print(" → Configure this URL in your Daily phone number settings")
|
||||
elif args.transport in TELEPHONY_TRANSPORTS:
|
||||
print(f"🚀 Bot ready! ({args.transport.capitalize()})")
|
||||
if not _transport_routes_enabled(args.transport):
|
||||
print(f" → Telephony disabled ({TRANSPORT_INSTALL_HINTS['telephony']})")
|
||||
else:
|
||||
print(f" → Open: {_runner_url(args)}")
|
||||
if args.proxy:
|
||||
print(f" → XML webhook: http://{args.host}:{args.port}/")
|
||||
print(f" → WebSocket: ws://{args.host}:{args.port}/ws")
|
||||
elif args.transport == "vonage":
|
||||
print()
|
||||
print("🚀 Bot ready!")
|
||||
print()
|
||||
|
||||
|
||||
def _get_bot_module():
|
||||
"""Get the bot module from the calling script."""
|
||||
import importlib.util
|
||||
@@ -227,6 +354,8 @@ async def _run_websocket_bot(websocket: WebSocket, args: argparse.Namespace):
|
||||
|
||||
def _setup_websocket_routes(app: FastAPI, args: argparse.Namespace):
|
||||
"""Set up the plain WebSocket route at ``/ws-client``."""
|
||||
if not _transport_routes_enabled("websocket"):
|
||||
return
|
||||
|
||||
@app.websocket("/ws-client")
|
||||
async def websocket_client_endpoint(websocket: WebSocket):
|
||||
@@ -338,6 +467,15 @@ def _setup_unified_start_route(
|
||||
),
|
||||
)
|
||||
|
||||
if not _transport_routes_enabled(transport):
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=(
|
||||
f"Transport '{transport}' is disabled in this runner environment. "
|
||||
"Check the startup banner for enabled transports."
|
||||
),
|
||||
)
|
||||
|
||||
if transport == "webrtc":
|
||||
# WebRTC: register the session; the bot starts when the WebRTC offer arrives.
|
||||
session_id = str(uuid.uuid4())
|
||||
@@ -471,6 +609,9 @@ def _setup_webrtc_routes(
|
||||
app: FastAPI, args: argparse.Namespace, active_sessions: dict[str, dict[str, Any]]
|
||||
):
|
||||
"""Set up WebRTC-specific routes."""
|
||||
if not _transport_routes_enabled("webrtc"):
|
||||
return
|
||||
|
||||
try:
|
||||
from pipecat.transports.smallwebrtc.connection import SmallWebRTCConnection
|
||||
from pipecat.transports.smallwebrtc.request_handler import (
|
||||
@@ -480,7 +621,7 @@ def _setup_webrtc_routes(
|
||||
SmallWebRTCRequestHandler,
|
||||
)
|
||||
except ImportError as e:
|
||||
logger.error(f"WebRTC transport dependencies not installed: {e}")
|
||||
logger.warning(f"WebRTC routes disabled after dependency check passed: {e}")
|
||||
return
|
||||
|
||||
@app.get("/files/{filename:path}")
|
||||
@@ -765,6 +906,8 @@ def _setup_whatsapp_routes(app: FastAPI, args: argparse.Namespace):
|
||||
|
||||
def _setup_daily_routes(app: FastAPI, args: argparse.Namespace):
|
||||
"""Set up Daily-specific routes."""
|
||||
if not _transport_routes_enabled("daily"):
|
||||
return
|
||||
|
||||
@app.get("/daily")
|
||||
async def create_room_and_start_agent():
|
||||
@@ -908,6 +1051,9 @@ def _setup_telephony_routes(app: FastAPI, args: argparse.Namespace):
|
||||
specific telephony transport is chosen via ``-t`` because the XML template
|
||||
is provider-specific and requires a proxy hostname (``--proxy``).
|
||||
"""
|
||||
if not _transport_routes_enabled("telephony"):
|
||||
return
|
||||
|
||||
if args.transport in TELEPHONY_TRANSPORTS:
|
||||
# XML response templates (Exotel doesn't use XML webhooks)
|
||||
XML_TEMPLATES = {
|
||||
@@ -1160,43 +1306,11 @@ def main(parser: argparse.ArgumentParser | None = None):
|
||||
return
|
||||
|
||||
# Print startup message
|
||||
print()
|
||||
if args.transport is None:
|
||||
print("🚀 Bot ready!")
|
||||
print(f" → WebRTC: http://{args.host}:{args.port}/client")
|
||||
print(f" → Daily: http://{args.host}:{args.port}/daily")
|
||||
print(f" → Telephony: ws://{args.host}:{args.port}/ws")
|
||||
elif args.transport == "webrtc":
|
||||
if args.esp32:
|
||||
print("🚀 Bot ready! (ESP32 mode)")
|
||||
elif args.whatsapp:
|
||||
print("🚀 Bot ready! (WhatsApp)")
|
||||
else:
|
||||
print("🚀 Bot ready! (WebRTC)")
|
||||
print(f" → Open http://{args.host}:{args.port}/client in your browser")
|
||||
elif args.transport == "daily":
|
||||
print("🚀 Bot ready! (Daily)")
|
||||
if args.dialin:
|
||||
print(
|
||||
f" → Daily dial-in webhook: http://{args.host}:{args.port}/daily-dialin-webhook"
|
||||
)
|
||||
print(f" → Configure this URL in your Daily phone number settings")
|
||||
else:
|
||||
print(
|
||||
f" → Open http://{args.host}:{args.port}/daily in your browser to start a session"
|
||||
)
|
||||
elif args.transport in TELEPHONY_TRANSPORTS:
|
||||
print(f"🚀 Bot ready! ({args.transport.capitalize()})")
|
||||
if args.proxy:
|
||||
print(f" → XML webhook: http://{args.host}:{args.port}/")
|
||||
print(f" → WebSocket: ws://{args.host}:{args.port}/ws")
|
||||
elif args.transport == "vonage":
|
||||
print()
|
||||
print(f"🚀 Bot ready!")
|
||||
_print_startup_message(args)
|
||||
if args.transport == "vonage":
|
||||
asyncio.run(_run_vonage())
|
||||
print()
|
||||
return
|
||||
print()
|
||||
|
||||
RUNNER_DOWNLOADS_FOLDER = args.folder
|
||||
RUNNER_HOST = args.host
|
||||
|
||||
@@ -586,9 +586,9 @@ class AssemblyAISTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.debug(f"{self} Connected to AssemblyAI WebSocket")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
self._connected = False
|
||||
await self.push_error(error_msg=f"Unable to connect to AssemblyAI: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to AssemblyAI."""
|
||||
|
||||
@@ -339,10 +339,10 @@ class AWSTranscribeSTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.info(f"{self} Successfully connected to AWS Transcribe")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(
|
||||
error_msg=f"Unable to connect to AWS Transcribe: {e}", exception=e
|
||||
)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to AWS Transcribe."""
|
||||
|
||||
@@ -354,7 +354,8 @@ class CartesiaSTTService(WebsocketSTTService):
|
||||
self._websocket = await websocket_connect(ws_url, additional_headers=headers)
|
||||
await self._call_event_handler("on_connected")
|
||||
except Exception as e:
|
||||
await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
|
||||
self._websocket = None
|
||||
await self.push_error(error_msg=f"Unable to connect to Cartesia: {e}", exception=e)
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
ws = self._websocket
|
||||
|
||||
@@ -823,6 +823,7 @@ class ElevenLabsRealtimeSTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.debug("Connected to ElevenLabs Realtime STT")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(
|
||||
error_msg=f"Unable to connect to ElevenLabs Realtime STT: {e}", exception=e
|
||||
)
|
||||
|
||||
@@ -594,6 +594,10 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
self._partial_word_start_time = 0.0
|
||||
self._alignment_started_context_ids: set[str | None] = set()
|
||||
|
||||
# Context IDs whose context-init has been sent, so the keepalive knows
|
||||
# which contexts are safe to target.
|
||||
self._context_init_sent: set[str] = set()
|
||||
|
||||
# Context management for v1 multi API
|
||||
self._receive_task = None
|
||||
self._keepalive_task = None
|
||||
@@ -792,6 +796,7 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
finally:
|
||||
await self.remove_active_audio_context()
|
||||
self._websocket = None
|
||||
self._context_init_sent.clear()
|
||||
await self._call_event_handler("on_disconnected")
|
||||
|
||||
def _get_websocket(self):
|
||||
@@ -822,6 +827,7 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
self._partial_word = ""
|
||||
self._partial_word_start_time = 0.0
|
||||
self._alignment_started_context_ids.discard(context_id)
|
||||
self._context_init_sent.discard(context_id)
|
||||
|
||||
async def on_audio_context_interrupted(self, context_id: str):
|
||||
"""Close the ElevenLabs context when the bot is interrupted."""
|
||||
@@ -914,26 +920,35 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
while True:
|
||||
await asyncio.sleep(KEEPALIVE_SLEEP)
|
||||
try:
|
||||
if self._websocket and self._websocket.state is State.OPEN:
|
||||
context_id = self.get_active_audio_context_id()
|
||||
if context_id:
|
||||
# Send keepalive with context ID to keep the connection alive
|
||||
keepalive_message = {
|
||||
"text": "",
|
||||
"context_id": context_id,
|
||||
}
|
||||
logger.trace(f"Sending keepalive for context {context_id}")
|
||||
else:
|
||||
# It's possible to have a user interruption which clears the context
|
||||
# without generating a new TTS response. In this case, we'll just send
|
||||
# an empty message to keep the connection alive.
|
||||
keepalive_message = {"text": ""}
|
||||
logger.trace("Sending keepalive without context")
|
||||
await self._websocket.send(json.dumps(keepalive_message))
|
||||
await self._send_keepalive()
|
||||
except websockets.ConnectionClosed as e:
|
||||
logger.warning(f"{self} keepalive error: {e}")
|
||||
break
|
||||
|
||||
async def _send_keepalive(self):
|
||||
"""Send a single keepalive message to keep the WebSocket connection alive.
|
||||
|
||||
Only stamps a ``context_id`` once its context-init (carrying
|
||||
``voice_settings``) has been sent. Otherwise the keepalive would be the
|
||||
context's first message, with no ``voice_settings``, and ElevenLabs would
|
||||
reject the later context-init with a 1008 policy violation. A context-less
|
||||
keepalive is sufficient until the context-init is sent.
|
||||
"""
|
||||
if not self._websocket or self._websocket.state is not State.OPEN:
|
||||
return
|
||||
|
||||
context_id = self.get_active_audio_context_id()
|
||||
if context_id and context_id in self._context_init_sent:
|
||||
# The context's voice_settings context-init has been sent, so it's
|
||||
# safe to keep that context alive.
|
||||
keepalive_message = {"text": "", "context_id": context_id}
|
||||
else:
|
||||
# No active context, or the active context's context-init hasn't been
|
||||
# sent yet. A context-less keepalive keeps the connection alive without
|
||||
# opening the context prematurely.
|
||||
keepalive_message = {"text": ""}
|
||||
await self._websocket.send(json.dumps(keepalive_message))
|
||||
|
||||
async def _send_text(self, text: str, context_id: str):
|
||||
"""Send text to the WebSocket for synthesis."""
|
||||
if self._websocket and context_id:
|
||||
@@ -980,6 +995,9 @@ class ElevenLabsTTSService(WebsocketTTSService):
|
||||
locator.model_dump()
|
||||
for locator in self._pronunciation_dictionary_locators
|
||||
]
|
||||
# Mark the context-init as sent so the keepalive may now
|
||||
# target this context_id.
|
||||
self._context_init_sent.add(context_id)
|
||||
await self._websocket.send(json.dumps(msg))
|
||||
logger.trace(f"Created new context {context_id}")
|
||||
|
||||
|
||||
@@ -558,8 +558,9 @@ class GladiaSTTService(WebsocketSTTService):
|
||||
|
||||
logger.debug(f"{self} Connected to Gladia WebSocket")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
self._connection_active = False
|
||||
await self.push_error(error_msg=f"Unable to connect to Gladia: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to Gladia."""
|
||||
|
||||
@@ -423,8 +423,8 @@ class GradiumSTTService(WebsocketSTTService):
|
||||
logger.debug("Connected to Gradium STT")
|
||||
|
||||
except Exception as e:
|
||||
await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
|
||||
raise
|
||||
self._websocket = None
|
||||
await self.push_error(error_msg=f"Unable to connect to Gradium: {e}", exception=e)
|
||||
|
||||
async def _disconnect(self):
|
||||
await super()._disconnect()
|
||||
|
||||
0
src/pipecat/services/inception/__init__.py
Normal file
0
src/pipecat/services/inception/__init__.py
Normal file
124
src/pipecat/services/inception/llm.py
Normal file
124
src/pipecat/services/inception/llm.py
Normal file
@@ -0,0 +1,124 @@
|
||||
#
|
||||
# Copyright (c) 2024-2026, Daily
|
||||
#
|
||||
# SPDX-License-Identifier: BSD 2-Clause License
|
||||
#
|
||||
|
||||
"""Inception LLM service implementation using OpenAI-compatible interface."""
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Literal
|
||||
|
||||
from loguru import logger
|
||||
|
||||
from pipecat.adapters.services.open_ai_adapter import OpenAILLMInvocationParams
|
||||
from pipecat.services.openai.base_llm import BaseOpenAILLMService
|
||||
from pipecat.services.openai.llm import OpenAILLMService
|
||||
from pipecat.services.settings import NOT_GIVEN as _NOT_GIVEN
|
||||
from pipecat.services.settings import _NotGiven, is_given
|
||||
|
||||
|
||||
@dataclass
|
||||
class InceptionLLMSettings(BaseOpenAILLMService.Settings):
|
||||
"""Settings for InceptionLLMService.
|
||||
|
||||
Parameters:
|
||||
reasoning_effort: Controls how much reasoning the model applies.
|
||||
One of "instant", "low", "medium", or "high". When unset, the
|
||||
parameter is omitted and Inception's server-side default applies.
|
||||
realtime: When True, reduces time to first diffusion block (TTFT).
|
||||
"""
|
||||
|
||||
reasoning_effort: Literal["instant", "low", "medium", "high"] | None | _NotGiven = field(
|
||||
default_factory=lambda: _NOT_GIVEN
|
||||
)
|
||||
realtime: bool | None | _NotGiven = field(default_factory=lambda: _NOT_GIVEN)
|
||||
|
||||
|
||||
class InceptionLLMService(OpenAILLMService):
|
||||
"""A service for interacting with Inception's API using the OpenAI-compatible interface.
|
||||
|
||||
This service extends OpenAILLMService to connect to Inception's API endpoint while
|
||||
maintaining full compatibility with OpenAI's interface and functionality.
|
||||
Supports Mercury-2, Inception's diffusion-based reasoning model.
|
||||
"""
|
||||
|
||||
# Inception doesn't support the "developer" message role.
|
||||
supports_developer_role = False
|
||||
|
||||
Settings = InceptionLLMSettings
|
||||
_settings: Settings
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
api_key: str,
|
||||
base_url: str = "https://api.inceptionlabs.ai/v1",
|
||||
settings: Settings | None = None,
|
||||
**kwargs,
|
||||
):
|
||||
"""Initialize the Inception LLM service.
|
||||
|
||||
Args:
|
||||
api_key: The API key for accessing Inception's API.
|
||||
base_url: The base URL for Inception API. Defaults to "https://api.inceptionlabs.ai/v1".
|
||||
settings: Runtime-updatable settings.
|
||||
**kwargs: Additional keyword arguments passed to OpenAILLMService.
|
||||
"""
|
||||
default_settings = self.Settings(
|
||||
model="mercury-2",
|
||||
reasoning_effort=None,
|
||||
realtime=None,
|
||||
)
|
||||
|
||||
if settings is not None:
|
||||
default_settings.apply_update(settings)
|
||||
|
||||
super().__init__(api_key=api_key, base_url=base_url, settings=default_settings, **kwargs)
|
||||
|
||||
def create_client(self, api_key=None, base_url=None, **kwargs):
|
||||
"""Create OpenAI-compatible client for Inception API endpoint.
|
||||
|
||||
Args:
|
||||
api_key: The API key for authentication. If None, uses instance default.
|
||||
base_url: The base URL for the API. If None, uses instance default.
|
||||
**kwargs: Additional keyword arguments for client configuration.
|
||||
|
||||
Returns:
|
||||
An OpenAI-compatible client configured for Inception's API.
|
||||
"""
|
||||
logger.debug(f"Creating Inception client with api {base_url}")
|
||||
return super().create_client(api_key, base_url, **kwargs)
|
||||
|
||||
def build_chat_completion_params(self, params_from_context: OpenAILLMInvocationParams) -> dict:
|
||||
"""Build parameters for Inception chat completion request.
|
||||
|
||||
Extends the base OpenAI parameters with Inception-specific options
|
||||
such as reasoning_effort and realtime.
|
||||
|
||||
Args:
|
||||
params_from_context: Parameters, derived from the LLM context, to
|
||||
use for the chat completion. Contains messages, tools, and tool
|
||||
choice.
|
||||
|
||||
Returns:
|
||||
Dictionary of parameters for the chat completion request.
|
||||
"""
|
||||
params = super().build_chat_completion_params(params_from_context)
|
||||
|
||||
if (
|
||||
is_given(self._settings.reasoning_effort)
|
||||
and self._settings.reasoning_effort is not None
|
||||
):
|
||||
params["reasoning_effort"] = self._settings.reasoning_effort
|
||||
|
||||
# realtime is Inception-specific and unknown to the OpenAI SDK,
|
||||
# so it must be passed via extra_body to avoid validation errors.
|
||||
extra_body = {}
|
||||
if is_given(self._settings.realtime) and self._settings.realtime is not None:
|
||||
extra_body["realtime"] = self._settings.realtime
|
||||
|
||||
if extra_body:
|
||||
params["extra_body"] = extra_body
|
||||
|
||||
return params
|
||||
@@ -155,7 +155,6 @@ def language_to_soniox_language(language: Language) -> str:
|
||||
Language.ID: "id",
|
||||
Language.IT: "it",
|
||||
Language.JA: "ja",
|
||||
Language.KA: "ka",
|
||||
Language.KK: "kk",
|
||||
Language.KN: "kn",
|
||||
Language.KO: "ko",
|
||||
@@ -232,6 +231,7 @@ class SonioxSTTSettings(STTSettings):
|
||||
context_version 2.
|
||||
enable_speaker_diarization: Whether to enable speaker diarization.
|
||||
enable_language_identification: Whether to enable language identification.
|
||||
max_endpoint_delay_ms: Max ms before endpoint detection finalizes the turn (500-3000).
|
||||
client_reference_id: Client reference ID to use for transcription.
|
||||
"""
|
||||
|
||||
@@ -242,6 +242,7 @@ class SonioxSTTSettings(STTSettings):
|
||||
enable_language_identification: bool | None | _NotGiven = field(
|
||||
default_factory=lambda: NOT_GIVEN
|
||||
)
|
||||
max_endpoint_delay_ms: int | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)
|
||||
client_reference_id: str | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)
|
||||
|
||||
|
||||
@@ -309,6 +310,7 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
context=None,
|
||||
enable_speaker_diarization=False,
|
||||
enable_language_identification=False,
|
||||
max_endpoint_delay_ms=None,
|
||||
client_reference_id=None,
|
||||
)
|
||||
|
||||
@@ -390,8 +392,7 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
changed = await super()._update_settings(delta)
|
||||
|
||||
if changed:
|
||||
await self._disconnect()
|
||||
await self._connect()
|
||||
await self._request_reconnect()
|
||||
|
||||
return changed
|
||||
|
||||
@@ -522,6 +523,7 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
"audio_format": self._audio_format,
|
||||
"num_channels": self._num_channels,
|
||||
"enable_endpoint_detection": enable_endpoint_detection,
|
||||
"max_endpoint_delay_ms": s.max_endpoint_delay_ms,
|
||||
"sample_rate": self.sample_rate,
|
||||
"language_hints": _prepare_language_hints(assert_given(s.language_hints)),
|
||||
"language_hints_strict": s.language_hints_strict,
|
||||
@@ -537,8 +539,8 @@ class SonioxSTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.debug("Connected to Soniox STT")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
await self.push_error(error_msg=f"Unable to connect to Soniox: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the websocket connection to Soniox."""
|
||||
|
||||
@@ -44,17 +44,15 @@ GLADIA_TTFS_P99: float = 1.49
|
||||
GOOGLE_TTFS_P99: float = 1.57
|
||||
GRADIUM_TTFS_P99: float = 1.61
|
||||
GROQ_TTFS_P99: float = 1.54
|
||||
MISTRAL_TTFS_P99: float = 1.89
|
||||
OPENAI_TTFS_P99: float = 2.01
|
||||
OPENAI_REALTIME_TTFS_P99: float = 1.66
|
||||
SARVAM_TTFS_P99: float = 1.17
|
||||
SMALLEST_TTFS_P99: float = 1.59
|
||||
SONIOX_TTFS_P99: float = 0.35
|
||||
SPEECHMATICS_TTFS_P99: float = 0.74
|
||||
XAI_TTFS_P99: float = 2.14
|
||||
|
||||
# These services run locally and should be replaced with measured values
|
||||
NVIDIA_TTFS_P99: float = DEFAULT_TTFS_P99
|
||||
WHISPER_TTFS_P99: float = DEFAULT_TTFS_P99
|
||||
|
||||
# No benchmark available yet; using conservative default
|
||||
MISTRAL_TTFS_P99: float = DEFAULT_TTFS_P99
|
||||
SMALLEST_TTFS_P99: float = DEFAULT_TTFS_P99
|
||||
XAI_TTFS_P99: float = DEFAULT_TTFS_P99
|
||||
|
||||
@@ -76,7 +76,9 @@ class WebsocketService(ABC):
|
||||
logger.warning(f"{self} reconnecting (attempt: {attempt_number})")
|
||||
await self._disconnect_websocket()
|
||||
await self._connect_websocket()
|
||||
return await self._verify_connection()
|
||||
if not await self._verify_connection():
|
||||
raise ConnectionError(f"{self} websocket reconnection failed verification")
|
||||
return True
|
||||
|
||||
async def _try_reconnect(
|
||||
self,
|
||||
|
||||
@@ -293,8 +293,9 @@ class XAISTTService(WebsocketSTTService):
|
||||
await self._call_event_handler("on_connected")
|
||||
logger.debug(f"{self} connected to xAI STT WebSocket")
|
||||
except Exception as e:
|
||||
self._websocket = None
|
||||
self._session_ready.clear()
|
||||
await self.push_error(error_msg=f"Unable to connect to xAI STT: {e}", exception=e)
|
||||
raise
|
||||
|
||||
async def _disconnect_websocket(self):
|
||||
"""Close the WebSocket connection."""
|
||||
|
||||
@@ -86,7 +86,6 @@ class WordCompletionTracker:
|
||||
self._overflow_word: str | None = None
|
||||
self._llm_consumed: str | None = None
|
||||
self._frame_word: str | None = None
|
||||
logger.debug(f"WordCompletionTracker: {self._tts_normalized}")
|
||||
|
||||
@staticmethod
|
||||
def _normalize(text: str) -> str:
|
||||
|
||||
45
tests/test_cartesia_stt.py
Normal file
45
tests/test_cartesia_stt.py
Normal file
@@ -0,0 +1,45 @@
|
||||
#
|
||||
# Copyright (c) 2024-2026, Daily
|
||||
#
|
||||
# SPDX-License-Identifier: BSD 2-Clause License
|
||||
#
|
||||
|
||||
from unittest.mock import AsyncMock
|
||||
|
||||
import pytest
|
||||
from websockets.protocol import State
|
||||
|
||||
from pipecat.services.cartesia.stt import CartesiaSTTService
|
||||
|
||||
|
||||
class _FakeWebsocket:
|
||||
def __init__(self, *, state=State.OPEN, send_side_effect=None):
|
||||
self.state = state
|
||||
self.send = AsyncMock(side_effect=send_side_effect)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cartesia_connect_failure_clears_stale_websocket(monkeypatch):
|
||||
async def fake_websocket_connect(*args, **kwargs):
|
||||
raise RuntimeError("connection failed")
|
||||
|
||||
monkeypatch.setattr("pipecat.services.cartesia.stt.websocket_connect", fake_websocket_connect)
|
||||
|
||||
service = CartesiaSTTService(api_key="test-key", sample_rate=16000)
|
||||
service._websocket = _FakeWebsocket(state=State.CLOSED)
|
||||
|
||||
await service._connect_websocket()
|
||||
|
||||
assert service._websocket is None
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cartesia_run_stt_logs_send_failure_without_clearing_websocket():
|
||||
service = CartesiaSTTService(api_key="test-key", sample_rate=16000)
|
||||
websocket = _FakeWebsocket(send_side_effect=RuntimeError("websocket closed"))
|
||||
service._websocket = websocket
|
||||
|
||||
async for _ in service.run_stt(b"\x00" * 160):
|
||||
pass
|
||||
|
||||
assert service._websocket is websocket
|
||||
@@ -6,9 +6,14 @@
|
||||
|
||||
"""Tests for ElevenLabs TTS alignment handling."""
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
import pytest
|
||||
from websockets.protocol import State
|
||||
|
||||
from pipecat.services.elevenlabs.tts import (
|
||||
ElevenLabsTTSService,
|
||||
_select_alignment,
|
||||
_strip_utterance_leading_spaces,
|
||||
calculate_word_times,
|
||||
@@ -200,3 +205,87 @@ def test_select_alignment_works_with_http_field_names():
|
||||
)
|
||||
assert selected is not None
|
||||
assert selected["characters"] == list(" Hi")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Keepalive vs context-init race
|
||||
#
|
||||
# The keepalive must only stamp a context_id once its context-init (carrying
|
||||
# voice_settings) has been sent. Stamping it earlier makes the keepalive the
|
||||
# context's first message, with no voice_settings, and ElevenLabs rejects the
|
||||
# later context-init with a 1008 policy violation.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class _FakeWebSocket:
|
||||
"""Minimal stand-in for the ElevenLabs websocket that records sends."""
|
||||
|
||||
def __init__(self):
|
||||
self.state = State.OPEN
|
||||
self.sent: list[dict] = []
|
||||
|
||||
async def send(self, data: str):
|
||||
self.sent.append(json.loads(data))
|
||||
|
||||
|
||||
def _make_service() -> ElevenLabsTTSService:
|
||||
return ElevenLabsTTSService(
|
||||
api_key="test-key",
|
||||
settings=ElevenLabsTTSService.Settings(
|
||||
voice="test-voice",
|
||||
stability=0.55,
|
||||
similarity_boost=0.85,
|
||||
use_speaker_boost=True,
|
||||
speed=0.81,
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_keepalive_does_not_stamp_context_before_init():
|
||||
"""During the pre-init window the keepalive must not stamp the new context_id."""
|
||||
service = _make_service()
|
||||
ws = _FakeWebSocket()
|
||||
service._websocket = ws
|
||||
|
||||
# Simulate the start of an LLM turn: TTSService sets the turn context id on
|
||||
# LLMFullResponseStartFrame, before run_tts sends the voice_settings init.
|
||||
service._turn_context_id = "ctx-1"
|
||||
service._playing_context_id = None
|
||||
assert "ctx-1" not in service._context_init_sent
|
||||
|
||||
await service._send_keepalive()
|
||||
|
||||
# Context-less keepalive: the real context-init stays the context's first
|
||||
# message, so ElevenLabs won't reject it with 1008.
|
||||
assert ws.sent == [{"text": ""}]
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_keepalive_stamps_context_after_init():
|
||||
"""Once the context-init has been sent, the keepalive targets that context."""
|
||||
service = _make_service()
|
||||
ws = _FakeWebSocket()
|
||||
service._websocket = ws
|
||||
service._turn_context_id = "ctx-1"
|
||||
service._playing_context_id = None
|
||||
# run_tts records the context once its voice_settings init has gone out.
|
||||
service._context_init_sent.add("ctx-1")
|
||||
|
||||
await service._send_keepalive()
|
||||
|
||||
assert ws.sent == [{"text": "", "context_id": "ctx-1"}]
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_keepalive_without_active_context_sends_empty():
|
||||
"""With no active context, the keepalive sends a plain empty message."""
|
||||
service = _make_service()
|
||||
ws = _FakeWebSocket()
|
||||
service._websocket = ws
|
||||
service._turn_context_id = None
|
||||
service._playing_context_id = None
|
||||
|
||||
await service._send_keepalive()
|
||||
|
||||
assert ws.sent == [{"text": ""}]
|
||||
|
||||
324
tests/test_runner_run.py
Normal file
324
tests/test_runner_run.py
Normal file
@@ -0,0 +1,324 @@
|
||||
#
|
||||
# Copyright (c) 2024-2026, Daily
|
||||
#
|
||||
# SPDX-License-Identifier: BSD 2-Clause License
|
||||
#
|
||||
|
||||
import argparse
|
||||
import io
|
||||
import sys
|
||||
import types
|
||||
import unittest
|
||||
from contextlib import redirect_stdout
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi.testclient import TestClient
|
||||
from pydantic import BaseModel
|
||||
|
||||
from pipecat.runner.run import (
|
||||
_print_startup_message,
|
||||
_setup_daily_routes,
|
||||
_setup_telephony_routes,
|
||||
_setup_unified_start_route,
|
||||
_setup_webrtc_routes,
|
||||
_setup_websocket_routes,
|
||||
_transport_route_dependencies,
|
||||
_transport_routes_enabled,
|
||||
)
|
||||
|
||||
|
||||
class TestRunnerRun(unittest.TestCase):
|
||||
def _capture_startup_message(self, args: argparse.Namespace) -> str:
|
||||
buffer = io.StringIO()
|
||||
with redirect_stdout(buffer):
|
||||
_print_startup_message(args)
|
||||
return buffer.getvalue()
|
||||
|
||||
def test_transport_route_dependencies_maps_transports_to_modules(self):
|
||||
self.assertEqual(_transport_route_dependencies("daily"), ("daily",))
|
||||
self.assertEqual(_transport_route_dependencies("webrtc"), ("aiortc",))
|
||||
self.assertEqual(_transport_route_dependencies("websocket"), ("fastapi", "websockets"))
|
||||
self.assertEqual(_transport_route_dependencies("telephony"), ("fastapi", "websockets"))
|
||||
self.assertEqual(_transport_route_dependencies("twilio"), ("fastapi", "websockets"))
|
||||
self.assertEqual(_transport_route_dependencies("telnyx"), ("fastapi", "websockets"))
|
||||
self.assertEqual(_transport_route_dependencies("plivo"), ("fastapi", "websockets"))
|
||||
self.assertEqual(_transport_route_dependencies("exotel"), ("fastapi", "websockets"))
|
||||
self.assertEqual(_transport_route_dependencies("vonage"), ())
|
||||
|
||||
def test_transport_routes_enabled_maps_transports_to_dependency_checks(self):
|
||||
def module_available(module: str) -> bool:
|
||||
return module in {"fastapi", "websockets"}
|
||||
|
||||
with patch("pipecat.runner.run._is_module_available", side_effect=module_available):
|
||||
self.assertFalse(_transport_routes_enabled("daily"))
|
||||
self.assertFalse(_transport_routes_enabled("webrtc"))
|
||||
self.assertTrue(_transport_routes_enabled("websocket"))
|
||||
self.assertTrue(_transport_routes_enabled("telephony"))
|
||||
self.assertTrue(_transport_routes_enabled("twilio"))
|
||||
self.assertTrue(_transport_routes_enabled("vonage"))
|
||||
|
||||
def test_setup_webrtc_routes_skips_when_aiortc_is_missing(self):
|
||||
"""WebRTC routes should be optional when the webrtc extra is not installed."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(folder=None, esp32=False, host="localhost")
|
||||
|
||||
with (
|
||||
patch("pipecat.runner.run._transport_routes_enabled", return_value=False),
|
||||
patch("pipecat.runner.run.logger") as logger,
|
||||
):
|
||||
_setup_webrtc_routes(app, args, {})
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertNotIn("/api/offer", paths)
|
||||
logger.info.assert_not_called()
|
||||
|
||||
def test_setup_webrtc_routes_registers_routes_when_webrtc_is_available(self):
|
||||
"""WebRTC routes should be registered when dependencies are available."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(folder=None, esp32=False, host="localhost")
|
||||
|
||||
connection_module = types.ModuleType("pipecat.transports.smallwebrtc.connection")
|
||||
connection_module.SmallWebRTCConnection = MagicMock()
|
||||
|
||||
request_handler_module = types.ModuleType("pipecat.transports.smallwebrtc.request_handler")
|
||||
|
||||
class IceCandidate(BaseModel):
|
||||
candidate: str
|
||||
sdp_mid: str
|
||||
sdp_mline_index: int
|
||||
|
||||
class SmallWebRTCPatchRequest(BaseModel):
|
||||
pc_id: str
|
||||
candidates: list[IceCandidate] = []
|
||||
|
||||
class SmallWebRTCRequest(BaseModel):
|
||||
sdp: str
|
||||
type: str
|
||||
pc_id: str | None = None
|
||||
restart_pc: bool | None = None
|
||||
request_data: dict | None = None
|
||||
|
||||
request_handler_module.IceCandidate = IceCandidate
|
||||
request_handler_module.SmallWebRTCPatchRequest = SmallWebRTCPatchRequest
|
||||
request_handler_module.SmallWebRTCRequest = SmallWebRTCRequest
|
||||
|
||||
class MockSmallWebRTCRequestHandler:
|
||||
def __init__(self, *args, **kwargs):
|
||||
pass
|
||||
|
||||
async def close(self):
|
||||
pass
|
||||
|
||||
request_handler_module.SmallWebRTCRequestHandler = MockSmallWebRTCRequestHandler
|
||||
|
||||
with (
|
||||
patch("pipecat.runner.run._transport_routes_enabled", return_value=True),
|
||||
patch.dict(
|
||||
sys.modules,
|
||||
{
|
||||
"pipecat.transports.smallwebrtc.connection": connection_module,
|
||||
"pipecat.transports.smallwebrtc.request_handler": request_handler_module,
|
||||
},
|
||||
),
|
||||
):
|
||||
_setup_webrtc_routes(app, args, {})
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertIn("/api/offer", paths)
|
||||
self.assertIn("/files/{filename:path}", paths)
|
||||
|
||||
def test_setup_websocket_routes_skips_when_websocket_is_missing(self):
|
||||
"""Plain WebSocket routes should be optional."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace()
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=False):
|
||||
_setup_websocket_routes(app, args)
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertNotIn("/ws-client", paths)
|
||||
|
||||
def test_setup_websocket_routes_registers_when_websocket_is_available(self):
|
||||
"""Plain WebSocket route should be registered when dependencies are available."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace()
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
_setup_websocket_routes(app, args)
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertIn("/ws-client", paths)
|
||||
|
||||
def test_setup_telephony_routes_skips_when_websocket_is_missing(self):
|
||||
"""Telephony WebSocket routes should be optional."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(transport=None)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=False):
|
||||
_setup_telephony_routes(app, args)
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertNotIn("/ws", paths)
|
||||
|
||||
def test_setup_telephony_routes_registers_when_websocket_is_available(self):
|
||||
"""Telephony WebSocket route should be registered when dependencies are available."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(transport=None)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
_setup_telephony_routes(app, args)
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertIn("/ws", paths)
|
||||
|
||||
def test_setup_telephony_routes_registers_provider_webhook_for_selected_transport(self):
|
||||
"""Provider webhook route should be registered for selected telephony transports."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(transport="twilio", proxy="example.ngrok.io")
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
_setup_telephony_routes(app, args)
|
||||
|
||||
post_root_routes = [
|
||||
route for route in app.routes if route.path == "/" and "POST" in route.methods
|
||||
]
|
||||
self.assertEqual(len(post_root_routes), 1)
|
||||
|
||||
def test_setup_daily_routes_skips_when_daily_is_missing(self):
|
||||
"""Daily routes should be optional."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(dialin=False)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=False):
|
||||
_setup_daily_routes(app, args)
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertNotIn("/daily", paths)
|
||||
|
||||
def test_setup_daily_routes_registers_when_daily_is_available(self):
|
||||
"""Daily route should be registered when dependencies are available."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(dialin=False)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
_setup_daily_routes(app, args)
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertIn("/daily", paths)
|
||||
|
||||
def test_setup_daily_routes_registers_dialin_route_when_enabled(self):
|
||||
"""Daily dial-in route should be registered when requested and available."""
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(dialin=True)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
_setup_daily_routes(app, args)
|
||||
|
||||
paths = {route.path for route in app.routes}
|
||||
self.assertIn("/daily", paths)
|
||||
self.assertIn("/daily-dialin-webhook", paths)
|
||||
|
||||
def test_websocket_routes_require_fastapi_and_websockets(self):
|
||||
with patch(
|
||||
"pipecat.runner.run._is_module_available",
|
||||
side_effect=lambda module: module == "fastapi",
|
||||
) as is_module_available:
|
||||
self.assertFalse(_transport_routes_enabled("websocket"))
|
||||
|
||||
self.assertEqual(
|
||||
[call.args[0] for call in is_module_available.call_args_list],
|
||||
["fastapi", "websockets"],
|
||||
)
|
||||
|
||||
def test_start_rejects_disabled_transport_before_running_bot(self):
|
||||
app = FastAPI()
|
||||
args = argparse.Namespace(transport=None)
|
||||
_setup_unified_start_route(app, args, {})
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=False):
|
||||
response = TestClient(app).post("/start", json={"transport": "daily"})
|
||||
|
||||
self.assertEqual(response.status_code, 400)
|
||||
self.assertEqual(
|
||||
response.json()["detail"],
|
||||
(
|
||||
"Transport 'daily' is disabled in this runner environment. "
|
||||
"Check the startup banner for enabled transports."
|
||||
),
|
||||
)
|
||||
|
||||
def test_startup_message_all_transports_shows_open_url_and_transport_status(self):
|
||||
args = argparse.Namespace(transport=None, host="localhost", port=7860)
|
||||
|
||||
def routes_enabled(transport: str) -> bool:
|
||||
return transport in {"telephony", "websocket"}
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", side_effect=routes_enabled):
|
||||
output = self._capture_startup_message(args)
|
||||
|
||||
self.assertEqual(
|
||||
output,
|
||||
(
|
||||
"\n"
|
||||
"🚀 Bot ready!\n"
|
||||
" → Open: http://localhost:7860\n"
|
||||
" → Enabled transports: telephony, websocket\n"
|
||||
" → Disabled transports: daily (install pipecat-ai[daily]), "
|
||||
"webrtc (install pipecat-ai[webrtc])\n"
|
||||
"\n"
|
||||
),
|
||||
)
|
||||
|
||||
def test_startup_message_all_transports_omits_disabled_status_when_all_enabled(self):
|
||||
args = argparse.Namespace(transport=None, host="localhost", port=7860)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
output = self._capture_startup_message(args)
|
||||
|
||||
self.assertEqual(
|
||||
output,
|
||||
(
|
||||
"\n"
|
||||
"🚀 Bot ready!\n"
|
||||
" → Open: http://localhost:7860\n"
|
||||
" → Enabled transports: daily, webrtc, telephony, websocket\n"
|
||||
"\n"
|
||||
),
|
||||
)
|
||||
|
||||
def test_startup_message_webrtc_uses_root_open_url(self):
|
||||
args = argparse.Namespace(
|
||||
transport="webrtc", host="localhost", port=7860, esp32=False, whatsapp=False
|
||||
)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
output = self._capture_startup_message(args)
|
||||
|
||||
self.assertIn(" → Open: http://localhost:7860\n", output)
|
||||
self.assertNotIn("/client", output)
|
||||
|
||||
def test_startup_message_daily_uses_root_open_url(self):
|
||||
args = argparse.Namespace(transport="daily", host="localhost", port=7860, dialin=False)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
output = self._capture_startup_message(args)
|
||||
|
||||
self.assertIn(" → Open: http://localhost:7860\n", output)
|
||||
self.assertNotIn("/daily in your browser", output)
|
||||
|
||||
def test_startup_message_telephony_keeps_provider_endpoint_details(self):
|
||||
args = argparse.Namespace(
|
||||
transport="twilio", host="localhost", port=7860, proxy="example.ngrok.io"
|
||||
)
|
||||
|
||||
with patch("pipecat.runner.run._transport_routes_enabled", return_value=True):
|
||||
output = self._capture_startup_message(args)
|
||||
|
||||
self.assertIn(" → Open: http://localhost:7860\n", output)
|
||||
self.assertIn(" → XML webhook: http://localhost:7860/\n", output)
|
||||
self.assertIn(" → WebSocket: ws://localhost:7860/ws\n", output)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
@@ -5,8 +5,10 @@
|
||||
#
|
||||
|
||||
import json
|
||||
from unittest.mock import AsyncMock
|
||||
|
||||
import pytest
|
||||
from websockets.protocol import State
|
||||
|
||||
from pipecat.frames.frames import TranscriptionFrame
|
||||
from pipecat.services.soniox.stt import END_TOKEN, SonioxSTTService, _language_from_tokens
|
||||
@@ -14,8 +16,10 @@ from pipecat.transcriptions.language import Language
|
||||
|
||||
|
||||
class _FakeWebsocket:
|
||||
def __init__(self, messages):
|
||||
def __init__(self, messages, *, state=State.OPEN, send_side_effect=None):
|
||||
self._messages = messages
|
||||
self.state = state
|
||||
self.send = AsyncMock(side_effect=send_side_effect)
|
||||
|
||||
def __aiter__(self):
|
||||
return self._iter_messages()
|
||||
@@ -25,6 +29,21 @@ class _FakeWebsocket:
|
||||
yield message
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_connect_failure_clears_stale_websocket_without_raising(monkeypatch):
|
||||
async def fake_websocket_connect(*args, **kwargs):
|
||||
raise RuntimeError("connection failed")
|
||||
|
||||
monkeypatch.setattr("pipecat.services.soniox.stt.websocket_connect", fake_websocket_connect)
|
||||
|
||||
service = SonioxSTTService(api_key="test-key")
|
||||
service._websocket = _FakeWebsocket([], state=State.CLOSED)
|
||||
|
||||
await service._connect_websocket()
|
||||
|
||||
assert service._websocket is None
|
||||
|
||||
|
||||
def test_language_from_tokens_uses_single_recognized_language():
|
||||
tokens = [
|
||||
{"text": "Hello", "language": "en"},
|
||||
|
||||
@@ -165,6 +165,19 @@ async def test_reconnect_exhausted_emits_non_fatal_error(service, report_error):
|
||||
assert "Connection refused" in final_error.error
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_reconnect_exhausted_when_connect_does_not_raise(service, report_error):
|
||||
"""A non-raising failed connect is treated as a failed reconnect attempt."""
|
||||
result = await service._try_reconnect(report_error=report_error)
|
||||
|
||||
assert result is False
|
||||
assert report_error.call_count == 4
|
||||
final_error = report_error.call_args_list[-1][0][0]
|
||||
assert isinstance(final_error, ErrorFrame)
|
||||
assert final_error.fatal is False
|
||||
assert "websocket reconnection failed verification" in final_error.error
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Quick failure detection — accept then immediately close
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
10
uv.lock
generated
10
uv.lock
generated
@@ -4539,7 +4539,7 @@ requires-dist = [
|
||||
{ name = "pipecat-ai", extras = ["websockets-base"], marker = "extra == 'ultravox'" },
|
||||
{ name = "pipecat-ai", extras = ["websockets-base"], marker = "extra == 'websocket'" },
|
||||
{ name = "pipecat-ai", extras = ["websockets-base"], marker = "extra == 'xai'" },
|
||||
{ name = "pipecat-ai-prebuilt", marker = "extra == 'runner'", specifier = ">=1.0.0" },
|
||||
{ name = "pipecat-ai-prebuilt", marker = "extra == 'runner'", specifier = ">=1.0.1" },
|
||||
{ name = "piper-tts", marker = "extra == 'piper'", specifier = ">=1.3.0,<2" },
|
||||
{ name = "protobuf", specifier = ">=5.29.6,<7" },
|
||||
{ name = "protobuf", marker = "extra == 'nvidia'", specifier = ">=6.31.1,<7" },
|
||||
@@ -4574,7 +4574,7 @@ requires-dist = [
|
||||
{ name = "wait-for2", marker = "python_full_version < '3.12'", specifier = ">=0.4.1,<1" },
|
||||
{ name = "websockets", marker = "extra == 'websockets-base'", specifier = ">=13.1,<16.0" },
|
||||
]
|
||||
provides-extras = ["aic", "anthropic", "assemblyai", "asyncai", "aws", "aws-nova-sonic", "azure", "cartesia", "camb", "cerebras", "daily", "deepgram", "deepseek", "elevenlabs", "fal", "fireworks", "fish", "gladia", "google", "gradium", "grok", "groq", "gstreamer", "heygen", "hume", "inworld", "koala", "kokoro", "langchain", "lemonslice", "livekit", "lmnt", "local", "local-smart-turn", "mcp", "mem0", "mistral", "mlx-whisper", "moondream", "nebius", "neuphonic", "novita", "nvidia", "openai", "rnnoise", "openrouter", "perplexity", "piper", "qwen", "resembleai", "rime", "runner", "sagemaker", "sambanova", "sarvam", "sentry", "silero", "simli", "smallest", "soniox", "soundfile", "speechmatics", "strands", "tavus", "together", "tracing", "ultravox", "vonage-video-connector", "webrtc", "websocket", "websockets-base", "whisper", "xai"]
|
||||
provides-extras = ["aic", "anthropic", "assemblyai", "asyncai", "aws", "aws-nova-sonic", "azure", "cartesia", "camb", "cerebras", "daily", "deepgram", "deepseek", "elevenlabs", "fal", "fireworks", "fish", "gladia", "google", "gradium", "grok", "groq", "gstreamer", "heygen", "hume", "inception", "inworld", "koala", "kokoro", "langchain", "lemonslice", "livekit", "lmnt", "local", "local-smart-turn", "mcp", "mem0", "mistral", "mlx-whisper", "moondream", "nebius", "neuphonic", "novita", "nvidia", "openai", "rnnoise", "openrouter", "perplexity", "piper", "qwen", "resembleai", "rime", "runner", "sagemaker", "sambanova", "sarvam", "sentry", "silero", "simli", "smallest", "soniox", "soundfile", "speechmatics", "strands", "tavus", "together", "tracing", "ultravox", "vonage-video-connector", "webrtc", "websocket", "websockets-base", "whisper", "xai"]
|
||||
|
||||
[package.metadata.requires-dev]
|
||||
dev = [
|
||||
@@ -4603,14 +4603,14 @@ docs = [
|
||||
|
||||
[[package]]
|
||||
name = "pipecat-ai-prebuilt"
|
||||
version = "1.0.0"
|
||||
version = "1.0.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "fastapi", extra = ["all"] },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/d0/86/7527474a324e3da787468133a1dba877e06576edc502e1bc7dd84ba7c9f7/pipecat_ai_prebuilt-1.0.0.tar.gz", hash = "sha256:dc66df541f17620eef5dedb2fd44737eb97232899779afb66dcca5aaa9317512", size = 601709, upload-time = "2026-05-14T21:15:26.575Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/fa/27/91857cd93661922687e51f4141583dbeb71f9a6c8d0d6379bae1aa467522/pipecat_ai_prebuilt-1.0.1.tar.gz", hash = "sha256:9453136fcb994802f9b650b5175f3ce1d0476849a9e609fefe52ecc1c3299680", size = 601771, upload-time = "2026-05-20T16:08:14.485Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/89/b1/648122d5e418d3e0c8f797028bc53a22229ffc07a2406712b13b76735f38/pipecat_ai_prebuilt-1.0.0-py3-none-any.whl", hash = "sha256:6b7057920d3d00e5687adb26e032634ba1f6d924eb9079b1804d031620a1e854", size = 601949, upload-time = "2026-05-14T21:15:24.666Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ec/4f/a636e47967c3aa885ae912502d73a46d1e824a67992e405ea1e94b78bd94/pipecat_ai_prebuilt-1.0.1-py3-none-any.whl", hash = "sha256:45d78d3fd2ac8193626a5dabb5f45d0ff2d35bfc92098b4bcea308ae612196aa", size = 601994, upload-time = "2026-05-20T16:08:12.4Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
||||
Reference in New Issue
Block a user