Compare commits

...

110 Commits

Author SHA1 Message Date
Mark Backman
5a6cc4d35c Replace assert-based type narrowing with local variables and guards
Use local variable narrowing and if-guards instead of assert statements
for type safety, since asserts are stripped with python -O.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 16:46:45 -05:00
Mark Backman
28be775740 Reduce type: ignore comments by fixing avoidable type mismatches
Replace ~20 type: ignore comments with proper type fixes:
- Widen set_tools() to accept List[dict] | ToolsSchema | NotGiven
- Widen create_task() to accept Coroutine | Awaitable
- Fix _turn_params to use BaseTurnParams instead of SmartTurnParams
- Make _thought_llm Optional[str] with assertion guard
- Add mixer assertion, websocket narrowing, ice_servers cast
- Use dict.get() in protobuf serializer
- Make remote_participants Optional in Daily transport

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:30:35 -05:00
Mark Backman
bc730e4069 Enable pyright basic type checking for core framework
Add pyright configuration (basic mode, Python 3.10) to pyproject.toml
and fix all 276 type errors in the core framework (everything except
services/ and adapters/). This establishes a CI-ready type checking
baseline as Pipecat approaches 1.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:30:35 -05:00
Mark Backman
104d06551a Merge pull request #3679 from pipecat-ai/mb/remove-to-be-updated
Remove SequentialMergePipeline
2026-02-08 15:28:38 -05:00
Mark Backman
90ad2a4e81 Remove SequentialMergePipeline 2026-02-08 14:44:48 -05:00
Mark Backman
570f2d7fc0 Merge pull request #3667 from ianbbqzy/ian/fix-auto-mode-space
[inworld] aggregate_sentence mode needs trailing space
2026-02-07 18:22:32 -05:00
Ian Lee
f3d99adf8f [inworld] aggregate_sentence mode needs trailing space 2026-02-07 15:18:24 -08:00
Mark Backman
d34f416281 Merge pull request #3598 from dhruvladia-sarvam/sarvam-v3-update
ASR and TTS v3 update
2026-02-07 10:51:35 -05:00
Mark Backman
5a1deb7cb4 Merge pull request #3659 from pipecat-ai/mb/change-vad-defaults
Set VADParams stop_secs to 0.2 by default
2026-02-06 23:51:50 -05:00
Mark Backman
a5fc2b1650 Set VADParams stop_secs to 0.2 by default 2026-02-06 23:49:08 -05:00
Aleix Conchillo Flaqué
5cb8d91431 added changelog file for #3616 2026-02-06 16:45:23 -08:00
Aleix Conchillo Flaqué
ce690848c0 Merge pull request #3616 from omChauhanDev/fix/function-call-timeout-task-cleanup
fix: ensure function call timeout task is always cancelled
2026-02-06 16:40:56 -08:00
Aleix Conchillo Flaqué
30f51edfcd Merge pull request #3668 from pipecat-ai/aleix/parallel-pipeline-buffering
Buffer internal frames during ParallelPipeline lifecycle sync
2026-02-06 15:25:32 -08:00
Aleix Conchillo Flaqué
cd03d449cb Update changelog skill with skip rules and allowed types 2026-02-06 15:23:14 -08:00
Aleix Conchillo Flaqué
57df03aade Update CLAUDE.md with PR workflow instructions 2026-02-06 15:23:14 -08:00
Aleix Conchillo Flaqué
4945cfbd8f Buffer internal frames during ParallelPipeline lifecycle synchronization
Processors inside parallel sub-pipelines can push frames during
StartFrame/EndFrame/CancelFrame processing. Previously these frames
could escape the ParallelPipeline before all branches finished
processing the lifecycle frame. Now they are buffered and flushed
after synchronization completes.
2026-02-06 15:15:46 -08:00
Mark Backman
8d37d3bae7 Merge pull request #3666 from pipecat-ai/mb/deepgram-stt-smart-format
DeepgramSTTService: disable smart_format by default
2026-02-06 14:04:37 -05:00
Mark Backman
d7b1624d3c Merge pull request #3663 from lukepayyapilli/fix/stream-close-sambanova-google
fix: close stream on cancellation for SambaNova and Google OpenAI services
2026-02-06 14:02:31 -05:00
Mark Backman
7f65204c3b DeepgramSTTService: disable smart_format by default 2026-02-06 13:45:10 -05:00
Aleix Conchillo Flaqué
97eff414c3 Merge pull request #3660 from pipecat-ai/aleix/interruption-frame-completion-event
Attach asyncio.Event to InterruptionFrame for completion signaling
2026-02-06 10:14:26 -08:00
Aleix Conchillo Flaqué
5b67e76de7 Add changelog for PR #3660 2026-02-06 10:11:00 -08:00
Aleix Conchillo Flaqué
b9e79bd06a CLAUDE.md: explain about InterruptionFrame.complete() 2026-02-06 10:11:00 -08:00
Aleix Conchillo Flaqué
d5105a78e6 STTMuteFilter should call frame.complete() when InterruptionFrame is blocked 2026-02-06 10:11:00 -08:00
Aleix Conchillo Flaqué
a352b2d7a0 Add tests for InterruptionFrame completion event
Add tests for the event-based interruption completion: complete() sets
the event, complete() is safe without an event, the event fires at
the pipeline sink, and a warning is logged when the frame is blocked.

Also remove the unconditional await after the timeout so the function
returns instead of hanging when complete() is never called.
2026-02-06 09:57:24 -08:00
Aleix Conchillo Flaqué
2345090b10 Attach asyncio.Event to InterruptionFrame for completion signaling
Move the interruption wait event from per-processor instance state to
the frame itself. The event is created in
push_interruption_task_frame_and_wait(), threaded through
InterruptionTaskFrame → InterruptionFrame, and set when the frame
reaches the pipeline sink. This scopes the event to each interruption
flow rather than sharing mutable state on the processor.

Also adds a 2s timeout warning to help diagnose cases where
InterruptionFrame.complete() is never called.
2026-02-06 09:57:24 -08:00
Mark Backman
af562bf9a8 Merge pull request #3664 from pipecat-ai/mb/elevenlabs-scribe-v2
Update ElevenLabsSTTService to scribe_v2
2026-02-06 12:31:44 -05:00
Mark Backman
d4993f0dcf Update ElevenLabsSTTService to scribe_v2 2026-02-06 11:37:23 -05:00
Luke Payyapilli
1790a84bfd add changelog 2026-02-06 10:05:02 -05:00
Luke Payyapilli
29c53b99a4 fix: close stream on cancellation for SambaNova and Google OpenAI services 2026-02-06 10:02:40 -05:00
Mark Backman
aa5a855eab Merge pull request #3656 from pipecat-ai/mb/openai-realtime-stt
Add OpenAIRealtimeSTTService
2026-02-06 09:15:58 -05:00
Mark Backman
e66d6f8ffe Merge pull request #3658 from pipecat-ai/mb/bump-protobuf-5.29.6
Upgrade protobuf to >=5.29.6
2026-02-05 19:09:30 -05:00
Mark Backman
b8ac2ba713 Merge pull request #3593 from ianbbqzy/ian/inworld-auto-mode
Add auto_mode support for inworld plugin
2026-02-05 18:16:38 -05:00
Ian Lee
6eea40858e fix lint and changelog 2026-02-05 15:10:36 -08:00
Mark Backman
90700d10aa Upgrade protobuf to >=5.29.6 2026-02-05 18:08:52 -05:00
Mark Backman
fa85f7bbc7 Merge pull request #3640 from lukepayyapilli/fix/openai-stream-close
fix: close stream on cancellation to prevent socket leaks
2026-02-05 18:00:06 -05:00
Mark Backman
669f013970 Merge pull request #3657 from pipecat-ai/filipi/changing_no_audio_log_to_debug
Changing the ‘no audio received’ log from warning to debug.
2026-02-05 17:35:24 -05:00
filipi87
76f63e54e2 Changing the ‘no audio received’ log from warning to debug. 2026-02-05 18:07:14 -03:00
Filipi da Silva Fuchter
cce5a13444 Merge pull request #3650 from pipecat-ai/filipi/twilio_issues
Ignoring RTVI messages inside the Serializers by default.
2026-02-05 15:52:59 -05:00
Mark Backman
d11e1cd631 Update 13k to use ElevenLabsRealtimeSTTService 2026-02-05 15:48:00 -05:00
Mark Backman
8b9da632d1 Add OpenAIRealtimeSTTService 2026-02-05 15:48:00 -05:00
Mark Backman
b36f7892a4 Merge pull request #3654 from pipecat-ai/aleix/more-claude-update
CLAUDE.md: add RTVI and serializers
2026-02-05 15:23:35 -05:00
Mark Backman
9b43cde128 Merge pull request #3355 from itsderek23/user-bot-latency
Add `user_bot_latency_seconds` to OpenTelemetry turn spans
2026-02-05 15:23:15 -05:00
filipi87
6af4d872a8 Refactoring the serializers to ignore the RTVI messages by default. 2026-02-05 16:52:53 -03:00
Ian Lee
22398e1410 add changelog back 2026-02-05 11:39:39 -08:00
Ian Lee
d10467e043 update timestamps reset handling 2026-02-05 11:39:39 -08:00
Ian Lee
cbe131636d add changelog 2026-02-05 11:39:39 -08:00
Ian Lee
fef9e3ea32 Add auto_mode support for inworld plugin 2026-02-05 11:39:39 -08:00
Mark Backman
56d8ef2bf4 Deprecate UserBotLatencyLogObserver, update 29 example 2026-02-05 14:29:45 -05:00
Derek Haynes
8791559351 Add changelog entry for PR #3355 2026-02-05 14:29:45 -05:00
Derek Haynes
f6c919354f Add test for user bot latency 2026-02-05 14:29:45 -05:00
Derek Haynes
93138466d6 Feat: Add user-bot latency to OTel turn spans
This adds user-to-bot response latency tracking to OpenTelemetry spans:

- Created UserBotLatencyObserver as a reusable component for tracking
user-to-bot response latency
- Records the value as an attribute on turn spans (turn.user_bot_latency_seconds)
- Updated TurnTraceObserver to use UserBotLatencyObserver, following the same pattern as TurnTrackingObserver
- Updated PipelineTask to automatically create and wire UserBotLatencyObserver
when tracing is enabled (same as TurnTrackingObserver)
2026-02-05 14:29:42 -05:00
Mark Backman
5a5a98b497 Merge pull request #3649 from itsderek23/fix/tracing-orphan-spans
Fix orphan otel spans during flow initialization and transitions
2026-02-05 14:23:52 -05:00
Aleix Conchillo Flaqué
2b4f507d37 CLAUDE.md: add RTVI and serializers 2026-02-05 11:06:00 -08:00
Mark Backman
d6f3a90662 Merge pull request #3652 from pipecat-ai/mb/upgrade-small-webrtc-prebuilt-2.1.0
Upgrade pipecat-ai-small-webrtc-prebuilt to 2.1.0
2026-02-05 13:48:54 -05:00
Derek Haynes
8fb0e37965 Update changelog for #3649 2026-02-05 11:35:22 -07:00
Derek Haynes
0d45b48f7b Fix import placement 2026-02-05 11:26:58 -07:00
Mark Backman
6af4520b1f Merge pull request #3635 from pipecat-ai/filipi/fix_websocket
Fixed an error in the WebSocket transport that occurred when an InputTransportMessageFrame was received and broadcast.
2026-02-05 12:22:59 -05:00
filipi87
ba469e5645 Add changelog entry
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-05 12:19:51 -05:00
Mark Backman
bd12b60b5c Merge pull request #3614 from okue/fix/websocket-broadcast-frame-misuse
fix: pass frame class instead of instance to broadcast_frame in websocket transports
2026-02-05 12:19:03 -05:00
Mark Backman
54db37ea47 Upgrade pipecat-ai-small-webrtc-prebuilt to 2.1.0 2026-02-05 12:09:51 -05:00
filipi87
752e16f553 Ignoring RTVI messages inside TwilioSerializer by default. 2026-02-05 10:51:03 -03:00
Derek Haynes
7c7408a048 Fix orphan spans in tracing during flow initialization and transitions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 06:06:13 -07:00
Mark Backman
8f42343927 Merge pull request #3630 from pipecat-ai/mb/add-function-call-messages-rtvi
Add native RTVI function call lifecycle messages
2026-02-04 16:20:42 -05:00
Mark Backman
46da6cd91b Update changelogs 2026-02-04 11:19:30 -05:00
Mark Backman
ecb02d9049 Bump RTVI_PROTOCOL_VERSION to 1.2.0 2026-02-04 11:17:38 -05:00
Mark Backman
cc68e00125 Deprecate llm-function-call message 2026-02-04 11:17:23 -05:00
Mark Backman
e0e3b5250b Add RTVIObserverParams to control what information is included in function call events 2026-02-04 11:05:05 -05:00
Luke Payyapilli
55a3b10e70 fix(openai): close stream on cancellation to prevent socket leaks 2026-02-04 09:59:10 -05:00
dhruvladia-sarvam
e6b06414b3 change default speaker for bulbul:v3-beta to shubh 2026-02-04 16:46:35 +05:30
Aleix Conchillo Flaqué
6bcfb40d12 Merge pull request #3636 from pipecat-ai/aleix/initial-claude-md
initial CLAUDE.md
2026-02-03 19:31:16 -08:00
Aleix Conchillo Flaqué
65b1a8ce36 initial CLAUDE.md 2026-02-03 18:04:54 -08:00
Mark Backman
2db3d94d06 Merge pull request #3628 from pipecat-ai/mb/broadcast-speech-control-params-frame
Fix: Broadcast SpeechControlParamsFrame from VADController
2026-02-03 18:44:15 -05:00
Mark Backman
2a26b9f7a3 Fix: Broadcast SpeechControlParamsFrame from VADController 2026-02-03 18:40:39 -05:00
Aleix Conchillo Flaqué
4f77c532fb Merge pull request #3623 from pipecat-ai/aleix/pipeline-task-rtvi-always-set-bot-ready
PipelineTask: also call set_bot_ready() for external RTVI processors
2026-02-03 14:21:03 -08:00
Aleix Conchillo Flaqué
c3a4da4a29 PipelineTask: also call set_bot_ready() for external RTVI processors 2026-02-03 14:16:08 -08:00
Mark Backman
84ca0b6d58 Merge pull request #3629 from pipecat-ai/fix/telephony-websocket-stopasynciteration
Fix StopAsyncIteration in parse_telephony_websocket
2026-02-03 12:10:07 -05:00
Mark Backman
c1857d255d Avoid nesting try/excepts 2026-02-03 12:00:04 -05:00
Mark Backman
d50ec33079 Merge pull request #3542 from lukepayyapilli/fix/terminal-frames-uninterruptible
fix: make EndFrame and StopFrame uninterruptible to prevent pipeline freeze
2026-02-03 10:08:17 -05:00
Mark Backman
40c84faff5 Remove handle_function_call_start 2026-02-03 10:00:59 -05:00
Mark Backman
84cd9346f9 Add native RTVI function call lifecycle messages 2026-02-03 10:00:59 -05:00
Luke Payyapilli
5d5b19e1d2 Add changelog entry 2026-02-03 09:12:59 -05:00
Luke Payyapilli
8d3e10f054 Make EndFrame and StopFrame uninterruptible to prevent pipeline freeze 2026-02-03 09:12:59 -05:00
dhruvladia-sarvam
1665ce181a refactor(sarvam): centralize model configuration with dataclasses 2026-02-03 14:33:41 +05:30
James Hush
803a20cc00 Fix formatting: remove extra blank line
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:46:44 +08:00
James Hush
90bead06ab Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-03 16:42:13 +08:00
James Hush
b427d534ae Add tests for parse_telephony_websocket StopAsyncIteration handling
Tests cover:
- No messages received (raises ValueError)
- One message received (logs warning, continues)
- Two messages received (normal operation)
- All telephony providers (Twilio, Telnyx, Plivo, Exotel)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:33:36 +08:00
James Hush
b030f1178d Add changelog and improve docstring for parse_telephony_websocket
- Added changelog entry for bug fix
- Enhanced docstring with Args and Raises sections

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:26:09 +08:00
James Hush
a627597bca Fix StopAsyncIteration in parse_telephony_websocket
Handle WebSocket disconnections gracefully when telephony providers send
fewer messages than expected. Adds explicit StopAsyncIteration handling
for both first and second message retrieval.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:25:07 +08:00
Aleix Conchillo Flaqué
4c10ddb7bb upgrade uv.lock 2026-02-02 16:25:06 -08:00
Mark Backman
a4e499dc80 Merge pull request #3617 from pipecat-ai/fix/cjk-sentence-splitting
Fix sentence splitting for CJK and other non-Latin languages
2026-02-02 18:16:51 -05:00
Mark Backman
ca49acfaa6 Merge pull request #3619 from pipecat-ai/mb/resemble-readme
Resemble cleanup
2026-02-02 09:20:11 -05:00
Mark Backman
86147f15f3 Renumber the Resemble foundational example 2026-02-02 09:07:05 -05:00
Mark Backman
5cda72d138 Add Resemble TTS to README 2026-02-02 09:05:03 -05:00
Mark Backman
54e62a8177 Merge pull request #3134 from pipecat-ai/mb/resemble-tts-draft
Add ResembleAITTSService
2026-02-02 08:59:27 -05:00
Mark Backman
a592b7fdf0 Update per PR 1789, align with ErrorFrame norms 2026-02-02 08:55:29 -05:00
Mark Backman
ba2b7c05d6 Add ResembleAITTSService 2026-02-02 08:55:27 -05:00
James Hush
774041e9a1 Add changelog for PR #3617
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 14:47:22 +08:00
James Hush
763002f2bc Fix sentence splitting for CJK and other non-Latin languages in TTS pipeline
NLTK's sent_tokenize() only supports ~15 European languages and defaults to
English. For Japanese, Chinese, Korean, Hindi, Arabic, and other non-Latin
languages, NLTK fails to recognize sentence boundaries like 。?! causing
text to accumulate until flush instead of being emitted sentence-by-sentence.

Add a fallback in match_endofsentence() that scans for unambiguous non-Latin
sentence-ending punctuation when NLTK fails to split the text. Latin
punctuation (. ! ? ; …) is excluded from the fallback since NLTK handles
those correctly and they can be ambiguous (abbreviations, decimals, etc.).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 14:27:49 +08:00
Om Chauhan
50dedf350d fix: ensure function call timeout task is always cancelled 2026-02-02 08:38:54 +05:30
okue
d3ecbb11c1 fix: pass frame class instead of instance to broadcast_frame in websocket transports
broadcast_frame() expects a frame class and kwargs, but the three
websocket input transports (fastapi, client, server) were incorrectly
passing a frame instance. This would cause a TypeError at runtime when
an InputTransportMessageFrame was received.
2026-02-01 20:38:34 +09:00
Aleix Conchillo Flaqué
f453227ba3 Merge pull request #3612 from pipecat-ai/aleix/use-kokoro-onnx
KokoroTTSService: use kokoro-onnx instead of kokoro
2026-01-31 21:03:55 -08:00
Aleix Conchillo Flaqué
52cc64019a Merge pull request #3611 from pipecat-ai/aleix/aicoustics-example-update
examples: update 07zd to use vad_analyzer in LLMUserAggregator
2026-01-31 21:02:50 -08:00
Aleix Conchillo Flaqué
95689cc81c KokoroTTSService: use kokoro-onnx instead of kokoro 2026-01-31 17:20:27 -08:00
Aleix Conchillo Flaqué
675c7c43e3 examples: update 07zd to use vad_analyzer in LLMUserAggregator 2026-01-31 15:31:15 -08:00
Aleix Conchillo Flaqué
bfd19e867c Merge pull request #3610 from pipecat-ai/aleix/dont-add-rtvi-observer-if-already-there
PipelineTask: don't add RTVIObserver if already there
2026-01-31 14:57:52 -08:00
Aleix Conchillo Flaqué
acc9923c0a PipelineTask: don't add RTVIObserver if already there 2026-01-31 14:54:29 -08:00
Mark Backman
bdc9e7e2e4 Merge pull request #3608 from pipecat-ai/mb/quickstart-0.0.101
Update quickstart for 0.0.101
2026-01-31 10:39:17 -05:00
Mark Backman
a587e1b99a Update quickstart for 0.0.101 2026-01-31 09:52:24 -05:00
dhruvladia-sarvam
57821cf709 fix 2026-01-30 16:07:52 +05:30
dhruvladia-sarvam
18045582a9 ASR and TTS v3 update 2026-01-30 15:53:06 +05:30
276 changed files with 4929 additions and 1968 deletions

View File

@@ -7,23 +7,30 @@ Create changelog files for the important commits in this PR. The PR number is pr
## Instructions
1. First, check what commits are on the current branch compared to main:
1. Skip changelog for: documentation-only, internal refactoring, test-only, CI changes.
2. First, check what commits are on the current branch compared to main:
```
git log main..HEAD --oneline
```
2. For each significant change, create a changelog file in the `changelog/` folder using the format:
3. For each significant change, create a changelog file in the `changelog/` folder using the format:
Allowed types: `added`, `changed`, `deprecated`, `removed`, `fixed`, `security`, `performance`, `other`
- `{PR_NUMBER}.added.md` - for new features
- `{PR_NUMBER}.added.2.md`, `{PR_NUMBER}.added.3.md` - for additional new features
- `{PR_NUMBER}.added.2.md`, `{PR_NUMBER}.added.3.md` - for additional entries of the same type
- `{PR_NUMBER}.changed.md` - for changes to existing functionality
- `{PR_NUMBER}.fixed.md` - for bug fixes
- `{PR_NUMBER}.deprecated.md` - for deprecations
- `{PR_NUMBER}.removed.md` - for removed features
- `{PR_NUMBER}.security.md` - for security fixes
- `{PR_NUMBER}.performance.md` - for performance improvements
- `{PR_NUMBER}.other.md` - for other changes
3. Each changelog file should at least contain a main single line starting with `- ` followed by a clear description of the change.
4. Each changelog file should at least contain a main single line starting with `- ` followed by a clear description of the change.
4. If the change is complicated, changelog files can have indented lines after the main line with additional details or code samples.
5. If the change is complicated, changelog files can have indented lines after the main line with additional details or code samples.
5. Use ⚠️ emoji prefix for breaking changes.
6. Use ⚠️ emoji prefix for breaking changes.
## Example

143
CLAUDE.md Normal file
View File

@@ -0,0 +1,143 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Pipecat is an open-source Python framework for building real-time voice and multimodal conversational AI agents. It orchestrates audio/video, AI services, transports, and conversation pipelines using a frame-based architecture.
## Common Commands
```bash
# Setup development environment
uv sync --group dev --all-extras --no-extra gstreamer --no-extra krisp
# Install pre-commit hooks
uv run pre-commit install
# Run all tests
uv run pytest
# Run a single test file
uv run pytest tests/test_name.py
# Run a specific test
uv run pytest tests/test_name.py::test_function_name
# Preview changelog
towncrier build --draft --version Unreleased
# Lint and format check
uv run ruff check
uv run ruff format --check
# Update dependencies (after editing pyproject.toml)
uv lock && uv sync
```
## Architecture
### Frame-Based Pipeline Processing
All data flows as **Frame** objects through a pipeline of **FrameProcessors**:
```
Transport Input → Pipeline Source → [Processor1] → [Processor2] → ... → Pipeline Sink → Transport Output
```
**Key components:**
- **Frames** (`src/pipecat/frames/frames.py`): Data units (audio, text, video) and control signals. Flow DOWNSTREAM (input→output) or UPSTREAM (acknowledgments/errors).
- **FrameProcessor** (`src/pipecat/processors/frame_processor.py`): Base processing unit. Each processor receives frames, processes them, and pushes results downstream.
- **Pipeline** (`src/pipecat/pipeline/pipeline.py`): Chains processors together.
- **ParallelPipeline** (`src/pipecat/pipeline/parallel_pipeline.py`): Runs multiple pipelines in parallel.
- **Transports** (`src/pipecat/transports/`): External I/O layer (Daily WebRTC, LiveKit WebRTC, WebSocket, Local). Abstract interface via `BaseTransport`.
- **Services** (`src/pipecat/services/`): 60+ AI provider integrations (STT, TTS, LLM, etc.). Extend base classes: `AIService`, `LLMService`, `STTService`, `TTSService`, `VisionService`.
- **Serializers** (`src/pipecat/serializers/`): Convert frames to/from wire formats for WebSocket transports. `FrameSerializer` base class defines `serialize()` and `deserialize()`. Telephony serializers (Twilio, Plivo, Vonage, Telnyx, Exotel, Genesys) handle provider-specific protocols and audio encoding (e.g., μ-law).
- **RTVI** (`src/pipecat/processors/frameworks/rtvi.py`): Real-Time Voice Interface protocol bridging clients and the pipeline. `RTVIProcessor` handles incoming client messages (text input, audio, function call results). `RTVIObserver` converts pipeline frames to outgoing messages: user/bot speaking events, transcriptions, LLM/TTS lifecycle, function calls, metrics, and audio levels.
### Important Patterns
- **Context Aggregation**: `LLMContext` accumulates messages for LLM calls; `UserResponse` aggregates user input
- **Turn Management**: Turn management is done through `LLMUserAggregator` and
`LLMAssistantAggregator`, created with `LLMContextAggregatorPair`
- **User turn strategies**: Detection of when the user starts and stops speaking is done via user turn start/stop strategies. They push `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` respectively.
- **Interruptions**: Interruptions are usually triggered by a user turn start strategy (e.g. `VADUserTurnStartStrategy`) but they can be triggered by other processors as well, in which case the user turn start strategies don't need to. An `InterruptionFrame` carries an optional `asyncio.Event` that is set when the frame reaches the pipeline sink. If a processor stops an `InterruptionFrame` from propagating downstream (i.e., doesn't push it), it **must** call `frame.complete()` to avoid stalling `push_interruption_task_frame_and_wait()` callers.
- **Uninterruptible Frames**: These are frames that will not be removed from internal queues even if there's an interruption. For example, `EndFrame` and `StopFrame`.
- **Events**: Most classes in Pipecat have `BaseObject` as the very base class. `BaseObject` has support for events. Events can run in the background in an async task (default) or synchronously (`sync=True`) if we want immediate action. Synchronous event handlers need to exectue fast.
### Key Directories
| Directory | Purpose |
|---------------------------|----------------------------------------------------|
| `src/pipecat/frames/` | Frame definitions (100+ types) |
| `src/pipecat/processors/` | FrameProcessor base + aggregators, filters, audio |
| `src/pipecat/pipeline/` | Pipeline orchestration |
| `src/pipecat/services/` | AI service integrations (60+ providers) |
| `src/pipecat/transports/` | Transport layer (Daily, LiveKit, WebSocket, Local) |
| `src/pipecat/serializers/`| Frame serialization for WebSocket protocols |
| `src/pipecat/audio/` | VAD, filters, mixers, turn detection, DTMF |
| `src/pipecat/turns/` | User turn management |
## Code Style
- **Docstrings**: Google-style. Classes describe purpose; `__init__` has `Args:` section; dataclasses use `Parameters:` section.
- **Linting**: Ruff (line length 100). Pre-commit hooks enforce formatting.
- **Type hints**: Required for complex async code.
### Docstring Example
```python
class MyService(LLMService):
"""Description of what the service does.
More detailed description.
Event handlers available:
- on_connected: Called when we are connected
Example::
@service.event_handler("on_connected")
async def on_connected(service, frame):
...
"""
def __init__(self, param1: str, **kwargs):
"""Initialize the service.
Args:
param1: Description of param1.
**kwargs: Additional arguments passed to parent.
"""
super().__init__(**kwargs)
```
## Service Implementation
When adding a new service:
1. Extend the appropriate base class (`STTService`, `TTSService`, `LLMService`, etc.)
2. Implement required abstract methods
3. Handle necessary frames
4. By default, all frames should be pushed in the direction they came
5. Push `ErrorFrame` on failures
6. Add metrics tracking via `MetricsData` if relevant
7. Follow the pattern of existing services in `src/pipecat/services/`
## Pull Requests
After creating a PR, use `/changelog <pr_number>` to generate the changelog file and `/pr-description <pr_number>` to update the PR description.

View File

@@ -75,7 +75,7 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Gradium](https://docs.pipecat.ai/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [Hathora](https://docs.pipecat.ai/server/services/stt/hathora), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hathora](https://docs.pipecat.ai/server/services/tts/hathora), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hathora](https://docs.pipecat.ai/server/services/tts/hathora), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Resemble](https://docs.pipecat.ai/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/server/services/s2s/ultravox), |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Exotel](https://docs.pipecat.ai/server/utilities/serializers/exotel), [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx), [Vonage](https://docs.pipecat.ai/server/utilities/serializers/vonage) |

1
changelog/3134.added.md Normal file
View File

@@ -0,0 +1 @@
- Added `ResembleAITTSService` for text-to-speech using Resemble AI's streaming WebSocket API with word-level timestamps and jitter buffering for smooth audio playback.

1
changelog/3355.added.md Normal file
View File

@@ -0,0 +1 @@
- Added `UserBotLatencyObserver` for tracking user-to-bot response latency. When tracing is enabled, latency measurements are automatically recorded as `turn.user_bot_latency_seconds` attributes on OpenTelemetry turn spans.

View File

@@ -0,0 +1 @@
- Deprecated `UserBotLatencyLogObserver`. Use `UserBotLatencyObserver` directly with its `on_latency_measured` event handler instead.

1
changelog/3542.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed pipeline freeze when `InterruptionFrame` discards `EndFrame` or `StopFrame` by making terminal frames uninterruptible.

1
changelog/3589.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed OpenAI LLM stream not being closed on cancellation/exception, which could leak sockets.

1
changelog/3593.added.md Normal file
View File

@@ -0,0 +1 @@
- Added support for Inworld TTS Websocket Auto Mode for improved latency

View File

@@ -0,0 +1 @@
- Updated timestamps to be cumulative within an agent turn, using flushCompleted message as an indication of when timestamps from the server are reset to 0

1
changelog/3610.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `PipelineTask` adding duplicate `RTVIProcessor` and `RTVIObserver` when they were already provided in the pipeline or observers list. They are now detected and skipped, with appropriate warnings and errors logged for mismatched configurations.

View File

@@ -0,0 +1 @@
- Changed `KokoroTTSService` to use `kokoro-onnx` instead of `kokoro` as the underlying TTS engine.

1
changelog/3616.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed function call timeout task not being cancelled when the handler completes without calling `result_callback` or is cancelled externally, which caused `RuntimeWarning: coroutine was never awaited`.

5
changelog/3617.fixed.md Normal file
View File

@@ -0,0 +1,5 @@
- Fixed sentence splitting for Japanese, Chinese, Korean, and other non-Latin
languages in TTS pipeline. NLTK's sentence tokenizer does not support CJK
languages, causing text to accumulate until flush instead of being split at
sentence boundaries. Added fallback detection for unambiguous non-Latin
sentence-ending punctuation (e.g., `。`, ``, ``).

1
changelog/3623.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `PipelineTask` to also call `set_bot_ready()` when an external `RTVIProcessor` is provided.

1
changelog/3628.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `VADController` not broadcasting `SpeechControlParamsFrame` on startup, which prevented STT services from receiving VAD params needed for TTFB measurement.

1
changelog/3629.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `StopAsyncIteration` exceptions in `parse_telephony_websocket()` when WebSocket connections close before sending expected messages.

1
changelog/3630.added.md Normal file
View File

@@ -0,0 +1 @@
- Added RTVI function call lifecycle events (`llm-function-call-started`, `llm-function-call-in-progress`, `llm-function-call-stopped`) with configurable security levels via `RTVIObserverParams.function_call_report_level`. Supports per-function control over what information is exposed (`DISABLED`, `NONE`, `NAME`, or `FULL`).

View File

@@ -0,0 +1 @@
- Deprecated `RTVILLMFunctionCallMessage`, `RTVILLMFunctionCallMessageData`, and `RTVIProcessor.handle_function_call()`. Use the new `llm-function-call-in-progress` event sent automatically by `RTVIObserver` instead.

1
changelog/3635.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed WebSocket transport error when broadcasting `InputTransportMessageFrame` by correctly instantiating the frame with its message parameter.

1
changelog/3649.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed orphan OpenTelemetry spans during flow initialization and transitions in tracing.

View File

@@ -0,0 +1 @@
- Upgraded the `pipecat-ai-small-webrtc-prebuilt` package to v2.1.0.

1
changelog/3656.added.md Normal file
View File

@@ -0,0 +1 @@
- Added `OpenAIRealtimeSTTService` for real-time streaming speech-to-text using OpenAI's Realtime API WebSocket transcription sessions. Supports local VAD and server-side VAD modes, noise reduction, and automatic reconnection.

10
changelog/3659.changed.md Normal file
View File

@@ -0,0 +1,10 @@
- ⚠️ The default `VADParams` `stop_secs` default is changing from `0.8` seconds
to `0.2` seconds. This change both simplifies the developer experience and
improves the performance of STT services. With a shorter `stop_secs` value,
STT services using a local VAD can finalize sooner, resulting in faster
transcription.
- `SpeechTimeoutUserTurnStopStrategy`: control how long to wait for
additional user speech using `user_speech_timeout` (default: 0.6 sec).
- `TurnAnalyzerUserTurnStopStrategy`: the turn analyzer automatically adjusts
the user wait time based on the audio input.

View File

@@ -0,0 +1 @@
- Moved interruption wait event from per-processor instance state to `InterruptionFrame` itself. Added `InterruptionFrame.complete()` to signal when the interruption has fully traversed the pipeline. Custom processors that block or consume an `InterruptionFrame` before it reaches the pipeline sink must call `frame.complete()` to avoid stalling `push_interruption_task_frame_and_wait()`. A warning is logged if completion does not happen within 2 seconds.

1
changelog/3663.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `SambaNovaLLMService` and `GoogleLLMOpenAIBetaService` streams not being closed on cancellation/exception, which could leak sockets.

View File

@@ -0,0 +1 @@
- Update the default model to `scribe_v2` for `ElevenLabsSTTService`.

View File

@@ -0,0 +1 @@
- Changed the `DeepgramSTTService` default setting for `smart_format` to `False`, as agents don't need smart formatting. Disabling this setting provides a small performance improvement, as well.

1
changelog/3667.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed an issue in `InworldTTSService` where punctuation was pronounced. Now, the `InworldTTSService` ensures proper spacing between sentences, resolving pronunciation issues.

1
changelog/3668.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `ParallelPipeline` allowing frames pushed by internal processors to escape during lifecycle frame (`StartFrame`/`EndFrame`/`CancelFrame`) synchronization. These frames are now buffered and flushed after all branches complete.

1
changelog/3678.added.md Normal file
View File

@@ -0,0 +1 @@
- Added pyright basic type checking configuration for the core framework.

View File

@@ -156,6 +156,10 @@ PLIVO_AUTH_TOKEN=...
# Qwen
QWEN_API_KEY=...
# Resemble AI
RESEMBLE_API_KEY=
RESEMBLE_VOICE_UUID=
# Rime
RIME_API_KEY=...
RIME_VOICE_ID=...

View File

@@ -19,7 +19,6 @@ from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -90,7 +89,7 @@ async def run_example(webrtc_connection: SmallWebRTCConnection):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -14,7 +14,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -75,7 +74,7 @@ async def main():
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -14,7 +14,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
InterruptionFrame,
TranscriptionFrame,
@@ -83,7 +82,7 @@ async def main():
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import Frame, LLMRunFrame, MetricsFrame
from pipecat.metrics.metrics import (
LLMUsageMetricsData,
@@ -108,7 +107,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from PIL import Image
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
BotStartedSpeakingFrame,
BotStoppedSpeakingFrame,
@@ -123,7 +122,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -79,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -78,7 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -121,7 +120,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -17,7 +17,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMMessagesUpdateFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -105,7 +104,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -85,7 +84,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -88,7 +87,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -77,7 +76,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -89,7 +88,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -80,7 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -80,7 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -82,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -86,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -86,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -0,0 +1,135 @@
#
# Copyright (c) 2024-2026, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
LLMContextAggregatorPair,
LLMUserAggregatorParams,
)
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.openai.stt import OpenAISTTService
from pipecat.services.openai.tts import OpenAITTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.turns.user_stop import TurnAnalyzerUserTurnStopStrategy
from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = OpenAISTTService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-transcribe",
prompt="Expect words related to dogs, such as breed names.",
)
tts = OpenAITTSService(api_key=os.getenv("OPENAI_API_KEY"), voice="ballad")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are very knowledgable about dogs. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
user_aggregator, # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
assistant_aggregator, # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_out_sample_rate=24000,
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -25,8 +24,9 @@ from pipecat.processors.aggregators.llm_response_universal import (
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.openai.stt import OpenAISTTService
from pipecat.services.openai.stt import OpenAIRealtimeSTTService
from pipecat.services.openai.tts import OpenAITTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -56,10 +56,15 @@ transport_params = {
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = OpenAISTTService(
stt = OpenAIRealtimeSTTService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-transcribe",
prompt="Expect words related to dogs, such as breed names.",
language=Language.EN,
# Uses local VAD by default.
# To enable server-side VAD, set turn_detection=None or
# a dict with server_vad settings.
# turn_detection={"type": "server_vad", "threshold": 0.5},
)
tts = OpenAITTSService(api_key=os.getenv("OPENAI_API_KEY"), voice="ballad")
@@ -80,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -85,7 +84,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -85,7 +84,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -86,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
context,
user_params=LLMUserAggregatorParams(
user_turn_strategies=ExternalUserTurnStrategies(),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -89,7 +88,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -76,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -78,7 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -10,7 +10,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMMessagesAppendFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -123,7 +122,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -10,7 +10,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -82,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -27,7 +27,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -104,7 +103,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -109,7 +108,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -92,7 +91,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -92,7 +91,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -82,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -30,7 +30,6 @@ from loguru import logger
from pipecat.audio.filters.krisp_viva_filter import KrispVivaFilter
from pipecat.audio.turn.krisp_viva_turn import KrispVivaTurn
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -97,7 +96,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=KrispVivaTurn())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.filters.krisp_filter import KrispFilter
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -80,7 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -87,7 +86,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -79,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -78,7 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -14,7 +14,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
Frame,
InputAudioRawFrame,
@@ -251,7 +250,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)
audio_collector = UserAudioCollector(context, user_aggregator)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -80,7 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -86,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -79,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -82,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -67,7 +66,7 @@ async def main():
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -88,7 +87,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -90,7 +89,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -82,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -79,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame, TTSTextFrame
from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint
from pipecat.pipeline.pipeline import Pipeline
@@ -86,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame, TTSTextFrame
from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint
from pipecat.pipeline.pipeline import Pipeline
@@ -81,7 +80,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -86,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -80,7 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -38,13 +38,6 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# Create audio buffer processor so we can hear the audio fitler results.
audiobuffer = AudioBufferProcessor(
num_channels=2, # 1 for mono, 2 for stereo (user left, bot right)
enable_turn_audio=False, # Enable per-turn audio recording
)
def _create_aic_filter() -> AICFilter:
license_key = os.getenv("AICOUSTICS_LICENSE_KEY", "")
@@ -54,39 +47,29 @@ def _create_aic_filter() -> AICFilter:
)
aic_filter = _create_aic_filter()
aic_vad_analyzer = aic_filter.create_vad_analyzer(
speech_hold_duration=0.05, minimum_speech_duration=0.0, sensitivity=6.0
)
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: (
lambda aic: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=aic.create_vad_analyzer(
speech_hold_duration=0.05, minimum_speech_duration=0.0, sensitivity=6.0
),
audio_in_filter=aic,
)
)(_create_aic_filter()),
"twilio": lambda: (
lambda aic: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=aic.create_vad_analyzer(
speech_hold_duration=0.05, minimum_speech_duration=0.0, sensitivity=6.0
),
audio_in_filter=aic,
)
)(_create_aic_filter()),
"webrtc": lambda: (
lambda aic: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=aic.create_vad_analyzer(
speech_hold_duration=0.05, minimum_speech_duration=0.0, sensitivity=6.0
),
audio_in_filter=aic,
)
)(_create_aic_filter()),
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
audio_in_filter=aic_filter,
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
audio_in_filter=aic_filter,
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
audio_in_filter=aic_filter,
),
}
@@ -113,12 +96,19 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
vad_analyzer=aic_vad_analyzer,
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
),
)
# Create audio buffer processor so we can hear the audio fitler results.
audiobuffer = AudioBufferProcessor(
num_channels=2, # 1 for mono, 2 for stereo (user left, bot right)
enable_turn_audio=False, # Enable per-turn audio recording
)
pipeline = Pipeline(
[
transport.input(), # Transport user input

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame, TTSTextFrame
from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint
from pipecat.pipeline.pipeline import Pipeline
@@ -82,7 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -86,7 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -81,7 +80,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -84,7 +83,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -76,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -76,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -0,0 +1,134 @@
#
# Copyright (c) 2024-2026, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
LLMContextAggregatorPair,
LLMUserAggregatorParams,
)
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.resembleai.tts import ResembleAITTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.turns.user_stop import TurnAnalyzerUserTurnStopStrategy
from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = ResembleAITTSService(
api_key=os.getenv("RESEMBLE_API_KEY"),
voice_id=os.getenv("RESEMBLE_VOICE_UUID"),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be spoken aloud, so avoid special characters that can't easily be spoken, such as emojis or bullet points. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
),
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
user_aggregator, # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
assistant_aggregator, # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
Frame,
LLMRunFrame,
@@ -118,7 +117,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -81,7 +80,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
@@ -129,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)
out_sound = OutboundSoundEffectWrapper()

View File

@@ -13,7 +13,6 @@ from PIL import Image
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -76,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from PIL import Image
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -76,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from PIL import Image
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -83,7 +82,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from PIL import Image
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -76,7 +75,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -18,7 +18,7 @@ from pipecat.pipeline.task import PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService
from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
@@ -51,29 +51,25 @@ transport_params = {
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
async with aiohttp.ClientSession() as session:
stt = ElevenLabsSTTService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
aiohttp_session=session,
)
stt = ElevenLabsRealtimeSTTService(api_key=os.getenv("ELEVENLABS_API_KEY"))
tl = TranscriptionLogger()
tl = TranscriptionLogger()
pipeline = Pipeline([transport.input(), stt, tl])
pipeline = Pipeline([transport.input(), stt, tl])
task = PipelineTask(
pipeline,
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
task = PipelineTask(
pipeline,
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
await runner.run(task)
async def bot(runner_args: RunnerArguments):

Some files were not shown because too many files have changed in this diff Show More