Compare commits

...

149 Commits

Author SHA1 Message Date
Mark Backman
5a6cc4d35c Replace assert-based type narrowing with local variables and guards
Use local variable narrowing and if-guards instead of assert statements
for type safety, since asserts are stripped with python -O.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 16:46:45 -05:00
Mark Backman
28be775740 Reduce type: ignore comments by fixing avoidable type mismatches
Replace ~20 type: ignore comments with proper type fixes:
- Widen set_tools() to accept List[dict] | ToolsSchema | NotGiven
- Widen create_task() to accept Coroutine | Awaitable
- Fix _turn_params to use BaseTurnParams instead of SmartTurnParams
- Make _thought_llm Optional[str] with assertion guard
- Add mixer assertion, websocket narrowing, ice_servers cast
- Use dict.get() in protobuf serializer
- Make remote_participants Optional in Daily transport

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:30:35 -05:00
Mark Backman
bc730e4069 Enable pyright basic type checking for core framework
Add pyright configuration (basic mode, Python 3.10) to pyproject.toml
and fix all 276 type errors in the core framework (everything except
services/ and adapters/). This establishes a CI-ready type checking
baseline as Pipecat approaches 1.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:30:35 -05:00
Mark Backman
104d06551a Merge pull request #3679 from pipecat-ai/mb/remove-to-be-updated
Remove SequentialMergePipeline
2026-02-08 15:28:38 -05:00
Mark Backman
90ad2a4e81 Remove SequentialMergePipeline 2026-02-08 14:44:48 -05:00
Mark Backman
570f2d7fc0 Merge pull request #3667 from ianbbqzy/ian/fix-auto-mode-space
[inworld] aggregate_sentence mode needs trailing space
2026-02-07 18:22:32 -05:00
Ian Lee
f3d99adf8f [inworld] aggregate_sentence mode needs trailing space 2026-02-07 15:18:24 -08:00
Mark Backman
d34f416281 Merge pull request #3598 from dhruvladia-sarvam/sarvam-v3-update
ASR and TTS v3 update
2026-02-07 10:51:35 -05:00
Mark Backman
5a1deb7cb4 Merge pull request #3659 from pipecat-ai/mb/change-vad-defaults
Set VADParams stop_secs to 0.2 by default
2026-02-06 23:51:50 -05:00
Mark Backman
a5fc2b1650 Set VADParams stop_secs to 0.2 by default 2026-02-06 23:49:08 -05:00
Aleix Conchillo Flaqué
5cb8d91431 added changelog file for #3616 2026-02-06 16:45:23 -08:00
Aleix Conchillo Flaqué
ce690848c0 Merge pull request #3616 from omChauhanDev/fix/function-call-timeout-task-cleanup
fix: ensure function call timeout task is always cancelled
2026-02-06 16:40:56 -08:00
Aleix Conchillo Flaqué
30f51edfcd Merge pull request #3668 from pipecat-ai/aleix/parallel-pipeline-buffering
Buffer internal frames during ParallelPipeline lifecycle sync
2026-02-06 15:25:32 -08:00
Aleix Conchillo Flaqué
cd03d449cb Update changelog skill with skip rules and allowed types 2026-02-06 15:23:14 -08:00
Aleix Conchillo Flaqué
57df03aade Update CLAUDE.md with PR workflow instructions 2026-02-06 15:23:14 -08:00
Aleix Conchillo Flaqué
4945cfbd8f Buffer internal frames during ParallelPipeline lifecycle synchronization
Processors inside parallel sub-pipelines can push frames during
StartFrame/EndFrame/CancelFrame processing. Previously these frames
could escape the ParallelPipeline before all branches finished
processing the lifecycle frame. Now they are buffered and flushed
after synchronization completes.
2026-02-06 15:15:46 -08:00
Mark Backman
8d37d3bae7 Merge pull request #3666 from pipecat-ai/mb/deepgram-stt-smart-format
DeepgramSTTService: disable smart_format by default
2026-02-06 14:04:37 -05:00
Mark Backman
d7b1624d3c Merge pull request #3663 from lukepayyapilli/fix/stream-close-sambanova-google
fix: close stream on cancellation for SambaNova and Google OpenAI services
2026-02-06 14:02:31 -05:00
Mark Backman
7f65204c3b DeepgramSTTService: disable smart_format by default 2026-02-06 13:45:10 -05:00
Aleix Conchillo Flaqué
97eff414c3 Merge pull request #3660 from pipecat-ai/aleix/interruption-frame-completion-event
Attach asyncio.Event to InterruptionFrame for completion signaling
2026-02-06 10:14:26 -08:00
Aleix Conchillo Flaqué
5b67e76de7 Add changelog for PR #3660 2026-02-06 10:11:00 -08:00
Aleix Conchillo Flaqué
b9e79bd06a CLAUDE.md: explain about InterruptionFrame.complete() 2026-02-06 10:11:00 -08:00
Aleix Conchillo Flaqué
d5105a78e6 STTMuteFilter should call frame.complete() when InterruptionFrame is blocked 2026-02-06 10:11:00 -08:00
Aleix Conchillo Flaqué
a352b2d7a0 Add tests for InterruptionFrame completion event
Add tests for the event-based interruption completion: complete() sets
the event, complete() is safe without an event, the event fires at
the pipeline sink, and a warning is logged when the frame is blocked.

Also remove the unconditional await after the timeout so the function
returns instead of hanging when complete() is never called.
2026-02-06 09:57:24 -08:00
Aleix Conchillo Flaqué
2345090b10 Attach asyncio.Event to InterruptionFrame for completion signaling
Move the interruption wait event from per-processor instance state to
the frame itself. The event is created in
push_interruption_task_frame_and_wait(), threaded through
InterruptionTaskFrame → InterruptionFrame, and set when the frame
reaches the pipeline sink. This scopes the event to each interruption
flow rather than sharing mutable state on the processor.

Also adds a 2s timeout warning to help diagnose cases where
InterruptionFrame.complete() is never called.
2026-02-06 09:57:24 -08:00
Mark Backman
af562bf9a8 Merge pull request #3664 from pipecat-ai/mb/elevenlabs-scribe-v2
Update ElevenLabsSTTService to scribe_v2
2026-02-06 12:31:44 -05:00
Mark Backman
d4993f0dcf Update ElevenLabsSTTService to scribe_v2 2026-02-06 11:37:23 -05:00
Luke Payyapilli
1790a84bfd add changelog 2026-02-06 10:05:02 -05:00
Luke Payyapilli
29c53b99a4 fix: close stream on cancellation for SambaNova and Google OpenAI services 2026-02-06 10:02:40 -05:00
Mark Backman
aa5a855eab Merge pull request #3656 from pipecat-ai/mb/openai-realtime-stt
Add OpenAIRealtimeSTTService
2026-02-06 09:15:58 -05:00
Mark Backman
e66d6f8ffe Merge pull request #3658 from pipecat-ai/mb/bump-protobuf-5.29.6
Upgrade protobuf to >=5.29.6
2026-02-05 19:09:30 -05:00
Mark Backman
b8ac2ba713 Merge pull request #3593 from ianbbqzy/ian/inworld-auto-mode
Add auto_mode support for inworld plugin
2026-02-05 18:16:38 -05:00
Ian Lee
6eea40858e fix lint and changelog 2026-02-05 15:10:36 -08:00
Mark Backman
90700d10aa Upgrade protobuf to >=5.29.6 2026-02-05 18:08:52 -05:00
Mark Backman
fa85f7bbc7 Merge pull request #3640 from lukepayyapilli/fix/openai-stream-close
fix: close stream on cancellation to prevent socket leaks
2026-02-05 18:00:06 -05:00
Mark Backman
669f013970 Merge pull request #3657 from pipecat-ai/filipi/changing_no_audio_log_to_debug
Changing the ‘no audio received’ log from warning to debug.
2026-02-05 17:35:24 -05:00
filipi87
76f63e54e2 Changing the ‘no audio received’ log from warning to debug. 2026-02-05 18:07:14 -03:00
Filipi da Silva Fuchter
cce5a13444 Merge pull request #3650 from pipecat-ai/filipi/twilio_issues
Ignoring RTVI messages inside the Serializers by default.
2026-02-05 15:52:59 -05:00
Mark Backman
d11e1cd631 Update 13k to use ElevenLabsRealtimeSTTService 2026-02-05 15:48:00 -05:00
Mark Backman
8b9da632d1 Add OpenAIRealtimeSTTService 2026-02-05 15:48:00 -05:00
Mark Backman
b36f7892a4 Merge pull request #3654 from pipecat-ai/aleix/more-claude-update
CLAUDE.md: add RTVI and serializers
2026-02-05 15:23:35 -05:00
Mark Backman
9b43cde128 Merge pull request #3355 from itsderek23/user-bot-latency
Add `user_bot_latency_seconds` to OpenTelemetry turn spans
2026-02-05 15:23:15 -05:00
filipi87
6af4d872a8 Refactoring the serializers to ignore the RTVI messages by default. 2026-02-05 16:52:53 -03:00
Ian Lee
22398e1410 add changelog back 2026-02-05 11:39:39 -08:00
Ian Lee
d10467e043 update timestamps reset handling 2026-02-05 11:39:39 -08:00
Ian Lee
cbe131636d add changelog 2026-02-05 11:39:39 -08:00
Ian Lee
fef9e3ea32 Add auto_mode support for inworld plugin 2026-02-05 11:39:39 -08:00
Mark Backman
56d8ef2bf4 Deprecate UserBotLatencyLogObserver, update 29 example 2026-02-05 14:29:45 -05:00
Derek Haynes
8791559351 Add changelog entry for PR #3355 2026-02-05 14:29:45 -05:00
Derek Haynes
f6c919354f Add test for user bot latency 2026-02-05 14:29:45 -05:00
Derek Haynes
93138466d6 Feat: Add user-bot latency to OTel turn spans
This adds user-to-bot response latency tracking to OpenTelemetry spans:

- Created UserBotLatencyObserver as a reusable component for tracking
user-to-bot response latency
- Records the value as an attribute on turn spans (turn.user_bot_latency_seconds)
- Updated TurnTraceObserver to use UserBotLatencyObserver, following the same pattern as TurnTrackingObserver
- Updated PipelineTask to automatically create and wire UserBotLatencyObserver
when tracing is enabled (same as TurnTrackingObserver)
2026-02-05 14:29:42 -05:00
Mark Backman
5a5a98b497 Merge pull request #3649 from itsderek23/fix/tracing-orphan-spans
Fix orphan otel spans during flow initialization and transitions
2026-02-05 14:23:52 -05:00
Aleix Conchillo Flaqué
2b4f507d37 CLAUDE.md: add RTVI and serializers 2026-02-05 11:06:00 -08:00
Mark Backman
d6f3a90662 Merge pull request #3652 from pipecat-ai/mb/upgrade-small-webrtc-prebuilt-2.1.0
Upgrade pipecat-ai-small-webrtc-prebuilt to 2.1.0
2026-02-05 13:48:54 -05:00
Derek Haynes
8fb0e37965 Update changelog for #3649 2026-02-05 11:35:22 -07:00
Derek Haynes
0d45b48f7b Fix import placement 2026-02-05 11:26:58 -07:00
Mark Backman
6af4520b1f Merge pull request #3635 from pipecat-ai/filipi/fix_websocket
Fixed an error in the WebSocket transport that occurred when an InputTransportMessageFrame was received and broadcast.
2026-02-05 12:22:59 -05:00
filipi87
ba469e5645 Add changelog entry
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-05 12:19:51 -05:00
Mark Backman
bd12b60b5c Merge pull request #3614 from okue/fix/websocket-broadcast-frame-misuse
fix: pass frame class instead of instance to broadcast_frame in websocket transports
2026-02-05 12:19:03 -05:00
Mark Backman
54db37ea47 Upgrade pipecat-ai-small-webrtc-prebuilt to 2.1.0 2026-02-05 12:09:51 -05:00
filipi87
752e16f553 Ignoring RTVI messages inside TwilioSerializer by default. 2026-02-05 10:51:03 -03:00
Derek Haynes
7c7408a048 Fix orphan spans in tracing during flow initialization and transitions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 06:06:13 -07:00
Mark Backman
8f42343927 Merge pull request #3630 from pipecat-ai/mb/add-function-call-messages-rtvi
Add native RTVI function call lifecycle messages
2026-02-04 16:20:42 -05:00
Mark Backman
46da6cd91b Update changelogs 2026-02-04 11:19:30 -05:00
Mark Backman
ecb02d9049 Bump RTVI_PROTOCOL_VERSION to 1.2.0 2026-02-04 11:17:38 -05:00
Mark Backman
cc68e00125 Deprecate llm-function-call message 2026-02-04 11:17:23 -05:00
Mark Backman
e0e3b5250b Add RTVIObserverParams to control what information is included in function call events 2026-02-04 11:05:05 -05:00
Luke Payyapilli
55a3b10e70 fix(openai): close stream on cancellation to prevent socket leaks 2026-02-04 09:59:10 -05:00
dhruvladia-sarvam
e6b06414b3 change default speaker for bulbul:v3-beta to shubh 2026-02-04 16:46:35 +05:30
Aleix Conchillo Flaqué
6bcfb40d12 Merge pull request #3636 from pipecat-ai/aleix/initial-claude-md
initial CLAUDE.md
2026-02-03 19:31:16 -08:00
Aleix Conchillo Flaqué
65b1a8ce36 initial CLAUDE.md 2026-02-03 18:04:54 -08:00
Mark Backman
2db3d94d06 Merge pull request #3628 from pipecat-ai/mb/broadcast-speech-control-params-frame
Fix: Broadcast SpeechControlParamsFrame from VADController
2026-02-03 18:44:15 -05:00
Mark Backman
2a26b9f7a3 Fix: Broadcast SpeechControlParamsFrame from VADController 2026-02-03 18:40:39 -05:00
Aleix Conchillo Flaqué
4f77c532fb Merge pull request #3623 from pipecat-ai/aleix/pipeline-task-rtvi-always-set-bot-ready
PipelineTask: also call set_bot_ready() for external RTVI processors
2026-02-03 14:21:03 -08:00
Aleix Conchillo Flaqué
c3a4da4a29 PipelineTask: also call set_bot_ready() for external RTVI processors 2026-02-03 14:16:08 -08:00
Mark Backman
84ca0b6d58 Merge pull request #3629 from pipecat-ai/fix/telephony-websocket-stopasynciteration
Fix StopAsyncIteration in parse_telephony_websocket
2026-02-03 12:10:07 -05:00
Mark Backman
c1857d255d Avoid nesting try/excepts 2026-02-03 12:00:04 -05:00
Mark Backman
d50ec33079 Merge pull request #3542 from lukepayyapilli/fix/terminal-frames-uninterruptible
fix: make EndFrame and StopFrame uninterruptible to prevent pipeline freeze
2026-02-03 10:08:17 -05:00
Mark Backman
40c84faff5 Remove handle_function_call_start 2026-02-03 10:00:59 -05:00
Mark Backman
84cd9346f9 Add native RTVI function call lifecycle messages 2026-02-03 10:00:59 -05:00
Luke Payyapilli
5d5b19e1d2 Add changelog entry 2026-02-03 09:12:59 -05:00
Luke Payyapilli
8d3e10f054 Make EndFrame and StopFrame uninterruptible to prevent pipeline freeze 2026-02-03 09:12:59 -05:00
dhruvladia-sarvam
1665ce181a refactor(sarvam): centralize model configuration with dataclasses 2026-02-03 14:33:41 +05:30
James Hush
803a20cc00 Fix formatting: remove extra blank line
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:46:44 +08:00
James Hush
90bead06ab Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-03 16:42:13 +08:00
James Hush
b427d534ae Add tests for parse_telephony_websocket StopAsyncIteration handling
Tests cover:
- No messages received (raises ValueError)
- One message received (logs warning, continues)
- Two messages received (normal operation)
- All telephony providers (Twilio, Telnyx, Plivo, Exotel)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:33:36 +08:00
James Hush
b030f1178d Add changelog and improve docstring for parse_telephony_websocket
- Added changelog entry for bug fix
- Enhanced docstring with Args and Raises sections

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:26:09 +08:00
James Hush
a627597bca Fix StopAsyncIteration in parse_telephony_websocket
Handle WebSocket disconnections gracefully when telephony providers send
fewer messages than expected. Adds explicit StopAsyncIteration handling
for both first and second message retrieval.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 16:25:07 +08:00
Aleix Conchillo Flaqué
4c10ddb7bb upgrade uv.lock 2026-02-02 16:25:06 -08:00
Mark Backman
a4e499dc80 Merge pull request #3617 from pipecat-ai/fix/cjk-sentence-splitting
Fix sentence splitting for CJK and other non-Latin languages
2026-02-02 18:16:51 -05:00
Mark Backman
ca49acfaa6 Merge pull request #3619 from pipecat-ai/mb/resemble-readme
Resemble cleanup
2026-02-02 09:20:11 -05:00
Mark Backman
86147f15f3 Renumber the Resemble foundational example 2026-02-02 09:07:05 -05:00
Mark Backman
5cda72d138 Add Resemble TTS to README 2026-02-02 09:05:03 -05:00
Mark Backman
54e62a8177 Merge pull request #3134 from pipecat-ai/mb/resemble-tts-draft
Add ResembleAITTSService
2026-02-02 08:59:27 -05:00
Mark Backman
a592b7fdf0 Update per PR 1789, align with ErrorFrame norms 2026-02-02 08:55:29 -05:00
Mark Backman
ba2b7c05d6 Add ResembleAITTSService 2026-02-02 08:55:27 -05:00
James Hush
774041e9a1 Add changelog for PR #3617
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 14:47:22 +08:00
James Hush
763002f2bc Fix sentence splitting for CJK and other non-Latin languages in TTS pipeline
NLTK's sent_tokenize() only supports ~15 European languages and defaults to
English. For Japanese, Chinese, Korean, Hindi, Arabic, and other non-Latin
languages, NLTK fails to recognize sentence boundaries like 。?! causing
text to accumulate until flush instead of being emitted sentence-by-sentence.

Add a fallback in match_endofsentence() that scans for unambiguous non-Latin
sentence-ending punctuation when NLTK fails to split the text. Latin
punctuation (. ! ? ; …) is excluded from the fallback since NLTK handles
those correctly and they can be ambiguous (abbreviations, decimals, etc.).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 14:27:49 +08:00
Om Chauhan
50dedf350d fix: ensure function call timeout task is always cancelled 2026-02-02 08:38:54 +05:30
okue
d3ecbb11c1 fix: pass frame class instead of instance to broadcast_frame in websocket transports
broadcast_frame() expects a frame class and kwargs, but the three
websocket input transports (fastapi, client, server) were incorrectly
passing a frame instance. This would cause a TypeError at runtime when
an InputTransportMessageFrame was received.
2026-02-01 20:38:34 +09:00
Aleix Conchillo Flaqué
f453227ba3 Merge pull request #3612 from pipecat-ai/aleix/use-kokoro-onnx
KokoroTTSService: use kokoro-onnx instead of kokoro
2026-01-31 21:03:55 -08:00
Aleix Conchillo Flaqué
52cc64019a Merge pull request #3611 from pipecat-ai/aleix/aicoustics-example-update
examples: update 07zd to use vad_analyzer in LLMUserAggregator
2026-01-31 21:02:50 -08:00
Aleix Conchillo Flaqué
95689cc81c KokoroTTSService: use kokoro-onnx instead of kokoro 2026-01-31 17:20:27 -08:00
Aleix Conchillo Flaqué
675c7c43e3 examples: update 07zd to use vad_analyzer in LLMUserAggregator 2026-01-31 15:31:15 -08:00
Aleix Conchillo Flaqué
bfd19e867c Merge pull request #3610 from pipecat-ai/aleix/dont-add-rtvi-observer-if-already-there
PipelineTask: don't add RTVIObserver if already there
2026-01-31 14:57:52 -08:00
Aleix Conchillo Flaqué
acc9923c0a PipelineTask: don't add RTVIObserver if already there 2026-01-31 14:54:29 -08:00
Mark Backman
bdc9e7e2e4 Merge pull request #3608 from pipecat-ai/mb/quickstart-0.0.101
Update quickstart for 0.0.101
2026-01-31 10:39:17 -05:00
Mark Backman
a587e1b99a Update quickstart for 0.0.101 2026-01-31 09:52:24 -05:00
Aleix Conchillo Flaqué
7853e5ca93 Merge pull request #3606 from pipecat-ai/changelog-0.0.101
Release 0.0.101 - Changelog Update
2026-01-30 22:58:22 -08:00
aconchillo
614b8e1a62 Update changelog for version 0.0.101 2026-01-30 22:54:31 -08:00
Aleix Conchillo Flaqué
ef51c2a5c6 changelog: fix 3582 changed file 2026-01-30 22:48:26 -08:00
Aleix Conchillo Flaqué
f42dc0d38e Merge pull request #3605 from pipecat-ai/aleix/gemini-live-schedule-transcription-timeout-handler
GeminiLiveLLMService: let the transcription timeout handler be scheduled
2026-01-30 22:44:05 -08:00
Aleix Conchillo Flaqué
d87f3543c7 GeminiLiveLLMService: let the transcription timeout handler be scheduled 2026-01-30 22:41:10 -08:00
Aleix Conchillo Flaqué
fee633cb92 scripts(evals): disable kokoro for now 2026-01-30 21:23:42 -08:00
Aleix Conchillo Flaqué
607af91153 Merge pull request #3604 from pipecat-ai/mb/fix-ivr-navigator-aggregation
Fix IVRNavigator to push AggregatedTextFrame when switching to conver…
2026-01-30 21:22:20 -08:00
Mark Backman
e779233918 Fix IVRNavigator to push AggregatedTextFrame when switching to conversation mode 2026-01-30 21:07:49 -05:00
Aleix Conchillo Flaqué
604d5d0b14 examples: update 07zi and 07zj to use vad_analyzer form LLMUserAggregator 2026-01-30 16:14:02 -08:00
Mark Backman
342ae7af41 Merge pull request #3601 from pipecat-ai/mb/add-22-release-evals
Add 22 foundational to release evals
2026-01-30 15:31:54 -05:00
Mark Backman
c92ec1552e Add 22 foundational to release evals 2026-01-30 15:12:52 -05:00
Aleix Conchillo Flaqué
93160f1455 scripts(evals): remove vad_analyzer from transport 2026-01-30 12:08:12 -08:00
Aleix Conchillo Flaqué
e3158e1131 Merge pull request #3600 from pipecat-ai/aleix/llm-server-timeout-task-never-waited
LLMService: make sure function call timeout handler is started
2026-01-30 12:01:18 -08:00
Mark Backman
63a23246d5 Add UserTurnCompletionLLMServiceMixin (#3518)
* Added UserTurnCompletionLLMServiceMixin class

* Added 22-filter-incomplete-turns.py foundational example

* Removed old 22 natural conversation foundational examples

* Added test_user_turn_completion_mixin.py
2026-01-30 14:57:15 -05:00
Aleix Conchillo Flaqué
569ea9849a Merge pull request #3599 from pipecat-ai/aleix/release-evals-disable-rtvi
scripts(evals): disable RTVI
2026-01-30 11:44:46 -08:00
Aleix Conchillo Flaqué
a98ca9b65b LLMService: make sure function call timeout handler is started 2026-01-30 11:38:26 -08:00
Aleix Conchillo Flaqué
c9310789dc scripts(evals): use new vad_analyzer from LLMUSerAggregator 2026-01-30 10:57:17 -08:00
Aleix Conchillo Flaqué
b93e12d701 scripts(evals): disable RTVI 2026-01-30 10:52:38 -08:00
Aleix Conchillo Flaqué
3f77da627d Merge pull request #3583 from pipecat-ai/aleix/move-vad-analyzer-to-llm-user-aggregator
VAD analyzer is now passed to LLMUserAggregator
2026-01-30 10:46:10 -08:00
Aleix Conchillo Flaqué
35d265770d LLMUserAggregator: don't process certain self-queued frames 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
9632efec8c VADProcessor: broadcast frames 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
27dbfa1eda NvidiaTTSService: return AsyncIterator instead of AsyncIterable 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
183c0aa4ef LLMUserAggregator: queue frames internally so strategies and controllers can process them 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
a69a037ffa changelog: add updates for #3583 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
c46e7f5da0 TurnAnalyzerUserTurnStopStrategy: only update vad params if frame contains vad 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
307aeaeda0 examples: update with LLMUserAggregatorParams vad_analyzer and VADProcessor 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
305ab44132 tests: add unittest.main() call 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
b486f35c70 audio: add new VADProcessor 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
c92080b0d2 LLMUserAggregator: add vad_analyzer and use VADController 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
ddfedaf478 audio(vad): add new VADController 2026-01-30 10:07:34 -08:00
Aleix Conchillo Flaqué
b1ad4d5ab0 BaseInputTransport: deprecate vad_analyzer 2026-01-30 10:07:33 -08:00
Aleix Conchillo Flaqué
0857aa87be Merge pull request #3595 from pipecat-ai/aleix/add-kokoro-tts-support
services(tss): add new KokoroTTSService
2026-01-30 09:49:05 -08:00
Aleix Conchillo Flaqué
fd3c5f69b7 upgrade uv.lock 2026-01-30 09:41:33 -08:00
Aleix Conchillo Flaqué
72ab329513 services(tss): add new KokoroTTSService 2026-01-30 09:39:01 -08:00
Filipi da Silva Fuchter
7999d08b7e Merge pull request #3052 from Navigate-AI/fork/main
Include pts in video and audio frames in SmallWebRTCClient
2026-01-30 09:03:29 -05:00
dhruvladia-sarvam
57821cf709 fix 2026-01-30 16:07:52 +05:30
dhruvladia-sarvam
18045582a9 ASR and TTS v3 update 2026-01-30 15:53:06 +05:30
Mark Backman
7be2b8cc34 Merge pull request #3587 from pipecat-ai/mb/gradium-improvements
GradiumSTTService now flushes pending transcripts on VAD stopped dete…
2026-01-29 18:11:25 -05:00
Mark Backman
31c7fbc5ba Add delay_in_frames and language support 2026-01-29 10:59:04 -05:00
Mark Backman
6ab12626d6 GradiumSTTService now flushes pending transcripts on VAD stopped detection 2026-01-29 10:26:17 -05:00
Martin Liu
8dfc59be13 Include pts in incoming video and audio frames 2025-11-12 18:36:56 -05:00
411 changed files with 8835 additions and 5002 deletions

View File

@@ -7,23 +7,30 @@ Create changelog files for the important commits in this PR. The PR number is pr
## Instructions
1. First, check what commits are on the current branch compared to main:
1. Skip changelog for: documentation-only, internal refactoring, test-only, CI changes.
2. First, check what commits are on the current branch compared to main:
```
git log main..HEAD --oneline
```
2. For each significant change, create a changelog file in the `changelog/` folder using the format:
3. For each significant change, create a changelog file in the `changelog/` folder using the format:
Allowed types: `added`, `changed`, `deprecated`, `removed`, `fixed`, `security`, `performance`, `other`
- `{PR_NUMBER}.added.md` - for new features
- `{PR_NUMBER}.added.2.md`, `{PR_NUMBER}.added.3.md` - for additional new features
- `{PR_NUMBER}.added.2.md`, `{PR_NUMBER}.added.3.md` - for additional entries of the same type
- `{PR_NUMBER}.changed.md` - for changes to existing functionality
- `{PR_NUMBER}.fixed.md` - for bug fixes
- `{PR_NUMBER}.deprecated.md` - for deprecations
- `{PR_NUMBER}.removed.md` - for removed features
- `{PR_NUMBER}.security.md` - for security fixes
- `{PR_NUMBER}.performance.md` - for performance improvements
- `{PR_NUMBER}.other.md` - for other changes
3. Each changelog file should at least contain a main single line starting with `- ` followed by a clear description of the change.
4. Each changelog file should at least contain a main single line starting with `- ` followed by a clear description of the change.
4. If the change is complicated, changelog files can have indented lines after the main line with additional details or code samples.
5. If the change is complicated, changelog files can have indented lines after the main line with additional details or code samples.
5. Use ⚠️ emoji prefix for breaking changes.
6. Use ⚠️ emoji prefix for breaking changes.
## Example

View File

@@ -7,6 +7,258 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
<!-- towncrier release notes start -->
## [0.0.101] - 2026-01-30
### Added
- Additions for `AICFilter` and `AICVADAnalyzer`:
- Added model downloading support to `AICFilter` with `model_id` and
`model_download_dir` parameters.
- Added `model_path` parameter to `AICFilter` for loading local `.aicmodel`
files.
- Added unit tests for `AICFilter` and `AICVADAnalyzer`.
(PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
- Added handling for `server_content.interrupted` signal in the Gemini Live
service for faster interruption response in the case where there isn't
already turn tracking in the pipeline, e.g. local VAD + context aggregators.
When there is already turn tracking in the pipeline, the additional
interruption does no harm.
(PR [#3429](https://github.com/pipecat-ai/pipecat/pull/3429))
- Added new `GenesysFrameSerializer` for the Genesys AudioHook WebSocket
protocol, enabling bidirectional audio streaming between Pipecat pipelines
and Genesys Cloud contact center.
(PR [#3500](https://github.com/pipecat-ai/pipecat/pull/3500))
- Added `reached_upstream_types` and `reached_downstream_types` read-only
properties to `PipelineTask` for inspecting current frame filters.
(PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
- Added `add_reached_upstream_filter()` and `add_reached_downstream_filter()`
methods to `PipelineTask` for appending frame types.
(PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
- Added `UserTurnCompletionLLMServiceMixin` for LLM services to detect and
filter incomplete user turns. When enabled via `filter_incomplete_user_turns`
in `LLMUserAggregatorParams`, the LLM outputs a turn completion marker at the
start of each response: ✓ (complete), ○ (incomplete short), or ◐ (incomplete
long). Incomplete turns are suppressed, and configurable timeouts
automatically re-prompt the user.
(PR [#3518](https://github.com/pipecat-ai/pipecat/pull/3518))
- Added `FrameProcessor.broadcast_frame_instance(frame)` method to broadcast a
frame instance by extracting its fields and creating new instances for each
direction.
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
- `PipelineTask` now automatically adds `RTVIProcessor` and registers
`RTVIObserver` when `enable_rtvi=True` (default), simplifying pipeline setup.
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
- Added `RTVIProcessor.create_rtvi_observer()` factory method for creating RTVI
observers.
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
- Added `video_out_codec` parameter to `TransportParams` allowing configuration
of the preferred video codec (e.g., `"VP8"`, `"H264"`, `"H265"`) for video
output in `DailyTransport`.
(PR [#3520](https://github.com/pipecat-ai/pipecat/pull/3520))
- Added `location` parameter to Google TTS services (`GoogleHttpTTSService`,
`GoogleTTSService`, `GeminiTTSService`) for regional endpoint support.
(PR [#3523](https://github.com/pipecat-ai/pipecat/pull/3523))
- Added new `PIPECAT_SMART_TURN_LOG_DATA` environment variable, which causes
Smart Turn input data to be saved to disk
(PR [#3525](https://github.com/pipecat-ai/pipecat/pull/3525))
- Added `result_callback` parameter to `UserImageRequestFrame` to support
deferred function call results.
(PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
- Added `function_call_timeout_secs` parameter to `LLMService` to configure
timeout for deferred function calls (defaults to 10.0 seconds).
(PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
- Added `vad_analyzer` parameter to `LLMUserAggregatorParams`. VAD analysis is
now handled inside the `LLMUserAggregator` rather than in the transport,
keeping voice activity detection closer to where it is consumed. The
`vad_analyzer` on `BaseInputTransport` is now deprecated.
```python
context_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
vad_analyzer=SileroVADAnalyzer(),
),
)
```
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
- Added `VADProcessor` for detecting speech in audio streams within a pipeline.
Pushes `VADUserStartedSpeakingFrame`, `VADUserStoppedSpeakingFrame`, and
`UserSpeakingFrame` downstream based on VAD state changes.
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
- Added `VADController` for managing voice activity detection state and
emitting speech events independently of transport or pipeline processors.
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
- Added local `PiperTTSService` for offline text-to-speech using Piper voice
models. The existing HTTP-based service has been renamed to
`PiperHttpTTSService`.
(PR [#3585](https://github.com/pipecat-ai/pipecat/pull/3585))
- `main()` in `pipecat.runner.run` now accepts an optional
`argparse.ArgumentParser`, allowing bots to define custom CLI arguments
accessible via `runner_args.cli_args`.
(PR [#3590](https://github.com/pipecat-ai/pipecat/pull/3590))
- Added `KokoroTTSService` for local text-to-speech synthesis using the
Kokoro-82M model.
(PR [#3595](https://github.com/pipecat-ai/pipecat/pull/3595))
### Changed
- Updated `AICFilter` and `AICVADAnalyzer` to use aic-sdk ~= 2.0.1.
(PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
- Improved the STT TTFB (Time To First Byte) measurement, reporting the delay
between when the user stops speaking and when the final transcription is
received. Note: Unlike traditional TTFB which measures from a discrete
request, STT services receive continuous audio input—so we measure from
speech end to final transcript, which captures the latency that matters for
voice AI applications. In support of this change, added `finalized` field to
`TranscriptionFrame` to indicate when a transcript is the final result for an
utterance.
(PR [#3495](https://github.com/pipecat-ai/pipecat/pull/3495))
- `SarvamSTTService` now defaults `vad_signals` and `high_vad_sensitivity` to
`None` (omitted from connection parameters), improving latency by ~300ms
compared to the previous defaults.
(PR [#3495](https://github.com/pipecat-ai/pipecat/pull/3495))
- Changed frame filter storage from tuples to sets in `PipelineTask`.
(PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
- Changed default Inworld TTS model from `inworld-tts-1` to
`inworld-tts-1.5-max`.
(PR [#3531](https://github.com/pipecat-ai/pipecat/pull/3531))
- `FrameSerializer` now subclasses from `BaseObject` to enable event support.
(PR [#3560](https://github.com/pipecat-ai/pipecat/pull/3560))
- Added support for TTFS in `SpeechmaticsSTTService` and set the default mode
to `EXTERNAL` to support Pipecat-controlled VAD.
- Changed dependency to `speechmatics-voice[smart]>=0.2.8`
(PR [#3562](https://github.com/pipecat-ai/pipecat/pull/3562))
- ⚠️ Changed function call handling to use timeout-based completion instead of
immediate callback execution.
- Function calls that defer their results (e.g., `UserImageRequestFrame`)
now use a timeout mechanism
- The `result_callback` is invoked automatically when the deferred
operation completes or after timeout
- This change affects examples using `UserImageRequestFrame` - the
`result_callback` should now be passed to the frame instead of being called
immediately
(PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
- Pipecat runner now uses `DAILY_ROOM_URL` instead of `DAILY_SAMPLE_ROOM_URL`.
(PR [#3582](https://github.com/pipecat-ai/pipecat/pull/3582))
- Updates to `GradiumSTTService`:
- Now flushes pending transcriptions when VAD detects the user stopped
speaking, improving response latency.
- `GradiumSTTService` now supports `InputParams` for configuring `language`
and `delay_in_frames` settings.
(PR [#3587](https://github.com/pipecat-ai/pipecat/pull/3587))
### Deprecated
- ⚠️ Deprecated `vad_analyzer` parameter on `BaseInputTransport`. Pass
`vad_analyzer` to `LLMUserAggregatorParams` instead or use `VADProcessor` in
the pipeline.
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
### Removed
- Removed deprecated `AICFilter` parameters: `enhancement_level`, `voice_gain`,
`noise_gate_enable`.
(PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
### Fixed
- Fixed an issue where if you were using `OpenRouterLLMService` with a Gemini
model, it wouldn't handle multiple `"system"` messages as expected (and as we
do in `GoogleLLMService`), which is to convert subsequent ones into `"user"`
messages. Instead, the latest `"system"` message would overwrite the previous
ones.
(PR [#3406](https://github.com/pipecat-ai/pipecat/pull/3406))
- Transports now properly broadcast `InputTransportMessageFrame` frames both
upstream and downstream instead of only pushing downstream.
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
- Fixed `FrameProcessor.broadcast_frame()` to deep copy kwargs, preventing
shared mutable references between the downstream and upstream frame
instances.
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
- Fixed OpenAI LLM services to emit `ErrorFrame` on completion timeout,
enabling proper error handling and LLMSwitcher failover.
(PR [#3529](https://github.com/pipecat-ai/pipecat/pull/3529))
- Fixed a logging issue where non-ASCII characters (e.g., Japanese, Chinese,
etc.) were being unnecessarily escaped to Unicode sequences when function
call occurred.
(PR [#3536](https://github.com/pipecat-ai/pipecat/pull/3536))
- Fixed how audio tracks are synchronized inside the `AudioBufferProcessor` to
fix timing issues where silence and audio were misaligned between user and
bot buffers.
(PR [#3541](https://github.com/pipecat-ai/pipecat/pull/3541))
- Fixed race condition in `OpenAIRealtimeBetaLLMService` that could cause an
error when truncating the conversation.
(PR [#3567](https://github.com/pipecat-ai/pipecat/pull/3567))
- Fixed an infinite loop in `WebsocketService` that blocked the event loop when
a remote server closed the connection gracefully.
(PR [#3574](https://github.com/pipecat-ai/pipecat/pull/3574))
- Fixed `LLMUserAggregator` and `LLMAssistantAggregator` not emitting pending
transcripts via `on_user_turn_stopped` and `on_assistant_turn_stopped` events
when the conversation ends (`EndFrame`) or is cancelled (`CancelFrame`).
(PR [#3575](https://github.com/pipecat-ai/pipecat/pull/3575))
- Added missing `LiveKitRunnerArguments` and `LiveKitTransport` support in
runner utilities to enable LiveKit transport configuration.
(PR [#3580](https://github.com/pipecat-ai/pipecat/pull/3580))
- Fixed race condition in `OpenAIRealtimeLLMService` that could cause an error
when truncating the conversation.
(PR [#3581](https://github.com/pipecat-ai/pipecat/pull/3581))
- Fixed `PiperHttpTTSService` (olf `PiperTTSService`) to resample audio output
based on the model's sample rate parsed from the WAV header.
(PR [#3585](https://github.com/pipecat-ai/pipecat/pull/3585))
- Fixed `UserTurnController` to reset user turn timeout when interim
transcriptions are received.
(PR [#3594](https://github.com/pipecat-ai/pipecat/pull/3594))
- Fixed an issue in the `IVRNavigator` where the `TextFrame`s pushed had
incorrect spacing. Now, the internal `IVRProcessor` pushes
`AggregatedTextFrame`s when in conversation mode. This allows for controlling
spacing of the outputted, aggregated text.
(PR [#3604](https://github.com/pipecat-ai/pipecat/pull/3604))
- Fixed `GeminiLiveLLMService` transcription timeout handler not being
scheduled by yielding to the event loop after task creation.
(PR [#3605](https://github.com/pipecat-ai/pipecat/pull/3605))
## [0.0.100] - 2026-01-20
### Added

143
CLAUDE.md Normal file
View File

@@ -0,0 +1,143 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Pipecat is an open-source Python framework for building real-time voice and multimodal conversational AI agents. It orchestrates audio/video, AI services, transports, and conversation pipelines using a frame-based architecture.
## Common Commands
```bash
# Setup development environment
uv sync --group dev --all-extras --no-extra gstreamer --no-extra krisp
# Install pre-commit hooks
uv run pre-commit install
# Run all tests
uv run pytest
# Run a single test file
uv run pytest tests/test_name.py
# Run a specific test
uv run pytest tests/test_name.py::test_function_name
# Preview changelog
towncrier build --draft --version Unreleased
# Lint and format check
uv run ruff check
uv run ruff format --check
# Update dependencies (after editing pyproject.toml)
uv lock && uv sync
```
## Architecture
### Frame-Based Pipeline Processing
All data flows as **Frame** objects through a pipeline of **FrameProcessors**:
```
Transport Input → Pipeline Source → [Processor1] → [Processor2] → ... → Pipeline Sink → Transport Output
```
**Key components:**
- **Frames** (`src/pipecat/frames/frames.py`): Data units (audio, text, video) and control signals. Flow DOWNSTREAM (input→output) or UPSTREAM (acknowledgments/errors).
- **FrameProcessor** (`src/pipecat/processors/frame_processor.py`): Base processing unit. Each processor receives frames, processes them, and pushes results downstream.
- **Pipeline** (`src/pipecat/pipeline/pipeline.py`): Chains processors together.
- **ParallelPipeline** (`src/pipecat/pipeline/parallel_pipeline.py`): Runs multiple pipelines in parallel.
- **Transports** (`src/pipecat/transports/`): External I/O layer (Daily WebRTC, LiveKit WebRTC, WebSocket, Local). Abstract interface via `BaseTransport`.
- **Services** (`src/pipecat/services/`): 60+ AI provider integrations (STT, TTS, LLM, etc.). Extend base classes: `AIService`, `LLMService`, `STTService`, `TTSService`, `VisionService`.
- **Serializers** (`src/pipecat/serializers/`): Convert frames to/from wire formats for WebSocket transports. `FrameSerializer` base class defines `serialize()` and `deserialize()`. Telephony serializers (Twilio, Plivo, Vonage, Telnyx, Exotel, Genesys) handle provider-specific protocols and audio encoding (e.g., μ-law).
- **RTVI** (`src/pipecat/processors/frameworks/rtvi.py`): Real-Time Voice Interface protocol bridging clients and the pipeline. `RTVIProcessor` handles incoming client messages (text input, audio, function call results). `RTVIObserver` converts pipeline frames to outgoing messages: user/bot speaking events, transcriptions, LLM/TTS lifecycle, function calls, metrics, and audio levels.
### Important Patterns
- **Context Aggregation**: `LLMContext` accumulates messages for LLM calls; `UserResponse` aggregates user input
- **Turn Management**: Turn management is done through `LLMUserAggregator` and
`LLMAssistantAggregator`, created with `LLMContextAggregatorPair`
- **User turn strategies**: Detection of when the user starts and stops speaking is done via user turn start/stop strategies. They push `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` respectively.
- **Interruptions**: Interruptions are usually triggered by a user turn start strategy (e.g. `VADUserTurnStartStrategy`) but they can be triggered by other processors as well, in which case the user turn start strategies don't need to. An `InterruptionFrame` carries an optional `asyncio.Event` that is set when the frame reaches the pipeline sink. If a processor stops an `InterruptionFrame` from propagating downstream (i.e., doesn't push it), it **must** call `frame.complete()` to avoid stalling `push_interruption_task_frame_and_wait()` callers.
- **Uninterruptible Frames**: These are frames that will not be removed from internal queues even if there's an interruption. For example, `EndFrame` and `StopFrame`.
- **Events**: Most classes in Pipecat have `BaseObject` as the very base class. `BaseObject` has support for events. Events can run in the background in an async task (default) or synchronously (`sync=True`) if we want immediate action. Synchronous event handlers need to exectue fast.
### Key Directories
| Directory | Purpose |
|---------------------------|----------------------------------------------------|
| `src/pipecat/frames/` | Frame definitions (100+ types) |
| `src/pipecat/processors/` | FrameProcessor base + aggregators, filters, audio |
| `src/pipecat/pipeline/` | Pipeline orchestration |
| `src/pipecat/services/` | AI service integrations (60+ providers) |
| `src/pipecat/transports/` | Transport layer (Daily, LiveKit, WebSocket, Local) |
| `src/pipecat/serializers/`| Frame serialization for WebSocket protocols |
| `src/pipecat/audio/` | VAD, filters, mixers, turn detection, DTMF |
| `src/pipecat/turns/` | User turn management |
## Code Style
- **Docstrings**: Google-style. Classes describe purpose; `__init__` has `Args:` section; dataclasses use `Parameters:` section.
- **Linting**: Ruff (line length 100). Pre-commit hooks enforce formatting.
- **Type hints**: Required for complex async code.
### Docstring Example
```python
class MyService(LLMService):
"""Description of what the service does.
More detailed description.
Event handlers available:
- on_connected: Called when we are connected
Example::
@service.event_handler("on_connected")
async def on_connected(service, frame):
...
"""
def __init__(self, param1: str, **kwargs):
"""Initialize the service.
Args:
param1: Description of param1.
**kwargs: Additional arguments passed to parent.
"""
super().__init__(**kwargs)
```
## Service Implementation
When adding a new service:
1. Extend the appropriate base class (`STTService`, `TTSService`, `LLMService`, etc.)
2. Implement required abstract methods
3. Handle necessary frames
4. By default, all frames should be pushed in the direction they came
5. Push `ErrorFrame` on failures
6. Add metrics tracking via `MetricsData` if relevant
7. Follow the pattern of existing services in `src/pipecat/services/`
## Pull Requests
After creating a PR, use `/changelog <pr_number>` to generate the changelog file and `/pr-description <pr_number>` to update the PR description.

View File

@@ -75,7 +75,7 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Gradium](https://docs.pipecat.ai/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [Hathora](https://docs.pipecat.ai/server/services/stt/hathora), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hathora](https://docs.pipecat.ai/server/services/tts/hathora), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hathora](https://docs.pipecat.ai/server/services/tts/hathora), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Resemble](https://docs.pipecat.ai/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/server/services/s2s/ultravox), |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Exotel](https://docs.pipecat.ai/server/utilities/serializers/exotel), [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx), [Vonage](https://docs.pipecat.ai/server/utilities/serializers/vonage) |

1
changelog/3134.added.md Normal file
View File

@@ -0,0 +1 @@
- Added `ResembleAITTSService` for text-to-speech using Resemble AI's streaming WebSocket API with word-level timestamps and jitter buffering for smooth audio playback.

1
changelog/3355.added.md Normal file
View File

@@ -0,0 +1 @@
- Added `UserBotLatencyObserver` for tracking user-to-bot response latency. When tracing is enabled, latency measurements are automatically recorded as `turn.user_bot_latency_seconds` attributes on OpenTelemetry turn spans.

View File

@@ -0,0 +1 @@
- Deprecated `UserBotLatencyLogObserver`. Use `UserBotLatencyObserver` directly with its `on_latency_measured` event handler instead.

View File

@@ -1 +0,0 @@
- Fixed an issue where if you were using `OpenRouterLLMService` with a Gemini model, it wouldn't handle multiple `"system"` messages as expected (and as we do in `GoogleLLMService`), which is to convert subsequent ones into `"user"` messages. Instead, the latest `"system"` message would overwrite the previous ones.

View File

@@ -1,4 +0,0 @@
- Additions for `AICFilter` and `AICVADAnalyzer`:
- Added model downloading support to `AICFilter` with `model_id` and `model_download_dir` parameters.
- Added `model_path` parameter to `AICFilter` for loading local `.aicmodel` files.
- Added unit tests for `AICFilter` and `AICVADAnalyzer`.

View File

@@ -1 +0,0 @@
- Updated `AICFilter` and `AICVADAnalyzer` to use aic-sdk ~= 2.0.1.

View File

@@ -1 +0,0 @@
- Removed deprecated `AICFilter` parameters: `enhancement_level`, `voice_gain`, `noise_gate_enable`.

View File

@@ -1 +0,0 @@
- Added handling for `server_content.interrupted` signal in the Gemini Live service for faster interruption response in the case where there isn't already turn tracking in the pipeline, e.g. local VAD + context aggregators. When there is already turn tracking in the pipeline, the additional interruption does no harm.

View File

@@ -1 +0,0 @@
- `SarvamSTTService` now defaults `vad_signals` and `high_vad_sensitivity` to `None` (omitted from connection parameters), improving latency by ~300ms compared to the previous defaults.

View File

@@ -1 +0,0 @@
- Improved the STT TTFB (Time To First Byte) measurement, reporting the delay between when the user stops speaking and when the final transcription is received. Note: Unlike traditional TTFB which measures from a discrete request, STT services receive continuous audio input—so we measure from speech end to final transcript, which captures the latency that matters for voice AI applications. In support of this change, added `finalized` field to `TranscriptionFrame` to indicate when a transcript is the final result for an utterance.

View File

@@ -1 +0,0 @@
- Added new `GenesysFrameSerializer` for the Genesys AudioHook WebSocket protocol, enabling bidirectional audio streaming between Pipecat pipelines and Genesys Cloud contact center.

View File

@@ -1 +0,0 @@
- Added `add_reached_upstream_filter()` and `add_reached_downstream_filter()` methods to `PipelineTask` for appending frame types.

View File

@@ -1 +0,0 @@
- Added `reached_upstream_types` and `reached_downstream_types` read-only properties to `PipelineTask` for inspecting current frame filters.

View File

@@ -1 +0,0 @@
- Changed frame filter storage from tuples to sets in `PipelineTask`.

View File

@@ -1 +0,0 @@
- Added `RTVIProcessor.create_rtvi_observer()` factory method for creating RTVI observers.

View File

@@ -1 +0,0 @@
- Added `FrameProcessor.broadcast_frame_instance(frame)` method to broadcast a frame instance by extracting its fields and creating new instances for each direction.

View File

@@ -1 +0,0 @@
- `PipelineTask` now automatically adds `RTVIProcessor` and registers `RTVIObserver` when `enable_rtvi=True` (default), simplifying pipeline setup.

View File

@@ -1 +0,0 @@
- Fixed `FrameProcessor.broadcast_frame()` to deep copy kwargs, preventing shared mutable references between the downstream and upstream frame instances.

View File

@@ -1 +0,0 @@
- Transports now properly broadcast `InputTransportMessageFrame` frames both upstream and downstream instead of only pushing downstream.

View File

@@ -1 +0,0 @@
- Added `video_out_codec` parameter to `TransportParams` allowing configuration of the preferred video codec (e.g., `"VP8"`, `"H264"`, `"H265"`) for video output in `DailyTransport`.

View File

@@ -1 +0,0 @@
- Added `location` parameter to Google TTS services (`GoogleHttpTTSService`, `GoogleTTSService`, `GeminiTTSService`) for regional endpoint support.

View File

@@ -1 +0,0 @@
- Added new `PIPECAT_SMART_TURN_LOG_DATA` environment variable, which causes Smart Turn input data to be saved to disk

View File

@@ -1 +0,0 @@
- Fixed OpenAI LLM services to emit `ErrorFrame` on completion timeout, enabling proper error handling and LLMSwitcher failover.

View File

@@ -1,2 +0,0 @@
- Changed default Inworld TTS model from `inworld-tts-1` to
`inworld-tts-1.5-max`.

View File

@@ -1 +0,0 @@
- Fixed a logging issue where non-ASCII characters (e.g., Japanese, Chinese, etc.) were being unnecessarily escaped to Unicode sequences when function call occurred.

View File

@@ -1 +0,0 @@
- Fixed how audio tracks are synchronized inside the `AudioBufferProcessor` to fix timing issues where silence and audio were misaligned between user and bot buffers.

1
changelog/3542.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed pipeline freeze when `InterruptionFrame` discards `EndFrame` or `StopFrame` by making terminal frames uninterruptible.

View File

@@ -1 +0,0 @@
- `FrameSerializer` now subclasses from `BaseObject` to enable event support.

View File

@@ -1,2 +0,0 @@
- Added support for TTFS in `SpeechmaticsSTTService` and set the default mode to `EXTERNAL` to support Pipecat-controlled VAD.
- Changed dependency to `speechmatics-voice[smart]>=0.2.8`

View File

@@ -1 +0,0 @@
- Fixed race condition in `OpenAIRealtimeBetaLLMService` that could cause an error when truncating the conversation.

View File

@@ -1 +0,0 @@
- Added `function_call_timeout_secs` parameter to `LLMService` to configure timeout for deferred function calls (defaults to 10.0 seconds).

View File

@@ -1 +0,0 @@
- Added `result_callback` parameter to `UserImageRequestFrame` to support deferred function call results.

View File

@@ -1,4 +0,0 @@
- ⚠️ Changed function call handling to use timeout-based completion instead of immediate callback execution.
- Function calls that defer their results (e.g., `UserImageRequestFrame`) now use a timeout mechanism
- The `result_callback` is invoked automatically when the deferred operation completes or after timeout
- This change affects examples using `UserImageRequestFrame` - the `result_callback` should now be passed to the frame instead of being called immediately

View File

@@ -1 +0,0 @@
- Fixed an infinite loop in `WebsocketService` that blocked the event loop when a remote server closed the connection gracefully.

View File

@@ -1 +0,0 @@
- Fixed `LLMUserAggregator` and `LLMAssistantAggregator` not emitting pending transcripts via `on_user_turn_stopped` and `on_assistant_turn_stopped` events when the conversation ends (`EndFrame`) or is cancelled (`CancelFrame`).

View File

@@ -1 +0,0 @@
- Added missing `LiveKitRunnerArguments` and `LiveKitTransport` support in runner utilities to enable LiveKit transport configuration.

View File

@@ -1 +0,0 @@
- Fixed race condition in `OpenAIRealtimeLLMService` that could cause an error when truncating the conversation.

View File

@@ -1 +0,0 @@
- Pipecat runner now uses `DAILY_ROOM_URL` instead of `DAILY_SAMPLE_ROOM_URL`.

View File

@@ -1 +0,0 @@
- Added local `PiperTTSService` for offline text-to-speech using Piper voice models. The existing HTTP-based service has been renamed to `PiperHttpTTSService`.

View File

@@ -1 +0,0 @@
- Fixed `PiperHttpTTSService` (olf `PiperTTSService`) to resample audio output based on the model's sample rate parsed from the WAV header.

1
changelog/3589.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed OpenAI LLM stream not being closed on cancellation/exception, which could leak sockets.

View File

@@ -1 +0,0 @@
- `main()` in `pipecat.runner.run` now accepts an optional `argparse.ArgumentParser`, allowing bots to define custom CLI arguments accessible via `runner_args.cli_args`.

1
changelog/3593.added.md Normal file
View File

@@ -0,0 +1 @@
- Added support for Inworld TTS Websocket Auto Mode for improved latency

View File

@@ -0,0 +1 @@
- Updated timestamps to be cumulative within an agent turn, using flushCompleted message as an indication of when timestamps from the server are reset to 0

View File

@@ -1 +0,0 @@
- Fixed `UserTurnController` to reset user turn timeout when interim transcriptions are received.

1
changelog/3610.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `PipelineTask` adding duplicate `RTVIProcessor` and `RTVIObserver` when they were already provided in the pipeline or observers list. They are now detected and skipped, with appropriate warnings and errors logged for mismatched configurations.

View File

@@ -0,0 +1 @@
- Changed `KokoroTTSService` to use `kokoro-onnx` instead of `kokoro` as the underlying TTS engine.

1
changelog/3616.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed function call timeout task not being cancelled when the handler completes without calling `result_callback` or is cancelled externally, which caused `RuntimeWarning: coroutine was never awaited`.

5
changelog/3617.fixed.md Normal file
View File

@@ -0,0 +1,5 @@
- Fixed sentence splitting for Japanese, Chinese, Korean, and other non-Latin
languages in TTS pipeline. NLTK's sentence tokenizer does not support CJK
languages, causing text to accumulate until flush instead of being split at
sentence boundaries. Added fallback detection for unambiguous non-Latin
sentence-ending punctuation (e.g., `。`, ``, ``).

1
changelog/3623.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `PipelineTask` to also call `set_bot_ready()` when an external `RTVIProcessor` is provided.

1
changelog/3628.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `VADController` not broadcasting `SpeechControlParamsFrame` on startup, which prevented STT services from receiving VAD params needed for TTFB measurement.

1
changelog/3629.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `StopAsyncIteration` exceptions in `parse_telephony_websocket()` when WebSocket connections close before sending expected messages.

1
changelog/3630.added.md Normal file
View File

@@ -0,0 +1 @@
- Added RTVI function call lifecycle events (`llm-function-call-started`, `llm-function-call-in-progress`, `llm-function-call-stopped`) with configurable security levels via `RTVIObserverParams.function_call_report_level`. Supports per-function control over what information is exposed (`DISABLED`, `NONE`, `NAME`, or `FULL`).

View File

@@ -0,0 +1 @@
- Deprecated `RTVILLMFunctionCallMessage`, `RTVILLMFunctionCallMessageData`, and `RTVIProcessor.handle_function_call()`. Use the new `llm-function-call-in-progress` event sent automatically by `RTVIObserver` instead.

1
changelog/3635.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed WebSocket transport error when broadcasting `InputTransportMessageFrame` by correctly instantiating the frame with its message parameter.

1
changelog/3649.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed orphan OpenTelemetry spans during flow initialization and transitions in tracing.

View File

@@ -0,0 +1 @@
- Upgraded the `pipecat-ai-small-webrtc-prebuilt` package to v2.1.0.

1
changelog/3656.added.md Normal file
View File

@@ -0,0 +1 @@
- Added `OpenAIRealtimeSTTService` for real-time streaming speech-to-text using OpenAI's Realtime API WebSocket transcription sessions. Supports local VAD and server-side VAD modes, noise reduction, and automatic reconnection.

10
changelog/3659.changed.md Normal file
View File

@@ -0,0 +1,10 @@
- ⚠️ The default `VADParams` `stop_secs` default is changing from `0.8` seconds
to `0.2` seconds. This change both simplifies the developer experience and
improves the performance of STT services. With a shorter `stop_secs` value,
STT services using a local VAD can finalize sooner, resulting in faster
transcription.
- `SpeechTimeoutUserTurnStopStrategy`: control how long to wait for
additional user speech using `user_speech_timeout` (default: 0.6 sec).
- `TurnAnalyzerUserTurnStopStrategy`: the turn analyzer automatically adjusts
the user wait time based on the audio input.

View File

@@ -0,0 +1 @@
- Moved interruption wait event from per-processor instance state to `InterruptionFrame` itself. Added `InterruptionFrame.complete()` to signal when the interruption has fully traversed the pipeline. Custom processors that block or consume an `InterruptionFrame` before it reaches the pipeline sink must call `frame.complete()` to avoid stalling `push_interruption_task_frame_and_wait()`. A warning is logged if completion does not happen within 2 seconds.

1
changelog/3663.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `SambaNovaLLMService` and `GoogleLLMOpenAIBetaService` streams not being closed on cancellation/exception, which could leak sockets.

View File

@@ -0,0 +1 @@
- Update the default model to `scribe_v2` for `ElevenLabsSTTService`.

View File

@@ -0,0 +1 @@
- Changed the `DeepgramSTTService` default setting for `smart_format` to `False`, as agents don't need smart formatting. Disabling this setting provides a small performance improvement, as well.

1
changelog/3667.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed an issue in `InworldTTSService` where punctuation was pronounced. Now, the `InworldTTSService` ensures proper spacing between sentences, resolving pronunciation issues.

1
changelog/3668.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `ParallelPipeline` allowing frames pushed by internal processors to escape during lifecycle frame (`StartFrame`/`EndFrame`/`CancelFrame`) synchronization. These frames are now buffered and flushed after all branches complete.

1
changelog/3678.added.md Normal file
View File

@@ -0,0 +1 @@
- Added pyright basic type checking configuration for the core framework.

View File

@@ -156,6 +156,10 @@ PLIVO_AUTH_TOKEN=...
# Qwen
QWEN_API_KEY=...
# Resemble AI
RESEMBLE_API_KEY=
RESEMBLE_VOICE_UUID=
# Rime
RIME_API_KEY=...
RIME_VOICE_ID=...

View File

@@ -24,9 +24,8 @@ from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),

View File

@@ -23,9 +23,8 @@ from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),

View File

@@ -23,9 +23,8 @@ from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),

View File

@@ -23,9 +23,8 @@ from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),

View File

@@ -25,9 +25,8 @@ from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),

View File

@@ -23,9 +23,8 @@ from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
video_out_enabled=True,

View File

@@ -22,9 +22,8 @@ from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
video_out_enabled=True,

View File

@@ -19,7 +19,6 @@ from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -64,7 +63,6 @@ async def run_example(webrtc_connection: SmallWebRTCConnection):
params=TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
)
@@ -91,6 +89,7 @@ async def run_example(webrtc_connection: SmallWebRTCConnection):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -14,7 +14,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -49,7 +48,6 @@ async def main():
audio_in_enabled=True,
audio_out_enabled=True,
transcription_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
)
@@ -76,6 +74,7 @@ async def main():
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -14,7 +14,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
InterruptionFrame,
TranscriptionFrame,
@@ -54,7 +53,6 @@ async def main():
params=LiveKitParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
)
@@ -84,6 +82,7 @@ async def main():
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -65,9 +65,8 @@ class MonthPrepender(FrameProcessor):
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_out_enabled=True,

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import Frame, LLMRunFrame, MetricsFrame
from pipecat.metrics.metrics import (
LLMUsageMetricsData,
@@ -62,24 +61,20 @@ class MetricsLogger(FrameProcessor):
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -112,6 +107,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from PIL import Image
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
BotStartedSpeakingFrame,
BotStoppedSpeakingFrame,
@@ -77,9 +76,8 @@ class ImageSyncAggregator(FrameProcessor):
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
@@ -87,7 +85,6 @@ transport_params = {
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
@@ -95,7 +92,6 @@ transport_params = {
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -126,6 +122,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -35,24 +34,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -83,6 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -11,7 +11,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -34,24 +33,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -82,6 +77,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -33,9 +33,8 @@ from pipecat.turns.user_turn_strategies import ExternalUserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -37,24 +36,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -125,6 +120,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -17,7 +17,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMMessagesUpdateFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -50,24 +49,20 @@ def get_session_history(session_id: str) -> BaseChatMessageHistory:
return message_store[session_id]
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -109,6 +104,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -32,9 +32,8 @@ from pipecat.turns.user_turn_strategies import ExternalUserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -37,24 +36,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -89,6 +84,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -36,24 +35,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -92,6 +87,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -33,9 +33,8 @@ from pipecat.turns.user_turn_strategies import ExternalUserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -36,24 +35,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -81,6 +76,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -13,7 +13,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -37,24 +36,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -93,6 +88,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())
]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -36,24 +35,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -84,6 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -35,24 +34,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -84,6 +79,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -36,24 +35,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -86,6 +81,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -35,24 +34,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -90,6 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

View File

@@ -12,7 +12,6 @@ from loguru import logger
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
@@ -35,24 +34,20 @@ from pipecat.turns.user_turn_strategies import UserTurnStrategies
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
# We use lambdas to defer transport parameter creation until the transport
# type is selected at runtime.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
),
}
@@ -90,6 +85,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
user_turn_strategies=UserTurnStrategies(
stop=[TurnAnalyzerUserTurnStopStrategy(turn_analyzer=LocalSmartTurnAnalyzerV3())]
),
vad_analyzer=SileroVADAnalyzer(),
),
)

Some files were not shown because too many files have changed in this diff Show More