Compare commits

...

167 Commits

Author SHA1 Message Date
James Hush
f039ece2c0 feat: nova-3 example 2025-02-18 11:24:02 +08:00
Aleix Conchillo Flaqué
b45f7fee6f Merge pull request #1225 from pipecat-ai/aleix/prepare-0.0.57
update CHANGELOG for 0.0.57
2025-02-14 18:50:08 -08:00
Aleix Conchillo Flaqué
01c06c5cac update CHANGELOG for 0.0.57 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
329e89c1d9 TTSService: push BotStoppedSpeakingFrame 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
883410d8ac FrameProcessor: no need to create an input event every time 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
1f5b790dd0 TTSService: reset processing text during interruptions 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
a107b1cb4b examples(06a): use CartesiaTTSService 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
63950912f0 LLMAssistantContextAggregator: add missing variable initialization 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
2ce9402571 LLMAssistantResponseAggregator: initialize messages 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
f6912c0f9a utils: don't consider colon an end of sentence 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
633a4d4c58 FalImageGenService: load image async to not block the event loop 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
67da745bb3 tts: make frame pausing/resuming optional 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
5126d4de92 tts: handle incoming frames pausing/resuming from base TTSService class 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
426d7ac213 transports: some local audio and tk updates 2025-02-14 18:47:33 -08:00
Mark Backman
9115692c72 Merge pull request #1227 from pipecat-ai/mb/fix-25-error
fix: ensure proper Google message format conversion in transcription …
2025-02-14 21:01:05 -05:00
Mark Backman
c26fe3f277 fix: ensure proper Google message format conversion in transcription filter 2025-02-14 20:28:26 -05:00
Mark Backman
47b059d387 Merge pull request #1226 from pipecat-ai/mb/add-transcript-processor-tests
tests: add tests for TranscriptProcessor
2025-02-14 19:50:38 -05:00
Mark Backman
a49d81e519 tests: add tests for TranscriptProcessor 2025-02-14 17:10:40 -05:00
Aleix Conchillo Flaqué
b3a575c7c7 Merge pull request #1212 from Vaibhav159/vl_fix_incorrect_has_regular_messages_check
fixing google llm service error
2025-02-14 13:16:37 -08:00
Aleix Conchillo Flaqué
790d0c1256 Merge pull request #1224 from M1ngXU/patch-1
Update openai.py
2025-02-14 13:13:00 -08:00
Aleix Conchillo Flaqué
ee7e0dc3f7 Merge pull request #1223 from pipecat-ai/aleix/audio-context-tts-service
audio context tts service and cartesia fixes
2025-02-14 12:12:42 -08:00
Aleix Conchillo Flaqué
f53ee79ddb RimeTTSService: use AudioContextWordTTSService 2025-02-14 11:55:54 -08:00
Aleix Conchillo Flaqué
aeadb40c3f CartesiaTTSService: use AudioContextWordTTSService
By supporting multiple audio requests we fix an issue that was causing audio
overlapping.
2025-02-14 11:55:54 -08:00
Aleix Conchillo Flaqué
cacb07f4c2 introduce AudioContextWordTTSService 2025-02-14 11:55:54 -08:00
M1ngXU
0b91d821fb Update openai.py
d
2025-02-14 20:27:08 +01:00
Aleix Conchillo Flaqué
af66a43056 Merge pull request #1222 from pipecat-ai/aleix/websocket-service-handle-clean-disconnection
WebsocketService: handle clean server disconnection
2025-02-14 10:33:54 -08:00
Aleix Conchillo Flaqué
e006dcf172 WebsocketService: handle clean server disconnection
The websocket async iterator doesn't raise an exception when the server
disconnects cleanly. We should handle that and raise an exception so we can
reconnect.
2025-02-14 10:11:56 -08:00
Filipi da Silva Fuchter
8588f8b0d8 Merge pull request #1220 from pipecat-ai/instant_voice_demo_example
Instant voice example.
2025-02-14 14:24:13 -03:00
Filipi Fuchter
bff54547b0 Instant voice example. 2025-02-14 14:19:17 -03:00
Mark Backman
b2754bf208 Merge pull request #1219 from pipecat-ai/mb/markdown-text-filter-tests
Add MarkdownTextFilter tests
2025-02-13 21:10:52 -05:00
Mark Backman
9a4942b0d0 Merge pull request #1218 from pipecat-ai/mb/user-idle-tests
Add UserIdleProcessor tests
2025-02-13 18:53:22 -05:00
Mark Backman
ed6201910b Add MarkdownTextFilter tests 2025-02-13 18:51:46 -05:00
Mark Backman
ac5ebc587e Add tests for UserIdleProcessor 2025-02-13 18:47:29 -05:00
Aleix Conchillo Flaqué
dff4c54e57 Merge pull request #1209 from pipecat-ai/aleix/reimplement-llm-response-aggregators
reimplement LLM response aggregators
2025-02-13 15:30:40 -08:00
Aleix Conchillo Flaqué
c744409651 SegmentedSTTService: fix process_audio_frame() arguments 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
7578fbeaef update google requirements 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
5909dff423 LLMContextResponseAggregator: add VAD emulation support 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
a6502df72c services: forgot to pass context instead of user aggregator 2025-02-13 13:50:33 -08:00
Aleix Conchillo Flaqué
e0d24d7fc0 update CHANGELOG 2025-02-13 13:21:32 -08:00
Aleix Conchillo Flaqué
99779046a8 services: use push_context_frame() 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
67cdc0063a BaseTransportOutput: allow pushing frames upstream 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
b28f752afa tests: add anthropic and google aggregator tests 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
463078e375 initialize assistant aggregators with context and push upstream instead 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
84510fd521 LLMUserContextAggregator: add space between transcriptions 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
9f6a1c093a LLMUserContextAggregator: reset user speaking time after bot interruption 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
b602e78625 tests: add OpenAI context aggregator tests 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
7c815121ea LLMContextResponseAggregator: add missing reset() implementation 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
16a107948b services: missing kwargs in anthropic/openai user context aggregator 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
839aa7d935 llm_response: add some initial docstrings to LLM aggregators 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
4cbcfe2b0b LLMUserContextAggregator: interrupt the bot if VAD happened a while back 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
91a628d1ba UserResponseAggregator: implement on top of LLMUserResponseAggregator 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
50288eeaaa tests: add LLM response aggregators tests 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
e1f2bbceb3 reimplement LLM response aggregators 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
8bdd7ed0ed tests: implement langchain tests with run_test() 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
1b7dfe8126 tests: add a new SleepFrame
The new SleepFrame allow us to control when system frames are pushed to the
pipeline.
2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
d1ee851a65 tests: rename some variables to make things clearer 2025-02-13 13:20:37 -08:00
Filipi da Silva Fuchter
0358673b46 Merge pull request #1215 from pipecat-ai/instant_voice_demo
Instant voice demo improvements - part 02
2025-02-13 18:14:15 -03:00
Filipi Fuchter
16fe1b10e9 - Added support for the RTVIProcessor to handle buffered audio in base64 format, converting it into InputAudioRawFrame for transport.
- Added support for the `RTVIProcessor` to trigger `start_audio_in_streaming` only after the `client-ready` message.
2025-02-13 18:08:55 -03:00
Filipi Fuchter
f001819df8 - Added a new audio_in_stream_on_start field to TransportParams.
- Added a new method `start_audio_in_streaming` in the `BaseInputTransport`.
- Updated `DailyTransport` to respect the `audio_in_stream_on_start` field, ensuring it only starts receiving the audio input if it is enabled.
2025-02-13 18:08:36 -03:00
Filipi Fuchter
dceec60186 Updated FastAPIWebsocketOutputTransport to send TransportMessageFrame and TransportMessageUrgentFrame to the serializer. 2025-02-13 18:07:33 -03:00
Filipi Fuchter
b96979a4ed Update WebsocketServer to not wrap the message inside a text frame. 2025-02-13 18:07:04 -03:00
Mark Backman
745c40def4 Merge pull request #1214 from pipecat-ai/mb/stt-mute-tests
Improve STTMuteFilter, add tests
2025-02-13 09:50:43 -05:00
Mark Backman
42ab62716d Merge pull request #1198 from pipecat-ai/mb/more-whisper-params
Add prompt and temperature args to OpenAI and Groq hosted Whisper STT…
2025-02-13 09:16:38 -05:00
Mark Backman
16ba2010aa Refactor process_frame to be more consistent 2025-02-13 09:15:29 -05:00
Mark Backman
ec0ca46617 Fix temperature docstrings to reference optional 2025-02-13 09:04:20 -05:00
Mark Backman
6ff1f526ff Merge pull request #1216 from pipecat-ai/mb/google-cloud-speech
Add the google-cloud-speech package to the google dependency
2025-02-13 07:04:34 -05:00
Mark Backman
84143cc80c self._muted now returns from STT process_audio_frames 2025-02-13 07:00:44 -05:00
Mark Backman
229dccedc6 Add the google-cloud-speech package to the google dependency 2025-02-12 23:19:17 -05:00
Aleix Conchillo Flaqué
68aaa1f8f4 Merge pull request #1213 from pipecat-ai/aleix/base-transport-output-bot-vad-stop-secs
BaseOutputTransport: use specific VAD stop secs for the bot
2025-02-12 19:01:56 -08:00
Aleix Conchillo Flaqué
f110a45c85 BaseOutputTransport: use specific VAD stop secs for the bot 2025-02-12 19:01:39 -08:00
Mark Backman
1e8a86de63 Handle starting muted, add tests 2025-02-12 19:01:49 -05:00
Mark Backman
ee93e2a2b1 Reorder frame pushing for STTMuteFilter, update STTMuteFrame to SystemFrame 2025-02-12 15:51:18 -05:00
Mark Backman
2e87a019a8 Merge pull request #1208 from pipecat-ai/mb/stt-mute-first-bot-speech
Add new STTMuteStrategy: MUTE_UNTIL_FIRST_BOT_COMPLETE
2025-02-12 12:21:02 -05:00
Vaibhav159
687b3d9d4c fixing google llm service error 2025-02-12 22:22:04 +05:30
Mark Backman
397768d872 Add new STTMuteStrategy: MUTE_UNTIL_FIRST_BOT_COMPLETE 2025-02-12 10:59:28 -05:00
Mark Backman
24cdcd74e6 Merge pull request #1197 from pipecat-ai/mb/google-stt
Add GoogleSTTService
2025-02-12 10:16:18 -05:00
Mark Backman
5d6370690c Add _reconnect_if_needed to simplify reconnect logic 2025-02-12 10:11:18 -05:00
Mark Backman
9f728aa623 Add reconnect logic to handle Google's 5 min time limit 2025-02-12 10:11:18 -05:00
Mark Backman
32d8f6153f Update InputParams to languages: support str or List of Languages 2025-02-12 10:11:18 -05:00
Mark Backman
8c2071f248 Add ClientOptions for region selection 2025-02-12 10:11:18 -05:00
Mark Backman
a9c2197dc6 Add ability to update options 2025-02-12 10:11:18 -05:00
Mark Backman
ce0358804b Docstrings and cleanup 2025-02-12 10:11:18 -05:00
Mark Backman
66a6a6a295 Enable interim transcriptions, add VAD events option 2025-02-12 10:11:18 -05:00
Mark Backman
9f1732c390 Update CHANGELOG and README 2025-02-12 10:11:17 -05:00
Mark Backman
b44ddf2456 07n uses all Google services 2025-02-12 10:09:36 -05:00
Mark Backman
17420f4d0c Update language support 2025-02-12 10:09:36 -05:00
Mark Backman
6cb55ec2cb Add GoogleSTTService 2025-02-12 10:09:36 -05:00
Filipi da Silva Fuchter
e2b4554a54 Merge pull request #1129 from pipecat-ai/instant_voice_demo
Pipecat improvements for the instant voice demo
2025-02-12 11:53:40 -03:00
Mark Backman
fd68b82e48 Merge pull request #1163 from pipecat-ai/mb/rime-websocket
Add RimeTTSService
2025-02-12 09:51:56 -05:00
Filipi Fuchter
cc90f5ab9f Sending the RTVI messages to the websocket 2025-02-12 11:46:49 -03:00
Filipi Fuchter
08f40d9179 Adding support to DailyTransport receive raw-audio through appMessage 2025-02-12 11:46:37 -03:00
Aleix Conchillo Flaqué
80e1325621 include codecov.yml 2025-02-11 23:46:19 -08:00
Aleix Conchillo Flaqué
ed76a5bfa5 Merge pull request #1202 from pipecat-ai/aleix/fix-simli-audiolayout-error
simli: fix audio layout error
2025-02-11 22:24:22 -08:00
Mark Backman
69b0d9035f Mark end_time as unused 2025-02-11 17:44:52 -05:00
Mark Backman
dcc63dd648 Use the vendor default for temperature 2025-02-11 14:29:33 -05:00
Aleix Conchillo Flaqué
2d08f42870 Merge pull request #1204 from pipecat-ai/aleix/add-coverage-support
github: add coverage support
2025-02-11 11:09:25 -08:00
Mark Backman
0814c0bc82 Merge pull request #1203 from pipecat-ai/expose-update-remote-participants-on-daily-transport
Expose `update_remote_participants()` from `DailyTransport`
2025-02-11 13:57:08 -05:00
Paul Kompfner
28e233b195 Update CHANGELOG to reflect the addition of update_remote_participants() 2025-02-11 13:23:47 -05:00
Aleix Conchillo Flaqué
6e4d2d6ade examples: fix more dependabot warnings 2025-02-11 10:09:33 -08:00
Aleix Conchillo Flaqué
266135ec54 examples: fix dependabot warnings 2025-02-11 10:07:05 -08:00
Aleix Conchillo Flaqué
d81aa48262 test-requirements: update transformers to 4.48.0 2025-02-11 10:04:21 -08:00
Aleix Conchillo Flaqué
8c7752fbc2 github: add coverage support 2025-02-11 09:58:21 -08:00
Julien Le Bourg
77fb63372a fix: incorrectly changed the base type in my last pull request for L… (#1184)
* fix: incorrectly changed the base type in my last pull request for  LocalAudioTransport

* update examples to use the new LocalTransportParams

* add local device select example
2025-02-11 08:35:57 -08:00
Paul Kompfner
5a8279d3c2 Expose update_remote_participants() from DailyTransport 2025-02-11 11:28:03 -05:00
Aleix Conchillo Flaqué
4db620198a simli: fix audio layout error
Fixes #1201
2025-02-11 07:05:35 -08:00
Mark Backman
d35f4c6b99 Add prompt and temperature args to OpenAI and Groq hosted Whisper STT services 2025-02-10 21:06:37 -05:00
Aleix Conchillo Flaqué
0a990b2aaa Merge pull request #1196 from pipecat-ai/aleix/audio-buffer-processor-continuous-intermittent-stream
AudioBufferProcessor: handle continuous and intermittent user audio
2025-02-10 16:07:12 -08:00
Mark Backman
97586b132d Simplify _calculate_word_times 2025-02-10 18:45:49 -05:00
Mark Backman
8020db350e Update RimeHttpTTSService to use mistv2 model by default 2025-02-10 18:45:48 -05:00
Mark Backman
54f64b8dad Code review feedback 2025-02-10 18:45:08 -05:00
Mark Backman
8f8a3ae7f9 Add RimeTTSService 2025-02-10 18:45:06 -05:00
Mark Backman
344aff5681 Merge pull request #1191 from pipecat-ai/mb/azure-tts-error-handling
Improve AzureTTSService error handling
2025-02-10 18:01:39 -05:00
Mark Backman
0d2e90cff1 Merge pull request #1190 from pipecat-ai/mb/languages-hosted-whisper
Add language support to OpenAI and Groq hosted Whisper
2025-02-10 17:49:38 -05:00
Mark Backman
1a8dd6b713 Improve AzureTTSService error handling 2025-02-10 17:48:55 -05:00
Mark Backman
2dc585aee0 Merge pull request #1185 from pipecat-ai/mb/update-readme-hacking
Add missing pip install -e . step to the README, and clarify steps
2025-02-10 17:45:58 -05:00
Mark Backman
a64fa44811 Merge pull request #1186 from pipecat-ai/mb/whisper-multilingual
Add language support to WhisperSTTService
2025-02-10 17:26:10 -05:00
Aleix Conchillo Flaqué
baeb83484d Merge pull request #1194 from Vaibhav159/vl_fix_elevenlabs_disconnect_issue
fixing disconnect issue
2025-02-10 13:41:59 -08:00
Vaibhav159
b0c3f80963 resolve merge conf 2025-02-11 03:03:32 +05:30
Aleix Conchillo Flaqué
eb3c9b1e75 AudioBufferProcessor: handle continuous and intermittent user audio
Fixes #1172
2025-02-10 11:26:31 -08:00
Mark Backman
ad4cbdb1ec Merge pull request #1159 from Canonical-AI-Inc/gemini-rag
Gemini 2.0 Flash Lite RAG example
2025-02-10 13:42:11 -05:00
Aleix Conchillo Flaqué
32baee924b RTVI: fix premature bot-tts-text messages (#1193) 2025-02-10 10:37:54 -08:00
Adrian Cowham
9cc53509d1 PR feedback: renamed file, added docstring, changed file read logic 2025-02-10 09:39:01 -08:00
Vaibhav159
2c62d3bf32 break once ConnectionClosed error 2025-02-10 23:04:05 +05:30
Vaibhav159
b06b16adb7 fixing disconnect issue 2025-02-10 22:55:20 +05:30
Mark Backman
cd52d73027 Add language support to OpenAI and Groq hosted Whisper 2025-02-10 10:18:00 -05:00
Mark Backman
c9d8c572c7 Add language support to WhisperSTTService 2025-02-09 10:51:23 -05:00
Mark Backman
d9439fd398 Add missing pip install -e . step to the README, and clarify steps 2025-02-09 09:15:10 -05:00
Mark Backman
081abcedb3 Merge pull request #1176 from pipecat-ai/mb/stt-mute-deprecate-stt-service
Deprecate stt_service parameter in STTMuteFilter
2025-02-09 08:35:22 -05:00
Mark Backman
1455e24ad1 Add keyword args, collocated warnings import with the deprecation 2025-02-09 08:29:20 -05:00
Mark Backman
4613cf4790 Merge pull request #1181 from pipecat-ai/mb/daily-docstrings
Add docstrings to daily.py
2025-02-09 08:05:59 -05:00
Mark Backman
7aa2e1209d Merge pull request #1177 from pipecat-ai/mb/perplexity
Add PerplexityLLMService
2025-02-09 08:05:46 -05:00
Mark Backman
76daaab6ca Add PerplexityLLMService 2025-02-09 08:00:31 -05:00
Mark Backman
37cfe870cc Merge pull request #1183 from pipecat-ai/mb/add-groq-stt
Add GroqSTTService, BaseWhisperSTTService, and refactor OpenAISTTService
2025-02-09 07:56:35 -05:00
Mark Backman
160167758b Add docstrings to daily.py 2025-02-09 07:53:51 -05:00
Mark Backman
4b634713a5 Merge pull request #1182 from pipecat-ai/mb/28c-optional-db
Update 28c option to output to log line only by default
2025-02-09 07:52:21 -05:00
Mark Backman
72954d5f15 Remove to base_whisper.py 2025-02-09 07:51:30 -05:00
Mark Backman
f2b07271c1 Update GroqLLMService to use llama-3.3-70b-versatile as the default model 2025-02-09 07:51:30 -05:00
Mark Backman
32b9de5f51 Add GroqSTTService, BaseWhisperSTTService, and refactor OpenAISTTService 2025-02-09 07:51:28 -05:00
Mark Backman
71ce8f9bcf Merge pull request #1179 from pipecat-ai/mb/remove-command-dash-badge
Remove CommandDash badge from README
2025-02-09 07:47:32 -05:00
Mark Backman
7d05728e2f Update 28c option to output to log line only by default 2025-02-08 10:00:45 -05:00
Mark Backman
dee5448b57 Merge pull request #1123 from pipecat-ai/cb/sqlite
Add SQLite storage to the Gemini persistent storage example
2025-02-08 09:07:52 -05:00
Mark Backman
d67861925a Merge pull request #1128 from golbin/whisper-api
Add Whisper STT service using OpenAI API
2025-02-08 08:35:26 -05:00
Mark Backman
0180619d44 Merge pull request #1173 from TheCodingLand/local-pyaudio-device-ids
adds configurable device ids for local audio transport
2025-02-08 08:04:00 -05:00
Mark Backman
f07e498612 Remove CommandDash badge from README 2025-02-08 07:59:39 -05:00
TheCodingLand
57964cb929 fix LocalAudioTransport param type 2025-02-08 12:32:20 +01:00
TheCodingLand
6840c77684 apply ruff formatting 2025-02-08 12:03:23 +01:00
Mark Backman
a1b58115ce Deprecate stt_service parameter in STTMuteFilter 2025-02-07 19:24:03 -05:00
chadbailey59
23eb6e3d46 storybot fixes (#1175)
* storybot fixes

* readme cleanup
2025-02-07 13:58:02 -06:00
Mark Backman
74a2c38c6c Merge pull request #1174 from pipecat-ai/mb/bump-google-genai-version
Bump google-genai version to 1.0.0
2025-02-07 14:53:44 -05:00
Mark Backman
90b217fda8 Bump google-genai version to 1.0.0 2025-02-07 14:32:37 -05:00
Aleix Conchillo Flaqué
6855bc0ada Merge pull request #1166 from pipecat-ai/aleix/google-rtvi-observer
rtvi: separate specific google RTVI into a GoogleRTVIObserver
2025-02-08 03:19:02 +08:00
TheCodingLand
a359434307 remove Doc and Annotated imports 2025-02-07 19:42:34 +01:00
TheCodingLand
856c8959c3 enhance doc 2025-02-07 19:38:26 +01:00
TheCodingLand
8da7a42137 adds configurable input and output device ids for local audio 2025-02-07 19:23:18 +01:00
Aleix Conchillo Flaqué
510a0f5ef5 rtvi: deprecate RTVI.observer() 2025-02-07 09:19:43 -08:00
Aleix Conchillo Flaqué
03ac744bcf rtvi: deprecate frame processors 2025-02-07 09:17:29 -08:00
Aleix Conchillo Flaqué
b058461a7d GoogleRTVIObserver: add explicit constructor 2025-02-07 09:15:32 -08:00
Mark Backman
abd9f16b90 Export .rtvi, update new-chatbot example, rename and update foundational 32 2025-02-07 09:15:32 -08:00
Aleix Conchillo Flaqué
d07732f2e8 rtvi: separate specific google RTVI into a GoogleRTVIObserver 2025-02-07 09:15:32 -08:00
Aleix Conchillo Flaqué
4d25582e16 dev-requirements: update pyright and ruff 2025-02-06 21:51:57 -08:00
Adrian Cowham
d9f6b7b93c added an example using using Gemini's large context window for RAG 2025-02-06 12:49:29 -08:00
Jin Kim
5989e1ed16 Merge branch 'main' into whisper-api 2025-02-06 13:14:36 +09:00
Jin Kim
ef1e4277d3 Add an example for Whisper using OpenAI API 2025-02-04 10:32:55 +09:00
Jin Kim
823b763b25 Change OpenAI example file name 2025-02-04 10:28:06 +09:00
Jin Kim
3cb189eb1f Add whisper STT service using OpenAI API 2025-02-04 10:27:28 +09:00
Chad Bailey
d236973c0f moved sqlite code back to a single example 2025-01-31 23:18:06 +00:00
Chad Bailey
bc98c2e36c added sqlite storage example 2025-01-29 19:12:15 +00:00
115 changed files with 11335 additions and 1533 deletions

54
.github/workflows/coverage.yaml vendored Normal file
View File

@@ -0,0 +1,54 @@
name: coverage
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
jobs:
coverage:
name: "Coverage"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing dev-requirements.txt and test-requirements.txt which
# contain all dependencies needed to run the tests.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('dev-requirements.txt') }}-${{ hashFiles('test-requirements.txt') }}
path: .venv
- name: Install system packages
id: install_system_packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r test-requirements.txt
- name: Run tests with coverage
run: |
source .venv/bin/activate
coverage run
coverage xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
slug: pipecat-ai/pipecat

View File

@@ -5,10 +5,158 @@ All notable changes to **Pipecat** will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.0.57] - 2025-02-14
### Added
- Added new `AudioContextWordTTSService`. This is a TTS base class for TTS
services that handling multiple separate audio requests.
- Added new frames `EmulateUserStartedSpeakingFrame` and
`EmulateUserStoppedSpeakingFrame` which can be used to emulated VAD behavior
without VAD being present or not being triggered.
- Added a new `audio_in_stream_on_start` field to `TransportParams`.
- Added a new method `start_audio_in_streaming` in the `BaseInputTransport`.
- This method should be used to start receiving the input audio in case the
field `audio_in_stream_on_start` is set to `false`.
- Added support for the `RTVIProcessor` to handle buffered audio in `base64`
format, converting it into InputAudioRawFrame for transport.
- Added support for the `RTVIProcessor` to trigger `start_audio_in_streaming`
only after the `client-ready` message.
- Added new `MUTE_UNTIL_FIRST_BOT_COMPLETE` strategy to `STTMuteStrategy`. This
strategy starts muted and remains muted until the first bot speech completes,
ensuring the bot's first response cannot be interrupted. This complements the
existing `FIRST_SPEECH` strategy which only mutes during the first detected
bot speech.
- Added support for Google Cloud Speech-to-Text V2 through `GoogleSTTService`.
- Added `RimeTTSService`, a new `WordTTSService`. Updated the foundational
example `07q-interruptible-rime.py` to use `RimeTTSService`.
- Added support for Groq's Whisper API through the new `GroqSTTService` and
OpenAI's Whisper API through the new `OpenAISTTService`. Introduced a new
base class `BaseWhisperSTTService` to handle common Whisper API
functionality.
- Added `PerplexityLLMService` for Perplexity NIM API integration, with an
OpenAI-compatible interface. Also, added foundational example
`14n-function-calling-perplexity.py`.
- Added `DailyTransport.update_remote_participants()`. This allows you to update
remote participant's settings, like their permissions or which of their
devices are enabled. Requires that the local participant have participant
admin permission.
### Changed
- We don't consider a colon `:` and end of sentence any more.
- Updated `DailyTransport` to respect the `audio_in_stream_on_start` field,
ensuring it only starts receiving the audio input if it is enabled.
- Updated `FastAPIWebsocketOutputTransport` to send `TransportMessageFrame` and
`TransportMessageUrgentFrame` to the serializer.
- Updated `WebsocketServerOutputTransport` to send `TransportMessageFrame` and
`TransportMessageUrgentFrame` to the serializer.
- Enhanced `STTMuteConfig` to validate strategy combinations, preventing
`MUTE_UNTIL_FIRST_BOT_COMPLETE` and `FIRST_SPEECH` from being used together
as they handle first bot speech differently.
- Updated foundational example `07n-interruptible-google.py` to use all Google
services.
- `RimeHttpTTSService` now uses the `mistv2` model by default.
- Improved error handling in `AzureTTSService` to properly detect and log
synthesis cancellation errors.
- Enhanced `WhisperSTTService` with full language support and improved model
documentation.
- Updated foundation example `14f-function-calling-groq.py` to use
`GroqSTTService` for transcription.
- Updated `GroqLLMService` to use `llama-3.3-70b-versatile` as the default
model.
- `RTVIObserver` doesn't handle `LLMSearchResponseFrame` frames anymore. For
now, to handle those frames you need to create a `GoogleRTVIObserver`
instead.
### Deprecated
- `STTMuteFilter` constructor's `stt_service` parameter is now deprecated and
will be removed in a future version. The filter now manages mute state
internally instead of querying the STT service.
- `RTVI.observer()` is now deprecated, instantiate an `RTVIObserver` directly
instead.
- All RTVI frame processors (e.g. `RTVISpeakingProcessor`,
`RTVIBotLLMProcessor`) are now deprecated, instantiate an `RTVIObserver`
instead.
### Fixed
- Fixed a `FalImageGenService` issue that was causing the event loop to be
blocked while loading the downloadded image.
- Fixed a `CartesiaTTSService` service issue that would cause audio overlapping
in some cases.
- Fixed a websocket-based service issue (e.g. `CartesiaTTSService`) that was
preventing a reconnection after the server disconnected cleanly, which was
causing an inifite loop instead.
- Fixed a `BaseOutputTransport` issue that was causing upstream frames to no be
pushed upstream.
- Fixed multiple issue where user transcriptions where not being handled
properly. It was possible for short utterances to not trigger VAD which would
cause user transcriptions to be ignored. It was also possible for one or more
transcriptions to be generated after VAD in which case they would also be
ignored.
- Fixed an issue that was causing `BotStoppedSpeakingFrame` to be generated too
late. This could then cause issues unblocking `STTMuteFilter` later than
desired.
- Fixed an issue that was causing `AudioBufferProcessor` to not record
synchronized audio.
- Fixed an `RTVI` issue that was causing `bot-tts-text` messages to be sent
before being processed by the output transport.
- Fixed an issue[#1192] in 11labs where we are trying to reconnect/disconnect
the websocket connection even when the connection is already closed.
- Fixed an issue where `has_regular_messages` condition was always true in
`GoogleLLMContext` due to `Part` having `function_call` & `function_response`
with `None` values.
### Other
- Added new `instant-voice` example. This example showcases how to enable
instant voice communication as soon as a user connects.
- Added new `local-input-select-stt` example. This examples allows you to play
with local audio inputs by slecting them through a nice text interface.
## [0.0.56] - 2025-02-06
### Changed
- Use `gemini-2.0-flash-001` as the default model for `GoogleLLMSerivce`.
- Improved foundational examples 22b, 22c, and 22d to support function calling.
With these base examples, `FunctionCallInProgressFrame` and
`FunctionCallResultFrame` will no longer be blocked by the gates.
@@ -22,7 +170,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
logging setup.
- Fixed a `SentryMetrics` issue that was preventing any metrics to be sent to
Sentry and also was preventing from metrics frames to be pushed to the pipeline.
Sentry and also was preventing from metrics frames to be pushed to the
pipeline.
- Fixed an issue in `BaseOutputTransport` where incoming audio would not be
resampled to the desired output sample rate.
@@ -33,10 +182,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
and should be set manually from the serializer constructor if a different
value is needed.
### Changed
- Use `gemini-2.0-flash-001` as the default model for `GoogleLLMSerivce`.
### Other
- Added a new `sentry-metrics` example.
@@ -119,7 +264,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `AudioBufferProcessor.reset_audio_buffers()` has been removed, use
`AudioBufferProcessor.start_recording()` and
``AudioBufferProcessor.stop_recording()` instead.
`AudioBufferProcessor.stop_recording()` instead.
### Fixed
@@ -185,7 +330,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added `enable_recording` and `geo` parameters to `DailyRoomProperties`.
- Added `RecordingsBucketConfig` to `DailyRoomProperties` to upload recordings to a custom AWS bucket.
- Added `RecordingsBucketConfig` to `DailyRoomProperties` to upload recordings
to a custom AWS bucket.
### Changed

View File

@@ -2,7 +2,7 @@
 <img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
</div></h1>
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) <a href="https://app.commanddash.io/agent/github_pipecat-ai_pipecat"><img src="https://img.shields.io/badge/AI-Code%20Agent-EB9FDA"></a>
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![codecov](https://codecov.io/gh/pipecat-ai/pipecat/graph/badge.svg?token=LNVUIVO4Y9)](https://codecov.io/gh/pipecat-ai/pipecat) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat)
Pipecat is an open source Python framework for building voice and multimodal conversational agents. It handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions, letting you focus on creating engaging experiences.
@@ -55,17 +55,17 @@ pip install "pipecat-ai[option,...]"
### Available services
| Category | Services | Install Command Example |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) | `pip install "pipecat-ai[deepgram]"` |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Together AI](https://docs.pipecat.ai/server/services/llm/together) | `pip install "pipecat-ai[openai]"` |
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) | `pip install "pipecat-ai[cartesia]"` |
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) | `pip install "pipecat-ai[openai]"` |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local | `pip install "pipecat-ai[daily]"` |
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) | `pip install "pipecat-ai[tavus,simli]"` |
| Vision & Image | [Moondream](https://docs.pipecat.ai/server/services/vision/moondream), [fal](https://docs.pipecat.ai/server/services/image-generation/fal) | `pip install "pipecat-ai[moondream]"` |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) | `pip install "pipecat-ai[silero]"` |
| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/server/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) | `pip install "pipecat-ai[canonical]"` |
| Category | Services | Install Command Example |
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) | `pip install "pipecat-ai[deepgram]"` |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Together AI](https://docs.pipecat.ai/server/services/llm/together) | `pip install "pipecat-ai[openai]"` |
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) | `pip install "pipecat-ai[cartesia]"` |
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) | `pip install "pipecat-ai[google]"` |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local | `pip install "pipecat-ai[daily]"` |
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) | `pip install "pipecat-ai[tavus,simli]"` |
| Vision & Image | [Moondream](https://docs.pipecat.ai/server/services/vision/moondream), [fal](https://docs.pipecat.ai/server/services/image-generation/fal) | `pip install "pipecat-ai[moondream]"` |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) | `pip install "pipecat-ai[silero]"` |
| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/server/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) | `pip install "pipecat-ai[canonical]"` |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
@@ -149,36 +149,40 @@ Sign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://
## Hacking on the framework itself
_Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo:_
_Note: You may need to set up a virtual environment before following these instructions. From the root of the repo:_
```shell
python3 -m venv venv
source venv/bin/activate
```
From the root of this repo, run the following:
Install the development dependencies:
```shell
pip install -r dev-requirements.txt
```
This will install the necessary development dependencies. Also, make sure you install the git pre-commit hooks:
Install the git pre-commit hooks (these help ensure your code follows project rules):
```shell
pre-commit install
```
The hooks will just save you time when you submit a PR by making sure your code follows the project rules.
To use the package locally (e.g. to run sample files), run:
Install the `pipecat-ai` package locally in editable mode:
```shell
pip install --editable ".[option,...]"
pip install -e .
```
The `--editable` option makes sure you don't have to run `pip install` again and you can just edit the project files locally.
The `-e` or `--editable` option allows you to modify the code without reinstalling.
If you want to use this package from another directory, you can run:
To include optional dependencies, add them to the install command. For example:
```shell
pip install -e ".[daily,deepgram,cartesia,openai,silero]" # Updated for the services you're using
```
If you want to use this package from another directory:
```shell
pip install "path_to_this_repo[option,...]"

11
codecov.yml Normal file
View File

@@ -0,0 +1,11 @@
coverage:
range: 50..90 # coverage lower than 50 is red, higher than 90 green, between color code
status:
project:
default:
target: auto # auto % coverage target
threshold: 5% # allow for 5% reduction of coverage without failing
# do not run coverage on patch nor changes
patch: false

View File

@@ -1,11 +1,12 @@
build~=1.2.2
coverage~=7.6.12
grpcio-tools~=1.67.1
pip-tools~=7.4.1
pre-commit~=4.0.1
pyright~=1.1.392
pyright~=1.1.393
pytest~=8.3.4
pytest-asyncio~=0.25.2
ruff~=0.9.1
ruff~=0.9.5
setuptools~=70.0.0
setuptools_scm~=8.1.0
python-dotenv~=1.0.1

View File

@@ -12,7 +12,7 @@
"@daily-co/daily-js": "0.74.0"
},
"devDependencies": {
"vite": "^6.0.2"
"vite": "^6.0.9"
}
},
"node_modules/@babel/runtime": {
@@ -1007,15 +1007,14 @@
}
},
"node_modules/vite": {
"version": "6.0.7",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.0.7.tgz",
"integrity": "sha512-RDt8r/7qx9940f8FcOIAH9PTViRrghKaK2K1jY3RaAURrEUbm9Du1mJ72G+jlhtG3WwodnfzY8ORQZbBavZEAQ==",
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.1.0.tgz",
"integrity": "sha512-RjjMipCKVoR4hVfPY6GQTgveinjNuyLw+qruksLDvA5ktI1150VmcMBKmQaEWJhg/j6Uaf6dNCNA0AfdzUb/hQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"esbuild": "^0.24.2",
"postcss": "^8.4.49",
"rollup": "^4.23.0"
"postcss": "^8.5.1",
"rollup": "^4.30.1"
},
"bin": {
"vite": "bin/vite.js"

View File

@@ -12,7 +12,7 @@
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.0.2"
"vite": "^6.0.9"
},
"dependencies": {
"@daily-co/daily-js": "0.74.0"

View File

@@ -16,8 +16,7 @@ from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.audio import LocalAudioTransport
from pipecat.transports.local.audio import LocalAudioTransport, LocalAudioTransportParams
load_dotenv(override=True)
@@ -26,7 +25,7 @@ logger.add(sys.stderr, level="DEBUG")
async def main():
transport = LocalAudioTransport(TransportParams(audio_out_enabled=True))
transport = LocalAudioTransport(LocalAudioTransportParams(audio_out_enabled=True))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
@@ -41,7 +40,7 @@ async def main():
await asyncio.sleep(1)
await task.queue_frames([TTSSpeakFrame("Hello there, how is it going!"), EndFrame()])
runner = PipelineRunner()
runner = PipelineRunner(handle_sigint=False if sys.platform == "win32" else True)
await asyncio.gather(runner.run(task), say_something())

View File

@@ -27,7 +27,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.cartesia import CartesiaHttpTTSService
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
@@ -91,7 +91,7 @@ async def main():
),
)
tts = CartesiaHttpTTSService(
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)

View File

@@ -9,6 +9,7 @@ import os
import sys
import aiohttp
from deepgram import LiveOptions
from dotenv import load_dotenv
from loguru import logger
from runner import configure
@@ -44,7 +45,23 @@ async def main():
),
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
# stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
stt = DeepgramSTTService(
api_key=os.getenv("DEEPGRAM_API_KEY"),
# url=deepgram_url,
live_options=LiveOptions(
encoding="linear16",
language="en-US",
model="nova-3",
channels=1,
interim_results=True,
# smart_format=smart_format,
# endpointing=endpointing,
vad_events=True,
diarize=True,
filler_words=True,
),
)
tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-helios-en")

View File

@@ -18,7 +18,7 @@ from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.openai import OpenAILLMService, OpenAITTSService
from pipecat.services.openai import OpenAILLMService, OpenAISTTService, OpenAITTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
@@ -37,12 +37,22 @@ async def main():
"Respond bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
audio_out_sample_rate=24000,
transcription_enabled=False,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
vad_audio_passthrough=True,
),
)
# You can use the OpenAI compatible API like Groq.
# stt = OpenAISTTService(
# base_url="https://api.groq.com/openai/v1",
# api_key="gsk_***",
# model="whisper-large-v3",
# )
stt = OpenAISTTService(api_key=os.getenv("OPENAI_API_KEY"), model="whisper-1")
tts = OpenAITTSService(api_key=os.getenv("OPENAI_API_KEY"), voice="alloy")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
@@ -60,6 +70,7 @@ async def main():
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS

View File

@@ -18,9 +18,7 @@ from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.deepgram import DeepgramSTTService
from pipecat.services.google import GoogleTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.services.google import GoogleLLMService, GoogleSTTService, GoogleTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.services.daily import DailyParams, DailyTransport
@@ -46,14 +44,16 @@ async def main():
),
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
stt = GoogleSTTService(
params=GoogleSTTService.InputParams(languages=Language.EN_US),
)
tts = GoogleTTSService(
voice_id="en-US-Journey-F",
params=GoogleTTSService.InputParams(language=Language.EN_US),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
llm = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
messages = [
{

View File

@@ -19,7 +19,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.openai import OpenAILLMService
from pipecat.services.rime import RimeHttpTTSService
from pipecat.services.rime import RimeTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
@@ -44,10 +44,9 @@ async def main():
),
)
tts = RimeHttpTTSService(
tts = RimeTTSService(
api_key=os.getenv("RIME_API_KEY", ""),
voice_id="rex",
params=RimeHttpTTSService.InputParams(reduce_latency=True),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")

View File

@@ -16,8 +16,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.whisper import WhisperSTTService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.audio import LocalAudioTransport
from pipecat.transports.local.audio import LocalAudioTransport, LocalAudioTransportParams
load_dotenv(override=True)
@@ -34,7 +33,7 @@ class TranscriptionLogger(FrameProcessor):
async def main():
transport = LocalAudioTransport(TransportParams(audio_in_enabled=True))
transport = LocalAudioTransport(LocalAudioTransportParams(audio_in_enabled=True))
stt = WhisperSTTService()
@@ -44,7 +43,7 @@ async def main():
task = PipelineTask(pipeline)
runner = PipelineRunner()
runner = PipelineRunner(handle_sigint=False if sys.platform == "win32" else True)
await runner.run(task)

View File

@@ -20,7 +20,7 @@ from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.groq import GroqLLMService
from pipecat.services.groq import GroqLLMService, GroqSTTService
from pipecat.services.openai import OpenAILLMContext
from pipecat.transports.services.daily import DailyParams, DailyTransport
@@ -50,20 +50,20 @@ async def main():
"Respond bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
vad_audio_passthrough=True,
),
)
stt = GroqSTTService(api_key=os.getenv("GROQ_API_KEY"), model="distil-whisper-large-v3-en")
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)
llm = GroqLLMService(
api_key=os.getenv("GROQ_API_KEY"), model="llama3-groq-70b-8192-tool-use-preview"
)
llm = GroqLLMService(api_key=os.getenv("GROQ_API_KEY"), model="llama-3.3-70b-versatile")
# Register a function_name of None to get all functions
# sent to the same callback with an additional function_name parameter.
llm.register_function(None, fetch_weather_from_api, start_callback=start_fetch_weather)
@@ -105,6 +105,7 @@ async def main():
pipeline = Pipeline(
[
transport.input(),
stt,
context_aggregator.user(),
llm,
tts,

View File

@@ -0,0 +1,106 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
"""This example demonstrates using the Perplexity API as a drop-in replacement for OpenAI.
Note that while this file is in the function-calling examples, Perplexity's API does not
currently support function calling. The example shows basic chat completion functionality
using Perplexity's API while maintaining compatibility with the OpenAI interface.
"""
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from openai.types.chat import ChatCompletionToolParam
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.openai import OpenAILLMContext, OpenAILLMService
from pipecat.services.perplexity import PerplexityLLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Respond bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)
llm = PerplexityLLMService(api_key=os.getenv("PERPLEXITY_API_KEY"), model="sonar")
messages = [
{
"role": "user",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
report_only_initial_ttfb=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
await task.queue_frames([context_aggregator.user().get_context_frame()])
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -497,7 +497,7 @@ class UserAggregatorBuffer(LLMResponseAggregator):
if isinstance(frame, UserStartedSpeakingFrame):
self._transcription = ""
async def _push_aggregation(self):
async def push_aggregation(self):
if self._aggregation:
self._transcription = self._aggregation
self._aggregation = ""

View File

@@ -61,9 +61,11 @@ async def main():
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
# Configure the mute processor with both strategies
stt_mute_processor = STTMuteFilter(
stt_service=stt,
config=STTMuteConfig(
strategies={STTMuteStrategy.FIRST_SPEECH, STTMuteStrategy.FUNCTION_CALL}
strategies={
STTMuteStrategy.MUTE_UNTIL_FIRST_BOT_COMPLETE,
STTMuteStrategy.FUNCTION_CALL,
}
),
)

View File

@@ -143,7 +143,14 @@ class InputTranscriptionContextFilter(FrameProcessor):
return
try:
message = frame.context.messages[-1]
# Make sure we're working with a GoogleLLMContext
context = GoogleLLMContext.upgrade_to_google(frame.context)
message = context.messages[-1]
if not isinstance(message, glm.Content):
logger.error(f"Expected glm.Content, got {type(message)}")
return
last_part = message.parts[-1]
if not (
message.role == "user"

View File

@@ -6,6 +6,7 @@
import asyncio
import os
import sqlite3
import sys
from typing import List, Optional
@@ -44,22 +45,33 @@ class TranscriptHandler:
output_file: Optional path to file where transcript is saved. If None, outputs to log only.
"""
def __init__(self, output_file: Optional[str] = None):
"""Initialize handler with optional file output.
def __init__(self, output_file: Optional[str] = None, output_db: Optional[str] = None):
"""Initialize handler with optional file or database output.
Args:
output_file: Path to output file. If None, outputs to log only.
"""
self.messages: List[TranscriptionMessage] = []
self.output_file: Optional[str] = output_file
self.output_db: Optional[str] = output_db
if self.output_db:
self.con = sqlite3.connect("example.db")
self.db = self.con.cursor()
table = self.db.execute("SELECT name FROM sqlite_master WHERE name='messages'")
if not (table.fetchone()):
self.db.execute(
"CREATE TABLE messages(role TEXT, content TEXT, timestamp DATETIME DEFAULT CURRENT_TIMESTAMP )"
)
logger.debug(
f"TranscriptHandler initialized {'with output_file=' + output_file if output_file else 'with log output only'}"
f"TranscriptHandler initialized; output file: {output_file}, output DB: {output_db}"
)
async def save_message(self, message: TranscriptionMessage):
"""Save a single transcript message.
Outputs the message to the log and optionally to a file.
Outputs the message to the log and optionally to a SQLite database or file.
Args:
message: The message to save
@@ -78,6 +90,14 @@ class TranscriptHandler:
except Exception as e:
logger.error(f"Error saving transcript message to file: {e}")
# and/or to a SQLite database
if self.output_db:
self.db.execute(
"INSERT INTO messages VALUES (?, ?, ?)",
(message.role, message.content, message.timestamp),
)
self.con.commit()
async def on_transcript_update(
self, processor: TranscriptProcessor, frame: TranscriptionUpdateFrame
):
@@ -136,8 +156,11 @@ async def main():
# Create transcript processor and handler
transcript = TranscriptProcessor()
# Select a TranscriptHandler output method
# Uncomment out only one of the following lines:
transcript_handler = TranscriptHandler() # Output to log only
# transcript_handler = TranscriptHandler(output_file="transcript.txt") # Output to file and log
# transcript_handler = TranscriptHandler(output_db="example.db") # Output to SQLite DB and log
pipeline = Pipeline(
[

View File

@@ -89,6 +89,7 @@ async def main():
api_key=os.getenv("GOOGLE_API_KEY"),
system_instruction=system_instruction,
tools=tools,
model="gemini-1.5-flash-002",
)
context = OpenAILLMContext(

View File

@@ -0,0 +1,254 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
"""CrossFit Games 2025 Rulebook RAG Demo.
This example demonstrates a Model-Assisted Generation (MAG) chatbot using Google's Gemini model.
This example uses 2 Gemini models:
- Gemini 2.0 Flash: This is the voice model that is used to generate the response.
- Gemini 2.0 Flash Lite: This is the model that is used to answer questions about the CrossFit Games 2025 rulebook - information that isn't yet publicly
indexed by Gemini (or any other LLM).
How it works:
- The voice model (Gemini 2.0 Flash) is configured to call a function whenever the user asks a question.
- The function call is a tool call to the MAG model (Gemini 2.0 Flash Lite).
- The MAG model generates a response based on the question. The MAG model has the entire contents of the CrossFit Games 2025 rulebook in it's context window.
- The response is returned to the voice model (Gemini 2.0 Flash), which then generates the response to the user.
Why this works:
- Gemini 2.0 Flash is fast
- Gemini 2.0 Flash Lite is faster
- Gemini 2.0 Flash Lite has a large (1 million tokens) context window
- IMPORTANT: The generated response from Gemini 2.0 Flash Lite is limited to 50 words or less and 64 tokens.
You can see this in the RAG_PROMPT variable and the generation_config in the query_knowledge_base function.
Long generations are slower and more expensive, in the world of Voice AI, we don't need long generations.
Example questions to ask and compare to other RAG solutions:
- What lenses are not allowed?
- How many people can be on a team?
- What do winning gyms get?
- What happens if I skip a workout?
- Can I switch my team members for the Games?
- What happens if I start too early?
Notes:
- The RAG model is Gemini 2.0 Flash Lite.
- The voice model is Gemini 2.0 Flash.
- The RAG content is stored in the assets/rag-content.txt file.
- The model for voice is Gemini 2.0 Flash, but can be easily switched to any other model.
Customization options:
- update assets/rag-content.txt with your own knowledge base
- increase/decrease the RAG_MODEL's generation length
- use a different voice model
- play with the RAG_PROMPT
- change the function calling logic
"""
import asyncio
import json
import os
import sys
import time
import aiohttp
import google.generativeai as genai
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.google import GoogleLLMService
from pipecat.services.openai import OpenAILLMContext
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="INFO")
video_participant_id = None
def get_rag_content():
"""Get the RAG content from the file."""
script_dir = os.path.dirname(os.path.abspath(__file__))
rag_content_path = os.path.join(script_dir, "assets", "rag-content.txt")
with open(rag_content_path, "r") as f:
return f.read()
RAG_MODEL = "gemini-2.0-flash-lite-preview-02-05"
VOICE_MODEL = "gemini-2.0-flash"
RAG_CONTENT = get_rag_content()
RAG_PROMPT = f"""
You are a helpful assistant designed to answer user questions based solely on the provided knowledge base.
**Instructions:**
1. **Knowledge Base Only:** Answer questions *exclusively* using the information in the "Knowledge Base" section below. Do not use any outside information.
2. **Conversation History:** Use the "Conversation History" (ordered oldest to newest) to understand the context of the current question.
3. **Concise Response:** Respond in 50 words or fewer. The response will be spoken, so avoid symbols, abbreviations, or complex formatting. Use plain, natural language.
4. **Unknown Answer:** If the answer is not found within the "Knowledge Base," respond with "I don't know." Do not guess or make up an answer.
5. Do not introduce your response. Just provide the answer.
6. You must follow all instructions.
**Input Format:**
Each request will include:
* **Conversation History:** (A list of previous user and assistant messages, if any)
**Knowledge Base:**
Here is the knowledge base you have access to:
{RAG_CONTENT}
"""
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
async def query_knowledge_base(
function_name, tool_call_id, arguments, llm, context, result_callback
):
"""Query the knowledge base for the answer to the question."""
logger.info(f"Querying knowledge base for question: {arguments['question']}")
client = genai.GenerativeModel(
model_name=RAG_MODEL,
system_instruction=RAG_PROMPT,
generation_config=genai.types.GenerationConfig(
temperature=0.1,
max_output_tokens=64,
),
)
# for our case, the first two messages are the instructions and the user message
# so we remove them.
conversation_turns = context.messages[2:]
# convert to standard messages
messages = []
for turn in conversation_turns:
messages.extend(context.to_standard_messages(turn))
def _is_tool_call(turn):
if turn.get("role", None) == "tool":
return True
if turn.get("tool_calls", None):
return True
return False
# filter out tool calls
messages = [turn for turn in messages if not _is_tool_call(turn)]
# use the last 3 turns as the conversation history/context
messages = messages[-3:]
messages_json = json.dumps(messages, ensure_ascii=False, indent=2)
logger.info(f"Conversation turns: {messages_json}")
start = time.perf_counter()
response = client.generate_content(
contents=[messages_json],
)
end = time.perf_counter()
logger.info(f"Time taken: {end - start:.2f} seconds")
logger.info(response.text)
await result_callback(response.text)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Gemini RAG Bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="f9836c6e-a0bd-460e-9d3c-f7299fa60f94", # Southern Lady
)
llm = GoogleLLMService(
model=VOICE_MODEL,
api_key=os.getenv("GOOGLE_API_KEY"),
)
llm.register_function("query_knowledge_base", query_knowledge_base)
tools = [
{
"function_declarations": [
{
"name": "query_knowledge_base",
"description": "Query the knowledge base for the answer to the question.",
"parameters": {
"type": "object",
"properties": {
"question": {
"type": "string",
"description": "The question to query the knowledge base with.",
},
},
},
},
],
},
]
system_prompt = """\
You are a helpful assistant who converses with a user and answers questions.
You have access to the tool, query_knowledge_base, that allows you to query the knowledge base for the answer to the user's question.
Your response will be turned into speech so use only simple words and punctuation.
"""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Greet the user."},
]
context = OpenAILLMContext(messages, tools)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
global video_participant_id
video_participant_id = participant["id"]
await transport.capture_participant_transcription(participant["id"])
await transport.capture_participant_video(video_participant_id, framerate=0)
# Kick off the conversation.
await task.queue_frames([context_aggregator.user().get_context_frame()])
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

File diff suppressed because it is too large Load Diff

7
examples/instant-voice/.gitignore vendored Normal file
View File

@@ -0,0 +1,7 @@
venv
.idea
server/.env
client/javascript/node_modules
client/javascript/.vite
bkp

View File

@@ -0,0 +1,55 @@
# Instant Voice
This demo showcases how to enable instant voice communication as soon as a user connects.
By leveraging optimizations on both the server and client sides, users can start speaking immediately after pressing the connect button.
## How It Works
### Server-Side Improvements:
- A **pool of Daily rooms** is managed to ensure quick connections.
- When a user connects, an existing room from the pool is assigned.
- A new room is created asynchronously to maintain the predefined pool size.
### Client-Side Improvements:
- Using the **DailyTransport** property `bufferLocalAudioUntilBotReady` set to enabled, users can start speaking immediately
upon receiving the `AUDIO_BUFFERING_STARTED` event (typically within ~1s).
- This allows users to speak even before the bot is fully ready or the WebRTC connection is fully established.
## Quick Start
### 1. Start the Bot Server
1. Navigate to the server directory:
```bash
cd server
```
2. Create and activate a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Copy the `.env.example` file to `.env` and configure it:
- Add your API keys.
5. Start the server:
```bash
python src/server.py
```
### 2. Connect Using the Client App
For client-side setup, refer to the [JavaScript Guide](client/javascript/README.md).
## Important Notes
- The bot server **must** be running before using the client implementation.
- Ensure your environment variables are correctly set up.
## Requirements
- **Python 3.10+**
- **Node.js 16+** (for JavaScript/React client)
- **Daily API key**
- **Google API key**
- **Modern web browser with WebRTC support**

View File

@@ -0,0 +1,27 @@
# JavaScript Implementation
Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
## Setup
1. Run the bot server. See the [server README](../../README).
2. Navigate to the `client/javascript` directory:
```bash
cd client/javascript
```
3. Install dependencies:
```bash
npm install
```
4. Run the client app:
```
npm run dev
```
5. Visit http://localhost:5173 in your browser.

View File

@@ -0,0 +1,37 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI Chatbot</title>
</head>
<body>
<div class="container">
<div class="status-bar">
<div class="status">
Buffering audio: <span id="buffering-status">No</span>
</div>
<div class="status">
Transport: <span id="connection-status">Disconnected</span>
</div>
<div class="controls">
<button id="connect-btn">Connect</button>
<button id="disconnect-btn" disabled>Disconnect</button>
</div>
</div>
<audio id="bot-audio" autoplay></audio>
<div class="debug-panel">
<h3>Debug Info</h3>
<div id="debug-log"></div>
</div>
</div>
<script type="module" src="/src/app.ts"></script>
<link rel="stylesheet" href="/src/style.css">
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,24 @@
{
"name": "client",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"dev": "vite",
"build": "tsc && vite build",
"preview": "vite preview"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"devDependencies": {
"@types/node": "^22.13.1",
"@vitejs/plugin-react-swc": "^3.7.2",
"typescript": "^5.7.3",
"vite": "^6.0.2"
},
"dependencies": {
"@pipecat-ai/client-js": "^0.3.2",
"@pipecat-ai/daily-transport": "^0.3.5"
}
}

View File

@@ -0,0 +1,268 @@
/**
* Copyright (c) 20242025, Daily
*
* SPDX-License-Identifier: BSD 2-Clause License
*/
/**
* RTVI Client Implementation
*
* This client connects to an RTVI-compatible bot server using WebRTC (via Daily).
* It handles audio/video streaming and manages the connection lifecycle.
*
* Requirements:
* - A running RTVI bot server (defaults to http://localhost:7860)
* - The server must implement the /connect endpoint that returns Daily.co room credentials
* - Browser with WebRTC support
*/
import {
Participant,
RTVIClient,
RTVIClientOptions,
RTVIEvent,
} from '@pipecat-ai/client-js';
import { DailyTransport } from '@pipecat-ai/daily-transport';
import SoundUtils from "./util/soundUtils";
import { InstantVoiceHelper } from "./util/instantVoiceHelper";
/**
* InstantVoiceClient handles the connection and media management for a real-time
* voice and video interaction with an AI bot.
*/
class InstantVoiceClient {
private declare rtviClient: RTVIClient;
private connectBtn: HTMLButtonElement | null = null;
private disconnectBtn: HTMLButtonElement | null = null;
private statusSpan: HTMLElement | null = null;
private bufferingAudioSpan: HTMLElement | null = null;
private debugLog: HTMLElement | null = null;
private botAudio: HTMLAudioElement;
private declare startTime: number;
constructor() {
this.botAudio = document.createElement('audio');
this.botAudio.autoplay = true;
document.body.appendChild(this.botAudio);
this.setupDOMElements();
this.setupEventListeners();
this.initializeRTVIClient();
}
/**
* Set up references to DOM elements and create necessary media elements
*/
private setupDOMElements(): void {
this.connectBtn = document.getElementById('connect-btn') as HTMLButtonElement;
this.disconnectBtn = document.getElementById('disconnect-btn') as HTMLButtonElement;
this.statusSpan = document.getElementById('connection-status');
this.bufferingAudioSpan = document.getElementById('buffering-status');
this.debugLog = document.getElementById('debug-log');
}
/**
* Set up event listeners for connect/disconnect buttons
*/
private setupEventListeners(): void {
this.connectBtn?.addEventListener('click', () => this.connect());
this.disconnectBtn?.addEventListener('click', () => this.disconnect());
}
private initializeRTVIClient(): void {
const transport = new DailyTransport({
bufferLocalAudioUntilBotReady: true
});
const RTVIConfig: RTVIClientOptions = {
transport,
params: {
// The baseURL and endpoint of your bot server that the client will connect to
baseUrl: 'http://localhost:7860',
endpoints: { connect: '/connect' },
},
enableMic: true,
enableCam: false,
callbacks: {
onConnected: () => {
this.updateStatus('Connected');
if (this.connectBtn) this.connectBtn.disabled = true;
if (this.disconnectBtn) this.disconnectBtn.disabled = false;
},
onDisconnected: () => {
this.updateStatus('Disconnected');
this.updateBufferingStatus('No');
if (this.connectBtn) this.connectBtn.disabled = false;
if (this.disconnectBtn) this.disconnectBtn.disabled = true;
this.log('Client disconnected');
},
onBotConnected: (participant: Participant) => {
this.log(`onBotConnected, timeTaken: ${Date.now() - this.startTime}`);
},
onBotReady: (data) => {
this.log(`onBotReady, timeTaken: ${Date.now() - this.startTime}`);
this.log(`Bot ready: ${JSON.stringify(data)}`);
this.setupMediaTracks();
},
onUserTranscript: (data) => {
if (data.final) {
this.log(`User: ${data.text}`);
}
},
onBotTranscript: (data) => this.log(`Bot: ${data.text}`),
onMessageError: (error) => console.error('Message error:', error),
onError: (error) => console.error('Error:', error),
},
}
this.rtviClient = new RTVIClient(RTVIConfig);
this.rtviClient.registerHelper("transport", new InstantVoiceHelper({
callbacks: {
onAudioBufferingStarted: () => {
SoundUtils.beep()
this.updateBufferingStatus('Yes');
this.log(`onMicCaptureStarted, timeTaken: ${Date.now() - this.startTime}`);
},
onAudioBufferingStopped: () => {
this.updateBufferingStatus('No');
this.log(`onMicCaptureStopped, timeTaken: ${Date.now() - this.startTime}`);
}
}
}
));
this.setupTrackListeners();
}
/**
* Add a timestamped message to the debug log
*/
private log(message: string): void {
if (!this.debugLog) return;
const entry = document.createElement('div');
entry.textContent = `${new Date().toISOString()} - ${message}`;
if (message.startsWith('User: ')) {
entry.style.color = '#2196F3';
} else if (message.startsWith('Bot: ')) {
entry.style.color = '#4CAF50';
}
this.debugLog.appendChild(entry);
this.debugLog.scrollTop = this.debugLog.scrollHeight;
console.log(message);
}
/**
* Update the connection status display
*/
private updateStatus(status: string): void {
if (this.statusSpan) {
this.statusSpan.textContent = status;
}
this.log(`Status: ${status}`);
}
/**
* Update the connection status display
*/
private updateBufferingStatus(status: string): void {
if (this.bufferingAudioSpan) {
this.bufferingAudioSpan.textContent = status;
}
this.log(`BufferingStatus: ${status}`);
}
/**
* Check for available media tracks and set them up if present
* This is called when the bot is ready or when the transport state changes to ready
*/
setupMediaTracks() {
if (!this.rtviClient) return;
const tracks = this.rtviClient.tracks();
if (tracks.bot?.audio) {
this.setupAudioTrack(tracks.bot.audio);
}
}
/**
* Set up listeners for track events (start/stop)
* This handles new tracks being added during the session
*/
setupTrackListeners() {
if (!this.rtviClient) return;
// Listen for new tracks starting
this.rtviClient.on(RTVIEvent.TrackStarted, (track, participant) => {
// Only handle non-local (bot) tracks
if (!participant?.local && track.kind === 'audio') {
this.setupAudioTrack(track);
}
});
// Listen for tracks stopping
this.rtviClient.on(RTVIEvent.TrackStopped, (track, participant) => {
this.log(`Track stopped: ${track.kind} from ${participant?.name || 'unknown'}`);
});
}
/**
* Set up an audio track for playback
* Handles both initial setup and track updates
*/
private setupAudioTrack(track: MediaStreamTrack): void {
this.log('Setting up audio track');
if (this.botAudio.srcObject && "getAudioTracks" in this.botAudio.srcObject) {
const oldTrack = this.botAudio.srcObject.getAudioTracks()[0];
if (oldTrack?.id === track.id) return;
}
this.botAudio.srcObject = new MediaStream([track]);
}
/**
* Initialize and connect to the bot
* This sets up the RTVI client, initializes devices, and establishes the connection
*/
public async connect(): Promise<void> {
try {
this.startTime = Date.now();
this.log('Connecting to bot...');
await this.rtviClient.connect();
} catch (error) {
this.log(`Error connecting: ${(error as Error).message}`);
this.updateStatus('Error');
this.updateBufferingStatus('No');
// Clean up if there's an error
if (this.rtviClient) {
try {
await this.rtviClient.disconnect();
} catch (disconnectError) {
this.log(`Error during disconnect: ${disconnectError}`);
}
}
}
}
/**
* Disconnect from the bot and clean up media resources
*/
public async disconnect(): Promise<void> {
try {
await this.rtviClient.disconnect();
if (this.botAudio.srcObject && "getAudioTracks" in this.botAudio.srcObject) {
this.botAudio.srcObject.getAudioTracks().forEach((track) => track.stop());
this.botAudio.srcObject = null;
}
} catch (error) {
this.log(`Error disconnecting: ${(error as Error).message}`);
}
}
}
declare global {
interface Window {
InstantVoiceClient: typeof InstantVoiceClient;
}
}
window.addEventListener('DOMContentLoaded', () => {
window.InstantVoiceClient = InstantVoiceClient;
new InstantVoiceClient();
});

View File

@@ -0,0 +1,98 @@
body {
margin: 0;
padding: 20px;
font-family: Arial, sans-serif;
background-color: #f0f0f0;
}
.container {
max-width: 1200px;
margin: 0 auto;
}
.status-bar {
display: flex;
justify-content: space-between;
align-items: center;
padding: 10px;
background-color: #fff;
border-radius: 8px;
margin-bottom: 20px;
}
.controls button {
padding: 8px 16px;
margin-left: 10px;
border: none;
border-radius: 4px;
cursor: pointer;
}
#connect-btn {
background-color: #4caf50;
color: white;
}
#disconnect-btn {
background-color: #f44336;
color: white;
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.main-content {
background-color: #fff;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
}
.bot-container {
display: flex;
flex-direction: column;
align-items: center;
}
#bot-video-container {
width: 640px;
height: 360px;
background-color: #e0e0e0;
border-radius: 8px;
margin: 20px auto;
overflow: hidden;
display: flex;
align-items: center;
justify-content: center;
}
#bot-video-container video {
width: 100%;
height: 100%;
object-fit: cover;
}
.debug-panel {
background-color: #fff;
border-radius: 8px;
padding: 20px;
}
.debug-panel h3 {
margin: 0 0 10px 0;
font-size: 16px;
font-weight: bold;
}
#debug-log {
height: 500px;
overflow-y: auto;
background-color: #f8f8f8;
padding: 10px;
border-radius: 4px;
font-family: monospace;
font-size: 12px;
line-height: 1.4;
}

View File

@@ -0,0 +1,39 @@
import {RTVIClientHelper, RTVIClientHelperOptions, RTVIMessage} from "@pipecat-ai/client-js";
import {DailyRTVIMessageType} from '@pipecat-ai/daily-transport';
export type InstantVoiceHelperCallbacks = Partial<{
onAudioBufferingStarted: () => void;
onAudioBufferingStopped: () => void;
}>;
// --- Interface and class
export interface InstantVoiceHelperOptions extends RTVIClientHelperOptions {
callbacks?: InstantVoiceHelperCallbacks;
}
export class InstantVoiceHelper extends RTVIClientHelper {
protected declare _options: InstantVoiceHelperOptions;
constructor(options: InstantVoiceHelperOptions) {
super(options);
}
handleMessage(rtviMessage: RTVIMessage): void {
switch (rtviMessage.type) {
case DailyRTVIMessageType.AUDIO_BUFFERING_STARTED:
if (this._options.callbacks?.onAudioBufferingStarted) {
this._options.callbacks?.onAudioBufferingStarted()
}
break;
case DailyRTVIMessageType.AUDIO_BUFFERING_STOPPED:
if (this._options.callbacks?.onAudioBufferingStopped) {
this._options.callbacks?.onAudioBufferingStopped()
}
break;
}
}
getMessageTypes(): string[] {
return [DailyRTVIMessageType.AUDIO_BUFFERING_STARTED, DailyRTVIMessageType.AUDIO_BUFFERING_STOPPED];
}
}

View File

@@ -0,0 +1,8 @@
class SoundUtils {
static beep() {
const snd = new Audio("data:audio/wav;base64,//uQRAAAAWMSLwUIYAAsYkXgoQwAEaYLWfkWgAI0wWs/ItAAAGDgYtAgAyN+QWaAAihwMWm4G8QQRDiMcCBcH3Cc+CDv/7xA4Tvh9Rz/y8QADBwMWgQAZG/ILNAARQ4GLTcDeIIIhxGOBAuD7hOfBB3/94gcJ3w+o5/5eIAIAAAVwWgQAVQ2ORaIQwEMAJiDg95G4nQL7mQVWI6GwRcfsZAcsKkJvxgxEjzFUgfHoSQ9Qq7KNwqHwuB13MA4a1q/DmBrHgPcmjiGoh//EwC5nGPEmS4RcfkVKOhJf+WOgoxJclFz3kgn//dBA+ya1GhurNn8zb//9NNutNuhz31f////9vt///z+IdAEAAAK4LQIAKobHItEIYCGAExBwe8jcToF9zIKrEdDYIuP2MgOWFSE34wYiR5iqQPj0JIeoVdlG4VD4XA67mAcNa1fhzA1jwHuTRxDUQ//iYBczjHiTJcIuPyKlHQkv/LHQUYkuSi57yQT//uggfZNajQ3Vmz+Zt//+mm3Wm3Q576v////+32///5/EOgAAADVghQAAAAA//uQZAUAB1WI0PZugAAAAAoQwAAAEk3nRd2qAAAAACiDgAAAAAAABCqEEQRLCgwpBGMlJkIz8jKhGvj4k6jzRnqasNKIeoh5gI7BJaC1A1AoNBjJgbyApVS4IDlZgDU5WUAxEKDNmmALHzZp0Fkz1FMTmGFl1FMEyodIavcCAUHDWrKAIA4aa2oCgILEBupZgHvAhEBcZ6joQBxS76AgccrFlczBvKLC0QI2cBoCFvfTDAo7eoOQInqDPBtvrDEZBNYN5xwNwxQRfw8ZQ5wQVLvO8OYU+mHvFLlDh05Mdg7BT6YrRPpCBznMB2r//xKJjyyOh+cImr2/4doscwD6neZjuZR4AgAABYAAAABy1xcdQtxYBYYZdifkUDgzzXaXn98Z0oi9ILU5mBjFANmRwlVJ3/6jYDAmxaiDG3/6xjQQCCKkRb/6kg/wW+kSJ5//rLobkLSiKmqP/0ikJuDaSaSf/6JiLYLEYnW/+kXg1WRVJL/9EmQ1YZIsv/6Qzwy5qk7/+tEU0nkls3/zIUMPKNX/6yZLf+kFgAfgGyLFAUwY//uQZAUABcd5UiNPVXAAAApAAAAAE0VZQKw9ISAAACgAAAAAVQIygIElVrFkBS+Jhi+EAuu+lKAkYUEIsmEAEoMeDmCETMvfSHTGkF5RWH7kz/ESHWPAq/kcCRhqBtMdokPdM7vil7RG98A2sc7zO6ZvTdM7pmOUAZTnJW+NXxqmd41dqJ6mLTXxrPpnV8avaIf5SvL7pndPvPpndJR9Kuu8fePvuiuhorgWjp7Mf/PRjxcFCPDkW31srioCExivv9lcwKEaHsf/7ow2Fl1T/9RkXgEhYElAoCLFtMArxwivDJJ+bR1HTKJdlEoTELCIqgEwVGSQ+hIm0NbK8WXcTEI0UPoa2NbG4y2K00JEWbZavJXkYaqo9CRHS55FcZTjKEk3NKoCYUnSQ0rWxrZbFKbKIhOKPZe1cJKzZSaQrIyULHDZmV5K4xySsDRKWOruanGtjLJXFEmwaIbDLX0hIPBUQPVFVkQkDoUNfSoDgQGKPekoxeGzA4DUvnn4bxzcZrtJyipKfPNy5w+9lnXwgqsiyHNeSVpemw4bWb9psYeq//uQZBoABQt4yMVxYAIAAAkQoAAAHvYpL5m6AAgAACXDAAAAD59jblTirQe9upFsmZbpMudy7Lz1X1DYsxOOSWpfPqNX2WqktK0DMvuGwlbNj44TleLPQ+Gsfb+GOWOKJoIrWb3cIMeeON6lz2umTqMXV8Mj30yWPpjoSa9ujK8SyeJP5y5mOW1D6hvLepeveEAEDo0mgCRClOEgANv3B9a6fikgUSu/DmAMATrGx7nng5p5iimPNZsfQLYB2sDLIkzRKZOHGAaUyDcpFBSLG9MCQALgAIgQs2YunOszLSAyQYPVC2YdGGeHD2dTdJk1pAHGAWDjnkcLKFymS3RQZTInzySoBwMG0QueC3gMsCEYxUqlrcxK6k1LQQcsmyYeQPdC2YfuGPASCBkcVMQQqpVJshui1tkXQJQV0OXGAZMXSOEEBRirXbVRQW7ugq7IM7rPWSZyDlM3IuNEkxzCOJ0ny2ThNkyRai1b6ev//3dzNGzNb//4uAvHT5sURcZCFcuKLhOFs8mLAAEAt4UWAAIABAAAAAB4qbHo0tIjVkUU//uQZAwABfSFz3ZqQAAAAAngwAAAE1HjMp2qAAAAACZDgAAAD5UkTE1UgZEUExqYynN1qZvqIOREEFmBcJQkwdxiFtw0qEOkGYfRDifBui9MQg4QAHAqWtAWHoCxu1Yf4VfWLPIM2mHDFsbQEVGwyqQoQcwnfHeIkNt9YnkiaS1oizycqJrx4KOQjahZxWbcZgztj2c49nKmkId44S71j0c8eV9yDK6uPRzx5X18eDvjvQ6yKo9ZSS6l//8elePK/Lf//IInrOF/FvDoADYAGBMGb7FtErm5MXMlmPAJQVgWta7Zx2go+8xJ0UiCb8LHHdftWyLJE0QIAIsI+UbXu67dZMjmgDGCGl1H+vpF4NSDckSIkk7Vd+sxEhBQMRU8j/12UIRhzSaUdQ+rQU5kGeFxm+hb1oh6pWWmv3uvmReDl0UnvtapVaIzo1jZbf/pD6ElLqSX+rUmOQNpJFa/r+sa4e/pBlAABoAAAAA3CUgShLdGIxsY7AUABPRrgCABdDuQ5GC7DqPQCgbbJUAoRSUj+NIEig0YfyWUho1VBBBA//uQZB4ABZx5zfMakeAAAAmwAAAAF5F3P0w9GtAAACfAAAAAwLhMDmAYWMgVEG1U0FIGCBgXBXAtfMH10000EEEEEECUBYln03TTTdNBDZopopYvrTTdNa325mImNg3TTPV9q3pmY0xoO6bv3r00y+IDGid/9aaaZTGMuj9mpu9Mpio1dXrr5HERTZSmqU36A3CumzN/9Robv/Xx4v9ijkSRSNLQhAWumap82WRSBUqXStV/YcS+XVLnSS+WLDroqArFkMEsAS+eWmrUzrO0oEmE40RlMZ5+ODIkAyKAGUwZ3mVKmcamcJnMW26MRPgUw6j+LkhyHGVGYjSUUKNpuJUQoOIAyDvEyG8S5yfK6dhZc0Tx1KI/gviKL6qvvFs1+bWtaz58uUNnryq6kt5RzOCkPWlVqVX2a/EEBUdU1KrXLf40GoiiFXK///qpoiDXrOgqDR38JB0bw7SoL+ZB9o1RCkQjQ2CBYZKd/+VJxZRRZlqSkKiws0WFxUyCwsKiMy7hUVFhIaCrNQsKkTIsLivwKKigsj8XYlwt/WKi2N4d//uQRCSAAjURNIHpMZBGYiaQPSYyAAABLAAAAAAAACWAAAAApUF/Mg+0aohSIRobBAsMlO//Kk4soosy1JSFRYWaLC4qZBYWFRGZdwqKiwkNBVmoWFSJkWFxX4FFRQWR+LsS4W/rFRb/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////VEFHAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAU291bmRib3kuZGUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMjAwNGh0dHA6Ly93d3cuc291bmRib3kuZGUAAAAAAAAAACU=");
void snd.play();
}
}
export default SoundUtils;

View File

@@ -0,0 +1,111 @@
{
"compilerOptions": {
/* Visit https://aka.ms/tsconfig to read more about this file */
/* Projects */
// "incremental": true, /* Save .tsbuildinfo files to allow for incremental compilation of projects. */
// "composite": true, /* Enable constraints that allow a TypeScript project to be used with project references. */
// "tsBuildInfoFile": "./.tsbuildinfo", /* Specify the path to .tsbuildinfo incremental compilation file. */
// "disableSourceOfProjectReferenceRedirect": true, /* Disable preferring source files instead of declaration files when referencing composite projects. */
// "disableSolutionSearching": true, /* Opt a project out of multi-project reference checking when editing. */
// "disableReferencedProjectLoad": true, /* Reduce the number of projects loaded automatically by TypeScript. */
/* Language and Environment */
"target": "es2016", /* Set the JavaScript language version for emitted JavaScript and include compatible library declarations. */
// "lib": [], /* Specify a set of bundled library declaration files that describe the target runtime environment. */
// "jsx": "preserve", /* Specify what JSX code is generated. */
// "experimentalDecorators": true, /* Enable experimental support for legacy experimental decorators. */
// "emitDecoratorMetadata": true, /* Emit design-type metadata for decorated declarations in source files. */
// "jsxFactory": "", /* Specify the JSX factory function used when targeting React JSX emit, e.g. 'React.createElement' or 'h'. */
// "jsxFragmentFactory": "", /* Specify the JSX Fragment reference used for fragments when targeting React JSX emit e.g. 'React.Fragment' or 'Fragment'. */
// "jsxImportSource": "", /* Specify module specifier used to import the JSX factory functions when using 'jsx: react-jsx*'. */
// "reactNamespace": "", /* Specify the object invoked for 'createElement'. This only applies when targeting 'react' JSX emit. */
// "noLib": true, /* Disable including any library files, including the default lib.d.ts. */
// "useDefineForClassFields": true, /* Emit ECMAScript-standard-compliant class fields. */
// "moduleDetection": "auto", /* Control what method is used to detect module-format JS files. */
/* Modules */
"module": "commonjs", /* Specify what module code is generated. */
// "rootDir": "./", /* Specify the root folder within your source files. */
// "moduleResolution": "node10", /* Specify how TypeScript looks up a file from a given module specifier. */
// "baseUrl": "./", /* Specify the base directory to resolve non-relative module names. */
// "paths": {}, /* Specify a set of entries that re-map imports to additional lookup locations. */
// "rootDirs": [], /* Allow multiple folders to be treated as one when resolving modules. */
// "typeRoots": [], /* Specify multiple folders that act like './node_modules/@types'. */
// "types": [], /* Specify type package names to be included without being referenced in a source file. */
// "allowUmdGlobalAccess": true, /* Allow accessing UMD globals from modules. */
// "moduleSuffixes": [], /* List of file name suffixes to search when resolving a module. */
// "allowImportingTsExtensions": true, /* Allow imports to include TypeScript file extensions. Requires '--moduleResolution bundler' and either '--noEmit' or '--emitDeclarationOnly' to be set. */
// "rewriteRelativeImportExtensions": true, /* Rewrite '.ts', '.tsx', '.mts', and '.cts' file extensions in relative import paths to their JavaScript equivalent in output files. */
// "resolvePackageJsonExports": true, /* Use the package.json 'exports' field when resolving package imports. */
// "resolvePackageJsonImports": true, /* Use the package.json 'imports' field when resolving imports. */
// "customConditions": [], /* Conditions to set in addition to the resolver-specific defaults when resolving imports. */
// "noUncheckedSideEffectImports": true, /* Check side effect imports. */
// "resolveJsonModule": true, /* Enable importing .json files. */
// "allowArbitraryExtensions": true, /* Enable importing files with any extension, provided a declaration file is present. */
// "noResolve": true, /* Disallow 'import's, 'require's or '<reference>'s from expanding the number of files TypeScript should add to a project. */
/* JavaScript Support */
// "allowJs": true, /* Allow JavaScript files to be a part of your program. Use the 'checkJS' option to get errors from these files. */
// "checkJs": true, /* Enable error reporting in type-checked JavaScript files. */
// "maxNodeModuleJsDepth": 1, /* Specify the maximum folder depth used for checking JavaScript files from 'node_modules'. Only applicable with 'allowJs'. */
/* Emit */
// "declaration": true, /* Generate .d.ts files from TypeScript and JavaScript files in your project. */
// "declarationMap": true, /* Create sourcemaps for d.ts files. */
// "emitDeclarationOnly": true, /* Only output d.ts files and not JavaScript files. */
// "sourceMap": true, /* Create source map files for emitted JavaScript files. */
// "inlineSourceMap": true, /* Include sourcemap files inside the emitted JavaScript. */
// "noEmit": true, /* Disable emitting files from a compilation. */
// "outFile": "./", /* Specify a file that bundles all outputs into one JavaScript file. If 'declaration' is true, also designates a file that bundles all .d.ts output. */
// "outDir": "./", /* Specify an output folder for all emitted files. */
// "removeComments": true, /* Disable emitting comments. */
// "importHelpers": true, /* Allow importing helper functions from tslib once per project, instead of including them per-file. */
// "downlevelIteration": true, /* Emit more compliant, but verbose and less performant JavaScript for iteration. */
// "sourceRoot": "", /* Specify the root path for debuggers to find the reference source code. */
// "mapRoot": "", /* Specify the location where debugger should locate map files instead of generated locations. */
// "inlineSources": true, /* Include source code in the sourcemaps inside the emitted JavaScript. */
// "emitBOM": true, /* Emit a UTF-8 Byte Order Mark (BOM) in the beginning of output files. */
// "newLine": "crlf", /* Set the newline character for emitting files. */
// "stripInternal": true, /* Disable emitting declarations that have '@internal' in their JSDoc comments. */
// "noEmitHelpers": true, /* Disable generating custom helper functions like '__extends' in compiled output. */
// "noEmitOnError": true, /* Disable emitting files if any type checking errors are reported. */
// "preserveConstEnums": true, /* Disable erasing 'const enum' declarations in generated code. */
// "declarationDir": "./", /* Specify the output directory for generated declaration files. */
/* Interop Constraints */
// "isolatedModules": true, /* Ensure that each file can be safely transpiled without relying on other imports. */
// "verbatimModuleSyntax": true, /* Do not transform or elide any imports or exports not marked as type-only, ensuring they are written in the output file's format based on the 'module' setting. */
// "isolatedDeclarations": true, /* Require sufficient annotation on exports so other tools can trivially generate declaration files. */
// "allowSyntheticDefaultImports": true, /* Allow 'import x from y' when a module doesn't have a default export. */
"esModuleInterop": true, /* Emit additional JavaScript to ease support for importing CommonJS modules. This enables 'allowSyntheticDefaultImports' for type compatibility. */
// "preserveSymlinks": true, /* Disable resolving symlinks to their realpath. This correlates to the same flag in node. */
"forceConsistentCasingInFileNames": true, /* Ensure that casing is correct in imports. */
/* Type Checking */
"strict": true, /* Enable all strict type-checking options. */
// "noImplicitAny": true, /* Enable error reporting for expressions and declarations with an implied 'any' type. */
// "strictNullChecks": true, /* When type checking, take into account 'null' and 'undefined'. */
// "strictFunctionTypes": true, /* When assigning functions, check to ensure parameters and the return values are subtype-compatible. */
// "strictBindCallApply": true, /* Check that the arguments for 'bind', 'call', and 'apply' methods match the original function. */
// "strictPropertyInitialization": true, /* Check for class properties that are declared but not set in the constructor. */
// "strictBuiltinIteratorReturn": true, /* Built-in iterators are instantiated with a 'TReturn' type of 'undefined' instead of 'any'. */
// "noImplicitThis": true, /* Enable error reporting when 'this' is given the type 'any'. */
// "useUnknownInCatchVariables": true, /* Default catch clause variables as 'unknown' instead of 'any'. */
// "alwaysStrict": true, /* Ensure 'use strict' is always emitted. */
// "noUnusedLocals": true, /* Enable error reporting when local variables aren't read. */
// "noUnusedParameters": true, /* Raise an error when a function parameter isn't read. */
// "exactOptionalPropertyTypes": true, /* Interpret optional property types as written, rather than adding 'undefined'. */
// "noImplicitReturns": true, /* Enable error reporting for codepaths that do not explicitly return in a function. */
// "noFallthroughCasesInSwitch": true, /* Enable error reporting for fallthrough cases in switch statements. */
// "noUncheckedIndexedAccess": true, /* Add 'undefined' to a type when accessed using an index. */
// "noImplicitOverride": true, /* Ensure overriding members in derived classes are marked with an override modifier. */
// "noPropertyAccessFromIndexSignature": true, /* Enforces using indexed accessors for keys declared using an indexed type. */
// "allowUnusedLabels": true, /* Disable error reporting for unused labels. */
// "allowUnreachableCode": true, /* Disable error reporting for unreachable code. */
/* Completeness */
// "skipDefaultLibCheck": true, /* Skip type checking .d.ts files that are included with TypeScript. */
"skipLibCheck": true /* Skip type checking all .d.ts files. */
}
}

View File

@@ -0,0 +1,15 @@
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react-swc';
export default defineConfig({
plugins: [react()],
server: {
proxy: {
// Proxy /api requests to the backend server
'/connect': {
target: 'http://0.0.0.0:7860', // Replace with your backend URL
changeOrigin: true,
},
},
},
});

View File

@@ -0,0 +1,339 @@
# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.
# yarn lockfile v1
"@babel/runtime@^7.12.5":
version "7.26.0"
resolved "https://registry.npmjs.org/@babel/runtime/-/runtime-7.26.0.tgz"
integrity sha512-FDSOghenHTiToteC/QRlv2q3DhPZ/oOXTBoirfWNx1Cx3TMVcGWQtMMmQcSvb/JjpNeGzx8Pq/b4fKEJuWm1sw==
dependencies:
regenerator-runtime "^0.14.0"
"@daily-co/daily-js@^0.73.0":
version "0.73.0"
resolved "https://registry.npmjs.org/@daily-co/daily-js/-/daily-js-0.73.0.tgz"
integrity sha512-Wz8c60hgmkx8fcEeDAi4L4J0rbafiihWKyXFyhYoFYPsw2OdChHpA4RYwIB+1enRws5IK+/HdmzFDYLQsB4A6w==
dependencies:
"@babel/runtime" "^7.12.5"
"@sentry/browser" "^8.33.1"
bowser "^2.8.1"
dequal "^2.0.3"
events "^3.1.0"
"@esbuild/darwin-arm64@0.24.0":
version "0.24.0"
resolved "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.0.tgz"
integrity sha512-CKyDpRbK1hXwv79soeTJNHb5EiG6ct3efd/FTPdzOWdbZZfGhpbcqIpiD0+vwmpu0wTIL97ZRPZu8vUt46nBSw==
"@pipecat-ai/client-js@^0.3.2", "@pipecat-ai/client-js@~0.3.2":
version "0.3.2"
resolved "https://registry.npmjs.org/@pipecat-ai/client-js/-/client-js-0.3.2.tgz"
integrity sha512-psunOVrJjPka2SWlq53vxVWCA0Vt8pSXsXtn8pOLC0YTKFsUx+b7Z6quYUJcDZjCe1aAg9cKETek3Xal3Co8Tg==
dependencies:
"@types/events" "^3.0.3"
clone-deep "^4.0.1"
events "^3.3.0"
typed-emitter "^2.1.0"
uuid "^10.0.0"
"@pipecat-ai/daily-transport@^0.3.5":
version "0.3.5"
resolved "https://registry.npmjs.org/@pipecat-ai/daily-transport/-/daily-transport-0.3.5.tgz"
integrity sha512-nJ0TvWPCqXPmU81U8cXOqk5mUEEvEuI06Mis+N0jN8KZUrNy1pP08iWbs07ObmIXdnQcoL+kQmHOerT4q/bF0w==
dependencies:
"@daily-co/daily-js" "^0.73.0"
"@rollup/rollup-darwin-arm64@4.28.0":
version "4.28.0"
resolved "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.28.0.tgz"
integrity sha512-lmKx9yHsppblnLQZOGxdO66gT77bvdBtr/0P+TPOseowE7D9AJoBw8ZDULRasXRWf1Z86/gcOdpBrV6VDUY36Q==
"@sentry-internal/browser-utils@8.49.0":
version "8.49.0"
resolved "https://registry.npmjs.org/@sentry-internal/browser-utils/-/browser-utils-8.49.0.tgz"
integrity sha512-XkPHHdFqsN7EPaB+QGUOEmpFqXiqP67t2rRZ1HG1UwJoe0PhJEKNy7b4+WRwmT7ODSt+PvFk1gNBlJBpThwH7Q==
dependencies:
"@sentry/core" "8.49.0"
"@sentry-internal/feedback@8.49.0":
version "8.49.0"
resolved "https://registry.npmjs.org/@sentry-internal/feedback/-/feedback-8.49.0.tgz"
integrity sha512-v/wf7WvPxEvZUB7xrCnecI3fhevVo84hw8WlxgZIz6mLUHXEIX8xYWc9H8Yet/KKJ2uEB8GQ8aDsY6S1hVEIUA==
dependencies:
"@sentry/core" "8.49.0"
"@sentry-internal/replay-canvas@8.49.0":
version "8.49.0"
resolved "https://registry.npmjs.org/@sentry-internal/replay-canvas/-/replay-canvas-8.49.0.tgz"
integrity sha512-/yXxI7f+Wu24FIYoRE7A0AidNxORuhAyPzb5ey1wFqMXP72nG8dXhOpcl0w+bi554FkqkLjdeUDhSOBWYZXH9g==
dependencies:
"@sentry-internal/replay" "8.49.0"
"@sentry/core" "8.49.0"
"@sentry-internal/replay@8.49.0":
version "8.49.0"
resolved "https://registry.npmjs.org/@sentry-internal/replay/-/replay-8.49.0.tgz"
integrity sha512-BDiiCBxskkktTd6FNplBc9V8l14R4T/AwRIZj2itX4xnuHewTTDjVbeyvGol4roA4r+V0Mzoi31hLEGI6yFQ5Q==
dependencies:
"@sentry-internal/browser-utils" "8.49.0"
"@sentry/core" "8.49.0"
"@sentry/browser@^8.33.1":
version "8.49.0"
resolved "https://registry.npmjs.org/@sentry/browser/-/browser-8.49.0.tgz"
integrity sha512-dS4Sw2h8EixHeXOIR++XEVMTen6xCGcIQ/XhJbsjqvddXeIijW0WkxSeTfPkfs0dsqFHSisWmlmo0xhHbXvEsQ==
dependencies:
"@sentry-internal/browser-utils" "8.49.0"
"@sentry-internal/feedback" "8.49.0"
"@sentry-internal/replay" "8.49.0"
"@sentry-internal/replay-canvas" "8.49.0"
"@sentry/core" "8.49.0"
"@sentry/core@8.49.0":
version "8.49.0"
resolved "https://registry.npmjs.org/@sentry/core/-/core-8.49.0.tgz"
integrity sha512-/OAm6LdHhh8TvfDAucWfSJV7M03IOHrJm5LVjrrKr4gwQ1HKd4CDbARsBbPwHIzSRAle0IgG3sbJxEvv52JUIw==
"@swc/core-darwin-arm64@1.10.14":
version "1.10.14"
resolved "https://registry.npmjs.org/@swc/core-darwin-arm64/-/core-darwin-arm64-1.10.14.tgz"
integrity sha512-Dh4VyrhDDb05tdRmqJ/MucOPMTnrB4pRJol18HVyLlqu1HOT5EzonUniNTCdQbUXjgdv5UVJSTE1lYTzrp+myA==
"@swc/core@^1.7.26":
version "1.10.14"
resolved "https://registry.npmjs.org/@swc/core/-/core-1.10.14.tgz"
integrity sha512-WSrnE6JRnH20ZYjOOgSS4aOaPv9gxlkI2KRkN24kagbZnPZMnN8bZZyzw1rrLvwgpuRGv17Uz+hflosbR+SP6w==
dependencies:
"@swc/counter" "^0.1.3"
"@swc/types" "^0.1.17"
optionalDependencies:
"@swc/core-darwin-arm64" "1.10.14"
"@swc/core-darwin-x64" "1.10.14"
"@swc/core-linux-arm-gnueabihf" "1.10.14"
"@swc/core-linux-arm64-gnu" "1.10.14"
"@swc/core-linux-arm64-musl" "1.10.14"
"@swc/core-linux-x64-gnu" "1.10.14"
"@swc/core-linux-x64-musl" "1.10.14"
"@swc/core-win32-arm64-msvc" "1.10.14"
"@swc/core-win32-ia32-msvc" "1.10.14"
"@swc/core-win32-x64-msvc" "1.10.14"
"@swc/counter@^0.1.3":
version "0.1.3"
resolved "https://registry.npmjs.org/@swc/counter/-/counter-0.1.3.tgz"
integrity sha512-e2BR4lsJkkRlKZ/qCHPw9ZaSxc0MVUd7gtbtaB7aMvHeJVYe8sOB8DBZkP2DtISHGSku9sCK6T6cnY0CtXrOCQ==
"@swc/types@^0.1.17":
version "0.1.17"
resolved "https://registry.npmjs.org/@swc/types/-/types-0.1.17.tgz"
integrity sha512-V5gRru+aD8YVyCOMAjMpWR1Ui577DD5KSJsHP8RAxopAH22jFz6GZd/qxqjO6MJHQhcsjvjOFXyDhyLQUnMveQ==
dependencies:
"@swc/counter" "^0.1.3"
"@types/estree@1.0.6":
version "1.0.6"
resolved "https://registry.npmjs.org/@types/estree/-/estree-1.0.6.tgz"
integrity sha512-AYnb1nQyY49te+VRAVgmzfcgjYS91mY5P0TKUDCLEM+gNnA+3T6rWITXRLYCpahpqSQbN5cE+gHpnPyXjHWxcw==
"@types/events@^3.0.3":
version "3.0.3"
resolved "https://registry.npmjs.org/@types/events/-/events-3.0.3.tgz"
integrity sha512-trOc4AAUThEz9hapPtSd7wf5tiQKvTtu5b371UxXdTuqzIh0ArcRspRP0i0Viu+LXstIQ1z96t1nsPxT9ol01g==
"@types/node@^18.0.0 || ^20.0.0 || >=22.0.0", "@types/node@^22.13.1":
version "22.13.1"
resolved "https://registry.npmjs.org/@types/node/-/node-22.13.1.tgz"
integrity sha512-jK8uzQlrvXqEU91UxiK5J7pKHyzgnI1Qnl0QDHIgVGuolJhRb9EEl28Cj9b3rGR8B2lhFCtvIm5os8lFnO/1Ew==
dependencies:
undici-types "~6.20.0"
"@vitejs/plugin-react-swc@^3.7.2":
version "3.7.2"
resolved "https://registry.npmjs.org/@vitejs/plugin-react-swc/-/plugin-react-swc-3.7.2.tgz"
integrity sha512-y0byko2b2tSVVf5Gpng1eEhX1OvPC7x8yns1Fx8jDzlJp4LS6CMkCPfLw47cjyoMrshQDoQw4qcgjsU9VvlCew==
dependencies:
"@swc/core" "^1.7.26"
bowser@^2.8.1:
version "2.11.0"
resolved "https://registry.npmjs.org/bowser/-/bowser-2.11.0.tgz"
integrity sha512-AlcaJBi/pqqJBIQ8U9Mcpc9i8Aqxn88Skv5d+xBX006BY5u8N3mGLHa5Lgppa7L/HfwgwLgZ6NYs+Ag6uUmJRA==
clone-deep@^4.0.1:
version "4.0.1"
resolved "https://registry.npmjs.org/clone-deep/-/clone-deep-4.0.1.tgz"
integrity sha512-neHB9xuzh/wk0dIHweyAXv2aPGZIVk3pLMe+/RNzINf17fe0OG96QroktYAUm7SM1PBnzTabaLboqqxDyMU+SQ==
dependencies:
is-plain-object "^2.0.4"
kind-of "^6.0.2"
shallow-clone "^3.0.0"
dequal@^2.0.3:
version "2.0.3"
resolved "https://registry.npmjs.org/dequal/-/dequal-2.0.3.tgz"
integrity sha512-0je+qPKHEMohvfRTCEo3CrPG6cAzAYgmzKyxRiYSSDkS6eGJdyVJm7WaYA5ECaAD9wLB2T4EEeymA5aFVcYXCA==
esbuild@^0.24.0:
version "0.24.0"
resolved "https://registry.npmjs.org/esbuild/-/esbuild-0.24.0.tgz"
integrity sha512-FuLPevChGDshgSicjisSooU0cemp/sGXR841D5LHMB7mTVOmsEHcAxaH3irL53+8YDIeVNQEySh4DaYU/iuPqQ==
optionalDependencies:
"@esbuild/aix-ppc64" "0.24.0"
"@esbuild/android-arm" "0.24.0"
"@esbuild/android-arm64" "0.24.0"
"@esbuild/android-x64" "0.24.0"
"@esbuild/darwin-arm64" "0.24.0"
"@esbuild/darwin-x64" "0.24.0"
"@esbuild/freebsd-arm64" "0.24.0"
"@esbuild/freebsd-x64" "0.24.0"
"@esbuild/linux-arm" "0.24.0"
"@esbuild/linux-arm64" "0.24.0"
"@esbuild/linux-ia32" "0.24.0"
"@esbuild/linux-loong64" "0.24.0"
"@esbuild/linux-mips64el" "0.24.0"
"@esbuild/linux-ppc64" "0.24.0"
"@esbuild/linux-riscv64" "0.24.0"
"@esbuild/linux-s390x" "0.24.0"
"@esbuild/linux-x64" "0.24.0"
"@esbuild/netbsd-x64" "0.24.0"
"@esbuild/openbsd-arm64" "0.24.0"
"@esbuild/openbsd-x64" "0.24.0"
"@esbuild/sunos-x64" "0.24.0"
"@esbuild/win32-arm64" "0.24.0"
"@esbuild/win32-ia32" "0.24.0"
"@esbuild/win32-x64" "0.24.0"
events@^3.1.0, events@^3.3.0:
version "3.3.0"
resolved "https://registry.npmjs.org/events/-/events-3.3.0.tgz"
integrity sha512-mQw+2fkQbALzQ7V0MY0IqdnXNOeTtP4r0lN9z7AAawCXgqea7bDii20AYrIBrFd/Hx0M2Ocz6S111CaFkUcb0Q==
fsevents@~2.3.2, fsevents@~2.3.3:
version "2.3.3"
resolved "https://registry.npmjs.org/fsevents/-/fsevents-2.3.3.tgz"
integrity sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==
is-plain-object@^2.0.4:
version "2.0.4"
resolved "https://registry.npmjs.org/is-plain-object/-/is-plain-object-2.0.4.tgz"
integrity sha512-h5PpgXkWitc38BBMYawTYMWJHFZJVnBquFE57xFpjB8pJFiF6gZ+bU+WyI/yqXiFR5mdLsgYNaPe8uao6Uv9Og==
dependencies:
isobject "^3.0.1"
isobject@^3.0.1:
version "3.0.1"
resolved "https://registry.npmjs.org/isobject/-/isobject-3.0.1.tgz"
integrity sha512-WhB9zCku7EGTj/HQQRz5aUQEUeoQZH2bWcltRErOpymJ4boYE6wL9Tbr23krRPSZ+C5zqNSrSw+Cc7sZZ4b7vg==
kind-of@^6.0.2:
version "6.0.3"
resolved "https://registry.npmjs.org/kind-of/-/kind-of-6.0.3.tgz"
integrity sha512-dcS1ul+9tmeD95T+x28/ehLgd9mENa3LsvDTtzm3vyBEO7RPptvAD+t44WVXaUjTBRcrpFeFlC8WCruUR456hw==
nanoid@^3.3.7:
version "3.3.8"
resolved "https://registry.npmjs.org/nanoid/-/nanoid-3.3.8.tgz"
integrity sha512-WNLf5Sd8oZxOm+TzppcYk8gVOgP+l58xNy58D0nbUnOxOWRWvlcCV4kUF7ltmI6PsrLl/BgKEyS4mqsGChFN0w==
picocolors@^1.1.1:
version "1.1.1"
resolved "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz"
integrity sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==
postcss@^8.4.49:
version "8.4.49"
resolved "https://registry.npmjs.org/postcss/-/postcss-8.4.49.tgz"
integrity sha512-OCVPnIObs4N29kxTjzLfUryOkvZEq+pf8jTF0lg8E7uETuWHA+v7j3c/xJmiqpX450191LlmZfUKkXxkTry7nA==
dependencies:
nanoid "^3.3.7"
picocolors "^1.1.1"
source-map-js "^1.2.1"
regenerator-runtime@^0.14.0:
version "0.14.1"
resolved "https://registry.npmjs.org/regenerator-runtime/-/regenerator-runtime-0.14.1.tgz"
integrity sha512-dYnhHh0nJoMfnkZs6GmmhFknAGRrLznOu5nc9ML+EJxGvrx6H7teuevqVqCuPcPK//3eDrrjQhehXVx9cnkGdw==
rollup@^4.23.0:
version "4.28.0"
resolved "https://registry.npmjs.org/rollup/-/rollup-4.28.0.tgz"
integrity sha512-G9GOrmgWHBma4YfCcX8PjH0qhXSdH8B4HDE2o4/jaxj93S4DPCIDoLcXz99eWMji4hB29UFCEd7B2gwGJDR9cQ==
dependencies:
"@types/estree" "1.0.6"
optionalDependencies:
"@rollup/rollup-android-arm-eabi" "4.28.0"
"@rollup/rollup-android-arm64" "4.28.0"
"@rollup/rollup-darwin-arm64" "4.28.0"
"@rollup/rollup-darwin-x64" "4.28.0"
"@rollup/rollup-freebsd-arm64" "4.28.0"
"@rollup/rollup-freebsd-x64" "4.28.0"
"@rollup/rollup-linux-arm-gnueabihf" "4.28.0"
"@rollup/rollup-linux-arm-musleabihf" "4.28.0"
"@rollup/rollup-linux-arm64-gnu" "4.28.0"
"@rollup/rollup-linux-arm64-musl" "4.28.0"
"@rollup/rollup-linux-powerpc64le-gnu" "4.28.0"
"@rollup/rollup-linux-riscv64-gnu" "4.28.0"
"@rollup/rollup-linux-s390x-gnu" "4.28.0"
"@rollup/rollup-linux-x64-gnu" "4.28.0"
"@rollup/rollup-linux-x64-musl" "4.28.0"
"@rollup/rollup-win32-arm64-msvc" "4.28.0"
"@rollup/rollup-win32-ia32-msvc" "4.28.0"
"@rollup/rollup-win32-x64-msvc" "4.28.0"
fsevents "~2.3.2"
rxjs@*:
version "7.8.1"
resolved "https://registry.npmjs.org/rxjs/-/rxjs-7.8.1.tgz"
integrity sha512-AA3TVj+0A2iuIoQkWEK/tqFjBq2j+6PO6Y0zJcvzLAFhEFIO3HL0vls9hWLncZbAAbK0mar7oZ4V079I/qPMxg==
dependencies:
tslib "^2.1.0"
shallow-clone@^3.0.0:
version "3.0.1"
resolved "https://registry.npmjs.org/shallow-clone/-/shallow-clone-3.0.1.tgz"
integrity sha512-/6KqX+GVUdqPuPPd2LxDDxzX6CAbjJehAAOKlNpqqUpAqPM6HeL8f+o3a+JsyGjn2lv0WY8UsTgUJjU9Ok55NA==
dependencies:
kind-of "^6.0.2"
source-map-js@^1.2.1:
version "1.2.1"
resolved "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz"
integrity sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==
tslib@^2.1.0:
version "2.8.1"
resolved "https://registry.npmjs.org/tslib/-/tslib-2.8.1.tgz"
integrity sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w==
typed-emitter@^2.1.0:
version "2.1.0"
resolved "https://registry.npmjs.org/typed-emitter/-/typed-emitter-2.1.0.tgz"
integrity sha512-g/KzbYKbH5C2vPkaXGu8DJlHrGKHLsM25Zg9WuC9pMGfuvT+X25tZQWo5fK1BjBm8+UrVE9LDCvaY0CQk+fXDA==
optionalDependencies:
rxjs "*"
typescript@^5.7.3:
version "5.7.3"
resolved "https://registry.npmjs.org/typescript/-/typescript-5.7.3.tgz"
integrity sha512-84MVSjMEHP+FQRPy3pX9sTVV/INIex71s9TL2Gm5FG/WG1SqXeKyZ0k7/blY/4FdOzI12CBy1vGc4og/eus0fw==
undici-types@~6.20.0:
version "6.20.0"
resolved "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz"
integrity sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==
uuid@^10.0.0:
version "10.0.0"
resolved "https://registry.npmjs.org/uuid/-/uuid-10.0.0.tgz"
integrity sha512-8XkAphELsDnEGrDxUOHB3RGvXz6TeuYSGEZBOjtTtPm2lwhGBjLgOzLHB63IUWfBpNucQjND6d3AOudO+H3RWQ==
"vite@^4 || ^5 || ^6", vite@^6.0.2:
version "6.0.2"
resolved "https://registry.npmjs.org/vite/-/vite-6.0.2.tgz"
integrity sha512-XdQ+VsY2tJpBsKGs0wf3U/+azx8BBpYRHFAyKm5VeEZNOJZRB63q7Sc8Iup3k0TrN3KO6QgyzFf+opSbfY1y0g==
dependencies:
esbuild "^0.24.0"
postcss "^8.4.49"
rollup "^4.23.0"
optionalDependencies:
fsevents "~2.3.3"

View File

@@ -0,0 +1,3 @@
DAILY_API_KEY=
GOOGLE_API_KEY=
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)

View File

@@ -0,0 +1,11 @@
[tool.ruff]
exclude = [".git", "*_pb2.py"]
line-length = 100
[tool.ruff.lint]
select = [
"I", # Import rules
]
[tool.ruff.lint.pydocstyle]
convention = "google"

View File

@@ -0,0 +1,4 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[openai,silero,websocket,google,daily]

View File

@@ -0,0 +1,200 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
from contextlib import asynccontextmanager
from typing import Any, Dict, List
import aiohttp
import uvicorn
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
# Load environment variables
load_dotenv(override=True)
NUMBER_OF_ROOMS = 1
class RoomPool:
"""Manages a pool of pre-created rooms for quick allocation."""
def __init__(self, daily_rest_helper: DailyRESTHelper):
self.daily_rest_helper = daily_rest_helper
self.pool: List[Dict[str, str]] = []
self.lock = asyncio.Lock()
async def fill_pool(self, count: int):
"""Fills the pool with `count` new rooms."""
for _ in range(count):
await self.add_room()
async def add_room(self):
"""Creates a new room and adds it to the pool."""
try:
room = await self.daily_rest_helper.create_room(DailyRoomParams())
if not room.url:
raise HTTPException(status_code=500, detail="Failed to create room")
user_token = await self.daily_rest_helper.get_token(room.url)
if not user_token:
raise HTTPException(status_code=500, detail="Failed to get user token")
bot_token = await self.daily_rest_helper.get_token(room.url)
if not bot_token:
raise HTTPException(status_code=500, detail="Failed to get bot token")
async with self.lock:
self.pool.append(
{"room_url": room.url, "user_token": user_token, "bot_token": bot_token}
)
except Exception as e:
print(f"Error adding room to pool: {e}")
async def get_room(self) -> Dict[str, str]:
"""Retrieves a room from the pool and requests a new one to maintain the size."""
async with self.lock:
if not self.pool:
raise HTTPException(status_code=503, detail="No available rooms")
room = self.pool.pop(0) # Get first available room
# Start a background task to replenish the pool
asyncio.create_task(self.add_room())
return room
async def delete_room(self, room_url: str):
"""Deletes a room when it is not needed anymore"""
await self.daily_rest_helper.delete_room_by_url(room_url)
async def cleanup(self):
for rooms in self.pool:
room_url = rooms["room_url"]
await self.delete_room(room_url)
class BotManager:
"""Manages bot subprocesses asynchronously."""
def __init__(self):
self.bot_procs: Dict[int, asyncio.subprocess.Process] = {}
self.room_mappings: Dict[int, str] = {} # Maps process ID to room URL
async def start_bot(self, room_url: str, token: str) -> int:
bot_file = "single_bot"
command = f"python3 -m {bot_file} -u {room_url} -t {token}"
try:
proc = await asyncio.create_subprocess_shell(
command,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
if proc.pid is None:
raise HTTPException(status_code=500, detail="Failed to get subprocess PID")
self.bot_procs[proc.pid] = proc
self.room_mappings[proc.pid] = room_url
# Monitor the process and delete the room when it exits
asyncio.create_task(self._monitor_process(proc.pid))
return proc.pid
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
async def _monitor_process(self, pid: int):
"""Monitors a bot process and deletes the associated room when it exits."""
proc = self.bot_procs.get(pid)
if proc:
await proc.wait() # Wait for the process to exit
room_url = self.room_mappings.pop(pid, None)
if room_url:
await room_pool.delete_room(room_url)
print(f"Deleted room: {room_url}")
del self.bot_procs[pid]
async def cleanup(self):
"""Terminates all running bot processes and deletes associated rooms."""
for pid, proc in list(self.bot_procs.items()):
try:
proc.terminate()
await asyncio.wait_for(proc.wait(), timeout=5)
room_url = self.room_mappings.pop(pid, None)
if room_url:
await room_pool.delete_room(room_url) # Delete room when process terminates
print(f"Deleted room: {room_url}")
except asyncio.TimeoutError:
print(f"Process {pid} did not terminate in time.")
except Exception as e:
print(f"Error terminating process {pid}: {e}")
# Clear remaining mappings
self.bot_procs.clear()
self.room_mappings.clear()
# Global instances
bot_manager = BotManager()
room_pool: RoomPool # Will be initialized in lifespan
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Handles FastAPI startup and shutdown."""
global room_pool
aiohttp_session = aiohttp.ClientSession()
daily_rest_helper = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
room_pool = RoomPool(daily_rest_helper)
await room_pool.fill_pool(NUMBER_OF_ROOMS) # Fill pool on startup
yield # Run app
await bot_manager.cleanup()
await room_pool.cleanup()
await aiohttp_session.close()
# Initialize FastAPI app with lifespan manager
app = FastAPI(lifespan=lifespan)
# Configure CORS to allow requests from any origin
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.post("/connect")
async def bot_connect(request: Request) -> Dict[Any, Any]:
try:
room = await room_pool.get_room()
await bot_manager.start_bot(room["room_url"], room["bot_token"])
except HTTPException as e:
return {"error": str(e)}
return {
"room_url": room["room_url"],
"token": room["user_token"],
}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=7860)

View File

@@ -0,0 +1,120 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import asyncio
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIProcessor
from pipecat.services.gemini_multimodal_live import GeminiMultimodalLiveLLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
SYSTEM_INSTRUCTION = f"""
"You are Gemini Chatbot, a friendly, helpful robot.
Your goal is to demonstrate your capabilities in a succinct way.
Your output will be converted to audio so don't include special characters in your answers.
Respond to what the user said in a creative and helpful way. Keep your responses brief. One or two sentences at most.
"""
def extract_arguments():
parser = argparse.ArgumentParser(description="Instant Voice Example")
parser.add_argument(
"-u", "--url", type=str, required=True, help="URL of the Daily room to join"
)
parser.add_argument(
"-t", "--token", type=str, required=False, help="Token of the Daily room to join"
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
token = args.token
return url, token
async def main():
room_url, token = extract_arguments()
print(f"room_url: {room_url}")
daily_transport = DailyTransport(
room_url,
token,
"Instant voice Chatbot",
DailyParams(
audio_out_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
vad_audio_passthrough=True,
),
)
llm = GeminiMultimodalLiveLLMService(
api_key=os.getenv("GOOGLE_API_KEY"),
voice_id="Puck", # Aoede, Charon, Fenrir, Kore, Puck
transcribe_user_audio=True,
transcribe_model_audio=True,
system_instruction=SYSTEM_INSTRUCTION,
)
context = OpenAILLMContext()
context_aggregator = llm.create_context_aggregator(context)
# RTVI events for Pipecat client UI
rtvi = RTVIProcessor(config=RTVIConfig(config=[]), transport=daily_transport)
pipeline = Pipeline(
[
daily_transport.input(),
context_aggregator.user(),
rtvi,
llm, # LLM
daily_transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
allow_interruptions=True,
observers=[rtvi.observer()],
),
)
@rtvi.event_handler("on_client_ready")
async def on_client_ready(rtvi):
await rtvi.set_bot_ready()
@daily_transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await task.queue_frames([context_aggregator.user().get_context_frame()])
@daily_transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.cancel()
runner = PipelineRunner(handle_sigint=False)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,88 @@
# Pipecat Audio Transcription Example 🚀🎙️
Welcome to the **Pipecat Audio Transcription Example**!
This project showcases how to integrate the awesome [pipecat](https://github.com/pipecat-ai/pipecat) library with a neat textual interface (powered by [Textual](https://github.com/Textualize/textual)) to select audio devices, perform real-time speech-to-text (STT) transcription using [Whisper](https://github.com/openai/whisper).
> **Note:** Although the script allows you to select both input and output audio devices, this example only utilizes the audio **input** for transcription.
---
## 🎉 Features
- **Interactive Audio Device Selection:**
Choose your preferred audio input device using a cool, textual UI.
- **State-of-the-Art Transcription:**
Leverage Whisper's large model (running on CUDA) for high-quality, real-time STT.
- **Live Transcription Logging:**
Watch your spoken words transform into text on your console instantly.
- **Easy Setup:**
Everything you need is in the [`requirements.txt`](./requirements.txt).
---
## 🎥 Demo
Get a quick glimpse of the app in action!
*(Don't worry I'll be adding a GIF demo here soon!)*
![Demo GIF](demo.gif)
---
## 🔧 Installation
Install Dependencies:
```bash
pip install -r requirements.txt
```
---
## 🚀 Usage
Run the main script:
```bash
python bot.py
```
When the app launches, you'll see a textual interface that lets you select your audio input device. Once selected, the app will begin capturing audio, transcribing it using Whisper.
---
## ⚙️ How It Works
1. **LocalAudioTransport:**
Captures audio from your chosen input device.
2. **WhisperSTTService:**
Processes the audio stream using Whisper's large model for speech-to-text conversion.
3. **TranscriptionLogger:**
Logs the transcribed text to the console as soon as it's processed.
---
## 📦 Dependencies
The project relies on:
- [pipecat](https://github.com/yourusername/pipecat) For building the audio processing pipeline.
- [Textual](https://github.com/Textualize/textual) For the interactive terminal UI.
- [Whisper](https://github.com/openai/whisper) For state-of-the-art STT transcription.
---
## Example improvements:
I plan to improve this example with local LLM calls and audio output.

View File

@@ -0,0 +1,65 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import sys
from typing import Tuple
from dotenv import load_dotenv
from loguru import logger
from select_audio_device import AudioDevice, run_device_selector
from pipecat.frames.frames import Frame, TranscriptionFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.whisper import Model, WhisperSTTService
from pipecat.transports.local.audio import LocalAudioTransport, LocalAudioTransportParams
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
class TranscriptionLogger(FrameProcessor):
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
async def main(input_device: int, output_device: int):
transport = LocalAudioTransport(
LocalAudioTransportParams(
audio_in_enabled=True,
audio_out_enabled=False,
input_device_index=input_device,
output_device_index=output_device,
)
)
stt = WhisperSTTService(device="cuda", model=Model.LARGE, no_speech_prob=0.3)
tl = TranscriptionLogger()
pipeline = Pipeline([transport.input(), stt, tl])
task = PipelineTask(pipeline)
runner = PipelineRunner(handle_sigint=False if sys.platform == "win32" else True)
await asyncio.gather(runner.run(task))
if __name__ == "__main__":
res: Tuple[AudioDevice, AudioDevice, int] = asyncio.run(
run_device_selector() # runs the textual app that allows to select input device
)
asyncio.run(main(res[0].index, res[1].index))

Binary file not shown.

After

Width:  |  Height:  |  Size: 429 KiB

View File

@@ -0,0 +1,8 @@
--extra-index-url https://download.pytorch.org/whl/cu124
torch==2.5.0+cu124
torchvision
torchaudio
pipecat[whisper, openai]
textual==1.0.0
pydantic-settings==2.7.1
pyaudio==0.2.14

View File

@@ -0,0 +1,247 @@
from typing import List, Optional, Tuple
import pyaudio
from pydantic import BaseModel, ConfigDict, Field
from pydantic_settings import BaseSettings
from textual.app import App, ComposeResult
from textual.containers import Container
from textual.widgets import Footer, Header, Label, ListItem, ListView, Select
from textual.widgets.option_list import Option
# ─── DATA MODELS ───────────────────────────────────────────────────────────────
class HostApi(BaseModel):
index: int
struct_version: int = Field(..., alias="structVersion")
type: int
name: str
device_count: int = Field(..., alias="deviceCount")
default_input_device: int = Field(..., alias="defaultInputDevice")
default_output_device: int = Field(..., alias="defaultOutputDevice")
class AudioDevice(BaseModel):
model_config = ConfigDict(populate_by_name=True)
index: int
struct_version: int = Field(..., alias="structVersion")
name: str
host_api: int = Field(..., alias="hostApi")
max_input_channels: int = Field(..., alias="maxInputChannels")
max_output_channels: int = Field(..., alias="maxOutputChannels")
default_low_input_latency: float = Field(..., alias="defaultLowInputLatency")
default_low_output_latency: float = Field(..., alias="defaultLowOutputLatency")
default_high_input_latency: float = Field(..., alias="defaultHighInputLatency")
default_high_output_latency: float = Field(..., alias="defaultHighOutputLatency")
default_sample_rate: float = Field(..., alias="defaultSampleRate")
# ─── SETTINGS MODEL ───────────────────────────────────────────────────────────
class AudioSettings(BaseSettings): # to save settings to a file
host_api: Optional[int] = None
input_device: Optional[AudioDevice] = None
output_device: Optional[AudioDevice] = None
class Config:
env_file = "settings.env" # or adjust as needed
def save_to_json(self, filepath: str) -> None:
with open(filepath, "w") as f:
f.write(self.model_dump_json(indent=2))
# ─── TEXTUAL APP ──────────────────────────────────────────────────────────────
class AudioDeviceSelectorApp(App):
CSS = """
Screen {
align: center middle;
}
#container {
width: 80%;
border: round green;
padding: 1 2;
}
"""
def __init__(
self,
default_host_api: Optional[int] = None,
default_input_device: Optional[AudioDevice] = None,
default_output_device: Optional[AudioDevice] = None,
**kwargs,
) -> None:
super().__init__(**kwargs)
# Save defaults passed from settings.
self.default_host_api: Optional[int] = default_host_api
self.default_input_device: Optional[AudioDevice] = default_input_device
self.default_output_device: Optional[AudioDevice] = default_output_device
self.pyaudio_instance = pyaudio.PyAudio()
# Static datastructures: host APIs and devices as welltyped models.
self.host_apis: List[HostApi] = []
self.current_host_api: Optional[int] = None
self.all_input_devices: List[AudioDevice] = []
self.all_output_devices: List[AudioDevice] = []
self.input_devices: List[AudioDevice] = []
self.output_devices: List[AudioDevice] = []
# Stage management: first select input, then output.
self.stage: str = "input"
self.selected_input_device: Optional[AudioDevice] = None
self.selected_output_device: Optional[AudioDevice] = None
host_api_count: int = self.pyaudio_instance.get_host_api_count()
for i in range(host_api_count):
raw_api = self.pyaudio_instance.get_host_api_info_by_index(i)
# Inject the index (if not already present)
raw_api["index"] = i
try:
api = HostApi.parse_obj(raw_api)
self.host_apis.append(api)
except Exception as e:
# Skip APIs that don't conform.
continue
def compose(self) -> ComposeResult:
options: List[Tuple[str, Option]] = [
(
api.name,
Option(
prompt=str(api.name) if api.name else f"Host API {api.index}",
id=str(api.index),
),
)
for api in self.host_apis
]
yield Header()
yield Footer()
with Container(id="container"):
yield Label("Select Host API:", id="host-api-label")
# Create the Select widget with no options initially.
self.host_api_select: Select[HostApi] = Select(options=options, id="host-api-select")
yield self.host_api_select
self.prompt = Label("Select Input Audio Device:", id="prompt")
yield self.prompt
self.list_view = ListView(id="device-list")
yield self.list_view
def on_mount(self) -> None:
# Populate host APIs from PyAudio.
# Build the dropdown options.
self.host_api_select.refresh() # Force a redraw
# Determine the default host API.
if self.default_host_api is not None:
self.current_host_api = self.default_host_api
else:
default_api_info = self.pyaudio_instance.get_default_host_api_info()
self.current_host_api = default_api_info["index"]
# Delay setting the dropdown's value until the widget is fully initialized.
self.set_timer(
0,
lambda: setattr(self.host_api_select, "value", str(self.current_host_api)),
)
# Load all devices and parse them into AudioDevice objects.
device_count: int = self.pyaudio_instance.get_device_count()
for i in range(device_count):
raw_device = self.pyaudio_instance.get_device_info_by_index(i)
raw_device["index"] = i
try:
device = AudioDevice.parse_obj(raw_device)
except Exception as e:
# Skip devices missing required fields.
continue
if device.max_input_channels > 0:
self.all_input_devices.append(device)
if device.max_output_channels > 0:
self.all_output_devices.append(device)
self.filter_devices()
self.populate_list(self.input_devices)
if self.default_input_device:
self._select_default_in_list(self.default_input_device)
def filter_devices(self) -> None:
"""Filter devices based on the selected host API."""
self.input_devices = [
d for d in self.all_input_devices if d.host_api == self.current_host_api
]
self.output_devices = [
d for d in self.all_output_devices if d.host_api == self.current_host_api
]
def populate_list(self, devices: List[AudioDevice]) -> None:
"""Populate the ListView with a list of AudioDevice objects."""
self.list_view.clear()
for dev in devices:
item_text: str = f"{dev.name} (Index: {dev.index})"
item = ListItem(Label(item_text))
# Attach the AudioDevice instance to the widget.
item.device_info = dev # type: ignore
self.list_view.append(item)
def _select_default_in_list(self, default_device: AudioDevice) -> None:
"""Pre-select the default device if present in the current list."""
for idx, item in enumerate(self.list_view.children):
if hasattr(item, "device_info") and item.device_info.index == default_device.index:
self.list_view.index = idx
break
async def on_select_changed(self, event: Select.Changed) -> None:
"""Handle changes in the host API dropdown."""
if event.select.id == "host-api-select":
self.current_host_api = int(event.value.id)
self.filter_devices()
if self.stage == "input":
self.populate_list(self.input_devices)
if self.default_input_device:
self._select_default_in_list(self.default_input_device)
elif self.stage == "output":
self.populate_list(self.output_devices)
if self.default_output_device:
self._select_default_in_list(self.default_output_device)
async def on_list_view_selected(self, message: ListView.Selected) -> None:
"""Record device selection and switch stages."""
selected_item = message.item
device_info: AudioDevice = selected_item.device_info # type: ignore
if self.stage == "input":
self.selected_input_device = device_info
self.stage = "output"
self.prompt.update("Select Output Audio Device:")
self.populate_list(self.output_devices)
if self.default_output_device:
self._select_default_in_list(self.default_output_device)
elif self.stage == "output":
self.selected_output_device = device_info
await self.action_quit()
# ─── HELPER FUNCTIONS ─────────────────────────────────────────────────────────
async def run_device_selector(
default_host_api: Optional[int] = None,
default_input_device: Optional[AudioDevice] = None,
default_output_device: Optional[AudioDevice] = None,
) -> Tuple[AudioDevice, AudioDevice, int]:
app = AudioDeviceSelectorApp(
default_host_api=default_host_api,
default_input_device=default_input_device,
default_output_device=default_output_device,
)
await app.run_async()
# The current_host_api is guaranteed to be set.
return app.selected_input_device, app.selected_output_device, app.current_host_api # type: ignore

View File

@@ -13,7 +13,7 @@
"@pipecat-ai/daily-transport": "^0.3.4"
},
"devDependencies": {
"vite": "^6.0.2"
"vite": "^6.0.9"
}
},
"node_modules/@babel/runtime": {
@@ -45,14 +45,13 @@
}
},
"node_modules/@esbuild/aix-ppc64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.24.0.tgz",
"integrity": "sha512-WtKdFM7ls47zkKHFVzMz8opM7LkcsIp9amDUBIAWirg70RM71WRSjdILPsY5Uv1D42ZpUfaPILDlfactHgsRkw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.24.2.tgz",
"integrity": "sha512-thpVCb/rhxE/BnMLQ7GReQLLN8q9qbHmI55F4489/ByVg2aQaQ6kbcLb6FHkocZzQhxc4gx0sCk0tJkKBFzDhA==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"aix"
@@ -62,14 +61,13 @@
}
},
"node_modules/@esbuild/android-arm": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.24.0.tgz",
"integrity": "sha512-arAtTPo76fJ/ICkXWetLCc9EwEHKaeya4vMrReVlEIUCAUncH7M4bhMQ+M9Vf+FFOZJdTNMXNBrWwW+OXWpSew==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.24.2.tgz",
"integrity": "sha512-tmwl4hJkCfNHwFB3nBa8z1Uy3ypZpxqxfTQOcHX+xRByyYgunVbZ9MzUUfb0RxaHIMnbHagwAxuTL+tnNM+1/Q==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -79,14 +77,13 @@
}
},
"node_modules/@esbuild/android-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.24.0.tgz",
"integrity": "sha512-Vsm497xFM7tTIPYK9bNTYJyF/lsP590Qc1WxJdlB6ljCbdZKU9SY8i7+Iin4kyhV/KV5J2rOKsBQbB77Ab7L/w==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.24.2.tgz",
"integrity": "sha512-cNLgeqCqV8WxfcTIOeL4OAtSmL8JjcN6m09XIgro1Wi7cF4t/THaWEa7eL5CMoMBdjoHOTh/vwTO/o2TRXIyzg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -96,14 +93,13 @@
}
},
"node_modules/@esbuild/android-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.24.0.tgz",
"integrity": "sha512-t8GrvnFkiIY7pa7mMgJd7p8p8qqYIz1NYiAoKc75Zyv73L3DZW++oYMSHPRarcotTKuSs6m3hTOa5CKHaS02TQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.24.2.tgz",
"integrity": "sha512-B6Q0YQDqMx9D7rvIcsXfmJfvUYLoP722bgfBlO5cGvNVb5V/+Y7nhBE3mHV9OpxBf4eAS2S68KZztiPaWq4XYw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -113,14 +109,13 @@
}
},
"node_modules/@esbuild/darwin-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.0.tgz",
"integrity": "sha512-CKyDpRbK1hXwv79soeTJNHb5EiG6ct3efd/FTPdzOWdbZZfGhpbcqIpiD0+vwmpu0wTIL97ZRPZu8vUt46nBSw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.2.tgz",
"integrity": "sha512-kj3AnYWc+CekmZnS5IPu9D+HWtUI49hbnyqk0FLEJDbzCIQt7hg7ucF1SQAilhtYpIujfaHr6O0UHlzzSPdOeA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -130,14 +125,13 @@
}
},
"node_modules/@esbuild/darwin-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.24.0.tgz",
"integrity": "sha512-rgtz6flkVkh58od4PwTRqxbKH9cOjaXCMZgWD905JOzjFKW+7EiUObfd/Kav+A6Gyud6WZk9w+xu6QLytdi2OA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.24.2.tgz",
"integrity": "sha512-WeSrmwwHaPkNR5H3yYfowhZcbriGqooyu3zI/3GGpF8AyUdsrrP0X6KumITGA9WOyiJavnGZUwPGvxvwfWPHIA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -147,14 +141,13 @@
}
},
"node_modules/@esbuild/freebsd-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.24.0.tgz",
"integrity": "sha512-6Mtdq5nHggwfDNLAHkPlyLBpE5L6hwsuXZX8XNmHno9JuL2+bg2BX5tRkwjyfn6sKbxZTq68suOjgWqCicvPXA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.24.2.tgz",
"integrity": "sha512-UN8HXjtJ0k/Mj6a9+5u6+2eZ2ERD7Edt1Q9IZiB5UZAIdPnVKDoG7mdTVGhHJIeEml60JteamR3qhsr1r8gXvg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
@@ -164,14 +157,13 @@
}
},
"node_modules/@esbuild/freebsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.24.0.tgz",
"integrity": "sha512-D3H+xh3/zphoX8ck4S2RxKR6gHlHDXXzOf6f/9dbFt/NRBDIE33+cVa49Kil4WUjxMGW0ZIYBYtaGCa2+OsQwQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.24.2.tgz",
"integrity": "sha512-TvW7wE/89PYW+IevEJXZ5sF6gJRDY/14hyIGFXdIucxCsbRmLUcjseQu1SyTko+2idmCw94TgyaEZi9HUSOe3Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
@@ -181,14 +173,13 @@
}
},
"node_modules/@esbuild/linux-arm": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.24.0.tgz",
"integrity": "sha512-gJKIi2IjRo5G6Glxb8d3DzYXlxdEj2NlkixPsqePSZMhLudqPhtZ4BUrpIuTjJYXxvF9njql+vRjB2oaC9XpBw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.24.2.tgz",
"integrity": "sha512-n0WRM/gWIdU29J57hJyUdIsk0WarGd6To0s+Y+LwvlC55wt+GT/OgkwoXCXvIue1i1sSNWblHEig00GBWiJgfA==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -198,14 +189,13 @@
}
},
"node_modules/@esbuild/linux-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.24.0.tgz",
"integrity": "sha512-TDijPXTOeE3eaMkRYpcy3LarIg13dS9wWHRdwYRnzlwlA370rNdZqbcp0WTyyV/k2zSxfko52+C7jU5F9Tfj1g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.24.2.tgz",
"integrity": "sha512-7HnAD6074BW43YvvUmE/35Id9/NB7BeX5EoNkK9obndmZBUk8xmJJeU7DwmUeN7tkysslb2eSl6CTrYz6oEMQg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -215,14 +205,13 @@
}
},
"node_modules/@esbuild/linux-ia32": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.24.0.tgz",
"integrity": "sha512-K40ip1LAcA0byL05TbCQ4yJ4swvnbzHscRmUilrmP9Am7//0UjPreh4lpYzvThT2Quw66MhjG//20mrufm40mA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.24.2.tgz",
"integrity": "sha512-sfv0tGPQhcZOgTKO3oBE9xpHuUqguHvSo4jl+wjnKwFpapx+vUDcawbwPNuBIAYdRAvIDBfZVvXprIj3HA+Ugw==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -232,14 +221,13 @@
}
},
"node_modules/@esbuild/linux-loong64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.24.0.tgz",
"integrity": "sha512-0mswrYP/9ai+CU0BzBfPMZ8RVm3RGAN/lmOMgW4aFUSOQBjA31UP8Mr6DDhWSuMwj7jaWOT0p0WoZ6jeHhrD7g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.24.2.tgz",
"integrity": "sha512-CN9AZr8kEndGooS35ntToZLTQLHEjtVB5n7dl8ZcTZMonJ7CCfStrYhrzF97eAecqVbVJ7APOEe18RPI4KLhwQ==",
"cpu": [
"loong64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -249,14 +237,13 @@
}
},
"node_modules/@esbuild/linux-mips64el": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.24.0.tgz",
"integrity": "sha512-hIKvXm0/3w/5+RDtCJeXqMZGkI2s4oMUGj3/jM0QzhgIASWrGO5/RlzAzm5nNh/awHE0A19h/CvHQe6FaBNrRA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.24.2.tgz",
"integrity": "sha512-iMkk7qr/wl3exJATwkISxI7kTcmHKE+BlymIAbHO8xanq/TjHaaVThFF6ipWzPHryoFsesNQJPE/3wFJw4+huw==",
"cpu": [
"mips64el"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -266,14 +253,13 @@
}
},
"node_modules/@esbuild/linux-ppc64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.24.0.tgz",
"integrity": "sha512-HcZh5BNq0aC52UoocJxaKORfFODWXZxtBaaZNuN3PUX3MoDsChsZqopzi5UupRhPHSEHotoiptqikjN/B77mYQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.24.2.tgz",
"integrity": "sha512-shsVrgCZ57Vr2L8mm39kO5PPIb+843FStGt7sGGoqiiWYconSxwTiuswC1VJZLCjNiMLAMh34jg4VSEQb+iEbw==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -283,14 +269,13 @@
}
},
"node_modules/@esbuild/linux-riscv64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.24.0.tgz",
"integrity": "sha512-bEh7dMn/h3QxeR2KTy1DUszQjUrIHPZKyO6aN1X4BCnhfYhuQqedHaa5MxSQA/06j3GpiIlFGSsy1c7Gf9padw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.24.2.tgz",
"integrity": "sha512-4eSFWnU9Hhd68fW16GD0TINewo1L6dRrB+oLNNbYyMUAeOD2yCK5KXGK1GH4qD/kT+bTEXjsyTCiJGHPZ3eM9Q==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -300,14 +285,13 @@
}
},
"node_modules/@esbuild/linux-s390x": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.24.0.tgz",
"integrity": "sha512-ZcQ6+qRkw1UcZGPyrCiHHkmBaj9SiCD8Oqd556HldP+QlpUIe2Wgn3ehQGVoPOvZvtHm8HPx+bH20c9pvbkX3g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.24.2.tgz",
"integrity": "sha512-S0Bh0A53b0YHL2XEXC20bHLuGMOhFDO6GN4b3YjRLK//Ep3ql3erpNcPlEFed93hsQAjAQDNsvcK+hV90FubSw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -317,14 +301,13 @@
}
},
"node_modules/@esbuild/linux-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.24.0.tgz",
"integrity": "sha512-vbutsFqQ+foy3wSSbmjBXXIJ6PL3scghJoM8zCL142cGaZKAdCZHyf+Bpu/MmX9zT9Q0zFBVKb36Ma5Fzfa8xA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.24.2.tgz",
"integrity": "sha512-8Qi4nQcCTbLnK9WoMjdC9NiTG6/E38RNICU6sUNqK0QFxCYgoARqVqxdFmWkdonVsvGqWhmm7MO0jyTqLqwj0Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -333,15 +316,30 @@
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.24.2.tgz",
"integrity": "sha512-wuLK/VztRRpMt9zyHSazyCVdCXlpHkKm34WUyinD2lzK07FAHTq0KQvZZlXikNWkDGoT6x3TD51jKQ7gMVpopw==",
"cpu": [
"arm64"
],
"dev": true,
"optional": true,
"os": [
"netbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.24.0.tgz",
"integrity": "sha512-hjQ0R/ulkO8fCYFsG0FZoH+pWgTTDreqpqY7UnQntnaKv95uP5iW3+dChxnx7C3trQQU40S+OgWhUVwCjVFLvg==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.24.2.tgz",
"integrity": "sha512-VefFaQUc4FMmJuAxmIHgUmfNiLXY438XrL4GDNV1Y1H/RW3qow68xTwjZKfj/+Plp9NANmzbH5R40Meudu8mmw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"netbsd"
@@ -351,14 +349,13 @@
}
},
"node_modules/@esbuild/openbsd-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.24.0.tgz",
"integrity": "sha512-MD9uzzkPQbYehwcN583yx3Tu5M8EIoTD+tUgKF982WYL9Pf5rKy9ltgD0eUgs8pvKnmizxjXZyLt0z6DC3rRXg==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.24.2.tgz",
"integrity": "sha512-YQbi46SBct6iKnszhSvdluqDmxCJA+Pu280Av9WICNwQmMxV7nLRHZfjQzwbPs3jeWnuAhE9Jy0NrnJ12Oz+0A==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
@@ -368,14 +365,13 @@
}
},
"node_modules/@esbuild/openbsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.24.0.tgz",
"integrity": "sha512-4ir0aY1NGUhIC1hdoCzr1+5b43mw99uNwVzhIq1OY3QcEwPDO3B7WNXBzaKY5Nsf1+N11i1eOfFcq+D/gOS15Q==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.24.2.tgz",
"integrity": "sha512-+iDS6zpNM6EnJyWv0bMGLWSWeXGN/HTaF/LXHXHwejGsVi+ooqDfMCCTerNFxEkM3wYVcExkeGXNqshc9iMaOA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
@@ -385,14 +381,13 @@
}
},
"node_modules/@esbuild/sunos-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.24.0.tgz",
"integrity": "sha512-jVzdzsbM5xrotH+W5f1s+JtUy1UWgjU0Cf4wMvffTB8m6wP5/kx0KiaLHlbJO+dMgtxKV8RQ/JvtlFcdZ1zCPA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.24.2.tgz",
"integrity": "sha512-hTdsW27jcktEvpwNHJU4ZwWFGkz2zRJUz8pvddmXPtXDzVKTTINmlmga3ZzwcuMpUvLw7JkLy9QLKyGpD2Yxig==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"sunos"
@@ -402,14 +397,13 @@
}
},
"node_modules/@esbuild/win32-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.24.0.tgz",
"integrity": "sha512-iKc8GAslzRpBytO2/aN3d2yb2z8XTVfNV0PjGlCxKo5SgWmNXx82I/Q3aG1tFfS+A2igVCY97TJ8tnYwpUWLCA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.24.2.tgz",
"integrity": "sha512-LihEQ2BBKVFLOC9ZItT9iFprsE9tqjDjnbulhHoFxYQtQfai7qfluVODIYxt1PgdoyQkz23+01rzwNwYfutxUQ==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -419,14 +413,13 @@
}
},
"node_modules/@esbuild/win32-ia32": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.24.0.tgz",
"integrity": "sha512-vQW36KZolfIudCcTnaTpmLQ24Ha1RjygBo39/aLkM2kmjkWmZGEJ5Gn9l5/7tzXA42QGIoWbICfg6KLLkIw6yw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.24.2.tgz",
"integrity": "sha512-q+iGUwfs8tncmFC9pcnD5IvRHAzmbwQ3GPS5/ceCyHdjXubwQWI12MKWSNSMYLJMq23/IUCvJMS76PDqXe1fxA==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -436,14 +429,13 @@
}
},
"node_modules/@esbuild/win32-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.24.0.tgz",
"integrity": "sha512-7IAFPrjSQIJrGsK6flwg7NFmwBoSTyF3rl7If0hNUFQU4ilTsEPL6GuMuU9BfIWVVGuRnuIidkSMC+c0Otu8IA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.24.2.tgz",
"integrity": "sha512-7VTgWzgMGvup6aSqDPLiW5zHaxYJGTO4OokMjIlrCtf+VpEL+cXKtCvg723iguPYI5oaUNdS+/V7OU2gvXVWEg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -478,252 +470,247 @@
}
},
"node_modules/@rollup/rollup-android-arm-eabi": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.28.0.tgz",
"integrity": "sha512-wLJuPLT6grGZsy34g4N1yRfYeouklTgPhH1gWXCYspenKYD0s3cR99ZevOGw5BexMNywkbV3UkjADisozBmpPQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.34.6.tgz",
"integrity": "sha512-+GcCXtOQoWuC7hhX1P00LqjjIiS/iOouHXhMdiDSnq/1DGTox4SpUvO52Xm+div6+106r+TcvOeo/cxvyEyTgg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-android-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.28.0.tgz",
"integrity": "sha512-eiNkznlo0dLmVG/6wf+Ifi/v78G4d4QxRhuUl+s8EWZpDewgk7PX3ZyECUXU0Zq/Ca+8nU8cQpNC4Xgn2gFNDA==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.34.6.tgz",
"integrity": "sha512-E8+2qCIjciYUnCa1AiVF1BkRgqIGW9KzJeesQqVfyRITGQN+dFuoivO0hnro1DjT74wXLRZ7QF8MIbz+luGaJA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-darwin-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.28.0.tgz",
"integrity": "sha512-lmKx9yHsppblnLQZOGxdO66gT77bvdBtr/0P+TPOseowE7D9AJoBw8ZDULRasXRWf1Z86/gcOdpBrV6VDUY36Q==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.34.6.tgz",
"integrity": "sha512-z9Ib+OzqN3DZEjX7PDQMHEhtF+t6Mi2z/ueChQPLS/qUMKY7Ybn5A2ggFoKRNRh1q1T03YTQfBTQCJZiepESAg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-darwin-x64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.28.0.tgz",
"integrity": "sha512-8hxgfReVs7k9Js1uAIhS6zq3I+wKQETInnWQtgzt8JfGx51R1N6DRVy3F4o0lQwumbErRz52YqwjfvuwRxGv1w==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.34.6.tgz",
"integrity": "sha512-PShKVY4u0FDAR7jskyFIYVyHEPCPnIQY8s5OcXkdU8mz3Y7eXDJPdyM/ZWjkYdR2m0izD9HHWA8sGcXn+Qrsyg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-freebsd-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.28.0.tgz",
"integrity": "sha512-lA1zZB3bFx5oxu9fYud4+g1mt+lYXCoch0M0V/xhqLoGatbzVse0wlSQ1UYOWKpuSu3gyN4qEc0Dxf/DII1bhQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.34.6.tgz",
"integrity": "sha512-YSwyOqlDAdKqs0iKuqvRHLN4SrD2TiswfoLfvYXseKbL47ht1grQpq46MSiQAx6rQEN8o8URtpXARCpqabqxGQ==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-freebsd-x64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.28.0.tgz",
"integrity": "sha512-aI2plavbUDjCQB/sRbeUZWX9qp12GfYkYSJOrdYTL/C5D53bsE2/nBPuoiJKoWp5SN78v2Vr8ZPnB+/VbQ2pFA==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.34.6.tgz",
"integrity": "sha512-HEP4CgPAY1RxXwwL5sPFv6BBM3tVeLnshF03HMhJYCNc6kvSqBgTMmsEjb72RkZBAWIqiPUyF1JpEBv5XT9wKQ==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-linux-arm-gnueabihf": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.28.0.tgz",
"integrity": "sha512-WXveUPKtfqtaNvpf0iOb0M6xC64GzUX/OowbqfiCSXTdi/jLlOmH0Ba94/OkiY2yTGTwteo4/dsHRfh5bDCZ+w==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.34.6.tgz",
"integrity": "sha512-88fSzjC5xeH9S2Vg3rPgXJULkHcLYMkh8faix8DX4h4TIAL65ekwuQMA/g2CXq8W+NJC43V6fUpYZNjaX3+IIg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm-musleabihf": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.28.0.tgz",
"integrity": "sha512-yLc3O2NtOQR67lI79zsSc7lk31xjwcaocvdD1twL64PK1yNaIqCeWI9L5B4MFPAVGEVjH5k1oWSGuYX1Wutxpg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.34.6.tgz",
"integrity": "sha512-wM4ztnutBqYFyvNeR7Av+reWI/enK9tDOTKNF+6Kk2Q96k9bwhDDOlnCUNRPvromlVXo04riSliMBs/Z7RteEg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.28.0.tgz",
"integrity": "sha512-+P9G9hjEpHucHRXqesY+3X9hD2wh0iNnJXX/QhS/J5vTdG6VhNYMxJ2rJkQOxRUd17u5mbMLHM7yWGZdAASfcg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.34.6.tgz",
"integrity": "sha512-9RyprECbRa9zEjXLtvvshhw4CMrRa3K+0wcp3KME0zmBe1ILmvcVHnypZ/aIDXpRyfhSYSuN4EPdCCj5Du8FIA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-musl": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.28.0.tgz",
"integrity": "sha512-1xsm2rCKSTpKzi5/ypT5wfc+4bOGa/9yI/eaOLW0oMs7qpC542APWhl4A37AENGZ6St6GBMWhCCMM6tXgTIplw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.34.6.tgz",
"integrity": "sha512-qTmklhCTyaJSB05S+iSovfo++EwnIEZxHkzv5dep4qoszUMX5Ca4WM4zAVUMbfdviLgCSQOu5oU8YoGk1s6M9Q==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-loongarch64-gnu": {
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loongarch64-gnu/-/rollup-linux-loongarch64-gnu-4.34.6.tgz",
"integrity": "sha512-4Qmkaps9yqmpjY5pvpkfOerYgKNUGzQpFxV6rnS7c/JfYbDSU0y6WpbbredB5cCpLFGJEqYX40WUmxMkwhWCjw==",
"cpu": [
"loong64"
],
"dev": true,
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-powerpc64le-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.28.0.tgz",
"integrity": "sha512-zgWxMq8neVQeXL+ouSf6S7DoNeo6EPgi1eeqHXVKQxqPy1B2NvTbaOUWPn/7CfMKL7xvhV0/+fq/Z/J69g1WAQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.34.6.tgz",
"integrity": "sha512-Zsrtux3PuaxuBTX/zHdLaFmcofWGzaWW1scwLU3ZbW/X+hSsFbz9wDIp6XvnT7pzYRl9MezWqEqKy7ssmDEnuQ==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-riscv64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.28.0.tgz",
"integrity": "sha512-VEdVYacLniRxbRJLNtzwGt5vwS0ycYshofI7cWAfj7Vg5asqj+pt+Q6x4n+AONSZW/kVm+5nklde0qs2EUwU2g==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.34.6.tgz",
"integrity": "sha512-aK+Zp+CRM55iPrlyKiU3/zyhgzWBxLVrw2mwiQSYJRobCURb781+XstzvA8Gkjg/hbdQFuDw44aUOxVQFycrAg==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-s390x-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.28.0.tgz",
"integrity": "sha512-LQlP5t2hcDJh8HV8RELD9/xlYtEzJkm/aWGsauvdO2ulfl3QYRjqrKW+mGAIWP5kdNCBheqqqYIGElSRCaXfpw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.34.6.tgz",
"integrity": "sha512-WoKLVrY9ogmaYPXwTH326+ErlCIgMmsoRSx6bO+l68YgJnlOXhygDYSZe/qbUJCSiCiZAQ+tKm88NcWuUXqOzw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.28.0.tgz",
"integrity": "sha512-Nl4KIzteVEKE9BdAvYoTkW19pa7LR/RBrT6F1dJCV/3pbjwDcaOq+edkP0LXuJ9kflW/xOK414X78r+K84+msw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.34.6.tgz",
"integrity": "sha512-Sht4aFvmA4ToHd2vFzwMFaQCiYm2lDFho5rPcvPBT5pCdC+GwHG6CMch4GQfmWTQ1SwRKS0dhDYb54khSrjDWw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-musl": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.28.0.tgz",
"integrity": "sha512-eKpJr4vBDOi4goT75MvW+0dXcNUqisK4jvibY9vDdlgLx+yekxSm55StsHbxUsRxSTt3JEQvlr3cGDkzcSP8bw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.34.6.tgz",
"integrity": "sha512-zmmpOQh8vXc2QITsnCiODCDGXFC8LMi64+/oPpPx5qz3pqv0s6x46ps4xoycfUiVZps5PFn1gksZzo4RGTKT+A==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-win32-arm64-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.28.0.tgz",
"integrity": "sha512-Vi+WR62xWGsE/Oj+mD0FNAPY2MEox3cfyG0zLpotZdehPFXwz6lypkGs5y38Jd/NVSbOD02aVad6q6QYF7i8Bg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.34.6.tgz",
"integrity": "sha512-3/q1qUsO/tLqGBaD4uXsB6coVGB3usxw3qyeVb59aArCgedSF66MPdgRStUd7vbZOsko/CgVaY5fo2vkvPLWiA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-ia32-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.28.0.tgz",
"integrity": "sha512-kN/Vpip8emMLn/eOza+4JwqDZBL6MPNpkdaEsgUtW1NYN3DZvZqSQrbKzJcTL6hd8YNmFTn7XGWMwccOcJBL0A==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.34.6.tgz",
"integrity": "sha512-oLHxuyywc6efdKVTxvc0135zPrRdtYVjtVD5GUm55I3ODxhU/PwkQFD97z16Xzxa1Fz0AEe4W/2hzRtd+IfpOA==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-x64-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.28.0.tgz",
"integrity": "sha512-Bvno2/aZT6usSa7lRDL2+hMjVAGjuqaymF1ApZm31JXzniR/hvr14jpU+/z4X6Gt5BPlzosscyJZGUvguXIqeQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.34.6.tgz",
"integrity": "sha512-0PVwmgzZ8+TZ9oGBmdZoQVXflbvuwzN/HRclujpl4N/q3i+y0lqLw8n1bXA8ru3sApDjlmONaNAuYr38y1Kr9w==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -808,8 +795,7 @@
"version": "1.0.6",
"resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.6.tgz",
"integrity": "sha512-AYnb1nQyY49te+VRAVgmzfcgjYS91mY5P0TKUDCLEM+gNnA+3T6rWITXRLYCpahpqSQbN5cE+gHpnPyXjHWxcw==",
"dev": true,
"license": "MIT"
"dev": true
},
"node_modules/@types/events": {
"version": "3.0.3",
@@ -847,12 +833,11 @@
}
},
"node_modules/esbuild": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.24.0.tgz",
"integrity": "sha512-FuLPevChGDshgSicjisSooU0cemp/sGXR841D5LHMB7mTVOmsEHcAxaH3irL53+8YDIeVNQEySh4DaYU/iuPqQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.24.2.tgz",
"integrity": "sha512-+9egpBW8I3CD5XPe0n6BfT5fxLzxrlDzqydF3aviG+9ni1lDC/OvMHcxqEFV0+LANZG5R1bFMWfUrjVsdwxJvA==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"bin": {
"esbuild": "bin/esbuild"
},
@@ -860,30 +845,31 @@
"node": ">=18"
},
"optionalDependencies": {
"@esbuild/aix-ppc64": "0.24.0",
"@esbuild/android-arm": "0.24.0",
"@esbuild/android-arm64": "0.24.0",
"@esbuild/android-x64": "0.24.0",
"@esbuild/darwin-arm64": "0.24.0",
"@esbuild/darwin-x64": "0.24.0",
"@esbuild/freebsd-arm64": "0.24.0",
"@esbuild/freebsd-x64": "0.24.0",
"@esbuild/linux-arm": "0.24.0",
"@esbuild/linux-arm64": "0.24.0",
"@esbuild/linux-ia32": "0.24.0",
"@esbuild/linux-loong64": "0.24.0",
"@esbuild/linux-mips64el": "0.24.0",
"@esbuild/linux-ppc64": "0.24.0",
"@esbuild/linux-riscv64": "0.24.0",
"@esbuild/linux-s390x": "0.24.0",
"@esbuild/linux-x64": "0.24.0",
"@esbuild/netbsd-x64": "0.24.0",
"@esbuild/openbsd-arm64": "0.24.0",
"@esbuild/openbsd-x64": "0.24.0",
"@esbuild/sunos-x64": "0.24.0",
"@esbuild/win32-arm64": "0.24.0",
"@esbuild/win32-ia32": "0.24.0",
"@esbuild/win32-x64": "0.24.0"
"@esbuild/aix-ppc64": "0.24.2",
"@esbuild/android-arm": "0.24.2",
"@esbuild/android-arm64": "0.24.2",
"@esbuild/android-x64": "0.24.2",
"@esbuild/darwin-arm64": "0.24.2",
"@esbuild/darwin-x64": "0.24.2",
"@esbuild/freebsd-arm64": "0.24.2",
"@esbuild/freebsd-x64": "0.24.2",
"@esbuild/linux-arm": "0.24.2",
"@esbuild/linux-arm64": "0.24.2",
"@esbuild/linux-ia32": "0.24.2",
"@esbuild/linux-loong64": "0.24.2",
"@esbuild/linux-mips64el": "0.24.2",
"@esbuild/linux-ppc64": "0.24.2",
"@esbuild/linux-riscv64": "0.24.2",
"@esbuild/linux-s390x": "0.24.2",
"@esbuild/linux-x64": "0.24.2",
"@esbuild/netbsd-arm64": "0.24.2",
"@esbuild/netbsd-x64": "0.24.2",
"@esbuild/openbsd-arm64": "0.24.2",
"@esbuild/openbsd-x64": "0.24.2",
"@esbuild/sunos-x64": "0.24.2",
"@esbuild/win32-arm64": "0.24.2",
"@esbuild/win32-ia32": "0.24.2",
"@esbuild/win32-x64": "0.24.2"
}
},
"node_modules/events": {
@@ -901,7 +887,6 @@
"integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -951,7 +936,6 @@
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"bin": {
"nanoid": "bin/nanoid.cjs"
},
@@ -963,13 +947,12 @@
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz",
"integrity": "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==",
"dev": true,
"license": "ISC"
"dev": true
},
"node_modules/postcss": {
"version": "8.4.49",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.4.49.tgz",
"integrity": "sha512-OCVPnIObs4N29kxTjzLfUryOkvZEq+pf8jTF0lg8E7uETuWHA+v7j3c/xJmiqpX450191LlmZfUKkXxkTry7nA==",
"version": "8.5.2",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.2.tgz",
"integrity": "sha512-MjOadfU3Ys9KYoX0AdkBlFEF1Vx37uCCeN4ZHnmwm9FfpbsGWMZeBLMmmpY+6Ocqod7mkdZ0DT31OlbsFrLlkA==",
"dev": true,
"funding": [
{
@@ -985,9 +968,8 @@
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"dependencies": {
"nanoid": "^3.3.7",
"nanoid": "^3.3.8",
"picocolors": "^1.1.1",
"source-map-js": "^1.2.1"
},
@@ -1002,11 +984,10 @@
"license": "MIT"
},
"node_modules/rollup": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.28.0.tgz",
"integrity": "sha512-G9GOrmgWHBma4YfCcX8PjH0qhXSdH8B4HDE2o4/jaxj93S4DPCIDoLcXz99eWMji4hB29UFCEd7B2gwGJDR9cQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.34.6.tgz",
"integrity": "sha512-wc2cBWqJgkU3Iz5oztRkQbfVkbxoz5EhnCGOrnJvnLnQ7O0WhQUYyv18qQI79O8L7DdHrrlJNeCHd4VGpnaXKQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"@types/estree": "1.0.6"
},
@@ -1018,24 +999,25 @@
"npm": ">=8.0.0"
},
"optionalDependencies": {
"@rollup/rollup-android-arm-eabi": "4.28.0",
"@rollup/rollup-android-arm64": "4.28.0",
"@rollup/rollup-darwin-arm64": "4.28.0",
"@rollup/rollup-darwin-x64": "4.28.0",
"@rollup/rollup-freebsd-arm64": "4.28.0",
"@rollup/rollup-freebsd-x64": "4.28.0",
"@rollup/rollup-linux-arm-gnueabihf": "4.28.0",
"@rollup/rollup-linux-arm-musleabihf": "4.28.0",
"@rollup/rollup-linux-arm64-gnu": "4.28.0",
"@rollup/rollup-linux-arm64-musl": "4.28.0",
"@rollup/rollup-linux-powerpc64le-gnu": "4.28.0",
"@rollup/rollup-linux-riscv64-gnu": "4.28.0",
"@rollup/rollup-linux-s390x-gnu": "4.28.0",
"@rollup/rollup-linux-x64-gnu": "4.28.0",
"@rollup/rollup-linux-x64-musl": "4.28.0",
"@rollup/rollup-win32-arm64-msvc": "4.28.0",
"@rollup/rollup-win32-ia32-msvc": "4.28.0",
"@rollup/rollup-win32-x64-msvc": "4.28.0",
"@rollup/rollup-android-arm-eabi": "4.34.6",
"@rollup/rollup-android-arm64": "4.34.6",
"@rollup/rollup-darwin-arm64": "4.34.6",
"@rollup/rollup-darwin-x64": "4.34.6",
"@rollup/rollup-freebsd-arm64": "4.34.6",
"@rollup/rollup-freebsd-x64": "4.34.6",
"@rollup/rollup-linux-arm-gnueabihf": "4.34.6",
"@rollup/rollup-linux-arm-musleabihf": "4.34.6",
"@rollup/rollup-linux-arm64-gnu": "4.34.6",
"@rollup/rollup-linux-arm64-musl": "4.34.6",
"@rollup/rollup-linux-loongarch64-gnu": "4.34.6",
"@rollup/rollup-linux-powerpc64le-gnu": "4.34.6",
"@rollup/rollup-linux-riscv64-gnu": "4.34.6",
"@rollup/rollup-linux-s390x-gnu": "4.34.6",
"@rollup/rollup-linux-x64-gnu": "4.34.6",
"@rollup/rollup-linux-x64-musl": "4.34.6",
"@rollup/rollup-win32-arm64-msvc": "4.34.6",
"@rollup/rollup-win32-ia32-msvc": "4.34.6",
"@rollup/rollup-win32-x64-msvc": "4.34.6",
"fsevents": "~2.3.2"
}
},
@@ -1066,7 +1048,6 @@
"resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz",
"integrity": "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==",
"dev": true,
"license": "BSD-3-Clause",
"engines": {
"node": ">=0.10.0"
}
@@ -1101,15 +1082,14 @@
}
},
"node_modules/vite": {
"version": "6.0.2",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.0.2.tgz",
"integrity": "sha512-XdQ+VsY2tJpBsKGs0wf3U/+azx8BBpYRHFAyKm5VeEZNOJZRB63q7Sc8Iup3k0TrN3KO6QgyzFf+opSbfY1y0g==",
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.1.0.tgz",
"integrity": "sha512-RjjMipCKVoR4hVfPY6GQTgveinjNuyLw+qruksLDvA5ktI1150VmcMBKmQaEWJhg/j6Uaf6dNCNA0AfdzUb/hQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"esbuild": "^0.24.0",
"postcss": "^8.4.49",
"rollup": "^4.23.0"
"esbuild": "^0.24.2",
"postcss": "^8.5.1",
"rollup": "^4.30.1"
},
"bin": {
"vite": "bin/vite.js"

View File

@@ -12,7 +12,7 @@
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.0.2"
"vite": "^6.0.9"
},
"dependencies": {
"@pipecat-ai/client-js": "^0.3.2",

View File

@@ -23,7 +23,7 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIProcessor
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.deepgram import DeepgramSTTService
from pipecat.services.google import GoogleLLMService, LLMSearchResponseFrame
from pipecat.services.google import GoogleLLMService, GoogleRTVIObserver, LLMSearchResponseFrame
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.utils.text.markdown_text_filter import MarkdownTextFilter
@@ -102,6 +102,7 @@ async def main():
llm = GoogleLLMService(
api_key=os.getenv("GOOGLE_API_KEY"),
model="gemini-1.5-flash-002",
system_instruction=system_instruction,
tools=tools,
)
@@ -141,7 +142,7 @@ async def main():
pipeline,
PipelineParams(
allow_interruptions=True,
observers=[rtvi.observer()],
observers=[GoogleRTVIObserver(rtvi)],
),
)

View File

@@ -13,7 +13,7 @@
"@pipecat-ai/daily-transport": "^0.3.4"
},
"devDependencies": {
"vite": "^6.0.2"
"vite": "^6.0.9"
}
},
"node_modules/@babel/runtime": {
@@ -45,14 +45,13 @@
}
},
"node_modules/@esbuild/aix-ppc64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.24.0.tgz",
"integrity": "sha512-WtKdFM7ls47zkKHFVzMz8opM7LkcsIp9amDUBIAWirg70RM71WRSjdILPsY5Uv1D42ZpUfaPILDlfactHgsRkw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.24.2.tgz",
"integrity": "sha512-thpVCb/rhxE/BnMLQ7GReQLLN8q9qbHmI55F4489/ByVg2aQaQ6kbcLb6FHkocZzQhxc4gx0sCk0tJkKBFzDhA==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"aix"
@@ -62,14 +61,13 @@
}
},
"node_modules/@esbuild/android-arm": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.24.0.tgz",
"integrity": "sha512-arAtTPo76fJ/ICkXWetLCc9EwEHKaeya4vMrReVlEIUCAUncH7M4bhMQ+M9Vf+FFOZJdTNMXNBrWwW+OXWpSew==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.24.2.tgz",
"integrity": "sha512-tmwl4hJkCfNHwFB3nBa8z1Uy3ypZpxqxfTQOcHX+xRByyYgunVbZ9MzUUfb0RxaHIMnbHagwAxuTL+tnNM+1/Q==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -79,14 +77,13 @@
}
},
"node_modules/@esbuild/android-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.24.0.tgz",
"integrity": "sha512-Vsm497xFM7tTIPYK9bNTYJyF/lsP590Qc1WxJdlB6ljCbdZKU9SY8i7+Iin4kyhV/KV5J2rOKsBQbB77Ab7L/w==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.24.2.tgz",
"integrity": "sha512-cNLgeqCqV8WxfcTIOeL4OAtSmL8JjcN6m09XIgro1Wi7cF4t/THaWEa7eL5CMoMBdjoHOTh/vwTO/o2TRXIyzg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -96,14 +93,13 @@
}
},
"node_modules/@esbuild/android-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.24.0.tgz",
"integrity": "sha512-t8GrvnFkiIY7pa7mMgJd7p8p8qqYIz1NYiAoKc75Zyv73L3DZW++oYMSHPRarcotTKuSs6m3hTOa5CKHaS02TQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.24.2.tgz",
"integrity": "sha512-B6Q0YQDqMx9D7rvIcsXfmJfvUYLoP722bgfBlO5cGvNVb5V/+Y7nhBE3mHV9OpxBf4eAS2S68KZztiPaWq4XYw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -113,14 +109,13 @@
}
},
"node_modules/@esbuild/darwin-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.0.tgz",
"integrity": "sha512-CKyDpRbK1hXwv79soeTJNHb5EiG6ct3efd/FTPdzOWdbZZfGhpbcqIpiD0+vwmpu0wTIL97ZRPZu8vUt46nBSw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.2.tgz",
"integrity": "sha512-kj3AnYWc+CekmZnS5IPu9D+HWtUI49hbnyqk0FLEJDbzCIQt7hg7ucF1SQAilhtYpIujfaHr6O0UHlzzSPdOeA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -130,14 +125,13 @@
}
},
"node_modules/@esbuild/darwin-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.24.0.tgz",
"integrity": "sha512-rgtz6flkVkh58od4PwTRqxbKH9cOjaXCMZgWD905JOzjFKW+7EiUObfd/Kav+A6Gyud6WZk9w+xu6QLytdi2OA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.24.2.tgz",
"integrity": "sha512-WeSrmwwHaPkNR5H3yYfowhZcbriGqooyu3zI/3GGpF8AyUdsrrP0X6KumITGA9WOyiJavnGZUwPGvxvwfWPHIA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -147,14 +141,13 @@
}
},
"node_modules/@esbuild/freebsd-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.24.0.tgz",
"integrity": "sha512-6Mtdq5nHggwfDNLAHkPlyLBpE5L6hwsuXZX8XNmHno9JuL2+bg2BX5tRkwjyfn6sKbxZTq68suOjgWqCicvPXA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.24.2.tgz",
"integrity": "sha512-UN8HXjtJ0k/Mj6a9+5u6+2eZ2ERD7Edt1Q9IZiB5UZAIdPnVKDoG7mdTVGhHJIeEml60JteamR3qhsr1r8gXvg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
@@ -164,14 +157,13 @@
}
},
"node_modules/@esbuild/freebsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.24.0.tgz",
"integrity": "sha512-D3H+xh3/zphoX8ck4S2RxKR6gHlHDXXzOf6f/9dbFt/NRBDIE33+cVa49Kil4WUjxMGW0ZIYBYtaGCa2+OsQwQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.24.2.tgz",
"integrity": "sha512-TvW7wE/89PYW+IevEJXZ5sF6gJRDY/14hyIGFXdIucxCsbRmLUcjseQu1SyTko+2idmCw94TgyaEZi9HUSOe3Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
@@ -181,14 +173,13 @@
}
},
"node_modules/@esbuild/linux-arm": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.24.0.tgz",
"integrity": "sha512-gJKIi2IjRo5G6Glxb8d3DzYXlxdEj2NlkixPsqePSZMhLudqPhtZ4BUrpIuTjJYXxvF9njql+vRjB2oaC9XpBw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.24.2.tgz",
"integrity": "sha512-n0WRM/gWIdU29J57hJyUdIsk0WarGd6To0s+Y+LwvlC55wt+GT/OgkwoXCXvIue1i1sSNWblHEig00GBWiJgfA==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -198,14 +189,13 @@
}
},
"node_modules/@esbuild/linux-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.24.0.tgz",
"integrity": "sha512-TDijPXTOeE3eaMkRYpcy3LarIg13dS9wWHRdwYRnzlwlA370rNdZqbcp0WTyyV/k2zSxfko52+C7jU5F9Tfj1g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.24.2.tgz",
"integrity": "sha512-7HnAD6074BW43YvvUmE/35Id9/NB7BeX5EoNkK9obndmZBUk8xmJJeU7DwmUeN7tkysslb2eSl6CTrYz6oEMQg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -215,14 +205,13 @@
}
},
"node_modules/@esbuild/linux-ia32": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.24.0.tgz",
"integrity": "sha512-K40ip1LAcA0byL05TbCQ4yJ4swvnbzHscRmUilrmP9Am7//0UjPreh4lpYzvThT2Quw66MhjG//20mrufm40mA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.24.2.tgz",
"integrity": "sha512-sfv0tGPQhcZOgTKO3oBE9xpHuUqguHvSo4jl+wjnKwFpapx+vUDcawbwPNuBIAYdRAvIDBfZVvXprIj3HA+Ugw==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -232,14 +221,13 @@
}
},
"node_modules/@esbuild/linux-loong64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.24.0.tgz",
"integrity": "sha512-0mswrYP/9ai+CU0BzBfPMZ8RVm3RGAN/lmOMgW4aFUSOQBjA31UP8Mr6DDhWSuMwj7jaWOT0p0WoZ6jeHhrD7g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.24.2.tgz",
"integrity": "sha512-CN9AZr8kEndGooS35ntToZLTQLHEjtVB5n7dl8ZcTZMonJ7CCfStrYhrzF97eAecqVbVJ7APOEe18RPI4KLhwQ==",
"cpu": [
"loong64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -249,14 +237,13 @@
}
},
"node_modules/@esbuild/linux-mips64el": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.24.0.tgz",
"integrity": "sha512-hIKvXm0/3w/5+RDtCJeXqMZGkI2s4oMUGj3/jM0QzhgIASWrGO5/RlzAzm5nNh/awHE0A19h/CvHQe6FaBNrRA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.24.2.tgz",
"integrity": "sha512-iMkk7qr/wl3exJATwkISxI7kTcmHKE+BlymIAbHO8xanq/TjHaaVThFF6ipWzPHryoFsesNQJPE/3wFJw4+huw==",
"cpu": [
"mips64el"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -266,14 +253,13 @@
}
},
"node_modules/@esbuild/linux-ppc64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.24.0.tgz",
"integrity": "sha512-HcZh5BNq0aC52UoocJxaKORfFODWXZxtBaaZNuN3PUX3MoDsChsZqopzi5UupRhPHSEHotoiptqikjN/B77mYQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.24.2.tgz",
"integrity": "sha512-shsVrgCZ57Vr2L8mm39kO5PPIb+843FStGt7sGGoqiiWYconSxwTiuswC1VJZLCjNiMLAMh34jg4VSEQb+iEbw==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -283,14 +269,13 @@
}
},
"node_modules/@esbuild/linux-riscv64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.24.0.tgz",
"integrity": "sha512-bEh7dMn/h3QxeR2KTy1DUszQjUrIHPZKyO6aN1X4BCnhfYhuQqedHaa5MxSQA/06j3GpiIlFGSsy1c7Gf9padw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.24.2.tgz",
"integrity": "sha512-4eSFWnU9Hhd68fW16GD0TINewo1L6dRrB+oLNNbYyMUAeOD2yCK5KXGK1GH4qD/kT+bTEXjsyTCiJGHPZ3eM9Q==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -300,14 +285,13 @@
}
},
"node_modules/@esbuild/linux-s390x": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.24.0.tgz",
"integrity": "sha512-ZcQ6+qRkw1UcZGPyrCiHHkmBaj9SiCD8Oqd556HldP+QlpUIe2Wgn3ehQGVoPOvZvtHm8HPx+bH20c9pvbkX3g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.24.2.tgz",
"integrity": "sha512-S0Bh0A53b0YHL2XEXC20bHLuGMOhFDO6GN4b3YjRLK//Ep3ql3erpNcPlEFed93hsQAjAQDNsvcK+hV90FubSw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -317,14 +301,13 @@
}
},
"node_modules/@esbuild/linux-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.24.0.tgz",
"integrity": "sha512-vbutsFqQ+foy3wSSbmjBXXIJ6PL3scghJoM8zCL142cGaZKAdCZHyf+Bpu/MmX9zT9Q0zFBVKb36Ma5Fzfa8xA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.24.2.tgz",
"integrity": "sha512-8Qi4nQcCTbLnK9WoMjdC9NiTG6/E38RNICU6sUNqK0QFxCYgoARqVqxdFmWkdonVsvGqWhmm7MO0jyTqLqwj0Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -333,15 +316,30 @@
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.24.2.tgz",
"integrity": "sha512-wuLK/VztRRpMt9zyHSazyCVdCXlpHkKm34WUyinD2lzK07FAHTq0KQvZZlXikNWkDGoT6x3TD51jKQ7gMVpopw==",
"cpu": [
"arm64"
],
"dev": true,
"optional": true,
"os": [
"netbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.24.0.tgz",
"integrity": "sha512-hjQ0R/ulkO8fCYFsG0FZoH+pWgTTDreqpqY7UnQntnaKv95uP5iW3+dChxnx7C3trQQU40S+OgWhUVwCjVFLvg==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.24.2.tgz",
"integrity": "sha512-VefFaQUc4FMmJuAxmIHgUmfNiLXY438XrL4GDNV1Y1H/RW3qow68xTwjZKfj/+Plp9NANmzbH5R40Meudu8mmw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"netbsd"
@@ -351,14 +349,13 @@
}
},
"node_modules/@esbuild/openbsd-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.24.0.tgz",
"integrity": "sha512-MD9uzzkPQbYehwcN583yx3Tu5M8EIoTD+tUgKF982WYL9Pf5rKy9ltgD0eUgs8pvKnmizxjXZyLt0z6DC3rRXg==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.24.2.tgz",
"integrity": "sha512-YQbi46SBct6iKnszhSvdluqDmxCJA+Pu280Av9WICNwQmMxV7nLRHZfjQzwbPs3jeWnuAhE9Jy0NrnJ12Oz+0A==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
@@ -368,14 +365,13 @@
}
},
"node_modules/@esbuild/openbsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.24.0.tgz",
"integrity": "sha512-4ir0aY1NGUhIC1hdoCzr1+5b43mw99uNwVzhIq1OY3QcEwPDO3B7WNXBzaKY5Nsf1+N11i1eOfFcq+D/gOS15Q==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.24.2.tgz",
"integrity": "sha512-+iDS6zpNM6EnJyWv0bMGLWSWeXGN/HTaF/LXHXHwejGsVi+ooqDfMCCTerNFxEkM3wYVcExkeGXNqshc9iMaOA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
@@ -385,14 +381,13 @@
}
},
"node_modules/@esbuild/sunos-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.24.0.tgz",
"integrity": "sha512-jVzdzsbM5xrotH+W5f1s+JtUy1UWgjU0Cf4wMvffTB8m6wP5/kx0KiaLHlbJO+dMgtxKV8RQ/JvtlFcdZ1zCPA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.24.2.tgz",
"integrity": "sha512-hTdsW27jcktEvpwNHJU4ZwWFGkz2zRJUz8pvddmXPtXDzVKTTINmlmga3ZzwcuMpUvLw7JkLy9QLKyGpD2Yxig==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"sunos"
@@ -402,14 +397,13 @@
}
},
"node_modules/@esbuild/win32-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.24.0.tgz",
"integrity": "sha512-iKc8GAslzRpBytO2/aN3d2yb2z8XTVfNV0PjGlCxKo5SgWmNXx82I/Q3aG1tFfS+A2igVCY97TJ8tnYwpUWLCA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.24.2.tgz",
"integrity": "sha512-LihEQ2BBKVFLOC9ZItT9iFprsE9tqjDjnbulhHoFxYQtQfai7qfluVODIYxt1PgdoyQkz23+01rzwNwYfutxUQ==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -419,14 +413,13 @@
}
},
"node_modules/@esbuild/win32-ia32": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.24.0.tgz",
"integrity": "sha512-vQW36KZolfIudCcTnaTpmLQ24Ha1RjygBo39/aLkM2kmjkWmZGEJ5Gn9l5/7tzXA42QGIoWbICfg6KLLkIw6yw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.24.2.tgz",
"integrity": "sha512-q+iGUwfs8tncmFC9pcnD5IvRHAzmbwQ3GPS5/ceCyHdjXubwQWI12MKWSNSMYLJMq23/IUCvJMS76PDqXe1fxA==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -436,14 +429,13 @@
}
},
"node_modules/@esbuild/win32-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.24.0.tgz",
"integrity": "sha512-7IAFPrjSQIJrGsK6flwg7NFmwBoSTyF3rl7If0hNUFQU4ilTsEPL6GuMuU9BfIWVVGuRnuIidkSMC+c0Otu8IA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.24.2.tgz",
"integrity": "sha512-7VTgWzgMGvup6aSqDPLiW5zHaxYJGTO4OokMjIlrCtf+VpEL+cXKtCvg723iguPYI5oaUNdS+/V7OU2gvXVWEg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -478,252 +470,247 @@
}
},
"node_modules/@rollup/rollup-android-arm-eabi": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.28.0.tgz",
"integrity": "sha512-wLJuPLT6grGZsy34g4N1yRfYeouklTgPhH1gWXCYspenKYD0s3cR99ZevOGw5BexMNywkbV3UkjADisozBmpPQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.34.6.tgz",
"integrity": "sha512-+GcCXtOQoWuC7hhX1P00LqjjIiS/iOouHXhMdiDSnq/1DGTox4SpUvO52Xm+div6+106r+TcvOeo/cxvyEyTgg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-android-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.28.0.tgz",
"integrity": "sha512-eiNkznlo0dLmVG/6wf+Ifi/v78G4d4QxRhuUl+s8EWZpDewgk7PX3ZyECUXU0Zq/Ca+8nU8cQpNC4Xgn2gFNDA==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.34.6.tgz",
"integrity": "sha512-E8+2qCIjciYUnCa1AiVF1BkRgqIGW9KzJeesQqVfyRITGQN+dFuoivO0hnro1DjT74wXLRZ7QF8MIbz+luGaJA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-darwin-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.28.0.tgz",
"integrity": "sha512-lmKx9yHsppblnLQZOGxdO66gT77bvdBtr/0P+TPOseowE7D9AJoBw8ZDULRasXRWf1Z86/gcOdpBrV6VDUY36Q==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.34.6.tgz",
"integrity": "sha512-z9Ib+OzqN3DZEjX7PDQMHEhtF+t6Mi2z/ueChQPLS/qUMKY7Ybn5A2ggFoKRNRh1q1T03YTQfBTQCJZiepESAg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-darwin-x64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.28.0.tgz",
"integrity": "sha512-8hxgfReVs7k9Js1uAIhS6zq3I+wKQETInnWQtgzt8JfGx51R1N6DRVy3F4o0lQwumbErRz52YqwjfvuwRxGv1w==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.34.6.tgz",
"integrity": "sha512-PShKVY4u0FDAR7jskyFIYVyHEPCPnIQY8s5OcXkdU8mz3Y7eXDJPdyM/ZWjkYdR2m0izD9HHWA8sGcXn+Qrsyg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-freebsd-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.28.0.tgz",
"integrity": "sha512-lA1zZB3bFx5oxu9fYud4+g1mt+lYXCoch0M0V/xhqLoGatbzVse0wlSQ1UYOWKpuSu3gyN4qEc0Dxf/DII1bhQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.34.6.tgz",
"integrity": "sha512-YSwyOqlDAdKqs0iKuqvRHLN4SrD2TiswfoLfvYXseKbL47ht1grQpq46MSiQAx6rQEN8o8URtpXARCpqabqxGQ==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-freebsd-x64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.28.0.tgz",
"integrity": "sha512-aI2plavbUDjCQB/sRbeUZWX9qp12GfYkYSJOrdYTL/C5D53bsE2/nBPuoiJKoWp5SN78v2Vr8ZPnB+/VbQ2pFA==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.34.6.tgz",
"integrity": "sha512-HEP4CgPAY1RxXwwL5sPFv6BBM3tVeLnshF03HMhJYCNc6kvSqBgTMmsEjb72RkZBAWIqiPUyF1JpEBv5XT9wKQ==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-linux-arm-gnueabihf": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.28.0.tgz",
"integrity": "sha512-WXveUPKtfqtaNvpf0iOb0M6xC64GzUX/OowbqfiCSXTdi/jLlOmH0Ba94/OkiY2yTGTwteo4/dsHRfh5bDCZ+w==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.34.6.tgz",
"integrity": "sha512-88fSzjC5xeH9S2Vg3rPgXJULkHcLYMkh8faix8DX4h4TIAL65ekwuQMA/g2CXq8W+NJC43V6fUpYZNjaX3+IIg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm-musleabihf": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.28.0.tgz",
"integrity": "sha512-yLc3O2NtOQR67lI79zsSc7lk31xjwcaocvdD1twL64PK1yNaIqCeWI9L5B4MFPAVGEVjH5k1oWSGuYX1Wutxpg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.34.6.tgz",
"integrity": "sha512-wM4ztnutBqYFyvNeR7Av+reWI/enK9tDOTKNF+6Kk2Q96k9bwhDDOlnCUNRPvromlVXo04riSliMBs/Z7RteEg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.28.0.tgz",
"integrity": "sha512-+P9G9hjEpHucHRXqesY+3X9hD2wh0iNnJXX/QhS/J5vTdG6VhNYMxJ2rJkQOxRUd17u5mbMLHM7yWGZdAASfcg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.34.6.tgz",
"integrity": "sha512-9RyprECbRa9zEjXLtvvshhw4CMrRa3K+0wcp3KME0zmBe1ILmvcVHnypZ/aIDXpRyfhSYSuN4EPdCCj5Du8FIA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-musl": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.28.0.tgz",
"integrity": "sha512-1xsm2rCKSTpKzi5/ypT5wfc+4bOGa/9yI/eaOLW0oMs7qpC542APWhl4A37AENGZ6St6GBMWhCCMM6tXgTIplw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.34.6.tgz",
"integrity": "sha512-qTmklhCTyaJSB05S+iSovfo++EwnIEZxHkzv5dep4qoszUMX5Ca4WM4zAVUMbfdviLgCSQOu5oU8YoGk1s6M9Q==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-loongarch64-gnu": {
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loongarch64-gnu/-/rollup-linux-loongarch64-gnu-4.34.6.tgz",
"integrity": "sha512-4Qmkaps9yqmpjY5pvpkfOerYgKNUGzQpFxV6rnS7c/JfYbDSU0y6WpbbredB5cCpLFGJEqYX40WUmxMkwhWCjw==",
"cpu": [
"loong64"
],
"dev": true,
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-powerpc64le-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.28.0.tgz",
"integrity": "sha512-zgWxMq8neVQeXL+ouSf6S7DoNeo6EPgi1eeqHXVKQxqPy1B2NvTbaOUWPn/7CfMKL7xvhV0/+fq/Z/J69g1WAQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.34.6.tgz",
"integrity": "sha512-Zsrtux3PuaxuBTX/zHdLaFmcofWGzaWW1scwLU3ZbW/X+hSsFbz9wDIp6XvnT7pzYRl9MezWqEqKy7ssmDEnuQ==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-riscv64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.28.0.tgz",
"integrity": "sha512-VEdVYacLniRxbRJLNtzwGt5vwS0ycYshofI7cWAfj7Vg5asqj+pt+Q6x4n+AONSZW/kVm+5nklde0qs2EUwU2g==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.34.6.tgz",
"integrity": "sha512-aK+Zp+CRM55iPrlyKiU3/zyhgzWBxLVrw2mwiQSYJRobCURb781+XstzvA8Gkjg/hbdQFuDw44aUOxVQFycrAg==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-s390x-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.28.0.tgz",
"integrity": "sha512-LQlP5t2hcDJh8HV8RELD9/xlYtEzJkm/aWGsauvdO2ulfl3QYRjqrKW+mGAIWP5kdNCBheqqqYIGElSRCaXfpw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.34.6.tgz",
"integrity": "sha512-WoKLVrY9ogmaYPXwTH326+ErlCIgMmsoRSx6bO+l68YgJnlOXhygDYSZe/qbUJCSiCiZAQ+tKm88NcWuUXqOzw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.28.0.tgz",
"integrity": "sha512-Nl4KIzteVEKE9BdAvYoTkW19pa7LR/RBrT6F1dJCV/3pbjwDcaOq+edkP0LXuJ9kflW/xOK414X78r+K84+msw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.34.6.tgz",
"integrity": "sha512-Sht4aFvmA4ToHd2vFzwMFaQCiYm2lDFho5rPcvPBT5pCdC+GwHG6CMch4GQfmWTQ1SwRKS0dhDYb54khSrjDWw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-musl": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.28.0.tgz",
"integrity": "sha512-eKpJr4vBDOi4goT75MvW+0dXcNUqisK4jvibY9vDdlgLx+yekxSm55StsHbxUsRxSTt3JEQvlr3cGDkzcSP8bw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.34.6.tgz",
"integrity": "sha512-zmmpOQh8vXc2QITsnCiODCDGXFC8LMi64+/oPpPx5qz3pqv0s6x46ps4xoycfUiVZps5PFn1gksZzo4RGTKT+A==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-win32-arm64-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.28.0.tgz",
"integrity": "sha512-Vi+WR62xWGsE/Oj+mD0FNAPY2MEox3cfyG0zLpotZdehPFXwz6lypkGs5y38Jd/NVSbOD02aVad6q6QYF7i8Bg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.34.6.tgz",
"integrity": "sha512-3/q1qUsO/tLqGBaD4uXsB6coVGB3usxw3qyeVb59aArCgedSF66MPdgRStUd7vbZOsko/CgVaY5fo2vkvPLWiA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-ia32-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.28.0.tgz",
"integrity": "sha512-kN/Vpip8emMLn/eOza+4JwqDZBL6MPNpkdaEsgUtW1NYN3DZvZqSQrbKzJcTL6hd8YNmFTn7XGWMwccOcJBL0A==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.34.6.tgz",
"integrity": "sha512-oLHxuyywc6efdKVTxvc0135zPrRdtYVjtVD5GUm55I3ODxhU/PwkQFD97z16Xzxa1Fz0AEe4W/2hzRtd+IfpOA==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-x64-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.28.0.tgz",
"integrity": "sha512-Bvno2/aZT6usSa7lRDL2+hMjVAGjuqaymF1ApZm31JXzniR/hvr14jpU+/z4X6Gt5BPlzosscyJZGUvguXIqeQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.34.6.tgz",
"integrity": "sha512-0PVwmgzZ8+TZ9oGBmdZoQVXflbvuwzN/HRclujpl4N/q3i+y0lqLw8n1bXA8ru3sApDjlmONaNAuYr38y1Kr9w==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -808,8 +795,7 @@
"version": "1.0.6",
"resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.6.tgz",
"integrity": "sha512-AYnb1nQyY49te+VRAVgmzfcgjYS91mY5P0TKUDCLEM+gNnA+3T6rWITXRLYCpahpqSQbN5cE+gHpnPyXjHWxcw==",
"dev": true,
"license": "MIT"
"dev": true
},
"node_modules/@types/events": {
"version": "3.0.3",
@@ -847,12 +833,11 @@
}
},
"node_modules/esbuild": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.24.0.tgz",
"integrity": "sha512-FuLPevChGDshgSicjisSooU0cemp/sGXR841D5LHMB7mTVOmsEHcAxaH3irL53+8YDIeVNQEySh4DaYU/iuPqQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.24.2.tgz",
"integrity": "sha512-+9egpBW8I3CD5XPe0n6BfT5fxLzxrlDzqydF3aviG+9ni1lDC/OvMHcxqEFV0+LANZG5R1bFMWfUrjVsdwxJvA==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"bin": {
"esbuild": "bin/esbuild"
},
@@ -860,30 +845,31 @@
"node": ">=18"
},
"optionalDependencies": {
"@esbuild/aix-ppc64": "0.24.0",
"@esbuild/android-arm": "0.24.0",
"@esbuild/android-arm64": "0.24.0",
"@esbuild/android-x64": "0.24.0",
"@esbuild/darwin-arm64": "0.24.0",
"@esbuild/darwin-x64": "0.24.0",
"@esbuild/freebsd-arm64": "0.24.0",
"@esbuild/freebsd-x64": "0.24.0",
"@esbuild/linux-arm": "0.24.0",
"@esbuild/linux-arm64": "0.24.0",
"@esbuild/linux-ia32": "0.24.0",
"@esbuild/linux-loong64": "0.24.0",
"@esbuild/linux-mips64el": "0.24.0",
"@esbuild/linux-ppc64": "0.24.0",
"@esbuild/linux-riscv64": "0.24.0",
"@esbuild/linux-s390x": "0.24.0",
"@esbuild/linux-x64": "0.24.0",
"@esbuild/netbsd-x64": "0.24.0",
"@esbuild/openbsd-arm64": "0.24.0",
"@esbuild/openbsd-x64": "0.24.0",
"@esbuild/sunos-x64": "0.24.0",
"@esbuild/win32-arm64": "0.24.0",
"@esbuild/win32-ia32": "0.24.0",
"@esbuild/win32-x64": "0.24.0"
"@esbuild/aix-ppc64": "0.24.2",
"@esbuild/android-arm": "0.24.2",
"@esbuild/android-arm64": "0.24.2",
"@esbuild/android-x64": "0.24.2",
"@esbuild/darwin-arm64": "0.24.2",
"@esbuild/darwin-x64": "0.24.2",
"@esbuild/freebsd-arm64": "0.24.2",
"@esbuild/freebsd-x64": "0.24.2",
"@esbuild/linux-arm": "0.24.2",
"@esbuild/linux-arm64": "0.24.2",
"@esbuild/linux-ia32": "0.24.2",
"@esbuild/linux-loong64": "0.24.2",
"@esbuild/linux-mips64el": "0.24.2",
"@esbuild/linux-ppc64": "0.24.2",
"@esbuild/linux-riscv64": "0.24.2",
"@esbuild/linux-s390x": "0.24.2",
"@esbuild/linux-x64": "0.24.2",
"@esbuild/netbsd-arm64": "0.24.2",
"@esbuild/netbsd-x64": "0.24.2",
"@esbuild/openbsd-arm64": "0.24.2",
"@esbuild/openbsd-x64": "0.24.2",
"@esbuild/sunos-x64": "0.24.2",
"@esbuild/win32-arm64": "0.24.2",
"@esbuild/win32-ia32": "0.24.2",
"@esbuild/win32-x64": "0.24.2"
}
},
"node_modules/events": {
@@ -901,7 +887,6 @@
"integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -951,7 +936,6 @@
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"bin": {
"nanoid": "bin/nanoid.cjs"
},
@@ -963,13 +947,12 @@
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz",
"integrity": "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==",
"dev": true,
"license": "ISC"
"dev": true
},
"node_modules/postcss": {
"version": "8.4.49",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.4.49.tgz",
"integrity": "sha512-OCVPnIObs4N29kxTjzLfUryOkvZEq+pf8jTF0lg8E7uETuWHA+v7j3c/xJmiqpX450191LlmZfUKkXxkTry7nA==",
"version": "8.5.2",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.2.tgz",
"integrity": "sha512-MjOadfU3Ys9KYoX0AdkBlFEF1Vx37uCCeN4ZHnmwm9FfpbsGWMZeBLMmmpY+6Ocqod7mkdZ0DT31OlbsFrLlkA==",
"dev": true,
"funding": [
{
@@ -985,9 +968,8 @@
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"dependencies": {
"nanoid": "^3.3.7",
"nanoid": "^3.3.8",
"picocolors": "^1.1.1",
"source-map-js": "^1.2.1"
},
@@ -1002,11 +984,10 @@
"license": "MIT"
},
"node_modules/rollup": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.28.0.tgz",
"integrity": "sha512-G9GOrmgWHBma4YfCcX8PjH0qhXSdH8B4HDE2o4/jaxj93S4DPCIDoLcXz99eWMji4hB29UFCEd7B2gwGJDR9cQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.34.6.tgz",
"integrity": "sha512-wc2cBWqJgkU3Iz5oztRkQbfVkbxoz5EhnCGOrnJvnLnQ7O0WhQUYyv18qQI79O8L7DdHrrlJNeCHd4VGpnaXKQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"@types/estree": "1.0.6"
},
@@ -1018,24 +999,25 @@
"npm": ">=8.0.0"
},
"optionalDependencies": {
"@rollup/rollup-android-arm-eabi": "4.28.0",
"@rollup/rollup-android-arm64": "4.28.0",
"@rollup/rollup-darwin-arm64": "4.28.0",
"@rollup/rollup-darwin-x64": "4.28.0",
"@rollup/rollup-freebsd-arm64": "4.28.0",
"@rollup/rollup-freebsd-x64": "4.28.0",
"@rollup/rollup-linux-arm-gnueabihf": "4.28.0",
"@rollup/rollup-linux-arm-musleabihf": "4.28.0",
"@rollup/rollup-linux-arm64-gnu": "4.28.0",
"@rollup/rollup-linux-arm64-musl": "4.28.0",
"@rollup/rollup-linux-powerpc64le-gnu": "4.28.0",
"@rollup/rollup-linux-riscv64-gnu": "4.28.0",
"@rollup/rollup-linux-s390x-gnu": "4.28.0",
"@rollup/rollup-linux-x64-gnu": "4.28.0",
"@rollup/rollup-linux-x64-musl": "4.28.0",
"@rollup/rollup-win32-arm64-msvc": "4.28.0",
"@rollup/rollup-win32-ia32-msvc": "4.28.0",
"@rollup/rollup-win32-x64-msvc": "4.28.0",
"@rollup/rollup-android-arm-eabi": "4.34.6",
"@rollup/rollup-android-arm64": "4.34.6",
"@rollup/rollup-darwin-arm64": "4.34.6",
"@rollup/rollup-darwin-x64": "4.34.6",
"@rollup/rollup-freebsd-arm64": "4.34.6",
"@rollup/rollup-freebsd-x64": "4.34.6",
"@rollup/rollup-linux-arm-gnueabihf": "4.34.6",
"@rollup/rollup-linux-arm-musleabihf": "4.34.6",
"@rollup/rollup-linux-arm64-gnu": "4.34.6",
"@rollup/rollup-linux-arm64-musl": "4.34.6",
"@rollup/rollup-linux-loongarch64-gnu": "4.34.6",
"@rollup/rollup-linux-powerpc64le-gnu": "4.34.6",
"@rollup/rollup-linux-riscv64-gnu": "4.34.6",
"@rollup/rollup-linux-s390x-gnu": "4.34.6",
"@rollup/rollup-linux-x64-gnu": "4.34.6",
"@rollup/rollup-linux-x64-musl": "4.34.6",
"@rollup/rollup-win32-arm64-msvc": "4.34.6",
"@rollup/rollup-win32-ia32-msvc": "4.34.6",
"@rollup/rollup-win32-x64-msvc": "4.34.6",
"fsevents": "~2.3.2"
}
},
@@ -1066,7 +1048,6 @@
"resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz",
"integrity": "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==",
"dev": true,
"license": "BSD-3-Clause",
"engines": {
"node": ">=0.10.0"
}
@@ -1101,15 +1082,14 @@
}
},
"node_modules/vite": {
"version": "6.0.2",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.0.2.tgz",
"integrity": "sha512-XdQ+VsY2tJpBsKGs0wf3U/+azx8BBpYRHFAyKm5VeEZNOJZRB63q7Sc8Iup3k0TrN3KO6QgyzFf+opSbfY1y0g==",
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.1.0.tgz",
"integrity": "sha512-RjjMipCKVoR4hVfPY6GQTgveinjNuyLw+qruksLDvA5ktI1150VmcMBKmQaEWJhg/j6Uaf6dNCNA0AfdzUb/hQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"esbuild": "^0.24.0",
"postcss": "^8.4.49",
"rollup": "^4.23.0"
"esbuild": "^0.24.2",
"postcss": "^8.5.1",
"rollup": "^4.30.1"
},
"bin": {
"vite": "bin/vite.js"

View File

@@ -12,7 +12,7 @@
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.0.2"
"vite": "^6.0.9"
},
"dependencies": {
"@pipecat-ai/client-js": "^0.3.2",

View File

@@ -25,7 +25,7 @@
"globals": "^15.12.0",
"typescript": "~5.6.2",
"typescript-eslint": "^8.15.0",
"vite": "^6.0.1"
"vite": "^6.0.9"
}
},
"node_modules/@ampproject/remapping": {
@@ -353,14 +353,13 @@
}
},
"node_modules/@esbuild/aix-ppc64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.24.0.tgz",
"integrity": "sha512-WtKdFM7ls47zkKHFVzMz8opM7LkcsIp9amDUBIAWirg70RM71WRSjdILPsY5Uv1D42ZpUfaPILDlfactHgsRkw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.24.2.tgz",
"integrity": "sha512-thpVCb/rhxE/BnMLQ7GReQLLN8q9qbHmI55F4489/ByVg2aQaQ6kbcLb6FHkocZzQhxc4gx0sCk0tJkKBFzDhA==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"aix"
@@ -370,14 +369,13 @@
}
},
"node_modules/@esbuild/android-arm": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.24.0.tgz",
"integrity": "sha512-arAtTPo76fJ/ICkXWetLCc9EwEHKaeya4vMrReVlEIUCAUncH7M4bhMQ+M9Vf+FFOZJdTNMXNBrWwW+OXWpSew==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.24.2.tgz",
"integrity": "sha512-tmwl4hJkCfNHwFB3nBa8z1Uy3ypZpxqxfTQOcHX+xRByyYgunVbZ9MzUUfb0RxaHIMnbHagwAxuTL+tnNM+1/Q==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -387,14 +385,13 @@
}
},
"node_modules/@esbuild/android-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.24.0.tgz",
"integrity": "sha512-Vsm497xFM7tTIPYK9bNTYJyF/lsP590Qc1WxJdlB6ljCbdZKU9SY8i7+Iin4kyhV/KV5J2rOKsBQbB77Ab7L/w==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.24.2.tgz",
"integrity": "sha512-cNLgeqCqV8WxfcTIOeL4OAtSmL8JjcN6m09XIgro1Wi7cF4t/THaWEa7eL5CMoMBdjoHOTh/vwTO/o2TRXIyzg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -404,14 +401,13 @@
}
},
"node_modules/@esbuild/android-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.24.0.tgz",
"integrity": "sha512-t8GrvnFkiIY7pa7mMgJd7p8p8qqYIz1NYiAoKc75Zyv73L3DZW++oYMSHPRarcotTKuSs6m3hTOa5CKHaS02TQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.24.2.tgz",
"integrity": "sha512-B6Q0YQDqMx9D7rvIcsXfmJfvUYLoP722bgfBlO5cGvNVb5V/+Y7nhBE3mHV9OpxBf4eAS2S68KZztiPaWq4XYw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
@@ -421,14 +417,13 @@
}
},
"node_modules/@esbuild/darwin-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.0.tgz",
"integrity": "sha512-CKyDpRbK1hXwv79soeTJNHb5EiG6ct3efd/FTPdzOWdbZZfGhpbcqIpiD0+vwmpu0wTIL97ZRPZu8vUt46nBSw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.2.tgz",
"integrity": "sha512-kj3AnYWc+CekmZnS5IPu9D+HWtUI49hbnyqk0FLEJDbzCIQt7hg7ucF1SQAilhtYpIujfaHr6O0UHlzzSPdOeA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -438,14 +433,13 @@
}
},
"node_modules/@esbuild/darwin-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.24.0.tgz",
"integrity": "sha512-rgtz6flkVkh58od4PwTRqxbKH9cOjaXCMZgWD905JOzjFKW+7EiUObfd/Kav+A6Gyud6WZk9w+xu6QLytdi2OA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.24.2.tgz",
"integrity": "sha512-WeSrmwwHaPkNR5H3yYfowhZcbriGqooyu3zI/3GGpF8AyUdsrrP0X6KumITGA9WOyiJavnGZUwPGvxvwfWPHIA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -455,14 +449,13 @@
}
},
"node_modules/@esbuild/freebsd-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.24.0.tgz",
"integrity": "sha512-6Mtdq5nHggwfDNLAHkPlyLBpE5L6hwsuXZX8XNmHno9JuL2+bg2BX5tRkwjyfn6sKbxZTq68suOjgWqCicvPXA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.24.2.tgz",
"integrity": "sha512-UN8HXjtJ0k/Mj6a9+5u6+2eZ2ERD7Edt1Q9IZiB5UZAIdPnVKDoG7mdTVGhHJIeEml60JteamR3qhsr1r8gXvg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
@@ -472,14 +465,13 @@
}
},
"node_modules/@esbuild/freebsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.24.0.tgz",
"integrity": "sha512-D3H+xh3/zphoX8ck4S2RxKR6gHlHDXXzOf6f/9dbFt/NRBDIE33+cVa49Kil4WUjxMGW0ZIYBYtaGCa2+OsQwQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.24.2.tgz",
"integrity": "sha512-TvW7wE/89PYW+IevEJXZ5sF6gJRDY/14hyIGFXdIucxCsbRmLUcjseQu1SyTko+2idmCw94TgyaEZi9HUSOe3Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
@@ -489,14 +481,13 @@
}
},
"node_modules/@esbuild/linux-arm": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.24.0.tgz",
"integrity": "sha512-gJKIi2IjRo5G6Glxb8d3DzYXlxdEj2NlkixPsqePSZMhLudqPhtZ4BUrpIuTjJYXxvF9njql+vRjB2oaC9XpBw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.24.2.tgz",
"integrity": "sha512-n0WRM/gWIdU29J57hJyUdIsk0WarGd6To0s+Y+LwvlC55wt+GT/OgkwoXCXvIue1i1sSNWblHEig00GBWiJgfA==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -506,14 +497,13 @@
}
},
"node_modules/@esbuild/linux-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.24.0.tgz",
"integrity": "sha512-TDijPXTOeE3eaMkRYpcy3LarIg13dS9wWHRdwYRnzlwlA370rNdZqbcp0WTyyV/k2zSxfko52+C7jU5F9Tfj1g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.24.2.tgz",
"integrity": "sha512-7HnAD6074BW43YvvUmE/35Id9/NB7BeX5EoNkK9obndmZBUk8xmJJeU7DwmUeN7tkysslb2eSl6CTrYz6oEMQg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -523,14 +513,13 @@
}
},
"node_modules/@esbuild/linux-ia32": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.24.0.tgz",
"integrity": "sha512-K40ip1LAcA0byL05TbCQ4yJ4swvnbzHscRmUilrmP9Am7//0UjPreh4lpYzvThT2Quw66MhjG//20mrufm40mA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.24.2.tgz",
"integrity": "sha512-sfv0tGPQhcZOgTKO3oBE9xpHuUqguHvSo4jl+wjnKwFpapx+vUDcawbwPNuBIAYdRAvIDBfZVvXprIj3HA+Ugw==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -540,14 +529,13 @@
}
},
"node_modules/@esbuild/linux-loong64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.24.0.tgz",
"integrity": "sha512-0mswrYP/9ai+CU0BzBfPMZ8RVm3RGAN/lmOMgW4aFUSOQBjA31UP8Mr6DDhWSuMwj7jaWOT0p0WoZ6jeHhrD7g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.24.2.tgz",
"integrity": "sha512-CN9AZr8kEndGooS35ntToZLTQLHEjtVB5n7dl8ZcTZMonJ7CCfStrYhrzF97eAecqVbVJ7APOEe18RPI4KLhwQ==",
"cpu": [
"loong64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -557,14 +545,13 @@
}
},
"node_modules/@esbuild/linux-mips64el": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.24.0.tgz",
"integrity": "sha512-hIKvXm0/3w/5+RDtCJeXqMZGkI2s4oMUGj3/jM0QzhgIASWrGO5/RlzAzm5nNh/awHE0A19h/CvHQe6FaBNrRA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.24.2.tgz",
"integrity": "sha512-iMkk7qr/wl3exJATwkISxI7kTcmHKE+BlymIAbHO8xanq/TjHaaVThFF6ipWzPHryoFsesNQJPE/3wFJw4+huw==",
"cpu": [
"mips64el"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -574,14 +561,13 @@
}
},
"node_modules/@esbuild/linux-ppc64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.24.0.tgz",
"integrity": "sha512-HcZh5BNq0aC52UoocJxaKORfFODWXZxtBaaZNuN3PUX3MoDsChsZqopzi5UupRhPHSEHotoiptqikjN/B77mYQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.24.2.tgz",
"integrity": "sha512-shsVrgCZ57Vr2L8mm39kO5PPIb+843FStGt7sGGoqiiWYconSxwTiuswC1VJZLCjNiMLAMh34jg4VSEQb+iEbw==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -591,14 +577,13 @@
}
},
"node_modules/@esbuild/linux-riscv64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.24.0.tgz",
"integrity": "sha512-bEh7dMn/h3QxeR2KTy1DUszQjUrIHPZKyO6aN1X4BCnhfYhuQqedHaa5MxSQA/06j3GpiIlFGSsy1c7Gf9padw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.24.2.tgz",
"integrity": "sha512-4eSFWnU9Hhd68fW16GD0TINewo1L6dRrB+oLNNbYyMUAeOD2yCK5KXGK1GH4qD/kT+bTEXjsyTCiJGHPZ3eM9Q==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -608,14 +593,13 @@
}
},
"node_modules/@esbuild/linux-s390x": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.24.0.tgz",
"integrity": "sha512-ZcQ6+qRkw1UcZGPyrCiHHkmBaj9SiCD8Oqd556HldP+QlpUIe2Wgn3ehQGVoPOvZvtHm8HPx+bH20c9pvbkX3g==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.24.2.tgz",
"integrity": "sha512-S0Bh0A53b0YHL2XEXC20bHLuGMOhFDO6GN4b3YjRLK//Ep3ql3erpNcPlEFed93hsQAjAQDNsvcK+hV90FubSw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -625,14 +609,13 @@
}
},
"node_modules/@esbuild/linux-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.24.0.tgz",
"integrity": "sha512-vbutsFqQ+foy3wSSbmjBXXIJ6PL3scghJoM8zCL142cGaZKAdCZHyf+Bpu/MmX9zT9Q0zFBVKb36Ma5Fzfa8xA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.24.2.tgz",
"integrity": "sha512-8Qi4nQcCTbLnK9WoMjdC9NiTG6/E38RNICU6sUNqK0QFxCYgoARqVqxdFmWkdonVsvGqWhmm7MO0jyTqLqwj0Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
@@ -641,15 +624,30 @@
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.24.2.tgz",
"integrity": "sha512-wuLK/VztRRpMt9zyHSazyCVdCXlpHkKm34WUyinD2lzK07FAHTq0KQvZZlXikNWkDGoT6x3TD51jKQ7gMVpopw==",
"cpu": [
"arm64"
],
"dev": true,
"optional": true,
"os": [
"netbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.24.0.tgz",
"integrity": "sha512-hjQ0R/ulkO8fCYFsG0FZoH+pWgTTDreqpqY7UnQntnaKv95uP5iW3+dChxnx7C3trQQU40S+OgWhUVwCjVFLvg==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.24.2.tgz",
"integrity": "sha512-VefFaQUc4FMmJuAxmIHgUmfNiLXY438XrL4GDNV1Y1H/RW3qow68xTwjZKfj/+Plp9NANmzbH5R40Meudu8mmw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"netbsd"
@@ -659,14 +657,13 @@
}
},
"node_modules/@esbuild/openbsd-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.24.0.tgz",
"integrity": "sha512-MD9uzzkPQbYehwcN583yx3Tu5M8EIoTD+tUgKF982WYL9Pf5rKy9ltgD0eUgs8pvKnmizxjXZyLt0z6DC3rRXg==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.24.2.tgz",
"integrity": "sha512-YQbi46SBct6iKnszhSvdluqDmxCJA+Pu280Av9WICNwQmMxV7nLRHZfjQzwbPs3jeWnuAhE9Jy0NrnJ12Oz+0A==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
@@ -676,14 +673,13 @@
}
},
"node_modules/@esbuild/openbsd-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.24.0.tgz",
"integrity": "sha512-4ir0aY1NGUhIC1hdoCzr1+5b43mw99uNwVzhIq1OY3QcEwPDO3B7WNXBzaKY5Nsf1+N11i1eOfFcq+D/gOS15Q==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.24.2.tgz",
"integrity": "sha512-+iDS6zpNM6EnJyWv0bMGLWSWeXGN/HTaF/LXHXHwejGsVi+ooqDfMCCTerNFxEkM3wYVcExkeGXNqshc9iMaOA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
@@ -693,14 +689,13 @@
}
},
"node_modules/@esbuild/sunos-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.24.0.tgz",
"integrity": "sha512-jVzdzsbM5xrotH+W5f1s+JtUy1UWgjU0Cf4wMvffTB8m6wP5/kx0KiaLHlbJO+dMgtxKV8RQ/JvtlFcdZ1zCPA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.24.2.tgz",
"integrity": "sha512-hTdsW27jcktEvpwNHJU4ZwWFGkz2zRJUz8pvddmXPtXDzVKTTINmlmga3ZzwcuMpUvLw7JkLy9QLKyGpD2Yxig==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"sunos"
@@ -710,14 +705,13 @@
}
},
"node_modules/@esbuild/win32-arm64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.24.0.tgz",
"integrity": "sha512-iKc8GAslzRpBytO2/aN3d2yb2z8XTVfNV0PjGlCxKo5SgWmNXx82I/Q3aG1tFfS+A2igVCY97TJ8tnYwpUWLCA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.24.2.tgz",
"integrity": "sha512-LihEQ2BBKVFLOC9ZItT9iFprsE9tqjDjnbulhHoFxYQtQfai7qfluVODIYxt1PgdoyQkz23+01rzwNwYfutxUQ==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -727,14 +721,13 @@
}
},
"node_modules/@esbuild/win32-ia32": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.24.0.tgz",
"integrity": "sha512-vQW36KZolfIudCcTnaTpmLQ24Ha1RjygBo39/aLkM2kmjkWmZGEJ5Gn9l5/7tzXA42QGIoWbICfg6KLLkIw6yw==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.24.2.tgz",
"integrity": "sha512-q+iGUwfs8tncmFC9pcnD5IvRHAzmbwQ3GPS5/ceCyHdjXubwQWI12MKWSNSMYLJMq23/IUCvJMS76PDqXe1fxA==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -744,14 +737,13 @@
}
},
"node_modules/@esbuild/win32-x64": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.24.0.tgz",
"integrity": "sha512-7IAFPrjSQIJrGsK6flwg7NFmwBoSTyF3rl7If0hNUFQU4ilTsEPL6GuMuU9BfIWVVGuRnuIidkSMC+c0Otu8IA==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.24.2.tgz",
"integrity": "sha512-7VTgWzgMGvup6aSqDPLiW5zHaxYJGTO4OokMjIlrCtf+VpEL+cXKtCvg723iguPYI5oaUNdS+/V7OU2gvXVWEg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -1097,252 +1089,247 @@
}
},
"node_modules/@rollup/rollup-android-arm-eabi": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.28.0.tgz",
"integrity": "sha512-wLJuPLT6grGZsy34g4N1yRfYeouklTgPhH1gWXCYspenKYD0s3cR99ZevOGw5BexMNywkbV3UkjADisozBmpPQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.34.6.tgz",
"integrity": "sha512-+GcCXtOQoWuC7hhX1P00LqjjIiS/iOouHXhMdiDSnq/1DGTox4SpUvO52Xm+div6+106r+TcvOeo/cxvyEyTgg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-android-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.28.0.tgz",
"integrity": "sha512-eiNkznlo0dLmVG/6wf+Ifi/v78G4d4QxRhuUl+s8EWZpDewgk7PX3ZyECUXU0Zq/Ca+8nU8cQpNC4Xgn2gFNDA==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.34.6.tgz",
"integrity": "sha512-E8+2qCIjciYUnCa1AiVF1BkRgqIGW9KzJeesQqVfyRITGQN+dFuoivO0hnro1DjT74wXLRZ7QF8MIbz+luGaJA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-darwin-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.28.0.tgz",
"integrity": "sha512-lmKx9yHsppblnLQZOGxdO66gT77bvdBtr/0P+TPOseowE7D9AJoBw8ZDULRasXRWf1Z86/gcOdpBrV6VDUY36Q==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.34.6.tgz",
"integrity": "sha512-z9Ib+OzqN3DZEjX7PDQMHEhtF+t6Mi2z/ueChQPLS/qUMKY7Ybn5A2ggFoKRNRh1q1T03YTQfBTQCJZiepESAg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-darwin-x64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.28.0.tgz",
"integrity": "sha512-8hxgfReVs7k9Js1uAIhS6zq3I+wKQETInnWQtgzt8JfGx51R1N6DRVy3F4o0lQwumbErRz52YqwjfvuwRxGv1w==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.34.6.tgz",
"integrity": "sha512-PShKVY4u0FDAR7jskyFIYVyHEPCPnIQY8s5OcXkdU8mz3Y7eXDJPdyM/ZWjkYdR2m0izD9HHWA8sGcXn+Qrsyg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-freebsd-arm64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.28.0.tgz",
"integrity": "sha512-lA1zZB3bFx5oxu9fYud4+g1mt+lYXCoch0M0V/xhqLoGatbzVse0wlSQ1UYOWKpuSu3gyN4qEc0Dxf/DII1bhQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.34.6.tgz",
"integrity": "sha512-YSwyOqlDAdKqs0iKuqvRHLN4SrD2TiswfoLfvYXseKbL47ht1grQpq46MSiQAx6rQEN8o8URtpXARCpqabqxGQ==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-freebsd-x64": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.28.0.tgz",
"integrity": "sha512-aI2plavbUDjCQB/sRbeUZWX9qp12GfYkYSJOrdYTL/C5D53bsE2/nBPuoiJKoWp5SN78v2Vr8ZPnB+/VbQ2pFA==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.34.6.tgz",
"integrity": "sha512-HEP4CgPAY1RxXwwL5sPFv6BBM3tVeLnshF03HMhJYCNc6kvSqBgTMmsEjb72RkZBAWIqiPUyF1JpEBv5XT9wKQ==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-linux-arm-gnueabihf": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.28.0.tgz",
"integrity": "sha512-WXveUPKtfqtaNvpf0iOb0M6xC64GzUX/OowbqfiCSXTdi/jLlOmH0Ba94/OkiY2yTGTwteo4/dsHRfh5bDCZ+w==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.34.6.tgz",
"integrity": "sha512-88fSzjC5xeH9S2Vg3rPgXJULkHcLYMkh8faix8DX4h4TIAL65ekwuQMA/g2CXq8W+NJC43V6fUpYZNjaX3+IIg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm-musleabihf": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.28.0.tgz",
"integrity": "sha512-yLc3O2NtOQR67lI79zsSc7lk31xjwcaocvdD1twL64PK1yNaIqCeWI9L5B4MFPAVGEVjH5k1oWSGuYX1Wutxpg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.34.6.tgz",
"integrity": "sha512-wM4ztnutBqYFyvNeR7Av+reWI/enK9tDOTKNF+6Kk2Q96k9bwhDDOlnCUNRPvromlVXo04riSliMBs/Z7RteEg==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.28.0.tgz",
"integrity": "sha512-+P9G9hjEpHucHRXqesY+3X9hD2wh0iNnJXX/QhS/J5vTdG6VhNYMxJ2rJkQOxRUd17u5mbMLHM7yWGZdAASfcg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.34.6.tgz",
"integrity": "sha512-9RyprECbRa9zEjXLtvvshhw4CMrRa3K+0wcp3KME0zmBe1ILmvcVHnypZ/aIDXpRyfhSYSuN4EPdCCj5Du8FIA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-musl": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.28.0.tgz",
"integrity": "sha512-1xsm2rCKSTpKzi5/ypT5wfc+4bOGa/9yI/eaOLW0oMs7qpC542APWhl4A37AENGZ6St6GBMWhCCMM6tXgTIplw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.34.6.tgz",
"integrity": "sha512-qTmklhCTyaJSB05S+iSovfo++EwnIEZxHkzv5dep4qoszUMX5Ca4WM4zAVUMbfdviLgCSQOu5oU8YoGk1s6M9Q==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-loongarch64-gnu": {
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loongarch64-gnu/-/rollup-linux-loongarch64-gnu-4.34.6.tgz",
"integrity": "sha512-4Qmkaps9yqmpjY5pvpkfOerYgKNUGzQpFxV6rnS7c/JfYbDSU0y6WpbbredB5cCpLFGJEqYX40WUmxMkwhWCjw==",
"cpu": [
"loong64"
],
"dev": true,
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-powerpc64le-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.28.0.tgz",
"integrity": "sha512-zgWxMq8neVQeXL+ouSf6S7DoNeo6EPgi1eeqHXVKQxqPy1B2NvTbaOUWPn/7CfMKL7xvhV0/+fq/Z/J69g1WAQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.34.6.tgz",
"integrity": "sha512-Zsrtux3PuaxuBTX/zHdLaFmcofWGzaWW1scwLU3ZbW/X+hSsFbz9wDIp6XvnT7pzYRl9MezWqEqKy7ssmDEnuQ==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-riscv64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.28.0.tgz",
"integrity": "sha512-VEdVYacLniRxbRJLNtzwGt5vwS0ycYshofI7cWAfj7Vg5asqj+pt+Q6x4n+AONSZW/kVm+5nklde0qs2EUwU2g==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.34.6.tgz",
"integrity": "sha512-aK+Zp+CRM55iPrlyKiU3/zyhgzWBxLVrw2mwiQSYJRobCURb781+XstzvA8Gkjg/hbdQFuDw44aUOxVQFycrAg==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-s390x-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.28.0.tgz",
"integrity": "sha512-LQlP5t2hcDJh8HV8RELD9/xlYtEzJkm/aWGsauvdO2ulfl3QYRjqrKW+mGAIWP5kdNCBheqqqYIGElSRCaXfpw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.34.6.tgz",
"integrity": "sha512-WoKLVrY9ogmaYPXwTH326+ErlCIgMmsoRSx6bO+l68YgJnlOXhygDYSZe/qbUJCSiCiZAQ+tKm88NcWuUXqOzw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-gnu": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.28.0.tgz",
"integrity": "sha512-Nl4KIzteVEKE9BdAvYoTkW19pa7LR/RBrT6F1dJCV/3pbjwDcaOq+edkP0LXuJ9kflW/xOK414X78r+K84+msw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.34.6.tgz",
"integrity": "sha512-Sht4aFvmA4ToHd2vFzwMFaQCiYm2lDFho5rPcvPBT5pCdC+GwHG6CMch4GQfmWTQ1SwRKS0dhDYb54khSrjDWw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-musl": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.28.0.tgz",
"integrity": "sha512-eKpJr4vBDOi4goT75MvW+0dXcNUqisK4jvibY9vDdlgLx+yekxSm55StsHbxUsRxSTt3JEQvlr3cGDkzcSP8bw==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.34.6.tgz",
"integrity": "sha512-zmmpOQh8vXc2QITsnCiODCDGXFC8LMi64+/oPpPx5qz3pqv0s6x46ps4xoycfUiVZps5PFn1gksZzo4RGTKT+A==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-win32-arm64-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.28.0.tgz",
"integrity": "sha512-Vi+WR62xWGsE/Oj+mD0FNAPY2MEox3cfyG0zLpotZdehPFXwz6lypkGs5y38Jd/NVSbOD02aVad6q6QYF7i8Bg==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.34.6.tgz",
"integrity": "sha512-3/q1qUsO/tLqGBaD4uXsB6coVGB3usxw3qyeVb59aArCgedSF66MPdgRStUd7vbZOsko/CgVaY5fo2vkvPLWiA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-ia32-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.28.0.tgz",
"integrity": "sha512-kN/Vpip8emMLn/eOza+4JwqDZBL6MPNpkdaEsgUtW1NYN3DZvZqSQrbKzJcTL6hd8YNmFTn7XGWMwccOcJBL0A==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.34.6.tgz",
"integrity": "sha512-oLHxuyywc6efdKVTxvc0135zPrRdtYVjtVD5GUm55I3ODxhU/PwkQFD97z16Xzxa1Fz0AEe4W/2hzRtd+IfpOA==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-x64-msvc": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.28.0.tgz",
"integrity": "sha512-Bvno2/aZT6usSa7lRDL2+hMjVAGjuqaymF1ApZm31JXzniR/hvr14jpU+/z4X6Gt5BPlzosscyJZGUvguXIqeQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.34.6.tgz",
"integrity": "sha512-0PVwmgzZ8+TZ9oGBmdZoQVXflbvuwzN/HRclujpl4N/q3i+y0lqLw8n1bXA8ru3sApDjlmONaNAuYr38y1Kr9w==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -2066,12 +2053,11 @@
"license": "ISC"
},
"node_modules/esbuild": {
"version": "0.24.0",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.24.0.tgz",
"integrity": "sha512-FuLPevChGDshgSicjisSooU0cemp/sGXR841D5LHMB7mTVOmsEHcAxaH3irL53+8YDIeVNQEySh4DaYU/iuPqQ==",
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.24.2.tgz",
"integrity": "sha512-+9egpBW8I3CD5XPe0n6BfT5fxLzxrlDzqydF3aviG+9ni1lDC/OvMHcxqEFV0+LANZG5R1bFMWfUrjVsdwxJvA==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"bin": {
"esbuild": "bin/esbuild"
},
@@ -2079,30 +2065,31 @@
"node": ">=18"
},
"optionalDependencies": {
"@esbuild/aix-ppc64": "0.24.0",
"@esbuild/android-arm": "0.24.0",
"@esbuild/android-arm64": "0.24.0",
"@esbuild/android-x64": "0.24.0",
"@esbuild/darwin-arm64": "0.24.0",
"@esbuild/darwin-x64": "0.24.0",
"@esbuild/freebsd-arm64": "0.24.0",
"@esbuild/freebsd-x64": "0.24.0",
"@esbuild/linux-arm": "0.24.0",
"@esbuild/linux-arm64": "0.24.0",
"@esbuild/linux-ia32": "0.24.0",
"@esbuild/linux-loong64": "0.24.0",
"@esbuild/linux-mips64el": "0.24.0",
"@esbuild/linux-ppc64": "0.24.0",
"@esbuild/linux-riscv64": "0.24.0",
"@esbuild/linux-s390x": "0.24.0",
"@esbuild/linux-x64": "0.24.0",
"@esbuild/netbsd-x64": "0.24.0",
"@esbuild/openbsd-arm64": "0.24.0",
"@esbuild/openbsd-x64": "0.24.0",
"@esbuild/sunos-x64": "0.24.0",
"@esbuild/win32-arm64": "0.24.0",
"@esbuild/win32-ia32": "0.24.0",
"@esbuild/win32-x64": "0.24.0"
"@esbuild/aix-ppc64": "0.24.2",
"@esbuild/android-arm": "0.24.2",
"@esbuild/android-arm64": "0.24.2",
"@esbuild/android-x64": "0.24.2",
"@esbuild/darwin-arm64": "0.24.2",
"@esbuild/darwin-x64": "0.24.2",
"@esbuild/freebsd-arm64": "0.24.2",
"@esbuild/freebsd-x64": "0.24.2",
"@esbuild/linux-arm": "0.24.2",
"@esbuild/linux-arm64": "0.24.2",
"@esbuild/linux-ia32": "0.24.2",
"@esbuild/linux-loong64": "0.24.2",
"@esbuild/linux-mips64el": "0.24.2",
"@esbuild/linux-ppc64": "0.24.2",
"@esbuild/linux-riscv64": "0.24.2",
"@esbuild/linux-s390x": "0.24.2",
"@esbuild/linux-x64": "0.24.2",
"@esbuild/netbsd-arm64": "0.24.2",
"@esbuild/netbsd-x64": "0.24.2",
"@esbuild/openbsd-arm64": "0.24.2",
"@esbuild/openbsd-x64": "0.24.2",
"@esbuild/sunos-x64": "0.24.2",
"@esbuild/win32-arm64": "0.24.2",
"@esbuild/win32-ia32": "0.24.2",
"@esbuild/win32-x64": "0.24.2"
}
},
"node_modules/escalade": {
@@ -2445,7 +2432,6 @@
"integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
@@ -2825,7 +2811,6 @@
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"bin": {
"nanoid": "bin/nanoid.cjs"
},
@@ -2951,9 +2936,9 @@
}
},
"node_modules/postcss": {
"version": "8.4.49",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.4.49.tgz",
"integrity": "sha512-OCVPnIObs4N29kxTjzLfUryOkvZEq+pf8jTF0lg8E7uETuWHA+v7j3c/xJmiqpX450191LlmZfUKkXxkTry7nA==",
"version": "8.5.2",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.2.tgz",
"integrity": "sha512-MjOadfU3Ys9KYoX0AdkBlFEF1Vx37uCCeN4ZHnmwm9FfpbsGWMZeBLMmmpY+6Ocqod7mkdZ0DT31OlbsFrLlkA==",
"dev": true,
"funding": [
{
@@ -2969,9 +2954,8 @@
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"dependencies": {
"nanoid": "^3.3.7",
"nanoid": "^3.3.8",
"picocolors": "^1.1.1",
"source-map-js": "^1.2.1"
},
@@ -3083,11 +3067,10 @@
}
},
"node_modules/rollup": {
"version": "4.28.0",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.28.0.tgz",
"integrity": "sha512-G9GOrmgWHBma4YfCcX8PjH0qhXSdH8B4HDE2o4/jaxj93S4DPCIDoLcXz99eWMji4hB29UFCEd7B2gwGJDR9cQ==",
"version": "4.34.6",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.34.6.tgz",
"integrity": "sha512-wc2cBWqJgkU3Iz5oztRkQbfVkbxoz5EhnCGOrnJvnLnQ7O0WhQUYyv18qQI79O8L7DdHrrlJNeCHd4VGpnaXKQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"@types/estree": "1.0.6"
},
@@ -3099,24 +3082,25 @@
"npm": ">=8.0.0"
},
"optionalDependencies": {
"@rollup/rollup-android-arm-eabi": "4.28.0",
"@rollup/rollup-android-arm64": "4.28.0",
"@rollup/rollup-darwin-arm64": "4.28.0",
"@rollup/rollup-darwin-x64": "4.28.0",
"@rollup/rollup-freebsd-arm64": "4.28.0",
"@rollup/rollup-freebsd-x64": "4.28.0",
"@rollup/rollup-linux-arm-gnueabihf": "4.28.0",
"@rollup/rollup-linux-arm-musleabihf": "4.28.0",
"@rollup/rollup-linux-arm64-gnu": "4.28.0",
"@rollup/rollup-linux-arm64-musl": "4.28.0",
"@rollup/rollup-linux-powerpc64le-gnu": "4.28.0",
"@rollup/rollup-linux-riscv64-gnu": "4.28.0",
"@rollup/rollup-linux-s390x-gnu": "4.28.0",
"@rollup/rollup-linux-x64-gnu": "4.28.0",
"@rollup/rollup-linux-x64-musl": "4.28.0",
"@rollup/rollup-win32-arm64-msvc": "4.28.0",
"@rollup/rollup-win32-ia32-msvc": "4.28.0",
"@rollup/rollup-win32-x64-msvc": "4.28.0",
"@rollup/rollup-android-arm-eabi": "4.34.6",
"@rollup/rollup-android-arm64": "4.34.6",
"@rollup/rollup-darwin-arm64": "4.34.6",
"@rollup/rollup-darwin-x64": "4.34.6",
"@rollup/rollup-freebsd-arm64": "4.34.6",
"@rollup/rollup-freebsd-x64": "4.34.6",
"@rollup/rollup-linux-arm-gnueabihf": "4.34.6",
"@rollup/rollup-linux-arm-musleabihf": "4.34.6",
"@rollup/rollup-linux-arm64-gnu": "4.34.6",
"@rollup/rollup-linux-arm64-musl": "4.34.6",
"@rollup/rollup-linux-loongarch64-gnu": "4.34.6",
"@rollup/rollup-linux-powerpc64le-gnu": "4.34.6",
"@rollup/rollup-linux-riscv64-gnu": "4.34.6",
"@rollup/rollup-linux-s390x-gnu": "4.34.6",
"@rollup/rollup-linux-x64-gnu": "4.34.6",
"@rollup/rollup-linux-x64-musl": "4.34.6",
"@rollup/rollup-win32-arm64-msvc": "4.34.6",
"@rollup/rollup-win32-ia32-msvc": "4.34.6",
"@rollup/rollup-win32-x64-msvc": "4.34.6",
"fsevents": "~2.3.2"
}
},
@@ -3213,7 +3197,6 @@
"resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz",
"integrity": "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==",
"dev": true,
"license": "BSD-3-Clause",
"engines": {
"node": ">=0.10.0"
}
@@ -3395,15 +3378,14 @@
}
},
"node_modules/vite": {
"version": "6.0.2",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.0.2.tgz",
"integrity": "sha512-XdQ+VsY2tJpBsKGs0wf3U/+azx8BBpYRHFAyKm5VeEZNOJZRB63q7Sc8Iup3k0TrN3KO6QgyzFf+opSbfY1y0g==",
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.1.0.tgz",
"integrity": "sha512-RjjMipCKVoR4hVfPY6GQTgveinjNuyLw+qruksLDvA5ktI1150VmcMBKmQaEWJhg/j6Uaf6dNCNA0AfdzUb/hQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"esbuild": "^0.24.0",
"postcss": "^8.4.49",
"rollup": "^4.23.0"
"esbuild": "^0.24.2",
"postcss": "^8.5.1",
"rollup": "^4.30.1"
},
"bin": {
"vite": "bin/vite.js"

View File

@@ -27,6 +27,6 @@
"globals": "^15.12.0",
"typescript": "~5.6.2",
"typescript-eslint": "^8.15.0",
"vite": "^6.0.1"
"vite": "^6.0.9"
}
}

View File

@@ -40,7 +40,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIProcessor
from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIObserver, RTVIProcessor
from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
@@ -176,7 +176,7 @@ async def main():
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
observers=[rtvi.observer()],
observers=[RTVIObserver(rtvi)],
),
)
await task.queue_frame(quiet_frame)

View File

@@ -40,7 +40,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIProcessor
from pipecat.processors.frameworks.rtvi import RTVIConfig, RTVIObserver, RTVIProcessor
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
@@ -202,7 +202,7 @@ async def main():
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
observers=[rtvi.observer()],
observers=[RTVIObserver(rtvi)],
),
)
await task.queue_frame(quiet_frame)

View File

@@ -74,6 +74,8 @@ If you'd like to run a custom domain or port:
➡️ Open the host URL in your browser `http://localhost:7860`
If you've run previous versions of the demo, make sure to set `ENV=dev`, and remove the `RUN_AS_VM` line from the .env file.
---
## Improvements to make

View File

@@ -3,6 +3,4 @@ DAILY_SAMPLE_ROOM_URL=
ELEVENLABS_API_KEY=
ELEVENLABS_VOICE_ID=
GOOGLE_API_KEY=
ENV= # dev | production
RUN_AS_VM= # Set this if you want to run bots on process (not launch a new VM)
ENV=dev

View File

@@ -2,5 +2,4 @@ async_timeout
fastapi
uvicorn
python-dotenv
-e "../..[daily,silero,openai,fal,cartesia,google]"
-e "../../../python-genai"
pipecat-ai[daily,silero,openai,cartesia,google]

View File

@@ -23,8 +23,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.fal import FalImageGenService
from pipecat.services.google import GoogleLLMService
from pipecat.services.google import GoogleImageGenService, GoogleLLMService
from pipecat.transports.services.daily import (
DailyParams,
DailyTransport,

View File

@@ -90,7 +90,7 @@ async def run_bot(websocket_client: WebSocket, stream_sid: str, testing: bool):
# NOTE: Watch out! This will save all the conversation in memory. You can
# pass `buffer_size` to get periodic callbacks.
audiobuffer = AudioBufferProcessor()
audiobuffer = AudioBufferProcessor(user_continuous_stream=not testing)
pipeline = Pipeline(
[

View File

@@ -124,7 +124,7 @@ async def run_client(client_name: str, server_url: str, duration_secs: int):
# NOTE: Watch out! This will save all the conversation in memory. You can
# pass `buffer_size` to get periodic callbacks.
audiobuffer = AudioBufferProcessor()
audiobuffer = AudioBufferProcessor(user_continuous_stream=False)
pipeline = Pipeline(
[

View File

@@ -55,7 +55,7 @@ elevenlabs = [ "websockets~=13.1" ]
fal = [ "fal-client~=0.5.6" ]
fish = [ "ormsgpack~=1.7.0", "websockets~=13.1" ]
gladia = [ "websockets~=13.1" ]
google = [ "google-generativeai~=0.8.3", "google-cloud-texttospeech~=2.24.0", "google-genai~=0.7.0" ]
google = [ "google-cloud-speech~=2.31.0", "google-cloud-texttospeech~=2.25.0", "google-genai~=1.2.0", "google-generativeai~=0.8.4" ]
grok = [ "openai~=1.59.6" ]
groq = [ "openai~=1.59.6" ]
gstreamer = [ "pygobject~=3.50.0" ]
@@ -71,7 +71,9 @@ nim = [ "openai~=1.59.6" ]
noisereduce = [ "noisereduce~=3.0.3" ]
openai = [ "openai~=1.59.6", "websockets~=13.1", "python-deepcompare~=2.1.0" ]
openpipe = [ "openpipe~=4.45.0" ]
perplexity = [ "openai~=1.59.6" ]
playht = [ "pyht~=0.1.6", "websockets~=13.1" ]
rime = [ "websockets~=13.1" ]
riva = [ "nvidia-riva-client~=2.18.0" ]
sentry = [ "sentry-sdk~=2.20.0" ]
silero = [ "onnxruntime~=1.20.1" ]
@@ -111,3 +113,8 @@ select = [
[tool.ruff.lint.pydocstyle]
convention = "google"
[tool.coverage.run]
command_line = "--module pytest"
source = ["src"]
omit = ["*/tests/*"]

View File

@@ -565,6 +565,22 @@ class UserStoppedSpeakingFrame(SystemFrame):
pass
@dataclass
class EmulateUserStartedSpeakingFrame(SystemFrame):
"""Emitted by internal processors upstream to emulate VAD behavior when a
user starts speaking."""
pass
@dataclass
class EmulateUserStoppedSpeakingFrame(SystemFrame):
"""Emitted by internal processors upstream to emulate VAD behavior when a
user stops speaking."""
pass
@dataclass
class BotInterruptionFrame(SystemFrame):
"""Emitted by when the bot should be interrupted. This will mainly cause the
@@ -618,6 +634,13 @@ class FunctionCallInProgressFrame(SystemFrame):
arguments: str
@dataclass
class STTMuteFrame(SystemFrame):
"""System frame to mute/unmute the STT service."""
mute: bool
@dataclass
class TransportMessageUrgentFrame(SystemFrame):
message: Any
@@ -752,13 +775,6 @@ class TTSUpdateSettingsFrame(ServiceUpdateSettingsFrame):
pass
@dataclass
class STTMuteFrame(ControlFrame):
"""Control frame to mute/unmute the STT service."""
mute: bool
@dataclass
class STTUpdateSettingsFrame(ServiceUpdateSettingsFrame):
pass

View File

@@ -4,9 +4,16 @@
# SPDX-License-Identifier: BSD 2-Clause License
#
from typing import List, Optional, Type
import asyncio
import time
from abc import abstractmethod
from typing import List
from pipecat.frames.frames import (
CancelFrame,
EmulateUserStartedSpeakingFrame,
EmulateUserStoppedSpeakingFrame,
EndFrame,
Frame,
InterimTranscriptionFrame,
LLMFullResponseEndFrame,
@@ -15,6 +22,7 @@ from pipecat.frames.frames import (
LLMMessagesFrame,
LLMMessagesUpdateFrame,
LLMSetToolsFrame,
StartFrame,
StartInterruptionFrame,
TextFrame,
TranscriptionFrame,
@@ -28,121 +36,105 @@ from pipecat.processors.aggregators.openai_llm_context import (
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
class LLMResponseAggregator(FrameProcessor):
class BaseLLMResponseAggregator(FrameProcessor):
"""This is the base class for all LLM response aggregators. These
aggregators process incoming frames and aggregate content until they are
ready to push the aggregation. In the case of a user, an aggregation might
be a full transcription received from the STT service.
The LLM response aggregators also keep a store (e.g. a message list or an
LLM context) of the current conversation, that is, it stores the messages
said by the user or by the bot.
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
@property
@abstractmethod
def messages(self) -> List[dict]:
"""Returns the messages from the current conversation."""
pass
@property
@abstractmethod
def role(self) -> str:
"""Returns the role (e.g. user, assistant...) for this aggregator."""
pass
@abstractmethod
def add_messages(self, messages):
"""Add the given messages to the conversation."""
pass
@abstractmethod
def set_messages(self, messages):
"""Reset the conversation with the given messages."""
pass
@abstractmethod
def set_tools(self, tools):
"""Set LLM tools to be used in the current conversation."""
pass
@abstractmethod
def reset(self):
"""Reset the internals of this aggregator. This should not modify the
internal messages."""
pass
@abstractmethod
async def push_aggregation(self):
pass
class LLMResponseAggregator(BaseLLMResponseAggregator):
"""This is a base LLM aggregator that uses a simple list of messages to
store the conversation. It pushes `LLMMessagesFrame` as an aggregation
frame.
"""
def __init__(
self,
*,
messages: List[dict],
role: str,
start_frame,
end_frame,
accumulator_frame: Type[TextFrame],
interim_accumulator_frame: Optional[Type[TextFrame]] = None,
handle_interruptions: bool = False,
expect_stripped_words: bool = True, # if True, need to add spaces between words
role: str = "user",
**kwargs,
):
super().__init__()
super().__init__(**kwargs)
self._messages = messages
self._role = role
self._start_frame = start_frame
self._end_frame = end_frame
self._accumulator_frame = accumulator_frame
self._interim_accumulator_frame = interim_accumulator_frame
self._handle_interruptions = handle_interruptions
self._expect_stripped_words = expect_stripped_words
# Reset our accumulator state.
self._reset()
self._aggregation = ""
self.reset()
@property
def messages(self):
def messages(self) -> List[dict]:
return self._messages
@property
def role(self):
def role(self) -> str:
return self._role
#
# Frame processor
#
def add_messages(self, messages):
self._messages.extend(messages)
# Use cases implemented:
#
# S: Start, E: End, T: Transcription, I: Interim, X: Text
#
# S E -> None
# S T E -> X
# S I T E -> X
# S I E T -> X
# S I E I T -> X
# S E T -> X
# S E I T -> X
#
# The following case would not be supported:
#
# S I E T1 I T2 -> X
#
# and T2 would be dropped.
def set_messages(self, messages):
self.reset()
self._messages.clear()
self._messages.extend(messages)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
def set_tools(self, tools):
pass
send_aggregation = False
def reset(self):
self._aggregation = ""
if isinstance(frame, self._start_frame):
self._aggregation = ""
self._aggregating = True
self._seen_start_frame = True
self._seen_end_frame = False
self._seen_interim_results = False
await self.push_frame(frame, direction)
elif isinstance(frame, self._end_frame):
self._seen_end_frame = True
self._seen_start_frame = False
# We might have received the end frame but we might still be
# aggregating (i.e. we have seen interim results but not the final
# text).
self._aggregating = self._seen_interim_results or len(self._aggregation) == 0
# Send the aggregation if we are not aggregating anymore (i.e. no
# more interim results received).
send_aggregation = not self._aggregating
await self.push_frame(frame, direction)
elif isinstance(frame, self._accumulator_frame):
if self._aggregating:
if self._expect_stripped_words:
self._aggregation += f" {frame.text}" if self._aggregation else frame.text
else:
self._aggregation += frame.text
# We have recevied a complete sentence, so if we have seen the
# end frame and we were still aggregating, it means we should
# send the aggregation.
send_aggregation = self._seen_end_frame
# We just got our final result, so let's reset interim results.
self._seen_interim_results = False
elif self._interim_accumulator_frame and isinstance(frame, self._interim_accumulator_frame):
self._seen_interim_results = True
elif self._handle_interruptions and isinstance(frame, StartInterruptionFrame):
await self._push_aggregation()
# Reset anyways
self._reset()
await self.push_frame(frame, direction)
elif isinstance(frame, LLMMessagesAppendFrame):
self._add_messages(frame.messages)
elif isinstance(frame, LLMMessagesUpdateFrame):
self._set_messages(frame.messages)
elif isinstance(frame, LLMSetToolsFrame):
self._set_tools(frame.tools)
else:
await self.push_frame(frame, direction)
if send_aggregation:
await self._push_aggregation()
async def _push_aggregation(self):
async def push_aggregation(self):
if len(self._aggregation) > 0:
self._messages.append({"role": self._role, "content": self._aggregation})
@@ -153,109 +145,27 @@ class LLMResponseAggregator(FrameProcessor):
frame = LLMMessagesFrame(self._messages)
await self.push_frame(frame)
# TODO-CB: Types
def _add_messages(self, messages):
self._messages.extend(messages)
def _set_messages(self, messages):
self._reset()
self._messages.clear()
self._messages.extend(messages)
class LLMContextResponseAggregator(BaseLLMResponseAggregator):
"""This is a base LLM aggregator that uses an LLM context to store the
conversation. It pushes `OpenAILLMContextFrame` as an aggregation frame.
def _set_tools(self, tools):
# noop in the base class
pass
def _reset(self):
self._aggregation = ""
self._aggregating = False
self._seen_start_frame = False
self._seen_end_frame = False
self._seen_interim_results = False
class LLMAssistantResponseAggregator(LLMResponseAggregator):
def __init__(self, messages: List[dict] = []):
super().__init__(
messages=messages,
role="assistant",
start_frame=LLMFullResponseStartFrame,
end_frame=LLMFullResponseEndFrame,
accumulator_frame=TextFrame,
handle_interruptions=True,
)
class LLMUserResponseAggregator(LLMResponseAggregator):
def __init__(self, messages: List[dict] = []):
super().__init__(
messages=messages,
role="user",
start_frame=UserStartedSpeakingFrame,
end_frame=UserStoppedSpeakingFrame,
accumulator_frame=TranscriptionFrame,
interim_accumulator_frame=InterimTranscriptionFrame,
)
class LLMFullResponseAggregator(FrameProcessor):
"""This class aggregates Text frames until it receives a
LLMFullResponseEndFrame, then emits the concatenated text as
a single text frame.
given the following frames:
TextFrame("Hello,")
TextFrame(" world.")
TextFrame(" I am")
TextFrame(" an LLM.")
LLMFullResponseEndFrame()]
this processor will yield nothing for the first 4 frames, then
TextFrame("Hello, world. I am an LLM.")
LLMFullResponseEndFrame()
when passed the last frame.
>>> async def print_frames(aggregator, frame):
... async for frame in aggregator.process_frame(frame):
... if isinstance(frame, TextFrame):
... print(frame.text)
... else:
... print(frame.__class__.__name__)
>>> aggregator = LLMFullResponseAggregator()
>>> asyncio.run(print_frames(aggregator, TextFrame("Hello,")))
>>> asyncio.run(print_frames(aggregator, TextFrame(" world.")))
>>> asyncio.run(print_frames(aggregator, TextFrame(" I am")))
>>> asyncio.run(print_frames(aggregator, TextFrame(" an LLM.")))
>>> asyncio.run(print_frames(aggregator, LLMFullResponseEndFrame()))
Hello, world. I am an LLM.
LLMFullResponseEndFrame
"""
def __init__(self):
super().__init__()
self._aggregation = ""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, TextFrame):
self._aggregation += frame.text
elif isinstance(frame, LLMFullResponseEndFrame):
await self.push_frame(TextFrame(self._aggregation))
await self.push_frame(frame)
self._aggregation = ""
else:
await self.push_frame(frame, direction)
class LLMContextAggregator(LLMResponseAggregator):
def __init__(self, *, context: OpenAILLMContext, **kwargs):
def __init__(self, *, context: OpenAILLMContext, role: str, **kwargs):
super().__init__(**kwargs)
self._context = context
self._role = role
self._aggregation = ""
@property
def messages(self) -> List[dict]:
return self._context.get_messages()
@property
def role(self) -> str:
return self._role
@property
def context(self):
@@ -264,23 +174,25 @@ class LLMContextAggregator(LLMResponseAggregator):
def get_context_frame(self) -> OpenAILLMContextFrame:
return OpenAILLMContextFrame(context=self._context)
async def push_context_frame(self):
async def push_context_frame(self, direction: FrameDirection = FrameDirection.DOWNSTREAM):
frame = self.get_context_frame()
await self.push_frame(frame)
await self.push_frame(frame, direction)
# TODO-CB: Types
def _add_messages(self, messages):
def add_messages(self, messages):
self._context.add_messages(messages)
def _set_messages(self, messages):
def set_messages(self, messages):
self._context.set_messages(messages)
def _set_tools(self, tools: List):
def set_tools(self, tools: List):
self._context.set_tools(tools)
async def _push_aggregation(self):
def reset(self):
self._aggregation = ""
async def push_aggregation(self):
if len(self._aggregation) > 0:
self._context.add_message({"role": self._role, "content": self._aggregation})
self._context.add_message({"role": self.role, "content": self._aggregation})
# Reset the aggregation. Reset it before pushing it down, otherwise
# if the tasks gets cancelled we won't be able to clear things up.
@@ -290,31 +202,239 @@ class LLMContextAggregator(LLMResponseAggregator):
await self.push_frame(frame)
# Reset our accumulator state.
self._reset()
self.reset()
class LLMAssistantContextAggregator(LLMContextAggregator):
def __init__(self, context: OpenAILLMContext, *, expect_stripped_words: bool = True):
super().__init__(
messages=[],
context=context,
role="assistant",
start_frame=LLMFullResponseStartFrame,
end_frame=LLMFullResponseEndFrame,
accumulator_frame=TextFrame,
handle_interruptions=True,
expect_stripped_words=expect_stripped_words,
)
class LLMUserContextAggregator(LLMContextResponseAggregator):
"""This is a user LLM aggregator that uses an LLM context to store the
conversation. It aggregates transcriptions from the STT service and it has
logic to handle multiple scenarios where transcriptions are received between
VAD events (`UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame`) or
even outside or no VAD events at all.
"""
def __init__(
self,
context: OpenAILLMContext,
aggregation_timeout: float = 1.0,
bot_interruption_timeout: float = 2.0,
**kwargs,
):
super().__init__(context=context, role="user", **kwargs)
self._aggregation_timeout = aggregation_timeout
self._bot_interruption_timeout = bot_interruption_timeout
self._seen_interim_results = False
self._user_speaking = False
self._last_user_speaking_time = 0
self._emulating_vad = False
self._aggregation_event = asyncio.Event()
self._aggregation_task = None
self.reset()
def reset(self):
super().reset()
self._seen_interim_results = False
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, StartFrame):
await self._start(frame)
await self.push_frame(frame, direction)
elif isinstance(frame, EndFrame):
await self._stop(frame)
await self.push_frame(frame, direction)
elif isinstance(frame, CancelFrame):
await self._cancel(frame)
await self.push_frame(frame, direction)
elif isinstance(frame, UserStartedSpeakingFrame):
await self._handle_user_started_speaking(frame)
await self.push_frame(frame, direction)
elif isinstance(frame, UserStoppedSpeakingFrame):
await self._handle_user_stopped_speaking(frame)
await self.push_frame(frame, direction)
elif isinstance(frame, TranscriptionFrame):
await self._handle_transcription(frame)
elif isinstance(frame, InterimTranscriptionFrame):
await self._handle_interim_transcription(frame)
elif isinstance(frame, LLMMessagesAppendFrame):
self.add_messages(frame.messages)
elif isinstance(frame, LLMMessagesUpdateFrame):
self.set_messages(frame.messages)
elif isinstance(frame, LLMSetToolsFrame):
self.set_tools(frame.tools)
else:
await self.push_frame(frame, direction)
async def _start(self, frame: StartFrame):
self._create_aggregation_task()
async def _stop(self, frame: EndFrame):
await self._cancel_aggregation_task()
async def _cancel(self, frame: CancelFrame):
await self._cancel_aggregation_task()
async def _handle_user_started_speaking(self, _: UserStartedSpeakingFrame):
self._last_user_speaking_time = time.time()
self._user_speaking = True
async def _handle_user_stopped_speaking(self, _: UserStoppedSpeakingFrame):
self._last_user_speaking_time = time.time()
self._user_speaking = False
if not self._seen_interim_results:
await self.push_aggregation()
async def _handle_transcription(self, frame: TranscriptionFrame):
self._aggregation += f" {frame.text}" if self._aggregation else frame.text
# We just got a final result, so let's reset interim results.
self._seen_interim_results = False
# Reset aggregation timer.
self._aggregation_event.set()
async def _handle_interim_transcription(self, _: InterimTranscriptionFrame):
self._seen_interim_results = True
# Reset aggregation timer.
self._aggregation_event.set()
def _create_aggregation_task(self):
self._aggregation_task = self.create_task(self._aggregation_task_handler())
async def _cancel_aggregation_task(self):
if self._aggregation_task:
await self.cancel_task(self._aggregation_task)
self._aggregation_task = None
async def _aggregation_task_handler(self):
while True:
try:
await asyncio.wait_for(self._aggregation_event.wait(), self._aggregation_timeout)
await self._maybe_push_bot_interruption()
except asyncio.TimeoutError:
if not self._user_speaking:
await self.push_aggregation()
# If we are emulating VAD we still need to send the user stopped
# speaking frame.
if self._emulating_vad:
await self.push_frame(
EmulateUserStoppedSpeakingFrame(), FrameDirection.UPSTREAM
)
self._emulating_vad = False
finally:
self._aggregation_event.clear()
async def _maybe_push_bot_interruption(self):
"""If the user stopped speaking a while back and we got a transcription
frame we might want to interrupt the bot.
"""
if not self._user_speaking:
diff_time = time.time() - self._last_user_speaking_time
if diff_time > self._bot_interruption_timeout:
# If we reach this case we received a transcription but VAD was
# not able to detect voice (e.g. when you whisper a short
# utterance). So, we need to emulate VAD (i.e. user
# start/stopped speaking).
await self.push_frame(EmulateUserStartedSpeakingFrame(), FrameDirection.UPSTREAM)
self._emulating_vad = True
# Reset time so we don't interrupt again right away.
self._last_user_speaking_time = time.time()
class LLMUserContextAggregator(LLMContextAggregator):
def __init__(self, context: OpenAILLMContext):
super().__init__(
messages=[],
context=context,
role="user",
start_frame=UserStartedSpeakingFrame,
end_frame=UserStoppedSpeakingFrame,
accumulator_frame=TranscriptionFrame,
interim_accumulator_frame=InterimTranscriptionFrame,
)
class LLMAssistantContextAggregator(LLMContextResponseAggregator):
"""This is an assistant LLM aggregator that uses an LLM context to store the
conversation. It aggregates text frames received between
`LLMFullResponseStartFrame` and `LLMFullResponseEndFrame`.
"""
def __init__(self, context: OpenAILLMContext, *, expect_stripped_words: bool = True, **kwargs):
super().__init__(context=context, role="assistant", **kwargs)
self._expect_stripped_words = expect_stripped_words
self._started = False
self.reset()
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, StartInterruptionFrame):
await self.push_aggregation()
# Reset anyways
self.reset()
await self.push_frame(frame, direction)
elif isinstance(frame, LLMFullResponseStartFrame):
await self._handle_llm_start(frame)
elif isinstance(frame, LLMFullResponseEndFrame):
await self._handle_llm_end(frame)
elif isinstance(frame, TextFrame):
await self._handle_text(frame)
elif isinstance(frame, LLMMessagesAppendFrame):
self.add_messages(frame.messages)
elif isinstance(frame, LLMMessagesUpdateFrame):
self.set_messages(frame.messages)
elif isinstance(frame, LLMSetToolsFrame):
self.set_tools(frame.tools)
else:
await self.push_frame(frame, direction)
async def _handle_llm_start(self, _: LLMFullResponseStartFrame):
self._started = True
async def _handle_llm_end(self, _: LLMFullResponseEndFrame):
self._started = False
await self.push_aggregation()
async def _handle_text(self, frame: TextFrame):
if not self._started:
return
if self._expect_stripped_words:
self._aggregation += f" {frame.text}" if self._aggregation else frame.text
else:
self._aggregation += frame.text
class LLMUserResponseAggregator(LLMUserContextAggregator):
def __init__(self, messages: List[dict] = [], **kwargs):
super().__init__(context=OpenAILLMContext(messages), **kwargs)
async def push_aggregation(self):
if len(self._aggregation) > 0:
self._context.add_message({"role": self.role, "content": self._aggregation})
# Reset the aggregation. Reset it before pushing it down, otherwise
# if the tasks gets cancelled we won't be able to clear things up.
self._aggregation = ""
frame = LLMMessagesFrame(self._context.messages)
await self.push_frame(frame)
# Reset our accumulator state.
self.reset()
class LLMAssistantResponseAggregator(LLMAssistantContextAggregator):
def __init__(self, messages: List[dict] = [], **kwargs):
super().__init__(context=OpenAILLMContext(messages), **kwargs)
async def push_aggregation(self):
if len(self._aggregation) > 0:
self._context.add_message({"role": self.role, "content": self._aggregation})
# Reset the aggregation. Reset it before pushing it down, otherwise
# if the tasks gets cancelled we won't be able to clear things up.
self._aggregation = ""
frame = LLMMessagesFrame(self._context.messages)
await self.push_frame(frame)
# Reset our accumulator state.
self.reset()

View File

@@ -4,131 +4,15 @@
# SPDX-License-Identifier: BSD 2-Clause License
#
from typing import Optional
from pipecat.frames.frames import (
Frame,
InterimTranscriptionFrame,
StartInterruptionFrame,
TextFrame,
TranscriptionFrame,
UserStartedSpeakingFrame,
UserStoppedSpeakingFrame,
)
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.frames.frames import TextFrame
from pipecat.processors.aggregators.llm_response import LLMUserResponseAggregator
class ResponseAggregator(FrameProcessor):
"""This frame processor aggregates frames between a start and an end frame
into complete text frame sentences.
class UserResponseAggregator(LLMUserResponseAggregator):
def __init__(self, **kwargs):
super().__init__(**kwargs)
For example, frame input/output:
UserStartedSpeakingFrame() -> None
TranscriptionFrame("Hello,") -> None
TranscriptionFrame(" world.") -> None
UserStoppedSpeakingFrame() -> TextFrame("Hello world.")
Doctest: FIXME to work with asyncio
>>> async def print_frames(aggregator, frame):
... async for frame in aggregator.process_frame(frame):
... if isinstance(frame, TextFrame):
... print(frame.text)
>>> aggregator = ResponseAggregator(start_frame = UserStartedSpeakingFrame,
... end_frame=UserStoppedSpeakingFrame,
... accumulator_frame=TranscriptionFrame,
... pass_through=False)
>>> asyncio.run(print_frames(aggregator, UserStartedSpeakingFrame()))
>>> asyncio.run(print_frames(aggregator, TranscriptionFrame("Hello,", 1, 1)))
>>> asyncio.run(print_frames(aggregator, TranscriptionFrame("world.", 1, 2)))
>>> asyncio.run(print_frames(aggregator, UserStoppedSpeakingFrame()))
Hello, world.
"""
def __init__(
self,
*,
start_frame,
end_frame,
accumulator_frame: TextFrame,
interim_accumulator_frame: Optional[TextFrame] = None,
):
super().__init__()
self._start_frame = start_frame
self._end_frame = end_frame
self._accumulator_frame = accumulator_frame
self._interim_accumulator_frame = interim_accumulator_frame
# Reset our accumulator state.
self._reset()
#
# Frame processor
#
# Use cases implemented:
#
# S: Start, E: End, T: Transcription, I: Interim, X: Text
#
# S E -> None
# S T E -> X
# S I T E -> X
# S I E T -> X
# S I E I T -> X
# S E T -> X
# S E I T -> X
#
# The following case would not be supported:
#
# S I E T1 I T2 -> X
#
# and T2 would be dropped.
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
send_aggregation = False
if isinstance(frame, self._start_frame):
self._aggregating = True
self._seen_start_frame = True
self._seen_end_frame = False
self._seen_interim_results = False
await self.push_frame(frame, direction)
elif isinstance(frame, self._end_frame):
self._seen_end_frame = True
self._seen_start_frame = False
# We might have received the end frame but we might still be
# aggregating (i.e. we have seen interim results but not the final
# text).
self._aggregating = self._seen_interim_results or len(self._aggregation) == 0
# Send the aggregation if we are not aggregating anymore (i.e. no
# more interim results received).
send_aggregation = not self._aggregating
await self.push_frame(frame, direction)
elif isinstance(frame, self._accumulator_frame):
if self._aggregating:
self._aggregation += f" {frame.text}"
# We have recevied a complete sentence, so if we have seen the
# end frame and we were still aggregating, it means we should
# send the aggregation.
send_aggregation = self._seen_end_frame
# We just got our final result, so let's reset interim results.
self._seen_interim_results = False
elif self._interim_accumulator_frame and isinstance(frame, self._interim_accumulator_frame):
self._seen_interim_results = True
else:
await self.push_frame(frame, direction)
if send_aggregation:
await self._push_aggregation()
async def _push_aggregation(self):
async def push_aggregation(self):
if len(self._aggregation) > 0:
frame = TextFrame(self._aggregation.strip())
@@ -139,21 +23,4 @@ class ResponseAggregator(FrameProcessor):
await self.push_frame(frame)
# Reset our accumulator state.
self._reset()
def _reset(self):
self._aggregation = ""
self._aggregating = False
self._seen_start_frame = False
self._seen_end_frame = False
self._seen_interim_results = False
class UserResponseAggregator(ResponseAggregator):
def __init__(self):
super().__init__(
start_frame=UserStartedSpeakingFrame,
end_frame=UserStoppedSpeakingFrame,
accumulator_frame=TranscriptionFrame,
interim_accumulator_frame=InterimTranscriptionFrame,
)
self.reset()

View File

@@ -32,6 +32,10 @@ class AudioBufferProcessor(FrameProcessor):
in the case of stereo the left channel will be used for the user's audio and
the right channel for the bot.
Most of the time, user audio will be a continuous stream but it's possible
that in some cases only the spoken audio is sent. To accomodate for those
cases make sure to set `user_continuous_stream` accordingly.
"""
def __init__(
@@ -40,6 +44,7 @@ class AudioBufferProcessor(FrameProcessor):
sample_rate: Optional[int] = None,
num_channels: int = 1,
buffer_size: int = 0,
user_continuous_stream: bool = True,
**kwargs,
):
super().__init__(**kwargs)
@@ -47,10 +52,12 @@ class AudioBufferProcessor(FrameProcessor):
self._sample_rate = 0
self._num_channels = num_channels
self._buffer_size = buffer_size
self._user_continuous_stream = user_continuous_stream
self._user_audio_buffer = bytearray()
self._bot_audio_buffer = bytearray()
# Intermittent (non continous user stream variables)
self._last_user_frame_at = 0
self._last_bot_frame_at = 0
@@ -98,7 +105,40 @@ class AudioBufferProcessor(FrameProcessor):
if isinstance(frame, StartFrame):
self._update_sample_rate(frame)
if self._recording and isinstance(frame, InputAudioRawFrame):
if self._recording:
if self._user_continuous_stream:
await self._handle_continuous_stream(frame)
else:
await self._handle_intermittent_stream(frame)
if self._buffer_size > 0 and len(self._user_audio_buffer) > self._buffer_size:
await self._call_on_audio_data_handler()
if isinstance(frame, (CancelFrame, EndFrame)):
await self.stop_recording()
await self.push_frame(frame, direction)
def _update_sample_rate(self, frame: StartFrame):
self._sample_rate = self._init_sample_rate or frame.audio_out_sample_rate
async def _handle_continuous_stream(self, frame: Frame):
if isinstance(frame, InputAudioRawFrame):
# Add user audio.
resampled = await self._resample_audio(frame)
self._user_audio_buffer.extend(resampled)
# Sync the bot's buffer to the user's buffer by adding silence if needed
if len(self._user_audio_buffer) > len(self._bot_audio_buffer):
silence_size = len(self._user_audio_buffer) - len(self._bot_audio_buffer)
silence = b"\x00" * silence_size
self._bot_audio_buffer.extend(silence)
elif self._recording and isinstance(frame, OutputAudioRawFrame):
# Add bot audio.
resampled = await self._resample_audio(frame)
self._bot_audio_buffer.extend(resampled)
async def _handle_intermittent_stream(self, frame: Frame):
if isinstance(frame, InputAudioRawFrame):
# Add silence if we need to.
silence = self._compute_silence(self._last_user_frame_at)
self._user_audio_buffer.extend(silence)
@@ -117,17 +157,6 @@ class AudioBufferProcessor(FrameProcessor):
# Save time of frame so we can compute silence.
self._last_bot_frame_at = time.time()
if self._buffer_size > 0 and len(self._user_audio_buffer) > self._buffer_size:
await self._call_on_audio_data_handler()
if isinstance(frame, (CancelFrame, EndFrame)):
await self.stop_recording()
await self.push_frame(frame, direction)
def _update_sample_rate(self, frame: StartFrame):
self._sample_rate = self._init_sample_rate or frame.audio_out_sample_rate
async def _call_on_audio_data_handler(self):
if not self.has_audio() or not self._recording:
return
@@ -155,10 +184,9 @@ class AudioBufferProcessor(FrameProcessor):
def _compute_silence(self, from_time: float) -> bytes:
quiet_time = time.time() - from_time
# We should get audio frames very frequently. We pick 100ms because
# that's big enough, but it could be even a bit slower since we usually
# do 20ms audio frames.
if from_time == 0 or quiet_time < 0.1:
# We should get audio frames very frequently. We introduce silence only
# if there's a big enough gap of 1s.
if from_time == 0 or quiet_time < 1.0:
return b""
num_bytes = int(quiet_time * self._sample_rate) * 2
silence = b"\x00" * num_bytes

View File

@@ -23,6 +23,7 @@ from pipecat.frames.frames import (
Frame,
FunctionCallInProgressFrame,
FunctionCallResultFrame,
StartFrame,
StartInterruptionFrame,
StopInterruptionFrame,
STTMuteFrame,
@@ -37,16 +38,18 @@ class STTMuteStrategy(Enum):
"""Strategies determining when STT should be muted.
Attributes:
FIRST_SPEECH: Mute only during first bot speech
FIRST_SPEECH: Mute only during first detected bot speech
MUTE_UNTIL_FIRST_BOT_COMPLETE: Start muted and remain muted until first bot speech completes
FUNCTION_CALL: Mute during function calls
ALWAYS: Mute during all bot speech
CUSTOM: Allow custom logic via callback
"""
FIRST_SPEECH = "first_speech" # Mute only during first bot speech
FUNCTION_CALL = "function_call" # Mute during function calls
ALWAYS = "always" # Mute during all bot speech
CUSTOM = "custom" # Allow custom logic via callback
FIRST_SPEECH = "first_speech"
MUTE_UNTIL_FIRST_BOT_COMPLETE = "mute_until_first_bot_complete"
FUNCTION_CALL = "function_call"
ALWAYS = "always"
CUSTOM = "custom"
@dataclass
@@ -57,12 +60,25 @@ class STTMuteConfig:
strategies: Set of muting strategies to apply
should_mute_callback: Optional callback for custom muting logic.
Only required when using STTMuteStrategy.CUSTOM
Note:
MUTE_UNTIL_FIRST_BOT_COMPLETE and FIRST_SPEECH strategies should not be used together
as they handle the first bot speech differently.
"""
strategies: set[STTMuteStrategy]
# Optional callback for custom muting logic
should_mute_callback: Optional[Callable[["STTMuteFilter"], Awaitable[bool]]] = None
def __post_init__(self):
if (
STTMuteStrategy.MUTE_UNTIL_FIRST_BOT_COMPLETE in self.strategies
and STTMuteStrategy.FIRST_SPEECH in self.strategies
):
raise ValueError(
"MUTE_UNTIL_FIRST_BOT_COMPLETE and FIRST_SPEECH strategies should not be used together"
)
class STTMuteFilter(FrameProcessor):
"""A processor that handles STT muting and interruption control.
@@ -71,28 +87,40 @@ class STTMuteFilter(FrameProcessor):
feature. When STT is muted, interruptions are automatically disabled.
Args:
stt_service: Service handling speech-to-text functionality
config: Configuration specifying muting strategies
stt_service: STT service instance (deprecated, will be removed in future version)
**kwargs: Additional arguments passed to parent class
"""
def __init__(self, stt_service: STTService, config: STTMuteConfig, **kwargs):
def __init__(
self, *, config: STTMuteConfig, stt_service: Optional[STTService] = None, **kwargs
):
super().__init__(**kwargs)
self._stt_service = stt_service
self._config = config
if stt_service is not None:
import warnings
warnings.warn(
"The stt_service parameter is deprecated and will be removed in a future version. "
"STTMuteFilter now manages mute state internally.",
DeprecationWarning,
stacklevel=2,
)
self._first_speech_handled = False
self._bot_is_speaking = False
self._function_call_in_progress = False
self._is_muted = False # Initialize as unmuted, will set state on StartFrame if needed
@property
def is_muted(self) -> bool:
"""Returns whether STT is currently muted."""
return self._stt_service.is_muted
return self._is_muted
async def _handle_mute_state(self, should_mute: bool):
"""Handles both STT muting and interruption control."""
if should_mute != self.is_muted:
logger.debug(f"STT {'muting' if should_mute else 'unmuting'}")
self._is_muted = should_mute
await self.push_frame(STTMuteFrame(mute=should_mute))
async def _should_mute(self) -> bool:
@@ -112,6 +140,10 @@ class STTMuteFilter(FrameProcessor):
self._first_speech_handled = True
return True
case STTMuteStrategy.MUTE_UNTIL_FIRST_BOT_COMPLETE:
if not self._first_speech_handled:
return True
case STTMuteStrategy.CUSTOM:
if self._bot_is_speaking and self._config.should_mute_callback:
should_mute = await self._config.should_mute_callback(self)
@@ -121,25 +153,31 @@ class STTMuteFilter(FrameProcessor):
return False
async def process_frame(self, frame: Frame, direction: FrameDirection):
"""Processes incoming frames and manages muting state."""
await super().process_frame(frame, direction)
"""Processes incoming frames and manages muting state."""
# Handle function call state changes
if isinstance(frame, FunctionCallInProgressFrame):
# Determine if we need to change mute state based on frame type
should_mute = None
# Process frames to determine mute state
if isinstance(frame, StartFrame):
should_mute = await self._should_mute()
elif isinstance(frame, FunctionCallInProgressFrame):
self._function_call_in_progress = True
await self._handle_mute_state(await self._should_mute())
should_mute = await self._should_mute()
elif isinstance(frame, FunctionCallResultFrame):
self._function_call_in_progress = False
await self._handle_mute_state(await self._should_mute())
# Handle bot speaking state changes
should_mute = await self._should_mute()
elif isinstance(frame, BotStartedSpeakingFrame):
self._bot_is_speaking = True
await self._handle_mute_state(await self._should_mute())
should_mute = await self._should_mute()
elif isinstance(frame, BotStoppedSpeakingFrame):
self._bot_is_speaking = False
await self._handle_mute_state(await self._should_mute())
if not self._first_speech_handled:
self._first_speech_handled = True
should_mute = await self._should_mute()
# Handle frame propagation
# Then push the original frame
if isinstance(
frame,
(
@@ -157,3 +195,7 @@ class STTMuteFilter(FrameProcessor):
else:
# Pass all other frames through
await self.push_frame(frame, direction)
# Finally handle mute state change if needed
if should_mute is not None and should_mute != self.is_muted:
await self._handle_mute_state(should_mute)

View File

@@ -73,10 +73,11 @@ class FrameProcessor:
self._metrics.set_processor_name(self.name)
# Processors have an input queue. The input queue will be processed
# immediately (default) or it will block if `pause_processing_frames()` is
# called. To resume processing frames we need to call
# `resume_processing_frames()`.
# immediately (default) or it will block if `pause_processing_frames()`
# is called. To resume processing frames we need to call
# `resume_processing_frames()` which will wake up the event.
self.__should_block_frames = False
self.__input_event = asyncio.Event()
self.__input_frame_task: Optional[asyncio.Task] = None
# Every processor in Pipecat should only output frames from a single
@@ -335,8 +336,8 @@ class FrameProcessor:
def __create_input_task(self):
if not self.__input_frame_task:
self.__should_block_frames = False
self.__input_event.clear()
self.__input_queue = asyncio.Queue()
self.__input_event = asyncio.Event()
self.__input_frame_task = self.create_task(self.__input_frame_task_handler())
async def __cancel_input_task(self):

View File

@@ -5,6 +5,7 @@
#
import asyncio
import base64
from dataclasses import dataclass
from typing import (
Any,
@@ -31,6 +32,7 @@ from pipecat.frames.frames import (
ErrorFrame,
Frame,
FunctionCallResultFrame,
InputAudioRawFrame,
InterimTranscriptionFrame,
LLMFullResponseEndFrame,
LLMFullResponseStartFrame,
@@ -58,7 +60,9 @@ from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContextFrame,
)
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.google.frames import LLMSearchOrigin, LLMSearchResponseFrame
from pipecat.transports.base_input import BaseInputTransport
from pipecat.transports.base_output import BaseOutputTransport
from pipecat.transports.base_transport import BaseTransport
from pipecat.utils.string import match_endofsentence
RTVI_PROTOCOL_VERSION = "0.3.0"
@@ -296,12 +300,6 @@ class RTVITextMessageData(BaseModel):
text: str
class RTVISearchResponseMessageData(BaseModel):
search_result: Optional[str]
rendered_content: Optional[str]
origins: List[LLMSearchOrigin]
class RTVIBotTranscriptionMessage(BaseModel):
label: RTVIMessageLiteral = RTVI_MESSAGE_LABEL
type: Literal["bot-transcription"] = "bot-transcription"
@@ -314,12 +312,6 @@ class RTVIBotLLMTextMessage(BaseModel):
data: RTVITextMessageData
class RTVIBotLLMSearchResponseMessage(BaseModel):
label: Literal["rtvi-ai"] = "rtvi-ai"
type: Literal["bot-llm-search-response"] = "bot-llm-search-response"
data: RTVISearchResponseMessageData
class RTVIBotTTSTextMessage(BaseModel):
label: RTVIMessageLiteral = RTVI_MESSAGE_LABEL
type: Literal["bot-tts-text"] = "bot-tts-text"
@@ -397,6 +389,15 @@ class RTVISpeakingProcessor(RTVIFrameProcessor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVISpeakingProcessor' is deprecated, use an 'RTVIObserver' instead.",
DeprecationWarning,
)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
@@ -432,6 +433,15 @@ class RTVIUserTranscriptionProcessor(RTVIFrameProcessor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVIUserTranscriptionProcessor' is deprecated, use an 'RTVIObserver' instead.",
DeprecationWarning,
)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
@@ -463,6 +473,15 @@ class RTVIUserLLMTextProcessor(RTVIFrameProcessor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVIUserLLMTextProcessor' is deprecated, use an 'RTVIObserver' instead.",
DeprecationWarning,
)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
@@ -490,6 +509,15 @@ class RTVIBotTranscriptionProcessor(RTVIFrameProcessor):
super().__init__()
self._aggregation = ""
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVIBotTranscriptionProcessor' is deprecated, use an 'RTVIObserver' instead.",
DeprecationWarning,
)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
@@ -513,6 +541,15 @@ class RTVIBotLLMProcessor(RTVIFrameProcessor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVIBotLLMProcessor' is deprecated, use an 'RTVIObserver' instead.",
DeprecationWarning,
)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
@@ -531,6 +568,15 @@ class RTVIBotTTSProcessor(RTVIFrameProcessor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVIBotTTSProcessor' is deprecated, use an 'RTVIObserver' instead.",
DeprecationWarning,
)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
@@ -549,6 +595,15 @@ class RTVIMetricsProcessor(RTVIFrameProcessor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVIMetricsProcessor' is deprecated, use an 'RTVIObserver' instead.",
DeprecationWarning,
)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
@@ -582,10 +637,18 @@ class RTVIMetricsProcessor(RTVIFrameProcessor):
class RTVIObserver(BaseObserver):
"""This is a pipeline frame observer that is used to send RTVI server
messages to clients. The observer does not handle incoming RTVI client
messages, which is done by the RTVIProcessor.
"""Pipeline frame observer for RTVI server message handling.
This observer monitors pipeline frames and converts them into appropriate RTVI messages
for client communication. It handles various frame types including speech events,
transcriptions, LLM responses, and TTS events.
Note:
This observer only handles outgoing messages. Incoming RTVI client messages
are handled by the RTVIProcessor.
Args:
rtvi (FrameProcessor): The RTVI processor to push frames to.
"""
def __init__(self, rtvi: FrameProcessor):
@@ -602,10 +665,22 @@ class RTVIObserver(BaseObserver):
direction: FrameDirection,
timestamp: int,
):
"""Process a frame being pushed through the pipeline.
Args:
src: Source processor pushing the frame
dst: Destination processor receiving the frame
frame: The frame being pushed
direction: Direction of frame flow in pipeline
timestamp: Time when frame was pushed
"""
# If we have already seen this frame, let's skip it.
if frame.id in self._frames_seen:
return
self._frames_seen.add(frame.id)
# This tells whether the frame is already processed. If false, we will try
# again the next time we see the frame.
mark_as_seen = True
if isinstance(frame, (UserStartedSpeakingFrame, UserStoppedSpeakingFrame)):
await self._handle_interruptions(frame)
@@ -618,24 +693,34 @@ class RTVIObserver(BaseObserver):
elif isinstance(frame, UserStartedSpeakingFrame):
await self._push_bot_transcription()
elif isinstance(frame, LLMFullResponseStartFrame):
await self._push_transport_message_urgent(RTVIBotLLMStartedMessage())
await self.push_transport_message_urgent(RTVIBotLLMStartedMessage())
elif isinstance(frame, LLMFullResponseEndFrame):
await self._push_transport_message_urgent(RTVIBotLLMStoppedMessage())
await self.push_transport_message_urgent(RTVIBotLLMStoppedMessage())
elif isinstance(frame, LLMTextFrame):
await self._handle_llm_text_frame(frame)
elif isinstance(frame, LLMSearchResponseFrame):
await self._handle_llm_search_response_frame(frame)
elif isinstance(frame, TTSStartedFrame):
await self._push_transport_message_urgent(RTVIBotTTSStartedMessage())
await self.push_transport_message_urgent(RTVIBotTTSStartedMessage())
elif isinstance(frame, TTSStoppedFrame):
await self._push_transport_message_urgent(RTVIBotTTSStoppedMessage())
await self.push_transport_message_urgent(RTVIBotTTSStoppedMessage())
elif isinstance(frame, TTSTextFrame):
message = RTVIBotTTSTextMessage(data=RTVITextMessageData(text=frame.text))
await self._push_transport_message_urgent(message)
if isinstance(src, BaseOutputTransport):
message = RTVIBotTTSTextMessage(data=RTVITextMessageData(text=frame.text))
await self.push_transport_message_urgent(message)
else:
mark_as_seen = False
elif isinstance(frame, MetricsFrame):
await self._handle_metrics(frame)
async def _push_transport_message_urgent(self, model: BaseModel, exclude_none: bool = True):
if mark_as_seen:
self._frames_seen.add(frame.id)
async def push_transport_message_urgent(self, model: BaseModel, exclude_none: bool = True):
"""Push an urgent transport message to the RTVI processor.
Args:
model: The message model to send
exclude_none: Whether to exclude None values from the model dump
"""
frame = TransportMessageUrgentFrame(message=model.model_dump(exclude_none=exclude_none))
await self._rtvi.push_frame(frame)
@@ -644,7 +729,7 @@ class RTVIObserver(BaseObserver):
message = RTVIBotTranscriptionMessage(
data=RTVITextMessageData(text=self._bot_transcription)
)
await self._push_transport_message_urgent(message)
await self.push_transport_message_urgent(message)
self._bot_transcription = ""
async def _handle_interruptions(self, frame: Frame):
@@ -655,7 +740,7 @@ class RTVIObserver(BaseObserver):
message = RTVIUserStoppedSpeakingMessage()
if message:
await self._push_transport_message_urgent(message)
await self.push_transport_message_urgent(message)
async def _handle_bot_speaking(self, frame: Frame):
message = None
@@ -665,26 +750,16 @@ class RTVIObserver(BaseObserver):
message = RTVIBotStoppedSpeakingMessage()
if message:
await self._push_transport_message_urgent(message)
await self.push_transport_message_urgent(message)
async def _handle_llm_text_frame(self, frame: LLMTextFrame):
message = RTVIBotLLMTextMessage(data=RTVITextMessageData(text=frame.text))
await self._push_transport_message_urgent(message)
await self.push_transport_message_urgent(message)
self._bot_transcription += frame.text
if match_endofsentence(self._bot_transcription):
await self._push_bot_transcription()
async def _handle_llm_search_response_frame(self, frame: LLMSearchResponseFrame):
message = RTVIBotLLMSearchResponseMessage(
data=RTVISearchResponseMessageData(
search_result=frame.search_result,
origins=frame.origins,
rendered_content=frame.rendered_content,
)
)
await self._push_transport_message_urgent(message)
async def _handle_user_transcriptions(self, frame: Frame):
message = None
if isinstance(frame, TranscriptionFrame):
@@ -701,7 +776,7 @@ class RTVIObserver(BaseObserver):
)
if message:
await self._push_transport_message_urgent(message)
await self.push_transport_message_urgent(message)
async def _handle_context(self, frame: OpenAILLMContextFrame):
try:
@@ -715,7 +790,7 @@ class RTVIObserver(BaseObserver):
else:
text = content
rtvi_message = RTVIUserLLMTextMessage(data=RTVITextMessageData(text=text))
await self._push_transport_message_urgent(rtvi_message)
await self.push_transport_message_urgent(rtvi_message)
except TypeError as e:
logger.warning(f"Caught an error while trying to handle context: {e}")
@@ -740,7 +815,7 @@ class RTVIObserver(BaseObserver):
metrics["characters"].append(d.model_dump(exclude_none=True))
message = RTVIMetricsMessage(data=metrics)
await self._push_transport_message_urgent(message)
await self.push_transport_message_urgent(message)
class RTVIProcessor(FrameProcessor):
@@ -748,6 +823,7 @@ class RTVIProcessor(FrameProcessor):
self,
*,
config: RTVIConfig = RTVIConfig(config=[]),
transport: Optional[BaseTransport] = None,
**kwargs,
):
super().__init__(**kwargs)
@@ -773,7 +849,24 @@ class RTVIProcessor(FrameProcessor):
self._register_event_handler("on_bot_started")
self._register_event_handler("on_client_ready")
self._input_transport = None
self._transport = transport
if self._transport:
input_transport = self._transport.input()
if isinstance(input_transport, BaseInputTransport):
self._input_transport = input_transport
self._input_transport.enable_audio_in_stream_on_start(False)
def observer(self) -> RTVIObserver:
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"'RTVI.observer()' is deprecated, instantiate an 'RTVIObserver' directly instead.",
DeprecationWarning,
)
return RTVIObserver(self)
def register_action(self, action: RTVIAction):
@@ -933,6 +1026,8 @@ class RTVIProcessor(FrameProcessor):
case "llm-function-call-result":
data = RTVILLMFunctionCallResultData.model_validate(message.data)
await self._handle_function_call_result(data)
case "raw-audio" | "raw-audio-batch":
await self._handle_audio_buffer(message.data)
case _:
await self._send_error_response(message.id, f"Unsupported type {message.type}")
@@ -945,9 +1040,34 @@ class RTVIProcessor(FrameProcessor):
logger.warning(f"Exception processing message: {e}")
async def _handle_client_ready(self, request_id: str):
logger.debug("Received client-ready")
if self._input_transport:
self._input_transport.start_audio_in_streaming()
self._client_ready_id = request_id
await self.set_client_ready()
async def _handle_audio_buffer(self, data):
if not self._input_transport:
return
# Extract audio batch ensuring it's a list
audio_list = data.get("base64AudioBatch") or [data.get("base64Audio")]
try:
for base64_audio in filter(None, audio_list): # Filter out None values
pcm_bytes = base64.b64decode(base64_audio)
frame = InputAudioRawFrame(
audio=pcm_bytes,
sample_rate=data["sampleRate"],
num_channels=data["numChannels"],
)
await self._input_transport.push_audio_frame(frame)
except (KeyError, TypeError, ValueError) as e:
# Handle missing keys, decoding errors, and invalid types
logger.error(f"Error processing audio buffer: {e}")
async def _handle_describe_config(self, request_id: str):
services = list(self._registered_services.values())
message = RTVIDescribeConfig(id=request_id, data=RTVIDescribeConfigData(config=services))

View File

@@ -15,6 +15,7 @@ from loguru import logger
from pipecat.audio.utils import calculate_audio_volume, exp_smoothing
from pipecat.frames.frames import (
AudioRawFrame,
BotStoppedSpeakingFrame,
CancelFrame,
EndFrame,
ErrorFrame,
@@ -75,13 +76,13 @@ class AIService(FrameProcessor):
)
for key, value in settings.items():
print("Update request for:", key, value)
logger.debug("Update request for:", key, value)
if key in self._settings:
logger.info(f"Updating LLM setting {key} to: [{value}]")
self._settings[key] = value
elif key in SessionProperties.model_fields:
print("Attempting to update", key, value)
logger.debug("Attempting to update", key, value)
try:
from pipecat.services.openai_realtime_beta.events import (
@@ -212,6 +213,8 @@ class TTSService(AIService):
push_silence_after_stop: bool = False,
# if push_silence_after_stop is True, send this amount of audio silence
silence_time_s: float = 2.0,
# if True, we will pause processing frames while we are receiving audio
pause_frame_processing: bool = False,
# TTS output sample rate
sample_rate: Optional[int] = None,
text_filter: Optional[BaseTextFilter] = None,
@@ -224,6 +227,7 @@ class TTSService(AIService):
self._stop_frame_timeout_s: float = stop_frame_timeout_s
self._push_silence_after_stop: bool = push_silence_after_stop
self._silence_time_s: float = silence_time_s
self._pause_frame_processing: bool = pause_frame_processing
self._init_sample_rate = sample_rate
self._sample_rate = 0
self._voice_id: str = ""
@@ -234,6 +238,7 @@ class TTSService(AIService):
self._stop_frame_queue: asyncio.Queue = asyncio.Queue()
self._current_sentence: str = ""
self._processing_text: bool = False
@property
def sample_rate(self) -> int:
@@ -299,6 +304,7 @@ class TTSService(AIService):
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if (
isinstance(frame, TextFrame)
and not isinstance(frame, InterimTranscriptionFrame)
@@ -307,9 +313,16 @@ class TTSService(AIService):
await self._process_text_frame(frame)
elif isinstance(frame, StartInterruptionFrame):
await self._handle_interruption(frame, direction)
await self.push_frame(frame, direction)
elif isinstance(frame, (LLMFullResponseEndFrame, EndFrame)):
# We pause processing incoming frames if the LLM response included
# text (it might be that it's only a function calling response). We
# pause to avoid audio overlapping.
await self._maybe_pause_frame_processing()
sentence = self._current_sentence
self._current_sentence = ""
self._processing_text = False
await self._push_tts_frames(sentence)
if isinstance(frame, LLMFullResponseEndFrame):
if self._push_text_frames:
@@ -318,9 +331,16 @@ class TTSService(AIService):
await self.push_frame(frame, direction)
elif isinstance(frame, TTSSpeakFrame):
await self._push_tts_frames(frame.text)
# We pause processing incoming frames because we are sending data to
# the TTS. We pause to avoid audio overlapping.
await self._maybe_pause_frame_processing()
await self.flush_audio()
self._processing_text = False
elif isinstance(frame, TTSUpdateSettingsFrame):
await self._update_settings(frame.settings)
elif isinstance(frame, BotStoppedSpeakingFrame):
await self._maybe_resume_frame_processing()
await self.push_frame(frame, direction)
else:
await self.push_frame(frame, direction)
@@ -347,9 +367,17 @@ class TTSService(AIService):
async def _handle_interruption(self, frame: StartInterruptionFrame, direction: FrameDirection):
self._current_sentence = ""
self._processing_text = False
if self._text_filter:
self._text_filter.handle_interruption()
await self.push_frame(frame, direction)
async def _maybe_pause_frame_processing(self):
if self._processing_text and self._pause_frame_processing:
await self.pause_processing_frames()
async def _maybe_resume_frame_processing(self):
if self._pause_frame_processing:
await self.resume_processing_frames()
async def _process_text_frame(self, frame: TextFrame):
text: Optional[str] = None
@@ -371,6 +399,11 @@ class TTSService(AIService):
if not text.strip():
return
# This is just a flag that indicates if we sent something to the TTS
# service. It will be cleared if we sent text because of a TTSSpeakFrame
# or when we received an LLMFullResponseEndFrame
self._processing_text = True
await self.start_processing_metrics()
if self._text_filter:
self._text_filter.reset_interruption()
@@ -419,7 +452,7 @@ class WordTTSService(TTSService):
async def start(self, frame: StartFrame):
await super().start(frame)
await self._create_words_task()
self._create_words_task()
async def stop(self, frame: EndFrame):
await super().stop(frame)
@@ -439,7 +472,7 @@ class WordTTSService(TTSService):
await super()._handle_interruption(frame, direction)
self.reset_word_timestamps()
async def _create_words_task(self):
def _create_words_task(self):
self._words_task = self.create_task(self._words_task_handler())
async def _stop_words_task(self):
@@ -469,6 +502,115 @@ class WordTTSService(TTSService):
self._words_queue.task_done()
class AudioContextWordTTSService(WordTTSService):
"""This services allow us to send multiple TTS request to the services. Each
request could be multiple sentences long which are grouped by context. For
this to work, the TTS service needs to support handling multiple requests at
once (i.e. multiple simultaneous contexts).
The audio received from the TTS will be played in context order. That is, if
we requested audio for a context "A" and then audio for context "B", the
audio from context ID "A" will be played first.
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
self._contexts_queue = asyncio.Queue()
self._contexts: Dict[str, asyncio.Queue] = {}
self._audio_context_task = None
async def create_audio_context(self, context_id: str):
"""Create a new audio context."""
await self._contexts_queue.put(context_id)
self._contexts[context_id] = asyncio.Queue()
logger.trace(f"{self} created audio context {context_id}")
async def append_to_audio_context(self, context_id: str, frame: TTSAudioRawFrame):
"""Append audio to an existing context."""
if self.audio_context_available(context_id):
logger.trace(f"{self} appending audio {frame} to audio context {context_id}")
await self._contexts[context_id].put(frame)
else:
logger.warning(f"{self} unable to append audio to context {context_id}")
async def remove_audio_context(self, context_id: str):
"""Remove an existing audio context."""
if self.audio_context_available(context_id):
# We just mark the audio context for deletion by appending
# None. Once we reach None while handling audio we know we can
# safely remove the context.
logger.trace(f"{self} marking audio context {context_id} for deletion")
await self._contexts[context_id].put(None)
else:
logger.warning(f"{self} unable to remove context {context_id}")
def audio_context_available(self, context_id: str) -> bool:
"""Checks whether the given audio context is registered."""
return context_id in self._contexts
async def start(self, frame: StartFrame):
await super().start(frame)
self._create_audio_context_task()
async def stop(self, frame: EndFrame):
await super().stop(frame)
await self._stop_audio_context_task()
async def cancel(self, frame: CancelFrame):
await super().cancel(frame)
await self._stop_audio_context_task()
async def _handle_interruption(self, frame: StartInterruptionFrame, direction: FrameDirection):
await super()._handle_interruption(frame, direction)
await self._stop_audio_context_task()
self._create_audio_context_task()
def _create_audio_context_task(self):
self._contexts_queue = asyncio.Queue()
self._contexts: Dict[str, asyncio.Queue] = {}
self._audio_context_task = self.create_task(self._audio_context_task_handler())
async def _stop_audio_context_task(self):
if self._audio_context_task:
await self.cancel_task(self._audio_context_task)
self._audio_context_task = None
async def _audio_context_task_handler(self):
"""In this task we process audio contexts in order."""
while True:
context_id = await self._contexts_queue.get()
# Process the audio context until the context doesn't have more
# audio available (i.e. we find None).
await self._handle_audio_context(context_id)
# We just finished processing the context, so we can safely remove it.
del self._contexts[context_id]
self._contexts_queue.task_done()
# Append some silence between sentences.
silence = b"\x00" * self.sample_rate
frame = TTSAudioRawFrame(audio=silence, sample_rate=self.sample_rate, num_channels=1)
await self.push_frame(frame)
async def _handle_audio_context(self, context_id: str):
# If we don't receive any audio during this time, we consider the context finished.
AUDIO_CONTEXT_TIMEOUT = 3.0
queue = self._contexts[context_id]
running = True
while running:
try:
frame = await asyncio.wait_for(queue.get(), timeout=AUDIO_CONTEXT_TIMEOUT)
if frame:
await self.push_frame(frame)
running = frame is not None
except asyncio.TimeoutError:
# We didn't get audio, so let's consider this context finished.
logger.trace(f"{self} time out on audio context {context_id}")
break
class STTService(AIService):
"""STTService is a base class for speech-to-text services."""
@@ -525,9 +667,13 @@ class STTService(AIService):
else:
logger.warning(f"Unknown setting for STT service: {key}")
async def process_audio_frame(self, frame: AudioRawFrame):
if not self._muted:
await self.process_generator(self.run_stt(frame.audio))
async def process_audio_frame(self, frame: AudioRawFrame, direction: FrameDirection):
if self._muted:
return
await self.process_generator(self.run_stt(frame.audio))
if self._audio_passthrough:
await self.push_frame(frame, direction)
async def process_frame(self, frame: Frame, direction: FrameDirection):
"""Processes a frame of audio data, either buffering or transcribing it."""
@@ -537,9 +683,7 @@ class STTService(AIService):
# In this service we accumulate audio internally and at the end we
# push a TextFrame. We also push audio downstream in case someone
# else needs it.
await self.process_audio_frame(frame)
if self._audio_passthrough:
await self.push_frame(frame, direction)
await self.process_audio_frame(frame, direction)
elif isinstance(frame, STTUpdateSettingsFrame):
await self._update_settings(frame.settings)
elif isinstance(frame, STTMuteFrame):
@@ -575,7 +719,7 @@ class SegmentedSTTService(STTService):
self._smoothing_factor = 0.2
self._prev_volume = 0
async def process_audio_frame(self, frame: AudioRawFrame):
async def process_audio_frame(self, frame: AudioRawFrame, direction: FrameDirection):
# Try to filter out empty background noise
volume = self._get_smoothed_volume(frame)
if volume >= self._min_volume:

View File

@@ -126,9 +126,11 @@ class AnthropicLLMService(LLMService):
def create_context_aggregator(
context: OpenAILLMContext, *, assistant_expect_stripped_words: bool = True
) -> AnthropicContextAggregatorPair:
if isinstance(context, OpenAILLMContext):
context = AnthropicLLMContext.from_openai_context(context)
user = AnthropicUserContextAggregator(context)
assistant = AnthropicAssistantContextAggregator(
user, expect_stripped_words=assistant_expect_stripped_words
context, expect_stripped_words=assistant_expect_stripped_words
)
return AnthropicContextAggregatorPair(_user=user, _assistant=assistant)
@@ -651,11 +653,8 @@ class AnthropicLLMContext(OpenAILLMContext):
class AnthropicUserContextAggregator(LLMUserContextAggregator):
def __init__(self, context: OpenAILLMContext | AnthropicLLMContext):
super().__init__(context=context)
if isinstance(context, OpenAILLMContext):
self._context = AnthropicLLMContext.from_openai_context(context)
def __init__(self, context: OpenAILLMContext | AnthropicLLMContext, **kwargs):
super().__init__(context=context, **kwargs)
async def process_frame(self, frame, direction):
await super().process_frame(frame, direction)
@@ -703,9 +702,8 @@ class AnthropicUserContextAggregator(LLMUserContextAggregator):
class AnthropicAssistantContextAggregator(LLMAssistantContextAggregator):
def __init__(self, user_context_aggregator: AnthropicUserContextAggregator, **kwargs):
super().__init__(context=user_context_aggregator._context, **kwargs)
self._user_context_aggregator = user_context_aggregator
def __init__(self, context: OpenAILLMContext | AnthropicLLMContext, **kwargs):
super().__init__(context=context, **kwargs)
self._function_call_in_progress = None
self._function_call_result = None
self._pending_image_frame_message = None
@@ -725,7 +723,7 @@ class AnthropicAssistantContextAggregator(LLMAssistantContextAggregator):
):
self._function_call_in_progress = None
self._function_call_result = frame
await self._push_aggregation()
await self.push_aggregation()
else:
logger.warning(
"FunctionCallResultFrame tool_call_id != InProgressFrame tool_call_id"
@@ -734,9 +732,9 @@ class AnthropicAssistantContextAggregator(LLMAssistantContextAggregator):
self._function_call_result = None
elif isinstance(frame, AnthropicImageMessageFrame):
self._pending_image_frame_message = frame
await self._push_aggregation()
await self.push_aggregation()
async def _push_aggregation(self):
async def push_aggregation(self):
if not (
self._aggregation or self._function_call_result or self._pending_image_frame_message
):
@@ -746,7 +744,7 @@ class AnthropicAssistantContextAggregator(LLMAssistantContextAggregator):
properties: Optional[FunctionCallResultProperties] = None
aggregation = self._aggregation
self._reset()
self.reset()
try:
if self._function_call_result:
@@ -799,15 +797,14 @@ class AnthropicAssistantContextAggregator(LLMAssistantContextAggregator):
run_llm = True
if run_llm:
await self._user_context_aggregator.push_context_frame()
await self.push_context_frame(FrameDirection.UPSTREAM)
# Emit the on_context_updated callback once the function call result is added to the context
if properties and properties.on_context_updated is not None:
await properties.on_context_updated()
# Push context frame
frame = OpenAILLMContextFrame(self._context)
await self.push_frame(frame)
await self.push_context_frame()
# Push timestamp frame with current time
timestamp_frame = OpenAILLMContextAssistantTimestampFrame(timestamp=time_now_iso8601())

View File

@@ -577,35 +577,43 @@ class AzureTTSService(AzureBaseTTSService):
logger.debug(f"Generating TTS: [{text}]")
try:
await self.start_ttfb_metrics()
yield TTSStartedFrame()
if self._speech_synthesizer is None:
error_msg = "Speech synthesizer not initialized."
logger.error(error_msg)
yield ErrorFrame(error_msg)
return
ssml = self._construct_ssml(text)
try:
await self.start_ttfb_metrics()
yield TTSStartedFrame()
# Start synthesis
self._speech_synthesizer.speak_ssml_async(ssml)
ssml = self._construct_ssml(text)
self._speech_synthesizer.speak_ssml_async(ssml)
await self.start_tts_usage_metrics(text)
await self.start_tts_usage_metrics(text)
# Stream audio chunks as they arrive
while True:
chunk = await self._audio_queue.get()
if chunk is None: # End of stream
break
# Stream audio chunks as they arrive
while True:
chunk = await self._audio_queue.get()
if chunk is None: # End of stream
break
await self.stop_ttfb_metrics()
yield TTSAudioRawFrame(
audio=chunk,
sample_rate=self.sample_rate,
num_channels=1,
)
await self.stop_ttfb_metrics()
yield TTSStoppedFrame()
yield TTSAudioRawFrame(
audio=chunk,
sample_rate=self.sample_rate,
num_channels=1,
)
yield TTSStoppedFrame()
except Exception as e:
logger.error(f"{self} error during synthesis: {e}")
yield TTSStoppedFrame()
# Could add reconnection logic here if needed
return
except Exception as e:
logger.error(f"{self} error generating TTS: {e}")
yield ErrorFrame(f"{self} error: {str(e)}")
logger.error(f"{self} exception: {e}")
class AzureHttpTTSService(AzureBaseTTSService):

View File

@@ -0,0 +1,183 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
from typing import AsyncGenerator, Optional
from loguru import logger
from pipecat.frames.frames import ErrorFrame, Frame, TranscriptionFrame
from pipecat.services.ai_services import SegmentedSTTService
from pipecat.transcriptions.language import Language
from pipecat.utils.time import time_now_iso8601
try:
from openai import AsyncOpenAI
from openai.types.audio import Transcription
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
"In order to use OpenAI, you need to `pip install pipecat-ai[openai]`. Also, set `OPENAI_API_KEY` environment variable."
)
raise Exception(f"Missing module: {e}")
def language_to_whisper_language(language: Language) -> Optional[str]:
"""Language support for Whisper API.
Docs: https://platform.openai.com/docs/guides/speech-to-text#supported-languages
"""
BASE_LANGUAGES = {
Language.AF: "af",
Language.AR: "ar",
Language.HY: "hy",
Language.AZ: "az",
Language.BE: "be",
Language.BS: "bs",
Language.BG: "bg",
Language.CA: "ca",
Language.ZH: "zh",
Language.HR: "hr",
Language.CS: "cs",
Language.DA: "da",
Language.NL: "nl",
Language.EN: "en",
Language.ET: "et",
Language.FI: "fi",
Language.FR: "fr",
Language.GL: "gl",
Language.DE: "de",
Language.EL: "el",
Language.HE: "he",
Language.HI: "hi",
Language.HU: "hu",
Language.IS: "is",
Language.ID: "id",
Language.IT: "it",
Language.JA: "ja",
Language.KN: "kn",
Language.KK: "kk",
Language.KO: "ko",
Language.LV: "lv",
Language.LT: "lt",
Language.MK: "mk",
Language.MS: "ms",
Language.MR: "mr",
Language.MI: "mi",
Language.NE: "ne",
Language.NO: "no",
Language.FA: "fa",
Language.PL: "pl",
Language.PT: "pt",
Language.RO: "ro",
Language.RU: "ru",
Language.SR: "sr",
Language.SK: "sk",
Language.SL: "sl",
Language.ES: "es",
Language.SW: "sw",
Language.SV: "sv",
Language.TL: "tl",
Language.TA: "ta",
Language.TH: "th",
Language.TR: "tr",
Language.UK: "uk",
Language.UR: "ur",
Language.VI: "vi",
Language.CY: "cy",
}
result = BASE_LANGUAGES.get(language)
# If not found in base languages, try to find the base language from a variant
if not result:
lang_str = str(language.value)
base_code = lang_str.split("-")[0].lower()
result = base_code if base_code in BASE_LANGUAGES.values() else None
return result
class BaseWhisperSTTService(SegmentedSTTService):
"""Base class for Whisper-based speech-to-text services.
Provides common functionality for services implementing the Whisper API interface,
including metrics generation and error handling.
Args:
model: Name of the Whisper model to use.
api_key: Service API key. Defaults to None.
base_url: Service API base URL. Defaults to None.
language: Language of the audio input. Defaults to English.
prompt: Optional text to guide the model's style or continue a previous segment.
temperature: Sampling temperature between 0 and 1. Defaults to 0.0.
**kwargs: Additional arguments passed to SegmentedSTTService.
"""
def __init__(
self,
*,
model: str,
api_key: Optional[str] = None,
base_url: Optional[str] = None,
language: Optional[Language] = Language.EN,
prompt: Optional[str] = None,
temperature: Optional[float] = None,
**kwargs,
):
super().__init__(**kwargs)
self.set_model_name(model)
self._client = self._create_client(api_key, base_url)
self._language = self.language_to_service_language(language or Language.EN)
self._prompt = prompt
self._temperature = temperature
def _create_client(self, api_key: Optional[str], base_url: Optional[str]):
return AsyncOpenAI(api_key=api_key, base_url=base_url)
async def set_model(self, model: str):
self.set_model_name(model)
def can_generate_metrics(self) -> bool:
return True
def language_to_service_language(self, language: Language) -> Optional[str]:
return language_to_whisper_language(language)
async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
try:
await self.start_processing_metrics()
await self.start_ttfb_metrics()
response = await self._transcribe(audio)
await self.stop_ttfb_metrics()
await self.stop_processing_metrics()
text = response.text.strip()
if text:
logger.debug(f"Transcription: [{text}]")
yield TranscriptionFrame(text, "", time_now_iso8601())
else:
logger.warning("Received empty transcription from API")
except Exception as e:
logger.exception(f"Exception during transcription: {e}")
yield ErrorFrame(f"Error during transcription: {str(e)}")
async def _transcribe(self, audio: bytes) -> Transcription:
"""Transcribe audio data to text.
Args:
audio: Raw audio data in WAV format.
Returns:
Transcription: Object containing the transcribed text.
Raises:
NotImplementedError: Must be implemented by subclasses.
"""
raise NotImplementedError

View File

@@ -27,7 +27,7 @@ from pipecat.frames.frames import (
TTSStoppedFrame,
)
from pipecat.processors.frame_processor import FrameDirection
from pipecat.services.ai_services import TTSService, WordTTSService
from pipecat.services.ai_services import AudioContextWordTTSService, TTSService
from pipecat.services.websocket_service import WebsocketService
from pipecat.transcriptions.language import Language
@@ -75,7 +75,7 @@ def language_to_cartesia_language(language: Language) -> Optional[str]:
return result
class CartesiaTTSService(WordTTSService, WebsocketService):
class CartesiaTTSService(AudioContextWordTTSService, WebsocketService):
class InputParams(BaseModel):
language: Optional[Language] = Language.EN
speed: Optional[Union[str, float]] = ""
@@ -105,10 +105,11 @@ class CartesiaTTSService(WordTTSService, WebsocketService):
# if we're interrupted. Cartesia gives us word-by-word timestamps. We
# can use those to generate text frames ourselves aligned with the
# playout timing of the audio!
WordTTSService.__init__(
AudioContextWordTTSService.__init__(
self,
aggregate_sentences=True,
push_text_frames=False,
pause_frame_processing=True,
sample_rate=sample_rate,
**kwargs,
)
@@ -191,12 +192,12 @@ class CartesiaTTSService(WordTTSService, WebsocketService):
self._receive_task = self.create_task(self._receive_task_handler(self.push_error))
async def _disconnect(self):
await self._disconnect_websocket()
if self._receive_task:
await self.cancel_task(self._receive_task)
self._receive_task = None
await self._disconnect_websocket()
async def _connect_websocket(self):
try:
logger.debug("Connecting to Cartesia")
@@ -239,21 +240,19 @@ class CartesiaTTSService(WordTTSService, WebsocketService):
logger.trace(f"{self}: flushing audio")
msg = self._build_msg(text="", continue_transcript=False)
await self._websocket.send(msg)
self._context_id = None
async def _receive_messages(self):
async for message in self._get_websocket():
msg = json.loads(message)
if not msg or msg["context_id"] != self._context_id:
if not msg or not self.audio_context_available(msg["context_id"]):
continue
if msg["type"] == "done":
await self.stop_ttfb_metrics()
# Unset _context_id but not the _context_id_start_timestamp
# because we are likely still playing out audio and need the
# timestamp to set send context frames.
self._context_id = None
await self.add_word_timestamps(
[("TTSStoppedFrame", 0), ("LLMFullResponseEndFrame", 0), ("Reset", 0)]
)
await self.remove_audio_context(msg["context_id"])
elif msg["type"] == "timestamps":
await self.add_word_timestamps(
list(zip(msg["word_timestamps"]["words"], msg["word_timestamps"]["start"]))
@@ -266,28 +265,16 @@ class CartesiaTTSService(WordTTSService, WebsocketService):
sample_rate=self.sample_rate,
num_channels=1,
)
await self.push_frame(frame)
await self.append_to_audio_context(msg["context_id"], frame)
elif msg["type"] == "error":
logger.error(f"{self} error: {msg}")
await self.push_frame(TTSStoppedFrame())
await self.stop_all_metrics()
await self.push_error(ErrorFrame(f"{self} error: {msg['error']}"))
self._context_id = None
else:
logger.error(f"{self} error, unknown message type: {msg}")
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
# If we received a TTSSpeakFrame and the LLM response included text (it
# might be that it's only a function calling response) we pause
# processing more frames until we receive a BotStoppedSpeakingFrame.
if isinstance(frame, TTSSpeakFrame):
await self.pause_processing_frames()
elif isinstance(frame, LLMFullResponseEndFrame) and self._context_id:
await self.pause_processing_frames()
elif isinstance(frame, BotStoppedSpeakingFrame):
await self.resume_processing_frames()
async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
logger.debug(f"Generating TTS: [{text}]")
@@ -299,6 +286,7 @@ class CartesiaTTSService(WordTTSService, WebsocketService):
await self.start_ttfb_metrics()
yield TTSStartedFrame()
self._context_id = str(uuid.uuid4())
await self.create_audio_context(self._context_id)
msg = self._build_msg(text=text or " ") # Text must contain at least one character

View File

@@ -192,6 +192,7 @@ class ElevenLabsTTSService(WordTTSService, WebsocketService):
push_text_frames=False,
push_stop_frames=True,
stop_frame_timeout_s=2.0,
pause_frame_processing=True,
sample_rate=sample_rate,
**kwargs,
)
@@ -289,19 +290,6 @@ class ElevenLabsTTSService(WordTTSService, WebsocketService):
if isinstance(frame, TTSStoppedFrame):
await self.add_word_timestamps([("LLMFullResponseEndFrame", 0), ("Reset", 0)])
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
# If we received a TTSSpeakFrame and the LLM response included text (it
# might be that it's only a function calling response) we pause
# processing more frames until we receive a BotStoppedSpeakingFrame.
if isinstance(frame, TTSSpeakFrame):
await self.pause_processing_frames()
elif isinstance(frame, LLMFullResponseEndFrame) and self._started:
await self.pause_processing_frames()
elif isinstance(frame, BotStoppedSpeakingFrame):
await self.resume_processing_frames()
async def _connect(self):
await self._connect_websocket()
@@ -370,8 +358,13 @@ class ElevenLabsTTSService(WordTTSService, WebsocketService):
except Exception as e:
logger.error(f"{self} error closing websocket: {e}")
def _get_websocket(self):
if self._websocket:
return self._websocket
raise Exception("Websocket not connected")
async def _receive_messages(self):
async for message in self._websocket:
async for message in self._get_websocket():
msg = json.loads(message)
if msg.get("audio"):
await self.stop_ttfb_metrics()
@@ -388,7 +381,11 @@ class ElevenLabsTTSService(WordTTSService, WebsocketService):
async def _keepalive_task_handler(self):
while True:
await asyncio.sleep(10)
await self._send_text("")
try:
await self._send_text("")
except websockets.ConnectionClosed as e:
logger.warning(f"{self} keepalive error: {e}")
break
async def _send_text(self, text: str):
if self._websocket:

View File

@@ -4,6 +4,7 @@
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import io
import os
from typing import AsyncGenerator, Dict, Optional, Union
@@ -53,6 +54,11 @@ class FalImageGenService(ImageGenService):
os.environ["FAL_KEY"] = key
async def run_image_gen(self, prompt: str) -> AsyncGenerator[Frame, None]:
def load_image_bytes(encoded_image: bytes):
buffer = io.BytesIO(encoded_image)
image = Image.open(buffer)
return (image.tobytes(), image.size, image.format)
logger.debug(f"Generating image from prompt: {prompt}")
response = await fal_client.run_async(
@@ -73,10 +79,8 @@ class FalImageGenService(ImageGenService):
logger.debug(f"Downloading image {image_url} ...")
async with self._aiohttp_session.get(image_url) as response:
logger.debug(f"Downloaded image {image_url}")
image_stream = io.BytesIO(await response.content.read())
image = Image.open(image_stream)
encoded_image = await response.content.read()
(image_bytes, size, format) = await asyncio.to_thread(load_image_bytes, encoded_image)
frame = URLImageRawFrame(
url=image_url, image=image.tobytes(), size=image.size, format=image.format
)
frame = URLImageRawFrame(url=image_url, image=image_bytes, size=size, format=format)
yield frame

View File

@@ -60,7 +60,7 @@ class FishAudioTTSService(TTSService, WebsocketService):
params: InputParams = InputParams(),
**kwargs,
):
super().__init__(sample_rate=sample_rate, **kwargs)
super().__init__(pause_frame_processing=True, sample_rate=sample_rate, **kwargs)
self._api_key = api_key
self._base_url = "wss://api.fish.audio/v1/tts/live"
@@ -166,16 +166,6 @@ class FishAudioTTSService(TTSService, WebsocketService):
except Exception as e:
logger.error(f"Error processing message: {e}")
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, TTSSpeakFrame):
await self.pause_processing_frames()
elif isinstance(frame, LLMFullResponseEndFrame) and self._request_id:
await self.pause_processing_frames()
elif isinstance(frame, BotStoppedSpeakingFrame):
await self.resume_processing_frames()
async def _handle_interruption(self, frame: StartInterruptionFrame, direction: FrameDirection):
await super()._handle_interruption(frame, direction)
await self.stop_all_metrics()

View File

@@ -115,10 +115,10 @@ class GeminiMultimodalLiveUserContextAggregator(OpenAIUserContextAggregator):
class GeminiMultimodalLiveAssistantContextAggregator(OpenAIAssistantContextAggregator):
async def _push_aggregation(self):
async def push_aggregation(self):
# We don't want to store any images in the context. Revisit this later when the API evolves.
self._pending_image_frame_message = None
await super()._push_aggregation()
await super().push_aggregation()
@dataclass
@@ -706,6 +706,6 @@ class GeminiMultimodalLiveLLMService(LLMService):
GeminiMultimodalLiveContext.upgrade(context)
user = GeminiMultimodalLiveUserContextAggregator(context)
assistant = GeminiMultimodalLiveAssistantContextAggregator(
user, expect_stripped_words=assistant_expect_stripped_words
context, expect_stripped_words=assistant_expect_stripped_words
)
return GeminiMultimodalLiveContextAggregatorPair(_user=user, _assistant=assistant)

View File

@@ -1,2 +1,3 @@
from .frames import LLMSearchResponseFrame
from .google import *
from .rtvi import *

View File

@@ -8,24 +8,35 @@ import asyncio
import base64
import io
import json
import os
import time
# Suppress gRPC fork warnings
os.environ["GRPC_ENABLE_FORK_SUPPORT"] = "false"
from dataclasses import dataclass
from typing import Any, AsyncGenerator, Dict, List, Literal, Optional
from typing import Any, AsyncGenerator, Dict, List, Literal, Optional, Union
from loguru import logger
from PIL import Image
from pydantic import BaseModel, Field
from pydantic import BaseModel, Field, field_validator
from pipecat.frames.frames import (
AudioRawFrame,
CancelFrame,
EndFrame,
ErrorFrame,
Frame,
FunctionCallResultProperties,
InterimTranscriptionFrame,
LLMFullResponseEndFrame,
LLMFullResponseStartFrame,
LLMMessagesFrame,
LLMTextFrame,
LLMUpdateSettingsFrame,
OpenAILLMContextAssistantTimestampFrame,
StartFrame,
TranscriptionFrame,
TTSAudioRawFrame,
TTSStartedFrame,
TTSStoppedFrame,
@@ -38,7 +49,7 @@ from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContextFrame,
)
from pipecat.processors.frame_processor import FrameDirection
from pipecat.services.ai_services import ImageGenService, LLMService, TTSService
from pipecat.services.ai_services import ImageGenService, LLMService, STTService, TTSService
from pipecat.services.google.frames import LLMSearchResponseFrame
from pipecat.services.openai import (
OpenAIAssistantContextAggregator,
@@ -51,10 +62,13 @@ try:
import google.ai.generativelanguage as glm
import google.generativeai as gai
from google import genai
from google.cloud import texttospeech_v1
from google.api_core.client_options import ClientOptions
from google.cloud import speech_v2, texttospeech_v1
from google.cloud.speech_v2.types import cloud_speech
from google.genai import types
from google.generativeai.types import GenerationConfig
from google.oauth2 import service_account
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
@@ -63,7 +77,7 @@ except ModuleNotFoundError as e:
raise Exception(f"Missing module: {e}")
def language_to_google_language(language: Language) -> Optional[str]:
def language_to_google_tts_language(language: Language) -> Optional[str]:
language_map = {
# Afrikaans
Language.AF: "af-ZA",
@@ -223,8 +237,307 @@ def language_to_google_language(language: Language) -> Optional[str]:
return language_map.get(language)
def language_to_google_stt_language(language: Language) -> Optional[str]:
"""Maps Language enum to Google Speech-to-Text V2 language codes.
Args:
language: Language enum value.
Returns:
Optional[str]: Google STT language code or None if not supported.
"""
language_map = {
# Afrikaans
Language.AF: "af-ZA",
Language.AF_ZA: "af-ZA",
# Albanian
Language.SQ: "sq-AL",
Language.SQ_AL: "sq-AL",
# Amharic
Language.AM: "am-ET",
Language.AM_ET: "am-ET",
# Arabic
Language.AR: "ar-EG", # Default to Egypt
Language.AR_AE: "ar-AE",
Language.AR_BH: "ar-BH",
Language.AR_DZ: "ar-DZ",
Language.AR_EG: "ar-EG",
Language.AR_IQ: "ar-IQ",
Language.AR_JO: "ar-JO",
Language.AR_KW: "ar-KW",
Language.AR_LB: "ar-LB",
Language.AR_MA: "ar-MA",
Language.AR_OM: "ar-OM",
Language.AR_QA: "ar-QA",
Language.AR_SA: "ar-SA",
Language.AR_SY: "ar-SY",
Language.AR_TN: "ar-TN",
Language.AR_YE: "ar-YE",
# Armenian
Language.HY: "hy-AM",
Language.HY_AM: "hy-AM",
# Azerbaijani
Language.AZ: "az-AZ",
Language.AZ_AZ: "az-AZ",
# Basque
Language.EU: "eu-ES",
Language.EU_ES: "eu-ES",
# Bengali
Language.BN: "bn-IN", # Default to India
Language.BN_BD: "bn-BD",
Language.BN_IN: "bn-IN",
# Bosnian
Language.BS: "bs-BA",
Language.BS_BA: "bs-BA",
# Bulgarian
Language.BG: "bg-BG",
Language.BG_BG: "bg-BG",
# Burmese
Language.MY: "my-MM",
Language.MY_MM: "my-MM",
# Catalan
Language.CA: "ca-ES",
Language.CA_ES: "ca-ES",
# Chinese
Language.ZH: "cmn-Hans-CN", # Default to Simplified Chinese
Language.ZH_CN: "cmn-Hans-CN",
Language.ZH_HK: "cmn-Hans-HK",
Language.ZH_TW: "cmn-Hant-TW",
Language.YUE: "yue-Hant-HK", # Cantonese
Language.YUE_CN: "yue-Hant-HK",
# Croatian
Language.HR: "hr-HR",
Language.HR_HR: "hr-HR",
# Czech
Language.CS: "cs-CZ",
Language.CS_CZ: "cs-CZ",
# Danish
Language.DA: "da-DK",
Language.DA_DK: "da-DK",
# Dutch
Language.NL: "nl-NL", # Default to Netherlands
Language.NL_BE: "nl-BE",
Language.NL_NL: "nl-NL",
# English
Language.EN: "en-US", # Default to US
Language.EN_AU: "en-AU",
Language.EN_CA: "en-CA",
Language.EN_GB: "en-GB",
Language.EN_GH: "en-GH",
Language.EN_HK: "en-HK",
Language.EN_IN: "en-IN",
Language.EN_IE: "en-IE",
Language.EN_KE: "en-KE",
Language.EN_NG: "en-NG",
Language.EN_NZ: "en-NZ",
Language.EN_PH: "en-PH",
Language.EN_SG: "en-SG",
Language.EN_TZ: "en-TZ",
Language.EN_US: "en-US",
Language.EN_ZA: "en-ZA",
# Estonian
Language.ET: "et-EE",
Language.ET_EE: "et-EE",
# Filipino
Language.FIL: "fil-PH",
Language.FIL_PH: "fil-PH",
# Finnish
Language.FI: "fi-FI",
Language.FI_FI: "fi-FI",
# French
Language.FR: "fr-FR", # Default to France
Language.FR_BE: "fr-BE",
Language.FR_CA: "fr-CA",
Language.FR_CH: "fr-CH",
Language.FR_FR: "fr-FR",
# Galician
Language.GL: "gl-ES",
Language.GL_ES: "gl-ES",
# Georgian
Language.KA: "ka-GE",
Language.KA_GE: "ka-GE",
# German
Language.DE: "de-DE", # Default to Germany
Language.DE_AT: "de-AT",
Language.DE_CH: "de-CH",
Language.DE_DE: "de-DE",
# Greek
Language.EL: "el-GR",
Language.EL_GR: "el-GR",
# Gujarati
Language.GU: "gu-IN",
Language.GU_IN: "gu-IN",
# Hebrew
Language.HE: "iw-IL",
Language.HE_IL: "iw-IL",
# Hindi
Language.HI: "hi-IN",
Language.HI_IN: "hi-IN",
# Hungarian
Language.HU: "hu-HU",
Language.HU_HU: "hu-HU",
# Icelandic
Language.IS: "is-IS",
Language.IS_IS: "is-IS",
# Indonesian
Language.ID: "id-ID",
Language.ID_ID: "id-ID",
# Italian
Language.IT: "it-IT",
Language.IT_IT: "it-IT",
Language.IT_CH: "it-CH",
# Japanese
Language.JA: "ja-JP",
Language.JA_JP: "ja-JP",
# Javanese
Language.JV: "jv-ID",
Language.JV_ID: "jv-ID",
# Kannada
Language.KN: "kn-IN",
Language.KN_IN: "kn-IN",
# Kazakh
Language.KK: "kk-KZ",
Language.KK_KZ: "kk-KZ",
# Khmer
Language.KM: "km-KH",
Language.KM_KH: "km-KH",
# Korean
Language.KO: "ko-KR",
Language.KO_KR: "ko-KR",
# Lao
Language.LO: "lo-LA",
Language.LO_LA: "lo-LA",
# Latvian
Language.LV: "lv-LV",
Language.LV_LV: "lv-LV",
# Lithuanian
Language.LT: "lt-LT",
Language.LT_LT: "lt-LT",
# Macedonian
Language.MK: "mk-MK",
Language.MK_MK: "mk-MK",
# Malay
Language.MS: "ms-MY",
Language.MS_MY: "ms-MY",
# Malayalam
Language.ML: "ml-IN",
Language.ML_IN: "ml-IN",
# Marathi
Language.MR: "mr-IN",
Language.MR_IN: "mr-IN",
# Mongolian
Language.MN: "mn-MN",
Language.MN_MN: "mn-MN",
# Nepali
Language.NE: "ne-NP",
Language.NE_NP: "ne-NP",
# Norwegian
Language.NO: "no-NO",
Language.NB: "no-NO",
Language.NB_NO: "no-NO",
# Persian
Language.FA: "fa-IR",
Language.FA_IR: "fa-IR",
# Polish
Language.PL: "pl-PL",
Language.PL_PL: "pl-PL",
# Portuguese
Language.PT: "pt-PT", # Default to Portugal
Language.PT_BR: "pt-BR",
Language.PT_PT: "pt-PT",
# Punjabi
Language.PA: "pa-Guru-IN",
Language.PA_IN: "pa-Guru-IN",
# Romanian
Language.RO: "ro-RO",
Language.RO_RO: "ro-RO",
# Russian
Language.RU: "ru-RU",
Language.RU_RU: "ru-RU",
# Serbian
Language.SR: "sr-RS",
Language.SR_RS: "sr-RS",
# Sinhala
Language.SI: "si-LK",
Language.SI_LK: "si-LK",
# Slovak
Language.SK: "sk-SK",
Language.SK_SK: "sk-SK",
# Slovenian
Language.SL: "sl-SI",
Language.SL_SI: "sl-SI",
# Spanish
Language.ES: "es-ES", # Default to Spain
Language.ES_AR: "es-AR",
Language.ES_BO: "es-BO",
Language.ES_CL: "es-CL",
Language.ES_CO: "es-CO",
Language.ES_CR: "es-CR",
Language.ES_DO: "es-DO",
Language.ES_EC: "es-EC",
Language.ES_ES: "es-ES",
Language.ES_GT: "es-GT",
Language.ES_HN: "es-HN",
Language.ES_MX: "es-MX",
Language.ES_NI: "es-NI",
Language.ES_PA: "es-PA",
Language.ES_PE: "es-PE",
Language.ES_PR: "es-PR",
Language.ES_PY: "es-PY",
Language.ES_SV: "es-SV",
Language.ES_US: "es-US",
Language.ES_UY: "es-UY",
Language.ES_VE: "es-VE",
# Sundanese
Language.SU: "su-ID",
Language.SU_ID: "su-ID",
# Swahili
Language.SW: "sw-TZ", # Default to Tanzania
Language.SW_KE: "sw-KE",
Language.SW_TZ: "sw-TZ",
# Swedish
Language.SV: "sv-SE",
Language.SV_SE: "sv-SE",
# Tamil
Language.TA: "ta-IN", # Default to India
Language.TA_IN: "ta-IN",
Language.TA_MY: "ta-MY",
Language.TA_SG: "ta-SG",
Language.TA_LK: "ta-LK",
# Telugu
Language.TE: "te-IN",
Language.TE_IN: "te-IN",
# Thai
Language.TH: "th-TH",
Language.TH_TH: "th-TH",
# Turkish
Language.TR: "tr-TR",
Language.TR_TR: "tr-TR",
# Ukrainian
Language.UK: "uk-UA",
Language.UK_UA: "uk-UA",
# Urdu
Language.UR: "ur-IN", # Default to India
Language.UR_IN: "ur-IN",
Language.UR_PK: "ur-PK",
# Uzbek
Language.UZ: "uz-UZ",
Language.UZ_UZ: "uz-UZ",
# Vietnamese
Language.VI: "vi-VN",
Language.VI_VN: "vi-VN",
# Xhosa
Language.XH: "xh-ZA",
# Zulu
Language.ZU: "zu-ZA",
Language.ZU_ZA: "zu-ZA",
}
return language_map.get(language)
class GoogleUserContextAggregator(OpenAIUserContextAggregator):
async def _push_aggregation(self):
async def push_aggregation(self):
if len(self._aggregation) > 0:
self._context.add_message(
glm.Content(role="user", parts=[glm.Part(text=self._aggregation)])
@@ -239,11 +552,11 @@ class GoogleUserContextAggregator(OpenAIUserContextAggregator):
await self.push_frame(frame)
# Reset our accumulator state.
self._reset()
self.reset()
class GoogleAssistantContextAggregator(OpenAIAssistantContextAggregator):
async def _push_aggregation(self):
async def push_aggregation(self):
if not (
self._aggregation or self._function_call_result or self._pending_image_frame_message
):
@@ -253,7 +566,7 @@ class GoogleAssistantContextAggregator(OpenAIAssistantContextAggregator):
properties: Optional[FunctionCallResultProperties] = None
aggregation = self._aggregation
self._reset()
self.reset()
try:
if self._function_call_result:
@@ -313,15 +626,14 @@ class GoogleAssistantContextAggregator(OpenAIAssistantContextAggregator):
run_llm = True
if run_llm:
await self._user_context_aggregator.push_context_frame()
await self.push_context_frame(FrameDirection.UPSTREAM)
# Emit the on_context_updated callback once the function call result is added to the context
if properties and properties.on_context_updated is not None:
await properties.on_context_updated()
# Push context frame
frame = OpenAILLMContextFrame(self._context)
await self.push_frame(frame)
await self.push_context_frame()
# Push timestamp frame with current time
timestamp_frame = OpenAILLMContextAssistantTimestampFrame(timestamp=time_now_iso8601())
@@ -604,9 +916,9 @@ class GoogleLLMContext(OpenAILLMContext):
# Check if we only have function-related messages (no regular text)
has_regular_messages = any(
len(msg.parts) == 1
and hasattr(msg.parts[0], "text")
and not hasattr(msg.parts[0], "function_call")
and not hasattr(msg.parts[0], "function_response")
and not getattr(msg.parts[0], "text", None)
and getattr(msg.parts[0], "function_call", None)
and getattr(msg.parts[0], "function_response", None)
for msg in self._messages
)
@@ -621,7 +933,7 @@ class GoogleLLMContext(OpenAILLMContext):
class GoogleLLMService(LLMService):
"""This class implements inference with Google's AI models
"""This class implements inference with Google's AI models.
This service translates internally from OpenAILLMContext to the messages format
expected by the Google AI model. We are using the OpenAILLMContext as a lingua
@@ -862,7 +1174,7 @@ class GoogleLLMService(LLMService):
) -> GoogleContextAggregatorPair:
user = GoogleUserContextAggregator(context)
assistant = GoogleAssistantContextAggregator(
user, expect_stripped_words=assistant_expect_stripped_words
context, expect_stripped_words=assistant_expect_stripped_words
)
return GoogleContextAggregatorPair(_user=user, _assistant=assistant)
@@ -927,7 +1239,7 @@ class GoogleTTSService(TTSService):
return True
def language_to_service_language(self, language: Language) -> Optional[str]:
return language_to_google_language(language)
return language_to_google_tts_language(language)
def _construct_ssml(self, text: str) -> str:
ssml = "<speak>"
@@ -1097,3 +1409,451 @@ class GoogleImageGenService(ImageGenService):
except Exception as e:
logger.error(f"{self} error generating image: {e}")
yield ErrorFrame(f"Image generation error: {str(e)}")
class GoogleSTTService(STTService):
"""Google Cloud Speech-to-Text V2 service implementation.
Provides real-time speech recognition using Google Cloud's Speech-to-Text V2 API
with streaming support. Handles audio transcription and optional voice activity detection.
Attributes:
InputParams: Configuration parameters for the STT service.
"""
# Google Cloud's STT service has a connection time limit of 5 minutes per stream.
# They've shared an "endless streaming" example that guided this implementation:
# https://cloud.google.com/speech-to-text/docs/transcribe-streaming-audio#endless-streaming
STREAMING_LIMIT = 240000 # 4 minutes in milliseconds
class InputParams(BaseModel):
"""Configuration parameters for Google Speech-to-Text.
Attributes:
languages: Single language or list of recognition languages. First language is primary.
model: Speech recognition model to use.
use_separate_recognition_per_channel: Process each audio channel separately.
enable_automatic_punctuation: Add punctuation to transcripts.
enable_spoken_punctuation: Include spoken punctuation in transcript.
enable_spoken_emojis: Include spoken emojis in transcript.
profanity_filter: Filter profanity from transcript.
enable_word_time_offsets: Include timing information for each word.
enable_word_confidence: Include confidence scores for each word.
enable_interim_results: Stream partial recognition results.
enable_voice_activity_events: Detect voice activity in audio.
"""
languages: Union[Language, List[Language]] = Field(default_factory=lambda: [Language.EN_US])
model: Optional[str] = "latest_long"
use_separate_recognition_per_channel: Optional[bool] = False
enable_automatic_punctuation: Optional[bool] = True
enable_spoken_punctuation: Optional[bool] = False
enable_spoken_emojis: Optional[bool] = False
profanity_filter: Optional[bool] = False
enable_word_time_offsets: Optional[bool] = False
enable_word_confidence: Optional[bool] = False
enable_interim_results: Optional[bool] = True
enable_voice_activity_events: Optional[bool] = False
@field_validator("languages", mode="before")
@classmethod
def validate_languages(cls, v) -> List[Language]:
if isinstance(v, Language):
return [v]
return v
@property
def language_list(self) -> List[Language]:
"""Get languages as a guaranteed list."""
assert isinstance(self.languages, list)
return self.languages
def __init__(
self,
*,
credentials: Optional[str] = None,
credentials_path: Optional[str] = None,
location: str = "global",
sample_rate: Optional[int] = None,
params: InputParams = InputParams(),
**kwargs,
):
"""Initialize the Google STT service.
Args:
credentials: JSON string containing Google Cloud service account credentials.
credentials_path: Path to service account credentials JSON file.
location: Google Cloud location (e.g., "global", "us-central1").
sample_rate: Audio sample rate in Hertz.
params: Configuration parameters for the service.
**kwargs: Additional arguments passed to STTService.
Raises:
ValueError: If neither credentials nor credentials_path is provided.
ValueError: If project ID is not found in credentials.
"""
super().__init__(sample_rate=sample_rate, **kwargs)
self._location = location
self._stream = None
self._config = None
self._request_queue = asyncio.Queue()
self._streaming_task = None
# Used for keep-alive logic
self._stream_start_time = 0
self._last_audio_input = []
self._audio_input = []
self._result_end_time = 0
self._is_final_end_time = 0
self._final_request_end_time = 0
self._bridging_offset = 0
self._last_transcript_was_final = False
self._new_stream = True
self._restart_counter = 0
# Configure client options based on location
client_options = None
if self._location != "global":
client_options = ClientOptions(api_endpoint=f"{self._location}-speech.googleapis.com")
# Extract project ID and create client
if credentials:
json_account_info = json.loads(credentials)
self._project_id = json_account_info.get("project_id")
creds = service_account.Credentials.from_service_account_info(json_account_info)
elif credentials_path:
with open(credentials_path) as f:
json_account_info = json.load(f)
self._project_id = json_account_info.get("project_id")
creds = service_account.Credentials.from_service_account_file(credentials_path)
else:
raise ValueError("Either credentials or credentials_path must be provided")
if not self._project_id:
raise ValueError("Project ID not found in credentials")
self._client = speech_v2.SpeechAsyncClient(credentials=creds, client_options=client_options)
self._settings = {
"language_codes": [
self.language_to_service_language(lang) for lang in params.language_list
],
"model": params.model,
"use_separate_recognition_per_channel": params.use_separate_recognition_per_channel,
"enable_automatic_punctuation": params.enable_automatic_punctuation,
"enable_spoken_punctuation": params.enable_spoken_punctuation,
"enable_spoken_emojis": params.enable_spoken_emojis,
"profanity_filter": params.profanity_filter,
"enable_word_time_offsets": params.enable_word_time_offsets,
"enable_word_confidence": params.enable_word_confidence,
"enable_interim_results": params.enable_interim_results,
"enable_voice_activity_events": params.enable_voice_activity_events,
}
def language_to_service_language(self, language: Language | List[Language]) -> str | List[str]:
"""Convert Language enum(s) to Google STT language code(s).
Args:
language: Single Language enum or list of Language enums.
Returns:
str | List[str]: Google STT language code(s).
"""
if isinstance(language, list):
return [language_to_google_stt_language(lang) or "en-US" for lang in language]
return language_to_google_stt_language(language) or "en-US"
async def _reconnect_if_needed(self):
"""Reconnect the stream if it's currently active."""
if self._streaming_task:
logger.debug("Reconnecting stream due to configuration changes")
await self._disconnect()
await self._connect()
async def set_languages(self, languages: List[Language]):
"""Update the service's recognition languages.
Args:
languages: List of languages for recognition. First language is primary.
"""
logger.debug(f"Switching STT languages to: {languages}")
self._settings["language_codes"] = [
self.language_to_service_language(lang) for lang in languages
]
# Recreate stream with new languages
await self._reconnect_if_needed()
async def set_model(self, model: str):
"""Update the service's recognition model."""
logger.debug(f"Switching STT model to: {model}")
await super().set_model(model)
self._settings["model"] = model
# Recreate stream with new model
await self._reconnect_if_needed()
async def start(self, frame: StartFrame):
await super().start(frame)
await self._connect()
async def stop(self, frame: EndFrame):
await super().stop(frame)
await self._disconnect()
async def cancel(self, frame: CancelFrame):
await super().cancel(frame)
await self._disconnect()
async def update_options(
self,
*,
languages: Optional[List[Language]] = None,
model: Optional[str] = None,
enable_automatic_punctuation: Optional[bool] = None,
enable_spoken_punctuation: Optional[bool] = None,
enable_spoken_emojis: Optional[bool] = None,
profanity_filter: Optional[bool] = None,
enable_word_time_offsets: Optional[bool] = None,
enable_word_confidence: Optional[bool] = None,
enable_interim_results: Optional[bool] = None,
enable_voice_activity_events: Optional[bool] = None,
location: Optional[str] = None,
) -> None:
"""Update service options dynamically.
Args:
languages: New list of recongition languages.
model: New recognition model.
enable_automatic_punctuation: Enable/disable automatic punctuation.
enable_spoken_punctuation: Enable/disable spoken punctuation.
enable_spoken_emojis: Enable/disable spoken emojis.
profanity_filter: Enable/disable profanity filter.
enable_word_time_offsets: Enable/disable word timing info.
enable_word_confidence: Enable/disable word confidence scores.
enable_interim_results: Enable/disable interim results.
enable_voice_activity_events: Enable/disable voice activity detection.
location: New Google Cloud location.
Note:
Changes that affect the streaming configuration will cause
the stream to be reconnected.
"""
# Update settings with new values
if languages is not None:
logger.debug(f"Updating language to: {languages}")
self._settings["language_codes"] = [
self.language_to_service_language(lang) for lang in languages
]
if model is not None:
logger.debug(f"Updating model to: {model}")
self._settings["model"] = model
if enable_automatic_punctuation is not None:
logger.debug(f"Updating automatic punctuation to: {enable_automatic_punctuation}")
self._settings["enable_automatic_punctuation"] = enable_automatic_punctuation
if enable_spoken_punctuation is not None:
logger.debug(f"Updating spoken punctuation to: {enable_spoken_punctuation}")
self._settings["enable_spoken_punctuation"] = enable_spoken_punctuation
if enable_spoken_emojis is not None:
logger.debug(f"Updating spoken emojis to: {enable_spoken_emojis}")
self._settings["enable_spoken_emojis"] = enable_spoken_emojis
if profanity_filter is not None:
logger.debug(f"Updating profanity filter to: {profanity_filter}")
self._settings["profanity_filter"] = profanity_filter
if enable_word_time_offsets is not None:
logger.debug(f"Updating word time offsets to: {enable_word_time_offsets}")
self._settings["enable_word_time_offsets"] = enable_word_time_offsets
if enable_word_confidence is not None:
logger.debug(f"Updating word confidence to: {enable_word_confidence}")
self._settings["enable_word_confidence"] = enable_word_confidence
if enable_interim_results is not None:
logger.debug(f"Updating interim results to: {enable_interim_results}")
self._settings["enable_interim_results"] = enable_interim_results
if enable_voice_activity_events is not None:
logger.debug(f"Updating voice activity events to: {enable_voice_activity_events}")
self._settings["enable_voice_activity_events"] = enable_voice_activity_events
if location is not None:
logger.debug(f"Updating location to: {location}")
self._location = location
# Reconnect the stream for updates
await self._reconnect_if_needed()
async def _connect(self):
"""Initialize streaming recognition config and stream."""
logger.debug("Connecting to Google Speech-to-Text")
# Set stream start time
self._stream_start_time = int(time.time() * 1000)
self._new_stream = True
self._config = cloud_speech.StreamingRecognitionConfig(
config=cloud_speech.RecognitionConfig(
explicit_decoding_config=cloud_speech.ExplicitDecodingConfig(
encoding=cloud_speech.ExplicitDecodingConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=self.sample_rate,
audio_channel_count=1,
),
language_codes=self._settings["language_codes"],
model=self._settings["model"],
features=cloud_speech.RecognitionFeatures(
enable_automatic_punctuation=self._settings["enable_automatic_punctuation"],
enable_spoken_punctuation=self._settings["enable_spoken_punctuation"],
enable_spoken_emojis=self._settings["enable_spoken_emojis"],
profanity_filter=self._settings["profanity_filter"],
enable_word_time_offsets=self._settings["enable_word_time_offsets"],
enable_word_confidence=self._settings["enable_word_confidence"],
),
),
streaming_features=cloud_speech.StreamingRecognitionFeatures(
enable_voice_activity_events=self._settings["enable_voice_activity_events"],
interim_results=self._settings["enable_interim_results"],
),
)
self._streaming_task = self.create_task(self._stream_audio())
async def _disconnect(self):
"""Clean up streaming recognition resources."""
if self._streaming_task:
logger.debug("Disconnecting from Google Speech-to-Text")
# Send sentinel value to stop request generator
await self._request_queue.put(None)
await self.cancel_task(self._streaming_task)
self._streaming_task = None
# Clear any remaining items in the queue
while not self._request_queue.empty():
try:
self._request_queue.get_nowait()
self._request_queue.task_done()
except asyncio.QueueEmpty:
break
async def _request_generator(self):
"""Generates requests for the streaming recognize method."""
recognizer_path = f"projects/{self._project_id}/locations/{self._location}/recognizers/_"
logger.trace(f"Using recognizer path: {recognizer_path}")
try:
# Send initial config
yield cloud_speech.StreamingRecognizeRequest(
recognizer=recognizer_path,
streaming_config=self._config,
)
while True:
try:
audio_data = await self._request_queue.get()
if audio_data is None: # Sentinel value to stop
break
# Check streaming limit
if (int(time.time() * 1000) - self._stream_start_time) > self.STREAMING_LIMIT:
logger.debug("Streaming limit reached, initiating graceful reconnection")
# Instead of immediate reconnection, we'll break and let the stream close naturally
self._last_audio_input = self._audio_input
self._audio_input = []
self._restart_counter += 1
# Put the current audio chunk back in the queue
await self._request_queue.put(audio_data)
break
self._audio_input.append(audio_data)
yield cloud_speech.StreamingRecognizeRequest(audio=audio_data)
except asyncio.CancelledError:
break
finally:
self._request_queue.task_done()
except Exception as e:
logger.error(f"Error in request generator: {e}")
raise
async def _stream_audio(self):
"""Handle bi-directional streaming with Google STT."""
try:
while True:
try:
# Start bi-directional streaming
streaming_recognize = await self._client.streaming_recognize(
requests=self._request_generator()
)
# Process responses
await self._process_responses(streaming_recognize)
# If we're here, check if we need to reconnect
if (int(time.time() * 1000) - self._stream_start_time) > self.STREAMING_LIMIT:
logger.debug("Reconnecting stream after timeout")
# Reset stream start time
self._stream_start_time = int(time.time() * 1000)
continue
else:
# Normal stream end
break
except Exception as e:
logger.error(f"Stream error, attempting to reconnect: {e}")
await asyncio.sleep(1) # Brief delay before reconnecting
self._stream_start_time = int(time.time() * 1000)
continue
except Exception as e:
logger.error(f"Error in streaming task: {e}")
await self.push_frame(ErrorFrame(str(e)))
async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
"""Process an audio chunk for STT transcription."""
if self._streaming_task:
# Queue the audio data
await self._request_queue.put(audio)
yield None
async def _process_responses(self, streaming_recognize):
"""Process streaming recognition responses."""
try:
async for response in streaming_recognize:
# Check streaming limit
if (int(time.time() * 1000) - self._stream_start_time) > self.STREAMING_LIMIT:
logger.debug("Stream timeout reached in response processing")
break
if not response.results:
continue
for result in response.results:
if not result.alternatives:
continue
transcript = result.alternatives[0].transcript
if not transcript:
continue
primary_language = self._settings["language_codes"][0]
if result.is_final:
self._last_transcript_was_final = True
await self.push_frame(
TranscriptionFrame(transcript, "", time_now_iso8601(), primary_language)
)
else:
self._last_transcript_was_final = False
await self.push_frame(
InterimTranscriptionFrame(
transcript, "", time_now_iso8601(), primary_language
)
)
except Exception as e:
logger.error(f"Error processing Google STT responses: {e}")

View File

@@ -0,0 +1,54 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
from typing import List, Literal, Optional
from pydantic import BaseModel
from pipecat.frames.frames import Frame
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.processors.frameworks.rtvi import RTVIObserver
from pipecat.services.google.frames import LLMSearchOrigin, LLMSearchResponseFrame
class RTVISearchResponseMessageData(BaseModel):
search_result: Optional[str]
rendered_content: Optional[str]
origins: List[LLMSearchOrigin]
class RTVIBotLLMSearchResponseMessage(BaseModel):
label: Literal["rtvi-ai"] = "rtvi-ai"
type: Literal["bot-llm-search-response"] = "bot-llm-search-response"
data: RTVISearchResponseMessageData
class GoogleRTVIObserver(RTVIObserver):
def __init__(self, rtvi: FrameProcessor):
super().__init__(rtvi)
async def on_push_frame(
self,
src: FrameProcessor,
dst: FrameProcessor,
frame: Frame,
direction: FrameDirection,
timestamp: int,
):
await super().on_push_frame(src, dst, frame, direction, timestamp)
if isinstance(frame, LLMSearchResponseFrame):
await self._handle_llm_search_response_frame(frame)
async def _handle_llm_search_response_frame(self, frame: LLMSearchResponseFrame):
message = RTVIBotLLMSearchResponseMessage(
data=RTVISearchResponseMessageData(
search_result=frame.search_result,
origins=frame.origins,
rendered_content=frame.rendered_content,
)
)
await self.push_transport_message_urgent(message)

View File

@@ -17,6 +17,7 @@ from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContext,
OpenAILLMContextFrame,
)
from pipecat.processors.frame_processor import FrameDirection
from pipecat.services.openai import (
OpenAIAssistantContextAggregator,
OpenAILLMService,
@@ -27,7 +28,7 @@ from pipecat.services.openai import (
class GrokAssistantContextAggregator(OpenAIAssistantContextAggregator):
"""Custom assistant context aggregator for Grok that handles empty content requirement."""
async def _push_aggregation(self):
async def push_aggregation(self):
if not (
self._aggregation or self._function_call_result or self._pending_image_frame_message
):
@@ -37,7 +38,7 @@ class GrokAssistantContextAggregator(OpenAIAssistantContextAggregator):
properties: Optional[FunctionCallResultProperties] = None
aggregation = self._aggregation
self._reset()
self.reset()
try:
if self._function_call_result:
@@ -91,14 +92,13 @@ class GrokAssistantContextAggregator(OpenAIAssistantContextAggregator):
run_llm = True
if run_llm:
await self._user_context_aggregator.push_context_frame()
await self.push_context_frame(FrameDirection.UPSTREAM)
# Emit the on_context_updated callback once the function call result is added to the context
if properties and properties.on_context_updated is not None:
await properties.on_context_updated()
frame = OpenAILLMContextFrame(self._context)
await self.push_frame(frame)
await self.push_context_frame()
except Exception as e:
logger.error(f"Error processing frame: {e}")
@@ -212,6 +212,6 @@ class GrokLLMService(OpenAILLMService):
) -> GrokContextAggregatorPair:
user = OpenAIUserContextAggregator(context)
assistant = GrokAssistantContextAggregator(
user, expect_stripped_words=assistant_expect_stripped_words
context, expect_stripped_words=assistant_expect_stripped_words
)
return GrokContextAggregatorPair(_user=user, _assistant=assistant)

View File

@@ -5,9 +5,13 @@
#
from typing import Optional
from loguru import logger
from pipecat.services.base_whisper import BaseWhisperSTTService, Transcription
from pipecat.services.openai import OpenAILLMService
from pipecat.transcriptions.language import Language
class GroqLLMService(OpenAILLMService):
@@ -19,7 +23,7 @@ class GroqLLMService(OpenAILLMService):
Args:
api_key (str): The API key for accessing Groq's API
base_url (str, optional): The base URL for Groq API. Defaults to "https://api.groq.com/openai/v1"
model (str, optional): The model identifier to use. Defaults to "llama-3.1-70b-versatile"
model (str, optional): The model identifier to use. Defaults to "llama-3.3-70b-versatile"
**kwargs: Additional keyword arguments passed to OpenAILLMService
"""
@@ -28,7 +32,7 @@ class GroqLLMService(OpenAILLMService):
*,
api_key: str,
base_url: str = "https://api.groq.com/openai/v1",
model: str = "llama-3.1-70b-versatile",
model: str = "llama-3.3-70b-versatile",
**kwargs,
):
super().__init__(api_key=api_key, base_url=base_url, model=model, **kwargs)
@@ -37,3 +41,60 @@ class GroqLLMService(OpenAILLMService):
"""Create OpenAI-compatible client for Groq API endpoint."""
logger.debug(f"Creating Groq client with api {base_url}")
return super().create_client(api_key, base_url, **kwargs)
class GroqSTTService(BaseWhisperSTTService):
"""Groq Whisper speech-to-text service.
Uses Groq's Whisper API to convert audio to text. Requires a Groq API key
set via the api_key parameter or GROQ_API_KEY environment variable.
Args:
model: Whisper model to use. Defaults to "whisper-large-v3-turbo".
api_key: Groq API key. Defaults to None.
base_url: API base URL. Defaults to "https://api.groq.com/openai/v1".
language: Language of the audio input. Defaults to English.
prompt: Optional text to guide the model's style or continue a previous segment.
temperature: Optional sampling temperature between 0 and 1. Defaults to 0.0.
**kwargs: Additional arguments passed to BaseWhisperSTTService.
"""
def __init__(
self,
*,
model: str = "whisper-large-v3-turbo",
api_key: Optional[str] = None,
base_url: str = "https://api.groq.com/openai/v1",
language: Optional[Language] = Language.EN,
prompt: Optional[str] = None,
temperature: Optional[float] = None,
**kwargs,
):
super().__init__(
model=model,
api_key=api_key,
base_url=base_url,
language=language,
prompt=prompt,
temperature=temperature,
**kwargs,
)
async def _transcribe(self, audio: bytes) -> Transcription:
assert self._language is not None # Assigned in the BaseWhisperSTTService class
# Build kwargs dict with only set parameters
kwargs = {
"file": ("audio.wav", audio, "audio/wav"),
"model": self.model_name,
"response_format": "json",
"language": self._language,
}
if self._prompt is not None:
kwargs["prompt"] = self._prompt
if self._temperature is not None:
kwargs["temperature"] = self._temperature
return await self._client.audio.transcriptions.create(**kwargs)

View File

@@ -73,6 +73,7 @@ class LmntTTSService(TTSService, WebsocketService):
TTSService.__init__(
self,
push_stop_frames=True,
pause_frame_processing=True,
sample_rate=sample_rate,
**kwargs,
)

View File

@@ -48,7 +48,13 @@ from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContextFrame,
)
from pipecat.processors.frame_processor import FrameDirection
from pipecat.services.ai_services import ImageGenService, LLMService, TTSService
from pipecat.services.ai_services import (
ImageGenService,
LLMService,
TTSService,
)
from pipecat.services.base_whisper import BaseWhisperSTTService, Transcription
from pipecat.transcriptions.language import Language
from pipecat.utils.time import time_now_iso8601
try:
@@ -104,7 +110,7 @@ class BaseOpenAILLMService(LLMService):
seed: Optional[int] = Field(default_factory=lambda: NOT_GIVEN, ge=0)
temperature: Optional[float] = Field(default_factory=lambda: NOT_GIVEN, ge=0.0, le=2.0)
# Note: top_k is currently not supported by the OpenAI client library,
# so top_k is ignore right now.
# so top_k is ignored right now.
top_k: Optional[int] = Field(default=None, ge=0)
top_p: Optional[float] = Field(default_factory=lambda: NOT_GIVEN, ge=0.0, le=1.0)
max_tokens: Optional[int] = Field(default_factory=lambda: NOT_GIVEN, ge=1)
@@ -349,7 +355,7 @@ class OpenAILLMService(BaseOpenAILLMService):
) -> OpenAIContextAggregatorPair:
user = OpenAIUserContextAggregator(context)
assistant = OpenAIAssistantContextAggregator(
user, expect_stripped_words=assistant_expect_stripped_words
context, expect_stripped_words=assistant_expect_stripped_words
)
return OpenAIContextAggregatorPair(_user=user, _assistant=assistant)
@@ -391,6 +397,62 @@ class OpenAIImageGenService(ImageGenService):
yield frame
class OpenAISTTService(BaseWhisperSTTService):
"""OpenAI Whisper speech-to-text service.
Uses OpenAI's Whisper API to convert audio to text. Requires an OpenAI API key
set via the api_key parameter or OPENAI_API_KEY environment variable.
Args:
model: Whisper model to use. Defaults to "whisper-1".
api_key: OpenAI API key. Defaults to None.
base_url: API base URL. Defaults to None.
language: Language of the audio input. Defaults to English.
prompt: Optional text to guide the model's style or continue a previous segment.
temperature: Optional sampling temperature between 0 and 1. Defaults to 0.0.
**kwargs: Additional arguments passed to BaseWhisperSTTService.
"""
def __init__(
self,
*,
model: str = "whisper-1",
api_key: Optional[str] = None,
base_url: Optional[str] = None,
language: Optional[Language] = Language.EN,
prompt: Optional[str] = None,
temperature: Optional[float] = None,
**kwargs,
):
super().__init__(
model=model,
api_key=api_key,
base_url=base_url,
language=language,
prompt=prompt,
temperature=temperature,
**kwargs,
)
async def _transcribe(self, audio: bytes) -> Transcription:
assert self._language is not None # Assigned in the BaseWhisperSTTService class
# Build kwargs dict with only set parameters
kwargs = {
"file": ("audio.wav", audio, "audio/wav"),
"model": self.model_name,
"language": self._language,
}
if self._prompt is not None:
kwargs["prompt"] = self._prompt
if self._temperature is not None:
kwargs["temperature"] = self._temperature
return await self._client.audio.transcriptions.create(**kwargs)
class OpenAITTSService(TTSService):
"""OpenAI Text-to-Speech service that generates audio from text.
@@ -493,8 +555,8 @@ class OpenAIImageMessageFrame(Frame):
class OpenAIUserContextAggregator(LLMUserContextAggregator):
def __init__(self, context: OpenAILLMContext):
super().__init__(context=context)
def __init__(self, context: OpenAILLMContext, **kwargs):
super().__init__(context=context, **kwargs)
async def process_frame(self, frame, direction):
await super().process_frame(frame, direction)
@@ -530,9 +592,8 @@ class OpenAIUserContextAggregator(LLMUserContextAggregator):
class OpenAIAssistantContextAggregator(LLMAssistantContextAggregator):
def __init__(self, user_context_aggregator: OpenAIUserContextAggregator, **kwargs):
super().__init__(context=user_context_aggregator._context, **kwargs)
self._user_context_aggregator = user_context_aggregator
def __init__(self, context: OpenAILLMContext, **kwargs):
super().__init__(context=context, **kwargs)
self._function_calls_in_progress = {}
self._function_call_result = None
self._pending_image_frame_message = None
@@ -552,7 +613,7 @@ class OpenAIAssistantContextAggregator(LLMAssistantContextAggregator):
del self._function_calls_in_progress[frame.tool_call_id]
self._function_call_result = frame
# TODO-CB: Kwin wants us to refactor this out of here but I REFUSE
await self._push_aggregation()
await self.push_aggregation()
else:
logger.warning(
"FunctionCallResultFrame tool_call_id does not match any function call in progress"
@@ -560,9 +621,9 @@ class OpenAIAssistantContextAggregator(LLMAssistantContextAggregator):
self._function_call_result = None
elif isinstance(frame, OpenAIImageMessageFrame):
self._pending_image_frame_message = frame
await self._push_aggregation()
await self.push_aggregation()
async def _push_aggregation(self):
async def push_aggregation(self):
if not (
self._aggregation or self._function_call_result or self._pending_image_frame_message
):
@@ -572,7 +633,7 @@ class OpenAIAssistantContextAggregator(LLMAssistantContextAggregator):
properties: Optional[FunctionCallResultProperties] = None
aggregation = self._aggregation
self._reset()
self.reset()
try:
if self._function_call_result:
@@ -624,15 +685,14 @@ class OpenAIAssistantContextAggregator(LLMAssistantContextAggregator):
run_llm = True
if run_llm:
await self._user_context_aggregator.push_context_frame()
await self.push_context_frame(FrameDirection.UPSTREAM)
# Emit the on_context_updated callback once the function call result is added to the context
if properties and properties.on_context_updated is not None:
await properties.on_context_updated()
# Push context frame
frame = OpenAILLMContextFrame(self._context)
await self.push_frame(frame)
await self.push_context_frame()
# Push timestamp frame with current time
timestamp_frame = OpenAILLMContextAssistantTimestampFrame(timestamp=time_now_iso8601())

View File

@@ -166,7 +166,7 @@ class OpenAIRealtimeUserContextAggregator(OpenAIUserContextAggregator):
if isinstance(frame, LLMSetToolsFrame):
await self.push_frame(frame, direction)
async def _push_aggregation(self):
async def push_aggregation(self):
# for the moment, ignore all user input coming into the pipeline.
# todo: think about whether/how to fix this to allow for text input from
# upstream (transport/transcription, or other sources)
@@ -174,7 +174,7 @@ class OpenAIRealtimeUserContextAggregator(OpenAIUserContextAggregator):
class OpenAIRealtimeAssistantContextAggregator(OpenAIAssistantContextAggregator):
async def _push_aggregation(self):
async def push_aggregation(self):
# the only thing we implement here is function calling. in all other cases, messages
# are added to the context when we receive openai realtime api events
if not self._function_call_result:
@@ -182,7 +182,7 @@ class OpenAIRealtimeAssistantContextAggregator(OpenAIAssistantContextAggregator)
properties: Optional[FunctionCallResultProperties] = None
self._reset()
self.reset()
try:
run_llm = True
frame = self._function_call_result
@@ -217,8 +217,8 @@ class OpenAIRealtimeAssistantContextAggregator(OpenAIAssistantContextAggregator)
# The standard function callback code path pushes the FunctionCallResultFrame from the llm itself,
# so we didn't have a chance to add the result to the openai realtime api context. Let's push a
# special frame to do that.
await self._user_context_aggregator.push_frame(
RealtimeFunctionCallResultFrame(result_frame=frame)
await self.push_frame(
RealtimeFunctionCallResultFrame(result_frame=frame), FrameDirection.UPSTREAM
)
if properties and properties.run_llm is not None:
# If the tool call result has a run_llm property, use it
@@ -228,14 +228,13 @@ class OpenAIRealtimeAssistantContextAggregator(OpenAIAssistantContextAggregator)
run_llm = not bool(self._function_calls_in_progress)
if run_llm:
await self._user_context_aggregator.push_context_frame()
await self.push_context_frame(FrameDirection.UPSTREAM)
# Emit the on_context_updated callback once the function call result is added to the context
if properties and properties.on_context_updated is not None:
await properties.on_context_updated()
frame = OpenAILLMContextFrame(self._context)
await self.push_frame(frame)
await self.push_context_frame()
except Exception as e:
logger.error(f"Error processing frame: {e}")

View File

@@ -568,6 +568,6 @@ class OpenAIRealtimeBetaLLMService(LLMService):
OpenAIRealtimeLLMContext.upgrade_to_realtime(context)
user = OpenAIRealtimeUserContextAggregator(context)
assistant = OpenAIRealtimeAssistantContextAggregator(
user, expect_stripped_words=assistant_expect_stripped_words
context, expect_stripped_words=assistant_expect_stripped_words
)
return OpenAIContextAggregatorPair(_user=user, _assistant=assistant)

View File

@@ -0,0 +1,141 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
from typing import List
from loguru import logger
from pipecat.metrics.metrics import LLMTokenUsage
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.openai import OpenAILLMService
try:
from openai import (
NOT_GIVEN,
AsyncStream,
)
from openai.types.chat import ChatCompletionChunk, ChatCompletionMessageParam
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
"In order to use Perplexity, you need to `pip install pipecat-ai[perplexity]`. Also, set `PERPLEXITY_API_KEY` environment variable."
)
raise Exception(f"Missing module: {e}")
class PerplexityLLMService(OpenAILLMService):
"""A service for interacting with Perplexity's API.
This service extends OpenAILLMService to work with Perplexity's API while maintaining
compatibility with the OpenAI-style interface. It specifically handles the difference
in token usage reporting between Perplexity (incremental) and OpenAI (final summary).
Args:
api_key (str): The API key for accessing Perplexity's API
base_url (str, optional): The base URL for Perplexity's API. Defaults to "https://api.perplexity.ai"
model (str, optional): The model identifier to use. Defaults to "sonar"
**kwargs: Additional keyword arguments passed to OpenAILLMService
"""
def __init__(
self,
*,
api_key: str,
base_url: str = "https://api.perplexity.ai",
model: str = "sonar",
**kwargs,
):
super().__init__(api_key=api_key, base_url=base_url, model=model, **kwargs)
# Counters for accumulating token usage metrics
self._prompt_tokens = 0
self._completion_tokens = 0
self._total_tokens = 0
self._has_reported_prompt_tokens = False
self._is_processing = False
async def get_chat_completions(
self, context: OpenAILLMContext, messages: List[ChatCompletionMessageParam]
) -> AsyncStream[ChatCompletionChunk]:
"""Get chat completions from Perplexity API using OpenAI-compatible parameters.
Args:
context: The context containing conversation history and settings
messages: The messages to send to the API
Returns:
A stream of chat completion chunks
"""
params = {
"model": self.model_name,
"stream": True,
"messages": messages,
}
# Add OpenAI-compatible parameters if they're set
if self._settings["frequency_penalty"] is not NOT_GIVEN:
params["frequency_penalty"] = self._settings["frequency_penalty"]
if self._settings["presence_penalty"] is not NOT_GIVEN:
params["presence_penalty"] = self._settings["presence_penalty"]
if self._settings["temperature"] is not NOT_GIVEN:
params["temperature"] = self._settings["temperature"]
if self._settings["top_p"] is not NOT_GIVEN:
params["top_p"] = self._settings["top_p"]
if self._settings["max_tokens"] is not NOT_GIVEN:
params["max_tokens"] = self._settings["max_tokens"]
chunks = await self._client.chat.completions.create(**params)
return chunks
async def _process_context(self, context: OpenAILLMContext):
"""Process a context through the LLM and accumulate token usage metrics.
This method overrides the parent class implementation to handle
Perplexity's incremental token reporting style, accumulating the counts
and reporting them once at the end of processing.
Args:
context (OpenAILLMContext): The context to process, containing messages
and other information needed for the LLM interaction.
"""
# Reset all counters and flags at the start of processing
self._prompt_tokens = 0
self._completion_tokens = 0
self._total_tokens = 0
self._has_reported_prompt_tokens = False
self._is_processing = True
try:
await super()._process_context(context)
finally:
self._is_processing = False
# Report final accumulated token usage at the end of processing
if self._prompt_tokens > 0 or self._completion_tokens > 0:
self._total_tokens = self._prompt_tokens + self._completion_tokens
tokens = LLMTokenUsage(
prompt_tokens=self._prompt_tokens,
completion_tokens=self._completion_tokens,
total_tokens=self._total_tokens,
)
await super().start_llm_usage_metrics(tokens)
async def start_llm_usage_metrics(self, tokens: LLMTokenUsage):
"""Accumulate token usage metrics during processing.
Perplexity reports token usage incrementally during streaming,
unlike OpenAI which provides a final summary. We accumulate the
counts and report the total at the end of processing.
"""
if not self._is_processing:
return
# Record prompt tokens the first time we see them
if not self._has_reported_prompt_tokens and tokens.prompt_tokens > 0:
self._prompt_tokens = tokens.prompt_tokens
self._has_reported_prompt_tokens = True
# Update completion tokens count if it has increased
if tokens.completion_tokens > self._completion_tokens:
self._completion_tokens = tokens.completion_tokens

View File

@@ -120,6 +120,7 @@ class PlayHTTTSService(TTSService, WebsocketService):
):
TTSService.__init__(
self,
pause_frame_processing=True,
sample_rate=sample_rate,
**kwargs,
)
@@ -269,19 +270,6 @@ class PlayHTTTSService(TTSService, WebsocketService):
except json.JSONDecodeError:
logger.error(f"Invalid JSON message: {message}")
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
# If we received a TTSSpeakFrame and the LLM response included text (it
# might be that it's only a function calling response) we pause
# processing more frames until we receive a BotStoppedSpeakingFrame.
if isinstance(frame, TTSSpeakFrame):
await self.pause_processing_frames()
elif isinstance(frame, LLMFullResponseEndFrame) and self._request_id:
await self.pause_processing_frames()
elif isinstance(frame, BotStoppedSpeakingFrame):
await self.resume_processing_frames()
async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
logger.debug(f"Generating TTS: [{text}]")

View File

@@ -4,6 +4,9 @@
# SPDX-License-Identifier: BSD 2-Clause License
#
import base64
import json
import uuid
from typing import AsyncGenerator, Optional
import aiohttp
@@ -11,13 +14,327 @@ from loguru import logger
from pydantic import BaseModel
from pipecat.frames.frames import (
BotStoppedSpeakingFrame,
CancelFrame,
EndFrame,
ErrorFrame,
Frame,
LLMFullResponseEndFrame,
StartFrame,
StartInterruptionFrame,
TTSAudioRawFrame,
TTSSpeakFrame,
TTSStartedFrame,
TTSStoppedFrame,
)
from pipecat.services.ai_services import TTSService
from pipecat.processors.frame_processor import FrameDirection
from pipecat.services.ai_services import AudioContextWordTTSService, TTSService
from pipecat.services.websocket_service import WebsocketService
from pipecat.transcriptions.language import Language
try:
import websockets
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
"In order to use Rime, you need to `pip install pipecat-ai[rime]`. Also, set `RIME_API_KEY` environment variable."
)
raise Exception(f"Missing module: {e}")
def language_to_rime_language(language: Language) -> str:
"""Convert pipecat Language to Rime language code.
Args:
language: The pipecat Language enum value.
Returns:
str: Three-letter language code used by Rime (e.g., 'eng' for English).
"""
LANGUAGE_MAP = {
Language.EN: "eng",
Language.ES: "spa",
}
return LANGUAGE_MAP.get(language, "eng")
class RimeTTSService(AudioContextWordTTSService, WebsocketService):
"""Text-to-Speech service using Rime's websocket API.
Uses Rime's websocket JSON API to convert text to speech with word-level timing
information. Supports interruptions and maintains context across multiple messages
within a turn.
"""
class InputParams(BaseModel):
"""Configuration parameters for Rime TTS service."""
language: Optional[Language] = Language.EN
speed_alpha: Optional[float] = 1.0
reduce_latency: Optional[bool] = False
def __init__(
self,
*,
api_key: str,
voice_id: str,
url: str = "wss://users-ws.rime.ai/ws2",
model: str = "mistv2",
sample_rate: Optional[int] = None,
params: InputParams = InputParams(),
**kwargs,
):
"""Initialize Rime TTS service.
Args:
api_key: Rime API key for authentication.
voice_id: ID of the voice to use.
url: Rime websocket API endpoint.
model: Model ID to use for synthesis.
sample_rate: Audio sample rate in Hz.
params: Additional configuration parameters.
"""
# Initialize with parent class settings for proper frame handling
AudioContextWordTTSService.__init__(
self,
aggregate_sentences=True,
push_text_frames=False,
push_stop_frames=True,
stop_frame_timeout_s=2.0,
pause_frame_processing=True,
sample_rate=sample_rate,
**kwargs,
)
WebsocketService.__init__(self)
# Store service configuration
self._api_key = api_key
self._url = url
self._voice_id = voice_id
self._model = model
self._settings = {
"speaker": voice_id,
"modelId": model,
"audioFormat": "pcm",
"samplingRate": 0,
"lang": self.language_to_service_language(params.language)
if params.language
else "eng",
"speedAlpha": params.speed_alpha,
"reduceLatency": params.reduce_latency,
}
# State tracking
self._context_id = None # Tracks current turn
self._receive_task = None
self._cumulative_time = 0 # Accumulates time across messages
def can_generate_metrics(self) -> bool:
return True
def language_to_service_language(self, language: Language) -> str | None:
"""Convert pipecat language to Rime language code."""
return language_to_rime_language(language)
async def set_model(self, model: str):
"""Update the TTS model."""
self._model = model
await super().set_model(model)
def _build_msg(self, text: str = "") -> dict:
"""Build JSON message for Rime API."""
return {"text": text, "contextId": self._context_id}
def _build_clear_msg(self) -> dict:
"""Build clear operation message."""
return {"operation": "clear"}
def _build_eos_msg(self) -> dict:
"""Build end-of-stream operation message."""
return {"operation": "eos"}
async def start(self, frame: StartFrame):
"""Start the service and establish websocket connection."""
await super().start(frame)
self._settings["samplingRate"] = self.sample_rate
await self._connect()
async def stop(self, frame: EndFrame):
"""Stop the service and close connection."""
await super().stop(frame)
await self._disconnect()
async def cancel(self, frame: CancelFrame):
"""Cancel current operation and clean up."""
await super().cancel(frame)
await self._disconnect()
async def _connect(self):
"""Establish websocket connection and start receive task."""
await self._connect_websocket()
self._receive_task = self.create_task(self._receive_task_handler(self.push_error))
async def _disconnect(self):
"""Close websocket connection and clean up tasks."""
await self._disconnect_websocket()
if self._receive_task:
await self.cancel_task(self._receive_task)
self._receive_task = None
async def _connect_websocket(self):
"""Connect to Rime websocket API with configured settings."""
try:
params = "&".join(f"{k}={v}" for k, v in self._settings.items())
url = f"{self._url}?{params}"
headers = {"Authorization": f"Bearer {self._api_key}"}
self._websocket = await websockets.connect(url, extra_headers=headers)
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
async def _disconnect_websocket(self):
"""Close websocket connection and reset state."""
try:
await self.stop_all_metrics()
if self._websocket:
await self._websocket.send(json.dumps(self._build_eos_msg()))
await self._websocket.close()
self._websocket = None
self._context_id = None
except Exception as e:
logger.error(f"{self} error closing websocket: {e}")
def _get_websocket(self):
"""Get active websocket connection or raise exception."""
if self._websocket:
return self._websocket
raise Exception("Websocket not connected")
async def _handle_interruption(self, frame: StartInterruptionFrame, direction: FrameDirection):
"""Handle interruption by clearing current context."""
await super()._handle_interruption(frame, direction)
await self.stop_all_metrics()
if self._context_id:
await self._get_websocket().send(json.dumps(self._build_clear_msg()))
self._context_id = None
def _calculate_word_times(self, words: list, starts: list, ends: list) -> list:
"""Calculate word timing pairs with proper spacing and punctuation.
Args:
words: List of words from Rime.
starts: List of start times for each word.
ends: List of end times for each word.
Returns:
List of (word, timestamp) pairs with proper timing.
"""
word_pairs = []
for i, (word, start_time, _) in enumerate(zip(words, starts, ends)):
if not word.strip():
continue
# Adjust timing by adding cumulative time
adjusted_start = start_time + self._cumulative_time
# Handle punctuation by appending to previous word
is_punctuation = bool(word.strip(",.!?") == "")
if is_punctuation and word_pairs:
prev_word, prev_time = word_pairs[-1]
word_pairs[-1] = (prev_word + word, prev_time)
else:
word_pairs.append((word, adjusted_start))
return word_pairs
async def flush_audio(self):
if not self._context_id or not self._websocket:
return
logger.trace(f"{self}: flushing audio")
self._context_id = None
async def _receive_messages(self):
"""Process incoming websocket messages."""
async for message in self._get_websocket():
msg = json.loads(message)
if not msg or not self.audio_context_available(msg["contextId"]):
continue
if msg["type"] == "chunk":
# Process audio chunk
await self.stop_ttfb_metrics()
self.start_word_timestamps()
frame = TTSAudioRawFrame(
audio=base64.b64decode(msg["data"]),
sample_rate=self.sample_rate,
num_channels=1,
)
await self.append_to_audio_context(msg["contextId"], frame)
elif msg["type"] == "timestamps":
# Process word timing information
timestamps = msg.get("word_timestamps", {})
words = timestamps.get("words", [])
starts = timestamps.get("start", [])
ends = timestamps.get("end", [])
if words and starts:
# Calculate word timing pairs
word_pairs = self._calculate_word_times(words, starts, ends)
if word_pairs:
await self.add_word_timestamps(word_pairs)
self._cumulative_time = ends[-1] + self._cumulative_time
logger.debug(f"Updated cumulative time to: {self._cumulative_time}")
elif msg["type"] == "error":
logger.error(f"{self} error: {msg}")
await self.push_frame(TTSStoppedFrame())
await self.stop_all_metrics()
await self.push_error(ErrorFrame(f"{self} error: {msg['message']}"))
self._context_id = None
async def push_frame(self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM):
"""Push frame and handle end-of-turn conditions."""
await super().push_frame(frame, direction)
if isinstance(frame, (TTSStoppedFrame, StartInterruptionFrame)):
if isinstance(frame, TTSStoppedFrame):
await self.add_word_timestamps([("LLMFullResponseEndFrame", 0), ("Reset", 0)])
async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
"""Generate speech from text.
Args:
text: The text to convert to speech.
Yields:
Frames containing audio data and timing information.
"""
logger.debug(f"Generating TTS: [{text}]")
try:
if not self._websocket:
await self._connect()
try:
if not self._context_id:
await self.start_ttfb_metrics()
yield TTSStartedFrame()
self._cumulative_time = 0
self._context_id = str(uuid.uuid4())
await self.create_audio_context(self._context_id)
msg = self._build_msg(text=text)
await self._get_websocket().send(json.dumps(msg))
await self.start_tts_usage_metrics(text)
except Exception as e:
logger.error(f"{self} error sending message: {e}")
yield TTSStoppedFrame()
await self._disconnect()
await self._connect()
return
yield None
except Exception as e:
logger.error(f"{self} exception: {e}")
class RimeHttpTTSService(TTSService):
@@ -33,7 +350,7 @@ class RimeHttpTTSService(TTSService):
*,
api_key: str,
voice_id: str = "eva",
model: str = "mist",
model: str = "mistv2",
sample_rate: Optional[int] = None,
params: InputParams = InputParams(),
**kwargs,

View File

@@ -41,7 +41,7 @@ class SimliVideoService(FrameProcessor):
self._pipecat_resampler_event = asyncio.Event()
self._pipecat_resampler: AudioResampler = None
self._simli_resampler = AudioResampler("s16", 1, 16000)
self._simli_resampler = AudioResampler("s16", "mono", 16000)
self._audio_task: asyncio.Task = None
self._video_task: asyncio.Task = None

View File

@@ -10,6 +10,7 @@ from typing import Awaitable, Callable, Optional
import websockets
from loguru import logger
from websockets.protocol import State
from pipecat.frames.frames import ErrorFrame
@@ -83,8 +84,13 @@ class WebsocketService(ABC):
while True:
try:
await self._receive_messages()
logger.debug(f"{self} connection established successfully")
retry_count = 0 # Reset counter on successful message receive
if self._websocket and self._websocket.state == State.CLOSED:
raise websockets.ConnectionClosedOK(
self._websocket.close_rcvd,
self._websocket.close_sent,
self._websocket.close_rcvd_then_sent,
)
except Exception as e:
retry_count += 1
if retry_count >= MAX_RETRIES:

View File

@@ -15,6 +15,7 @@ from loguru import logger
from pipecat.frames.frames import ErrorFrame, Frame, TranscriptionFrame
from pipecat.services.ai_services import SegmentedSTTService
from pipecat.transcriptions.language import Language
from pipecat.utils.time import time_now_iso8601
try:
@@ -26,18 +27,219 @@ except ModuleNotFoundError as e:
class Model(Enum):
"""Class of basic Whisper model selection options"""
"""Class of basic Whisper model selection options.
Available models:
Multilingual models:
TINY: Smallest multilingual model
BASE: Basic multilingual model
MEDIUM: Good balance for multilingual
LARGE: Best quality multilingual
DISTIL_LARGE_V2: Fast multilingual
English-only models:
DISTIL_MEDIUM_EN: Fast English-only
"""
# Multilingual models
TINY = "tiny"
BASE = "base"
MEDIUM = "medium"
LARGE = "large-v3"
DISTIL_LARGE_V2 = "Systran/faster-distil-whisper-large-v2"
# English-only models
DISTIL_MEDIUM_EN = "Systran/faster-distil-whisper-medium.en"
def language_to_whisper_language(language: Language) -> Optional[str]:
"""Maps pipecat Language enum to Whisper language codes.
Args:
language: A Language enum value representing the input language.
Returns:
str or None: The corresponding Whisper language code, or None if not supported.
Note:
Only includes languages officially supported by Whisper.
"""
language_map = {
# Arabic
Language.AR: "ar",
Language.AR_AE: "ar",
Language.AR_BH: "ar",
Language.AR_DZ: "ar",
Language.AR_EG: "ar",
Language.AR_IQ: "ar",
Language.AR_JO: "ar",
Language.AR_KW: "ar",
Language.AR_LB: "ar",
Language.AR_LY: "ar",
Language.AR_MA: "ar",
Language.AR_OM: "ar",
Language.AR_QA: "ar",
Language.AR_SA: "ar",
Language.AR_SY: "ar",
Language.AR_TN: "ar",
Language.AR_YE: "ar",
# Bengali
Language.BN: "bn",
Language.BN_BD: "bn",
Language.BN_IN: "bn",
# Czech
Language.CS: "cs",
Language.CS_CZ: "cs",
# Danish
Language.DA: "da",
Language.DA_DK: "da",
# German
Language.DE: "de",
Language.DE_AT: "de",
Language.DE_CH: "de",
Language.DE_DE: "de",
# Greek
Language.EL: "el",
Language.EL_GR: "el",
# English
Language.EN: "en",
Language.EN_AU: "en",
Language.EN_CA: "en",
Language.EN_GB: "en",
Language.EN_HK: "en",
Language.EN_IE: "en",
Language.EN_IN: "en",
Language.EN_KE: "en",
Language.EN_NG: "en",
Language.EN_NZ: "en",
Language.EN_PH: "en",
Language.EN_SG: "en",
Language.EN_TZ: "en",
Language.EN_US: "en",
Language.EN_ZA: "en",
# Spanish
Language.ES: "es",
Language.ES_AR: "es",
Language.ES_BO: "es",
Language.ES_CL: "es",
Language.ES_CO: "es",
Language.ES_CR: "es",
Language.ES_CU: "es",
Language.ES_DO: "es",
Language.ES_EC: "es",
Language.ES_ES: "es",
Language.ES_GQ: "es",
Language.ES_GT: "es",
Language.ES_HN: "es",
Language.ES_MX: "es",
Language.ES_NI: "es",
Language.ES_PA: "es",
Language.ES_PE: "es",
Language.ES_PR: "es",
Language.ES_PY: "es",
Language.ES_SV: "es",
Language.ES_US: "es",
Language.ES_UY: "es",
Language.ES_VE: "es",
# Persian
Language.FA: "fa",
Language.FA_IR: "fa",
# Finnish
Language.FI: "fi",
Language.FI_FI: "fi",
# French
Language.FR: "fr",
Language.FR_BE: "fr",
Language.FR_CA: "fr",
Language.FR_CH: "fr",
Language.FR_FR: "fr",
# Hindi
Language.HI: "hi",
Language.HI_IN: "hi",
# Hungarian
Language.HU: "hu",
Language.HU_HU: "hu",
# Indonesian
Language.ID: "id",
Language.ID_ID: "id",
# Italian
Language.IT: "it",
Language.IT_IT: "it",
# Japanese
Language.JA: "ja",
Language.JA_JP: "ja",
# Korean
Language.KO: "ko",
Language.KO_KR: "ko",
# Dutch
Language.NL: "nl",
Language.NL_BE: "nl",
Language.NL_NL: "nl",
# Polish
Language.PL: "pl",
Language.PL_PL: "pl",
# Portuguese
Language.PT: "pt",
Language.PT_BR: "pt",
Language.PT_PT: "pt",
# Romanian
Language.RO: "ro",
Language.RO_RO: "ro",
# Russian
Language.RU: "ru",
Language.RU_RU: "ru",
# Slovak
Language.SK: "sk",
Language.SK_SK: "sk",
# Swedish
Language.SV: "sv",
Language.SV_SE: "sv",
# Thai
Language.TH: "th",
Language.TH_TH: "th",
# Turkish
Language.TR: "tr",
Language.TR_TR: "tr",
# Ukrainian
Language.UK: "uk",
Language.UK_UA: "uk",
# Urdu
Language.UR: "ur",
Language.UR_IN: "ur",
Language.UR_PK: "ur",
# Vietnamese
Language.VI: "vi",
Language.VI_VN: "vi",
# Chinese
Language.ZH: "zh",
Language.ZH_CN: "zh",
Language.ZH_HK: "zh",
Language.ZH_TW: "zh",
}
return language_map.get(language)
class WhisperSTTService(SegmentedSTTService):
"""Class to transcribe audio with a locally-downloaded Whisper model"""
"""Class to transcribe audio with a locally-downloaded Whisper model.
This service uses Faster Whisper to perform speech-to-text transcription on audio
segments. It supports multiple languages and various model sizes.
Args:
model: The Whisper model to use for transcription. Can be a Model enum or string.
device: The device to run inference on ('cpu', 'cuda', or 'auto').
compute_type: The compute type for inference ('default', 'int8', 'int8_float16', etc.).
no_speech_prob: Probability threshold for filtering out non-speech segments.
language: The default language for transcription.
**kwargs: Additional arguments passed to SegmentedSTTService.
Attributes:
_device: The device used for inference.
_compute_type: The compute type for inference.
_no_speech_prob: Threshold for non-speech filtering.
_model: The loaded Whisper model instance.
_settings: Dictionary containing service settings.
"""
def __init__(
self,
@@ -46,6 +248,7 @@ class WhisperSTTService(SegmentedSTTService):
device: str = "auto",
compute_type: str = "default",
no_speech_prob: float = 0.4,
language: Language = Language.EN,
**kwargs,
):
super().__init__(**kwargs)
@@ -54,14 +257,47 @@ class WhisperSTTService(SegmentedSTTService):
self.set_model_name(model if isinstance(model, str) else model.value)
self._no_speech_prob = no_speech_prob
self._model: Optional[WhisperModel] = None
self._settings = {
"language": language,
}
self._load()
def can_generate_metrics(self) -> bool:
"""Indicates whether this service can generate metrics.
Returns:
bool: True, as this service supports metric generation.
"""
return True
def language_to_service_language(self, language: Language) -> Optional[str]:
"""Convert from pipecat Language to Whisper language code.
Args:
language: The Language enum value to convert.
Returns:
str or None: The corresponding Whisper language code, or None if not supported.
"""
return language_to_whisper_language(language)
async def set_language(self, language: Language):
"""Set the language for transcription.
Args:
language: The Language enum value to use for transcription.
"""
logger.info(f"Switching STT language to: [{language}]")
self._settings["language"] = language
def _load(self):
"""Loads the Whisper model. Note that if this is the first time
this model is being run, it will take time to download.
"""Loads the Whisper model.
Note:
If this is the first time this model is being run,
it will take time to download from the Hugging Face model hub.
"""
logger.debug("Loading Whisper model...")
self._model = WhisperModel(
@@ -70,7 +306,19 @@ class WhisperSTTService(SegmentedSTTService):
logger.debug("Loaded Whisper model")
async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
"""Transcribes given audio using Whisper"""
"""Transcribes given audio using Whisper.
Args:
audio: Raw audio bytes in 16-bit PCM format.
Yields:
Frame: Either a TranscriptionFrame containing the transcribed text
or an ErrorFrame if transcription fails.
Note:
The audio is expected to be 16-bit signed PCM data.
The service will normalize it to float32 in the range [-1, 1].
"""
if not self._model:
logger.error(f"{self} error: Whisper model not available")
yield ErrorFrame("Whisper model not available")
@@ -82,7 +330,10 @@ class WhisperSTTService(SegmentedSTTService):
# Divide by 32768 because we have signed 16-bit data.
audio_float = np.frombuffer(audio, dtype=np.int16).astype(np.float32) / 32768.0
segments, _ = await asyncio.to_thread(self._model.transcribe, audio_float)
whisper_lang = self.language_to_service_language(self._settings["language"])
segments, _ = await asyncio.to_thread(
self._model.transcribe, audio_float, language=whisper_lang
)
text: str = ""
for segment in segments:
if segment.no_speech_prob < self._no_speech_prob:
@@ -93,4 +344,4 @@ class WhisperSTTService(SegmentedSTTService):
if text:
logger.debug(f"Transcription: [{text}]")
yield TranscriptionFrame(text, "", time_now_iso8601())
yield TranscriptionFrame(text, "", time_now_iso8601(), self._settings["language"])

View File

@@ -5,6 +5,7 @@
#
import asyncio
from dataclasses import dataclass
from typing import Any, Awaitable, Callable, Dict, Sequence, Tuple
from pipecat.frames.frames import (
@@ -12,6 +13,7 @@ from pipecat.frames.frames import (
Frame,
HeartbeatFrame,
StartFrame,
SystemFrame,
)
from pipecat.observers.base_observer import BaseObserver
from pipecat.pipeline.pipeline import Pipeline
@@ -20,6 +22,16 @@ from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
@dataclass
class SleepFrame(SystemFrame):
"""This frame is used by test framework to introduce some sleep time before
the next frame is pushed. This is useful to control system frames vs data or
control frames.
"""
sleep: float = 0.1
class HeartbeatsObserver(BaseObserver):
def __init__(
self,
@@ -44,7 +56,11 @@ class HeartbeatsObserver(BaseObserver):
class QueuedFrameProcessor(FrameProcessor):
def __init__(
self, queue: asyncio.Queue, queue_direction: FrameDirection, ignore_start: bool = True
self,
*,
queue: asyncio.Queue,
queue_direction: FrameDirection,
ignore_start: bool = True,
):
super().__init__()
self._queue = queue
@@ -72,21 +88,35 @@ async def run_test(
) -> Tuple[Sequence[Frame], Sequence[Frame]]:
received_up = asyncio.Queue()
received_down = asyncio.Queue()
source = QueuedFrameProcessor(received_up, FrameDirection.UPSTREAM, ignore_start)
sink = QueuedFrameProcessor(received_down, FrameDirection.DOWNSTREAM, ignore_start)
source = QueuedFrameProcessor(
queue=received_up,
queue_direction=FrameDirection.UPSTREAM,
ignore_start=ignore_start,
)
sink = QueuedFrameProcessor(
queue=received_down,
queue_direction=FrameDirection.DOWNSTREAM,
ignore_start=ignore_start,
)
pipeline = Pipeline([source, processor, sink])
task = PipelineTask(pipeline, params=PipelineParams(start_metadata=start_metadata))
for frame in frames_to_send:
await task.queue_frame(frame)
async def push_frames():
# Just give a little head start to the runner.
await asyncio.sleep(0.01)
for frame in frames_to_send:
if isinstance(frame, SleepFrame):
await asyncio.sleep(frame.sleep)
else:
await task.queue_frame(frame)
if send_end_frame:
await task.queue_frame(EndFrame())
if send_end_frame:
await task.queue_frame(EndFrame())
runner = PipelineRunner()
await runner.run(task)
await asyncio.gather(runner.run(task), push_frames())
#
# Down frames
@@ -98,6 +128,7 @@ async def run_test(
received_down_frames.append(frame)
print("received DOWN frames =", received_down_frames)
print("expected DOWN frames =", expected_down_frames)
assert len(received_down_frames) == len(expected_down_frames)
@@ -113,6 +144,7 @@ async def run_test(
received_up_frames.append(frame)
print("received UP frames =", received_up_frames)
print("expected UP frames =", expected_up_frames)
assert len(received_up_frames) == len(expected_up_frames)

View File

@@ -54,6 +54,9 @@ class Language(StrEnum):
AZ = "az"
AZ_AZ = "az-AZ"
# Belarusian
BE = "be"
# Bulgarian
BG = "bg"
BG_BG = "bg-BG"
@@ -98,6 +101,7 @@ class Language(StrEnum):
EN_AU = "en-AU"
EN_CA = "en-CA"
EN_GB = "en-GB"
EN_GH = "en-GH"
EN_HK = "en-HK"
EN_IE = "en-IE"
EN_IN = "en-IN"
@@ -205,6 +209,7 @@ class Language(StrEnum):
# Italian
IT = "it"
IT_IT = "it-IT"
IT_CH = "it-CH"
# Inuktitut
IU_CANS = "iu-Cans"
@@ -264,6 +269,9 @@ class Language(StrEnum):
MN = "mn"
MN_MN = "mn-MN"
# Maori
MI = "mi"
# Marathi
MR = "mr"
MR_IN = "mr-IN"

View File

@@ -14,6 +14,8 @@ from pipecat.audio.vad.vad_analyzer import VADAnalyzer, VADState
from pipecat.frames.frames import (
BotInterruptionFrame,
CancelFrame,
EmulateUserStartedSpeakingFrame,
EmulateUserStoppedSpeakingFrame,
EndFrame,
FilterUpdateSettingsFrame,
Frame,
@@ -47,6 +49,13 @@ class BaseInputTransport(FrameProcessor):
# if passthrough is enabled.
self._audio_task = None
def enable_audio_in_stream_on_start(self, enabled: bool) -> None:
logger.debug(f"Enabling audio on start. {enabled}")
self._params.audio_in_stream_on_start = enabled
def start_audio_in_streaming(self):
pass
@property
def sample_rate(self) -> int:
return self._sample_rate
@@ -105,9 +114,13 @@ class BaseInputTransport(FrameProcessor):
await self.cancel(frame)
await self.push_frame(frame, direction)
elif isinstance(frame, BotInterruptionFrame):
logger.debug("Bot interruption")
await self._start_interruption()
await self.push_frame(StartInterruptionFrame())
await self._handle_bot_interruption(frame)
elif isinstance(frame, EmulateUserStartedSpeakingFrame):
logger.debug("Emulating user started speaking")
await self._handle_user_interruption(UserStartedSpeakingFrame())
elif isinstance(frame, EmulateUserStoppedSpeakingFrame):
logger.debug("Emulating user stopped speaking")
await self._handle_user_interruption(UserStoppedSpeakingFrame())
# All other system frames
elif isinstance(frame, SystemFrame):
await self.push_frame(frame, direction)
@@ -130,7 +143,13 @@ class BaseInputTransport(FrameProcessor):
# Handle interruptions
#
async def _handle_interruptions(self, frame: Frame):
async def _handle_bot_interruption(self, frame: BotInterruptionFrame):
logger.debug("Bot interruption")
if self.interruptions_allowed:
await self._start_interruption()
await self.push_frame(StartInterruptionFrame())
async def _handle_user_interruption(self, frame: Frame):
if isinstance(frame, UserStartedSpeakingFrame):
logger.debug("User started speaking")
# Make sure we notify about interruptions quickly out-of-band.
@@ -176,7 +195,7 @@ class BaseInputTransport(FrameProcessor):
frame = UserStoppedSpeakingFrame()
if frame:
await self._handle_interruptions(frame)
await self._handle_user_interruption(frame)
vad_state = new_vad_state
return vad_state

View File

@@ -14,7 +14,6 @@ from loguru import logger
from PIL import Image
from pipecat.audio.utils import create_default_resampler
from pipecat.audio.vad.vad_analyzer import VAD_STOP_SECS
from pipecat.frames.frames import (
BotSpeakingFrame,
BotStartedSpeakingFrame,
@@ -38,6 +37,8 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.transports.base_transport import TransportParams
from pipecat.utils.time import nanoseconds_to_seconds
BOT_VAD_STOP_SECS = 0.3
class BaseOutputTransport(FrameProcessor):
def __init__(self, params: TransportParams, **kwargs):
@@ -169,6 +170,8 @@ class BaseOutputTransport(FrameProcessor):
# TODO(aleix): Images and audio should support presentation timestamps.
elif frame.pts:
await self._sink_clock_queue.put((frame.pts, frame.id, frame))
elif direction == FrameDirection.UPSTREAM:
await self.push_frame(frame, direction)
else:
await self._sink_queue.put(frame)
@@ -321,15 +324,10 @@ class BaseOutputTransport(FrameProcessor):
)
yield frame
vad_stop_secs = (
self._params.vad_analyzer.params.stop_secs
if self._params.vad_analyzer
else VAD_STOP_SECS
)
if self._params.audio_out_mixer:
return with_mixer(vad_stop_secs)
return with_mixer(BOT_VAD_STOP_SECS)
else:
return without_mixer(vad_stop_secs)
return without_mixer(BOT_VAD_STOP_SECS)
async def _sink_task_handler(self):
async for frame in self._next_frame():

View File

@@ -39,6 +39,7 @@ class TransportParams(BaseModel):
audio_in_sample_rate: Optional[int] = None
audio_in_channels: int = 1
audio_in_filter: Optional[BaseAudioFilter] = None
audio_in_stream_on_start: bool = True
vad_enabled: bool = False
vad_audio_passthrough: bool = False
vad_analyzer: Optional[VADAnalyzer] = None

View File

@@ -26,10 +26,18 @@ except ModuleNotFoundError as e:
raise Exception(f"Missing module: {e}")
class LocalAudioTransportParams(TransportParams):
input_device_index: Optional[int] = None
output_device_index: Optional[int] = None
class LocalAudioInputTransport(BaseInputTransport):
def __init__(self, py_audio: pyaudio.PyAudio, params: TransportParams):
_params: LocalAudioTransportParams
def __init__(self, py_audio: pyaudio.PyAudio, params: LocalAudioTransportParams):
super().__init__(params)
self._py_audio = py_audio
self._in_stream = None
self._sample_rate = 0
@@ -46,6 +54,7 @@ class LocalAudioInputTransport(BaseInputTransport):
frames_per_buffer=num_frames,
stream_callback=self._audio_in_callback,
input=True,
input_device_index=self._params.input_device_index,
)
self._in_stream.start_stream()
@@ -69,9 +78,12 @@ class LocalAudioInputTransport(BaseInputTransport):
class LocalAudioOutputTransport(BaseOutputTransport):
def __init__(self, py_audio: pyaudio.PyAudio, params: TransportParams):
_params: LocalAudioTransportParams
def __init__(self, py_audio: pyaudio.PyAudio, params: LocalAudioTransportParams):
super().__init__(params)
self._py_audio = py_audio
self._out_stream = None
self._sample_rate = 0
@@ -89,6 +101,7 @@ class LocalAudioOutputTransport(BaseOutputTransport):
channels=self._params.audio_out_channels,
rate=self._sample_rate,
output=True,
output_device_index=self._params.output_device_index,
)
self._out_stream.start_stream()
@@ -106,7 +119,7 @@ class LocalAudioOutputTransport(BaseOutputTransport):
class LocalAudioTransport(BaseTransport):
def __init__(self, params: TransportParams):
def __init__(self, params: LocalAudioTransportParams):
super().__init__()
self._params = params
self._pyaudio = pyaudio.PyAudio()

View File

@@ -34,8 +34,15 @@ except ModuleNotFoundError as e:
raise Exception(f"Missing module: {e}")
class TkTransportParams(TransportParams):
audio_input_device_index: Optional[int] = None
audio_output_device_index: Optional[int] = None
class TkInputTransport(BaseInputTransport):
def __init__(self, py_audio: pyaudio.PyAudio, params: TransportParams):
_params: TkTransportParams
def __init__(self, py_audio: pyaudio.PyAudio, params: TkTransportParams):
super().__init__(params)
self._py_audio = py_audio
self._in_stream = None
@@ -54,6 +61,7 @@ class TkInputTransport(BaseInputTransport):
frames_per_buffer=num_frames,
stream_callback=self._audio_in_callback,
input=True,
input_device_index=self._params.audio_input_device_index,
)
self._in_stream.start_stream()
@@ -76,6 +84,8 @@ class TkInputTransport(BaseInputTransport):
class TkOutputTransport(BaseOutputTransport):
_params: TkTransportParams
def __init__(self, tk_root: tk.Tk, py_audio: pyaudio.PyAudio, params: TransportParams):
super().__init__(params)
self._py_audio = py_audio
@@ -103,6 +113,7 @@ class TkOutputTransport(BaseOutputTransport):
channels=self._params.audio_out_channels,
rate=self._sample_rate,
output=True,
output_device_index=self._params.audio_output_device_index,
)
self._out_stream.start_stream()

Some files were not shown because too many files have changed in this diff Show More