Compare commits

...

493 Commits

Author SHA1 Message Date
Mark Backman
708ef71c96 Update python-compatibility workflow to include new user project check 2025-08-09 20:19:16 -04:00
Aleix Conchillo Flaqué
241ab19228 update uv.lock with numba dependency 2025-08-08 15:12:55 -07:00
Mark Backman
c08e8ec8fb Merge pull request #2391 from pipecat-ai/mb/readme-local-dev
Update README with local dev setup for contributors
2025-08-08 11:15:58 -07:00
Mark Backman
eb9bc9644e Merge pull request #2400 from pipecat-ai/mb/pin-numba-0.61.2
fix: pin numba to >=0.61.2
2025-08-08 11:15:22 -07:00
Mark Backman
3a306dae90 fix: pin numba to >=0.61.2 2025-08-08 10:52:47 -04:00
Mark Backman
c42cc8254f Update README with local dev setup for contributors 2025-08-07 22:07:35 -04:00
Aleix Conchillo Flaqué
a8e21f7d5d Merge pull request #2395 from pipecat-ai/aleix/examples-15-inherit-parallel-pipeline
examples(foundational): move 15/15a logic into its own processor
2025-08-07 17:59:28 -07:00
Aleix Conchillo Flaqué
c6ef8de578 scripts(evals): fix 14v-function-calling-openai.py 2025-08-07 17:57:47 -07:00
Aleix Conchillo Flaqué
fc571fba42 examples(foundational): move 15/15a logic into its own processor 2025-08-07 17:57:47 -07:00
Mark Backman
0502ee2b5a Merge pull request #2394 from pipecat-ai/mb/uv-lock
Update uv.lock
2025-08-07 15:25:38 -07:00
Mark Backman
9ec047094b Update uv.lock 2025-08-07 18:24:47 -04:00
Mark Backman
d991c106c8 Merge pull request #2393 from pipecat-ai/mb/openai-dep
fix: pin openai package upper bound to <=1.99.1
2025-08-07 15:19:05 -07:00
Mark Backman
312fb23c89 fix: pin openai package upper bound to <=1.99.1 2025-08-07 18:00:25 -04:00
Aleix Conchillo Flaqué
4d7f21d44e Merge pull request #2392 from pipecat-ai/aleix/avoid-using-tts-say
deprecate TTSService.say() method
2025-08-07 13:55:49 -07:00
Aleix Conchillo Flaqué
ec25d0a7c9 examples(foundational): fix 20a-persistent-context-openai 2025-08-07 13:48:32 -07:00
Aleix Conchillo Flaqué
2b8218deaa examples(foundational): use TTSSpeakFrame instead of TTSService.say() 2025-08-07 13:48:32 -07:00
Aleix Conchillo Flaqué
11119430cd TTSService: deprecate say() method 2025-08-07 13:48:32 -07:00
kompfner
9ca79232c1 Merge pull request #2380 from pipecat-ai/pk/deprecate-llm-messages-frame
Deprecate `LLMMessagesFrame`, `LLMUserResponseAggregator`, and `LLMAssistantResponseAggregator`
2025-08-07 15:13:01 -04:00
Paul Kompfner
9ea06c33f7 Bump deprecation version of LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator (the deprecation slipped past the 0.0.78 release) 2025-08-07 14:56:50 -04:00
Paul Kompfner
30a1dd202e Move deprecation of LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator into the next release in the changelog 2025-08-07 14:55:11 -04:00
Paul Kompfner
809ab0b7b6 Improve printed deprecation warning 2025-08-07 14:45:35 -04:00
Paul Kompfner
2b5db9c562 Remove redundant deprecation warning in docstring 2025-08-07 14:45:35 -04:00
Paul Kompfner
b4a886b59f Remove redundant deprecation warning in docstring 2025-08-07 14:45:35 -04:00
Paul Kompfner
07eb00722b Fix langchain unit test 2025-08-07 14:45:35 -04:00
Paul Kompfner
96652b8fba Add new deprecations to changelog 2025-08-07 14:45:30 -04:00
Paul Kompfner
df1fcf0c68 Remove unused import 2025-08-07 14:43:37 -04:00
Paul Kompfner
711f740d9e Update UserResponseAggregator to avoid using the now-deprecated LLMUserResponseAggregator 2025-08-07 14:43:37 -04:00
Paul Kompfner
a0bda98c20 Update langchain to avoid using the now-deprecated LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator 2025-08-07 14:43:37 -04:00
Paul Kompfner
1c1bae35ab Mention deprecation in docstring for LLMMessagesFrame 2025-08-07 14:43:37 -04:00
Paul Kompfner
56c52c2cf2 Deprecate LLMUserResponseAggregator and LLMAssistantResponseAggregator, which depend on the now-deprecated LLMMessagesFrame. 2025-08-07 14:43:37 -04:00
Paul Kompfner
740aee1a1a Fix an issue in AnthropicLLMContext where we would never initialize turns_above_cache_threshold if we were upgrading from an OpenAILLMContext.
I noticed this when working on 22c-natural-conversation-mixed-llms.py
2025-08-07 14:43:37 -04:00
Paul Kompfner
f0391c3280 Progress on updating foundational examples to avoid using the newly-deprecated LLMMessagesFrame.
Skipping over 07b-interruptible-langchain.py for now, as it requires deeper changes involving `LLMUserResponseAggregator` and `LLMAssistantResponseAggregator`.
2025-08-07 14:43:37 -04:00
Paul Kompfner
64e48e4660 Deprecate LLMMessagesFrame.
The same functionality can be achieved using either:
- `LLMMessagesUpdateFrame` with the desired messages, with `run_llm` set to `True`
- `OpenAILLMContextFrame` with a new context initialized with the desired messages
2025-08-07 14:43:37 -04:00
Paul Kompfner
b8147bdbbd Add missing Deepgram key to env.example 2025-08-07 14:43:37 -04:00
Aleix Conchillo Flaqué
315e45d41b Merge pull request #2389 from pipecat-ai/aleix/pipecat-0.0.78
update CHANGELOG for 0.0.78
2025-08-07 11:34:27 -07:00
Aleix Conchillo Flaqué
c057139c48 update CHANGELOG for 0.0.78 2025-08-07 11:14:54 -07:00
Mark Backman
c61e07132d Merge pull request #2390 from pipecat-ai/mb/optionally-ignore-emulated-speech
feat: Add option to ignore emulated user speech while the bot is spea…
2025-08-07 11:14:46 -07:00
Mark Backman
a5f5e418a8 feat: Add option to ignore emulated user speech while the bot is speaking 2025-08-07 14:08:11 -04:00
Mark Backman
31acfaa091 Merge pull request #2388 from pipecat-ai/14v-adding-openai-stt-tts-llm-functioncalling
14v adding OpenAI stt tts llm functioncalling
2025-08-07 10:22:35 -07:00
Mark Backman
69541c8835 Linting fix, plus update eval suite with 14v and others, tiny fix for 14m, too 2025-08-07 13:20:45 -04:00
Varun Singh
af94620839 Add OpenAI function calling example with Pipecat
Introduces a new example script demonstrating how to use OpenAI's function calling capabilities within a Pipecat pipeline. The example integrates OpenAI STT, TTS, and LLM services, registers a weather function, and sets up a pipeline for real-time audio interaction over WebRTC.
2025-08-07 13:20:45 -04:00
Filipi da Silva Fuchter
cec8a74293 Merge pull request #2386 from pipecat-ai/filipi/parallel_pipeline
Only push the StartFrame when all parallel pipelines have processed it
2025-08-07 14:20:30 -03:00
Filipi Fuchter
228a55ac1e Only push the StartFrame when all parallel pipelines have processed it. 2025-08-07 14:18:21 -03:00
Vanessa Pyne
ab9831daf0 Merge pull request #2382 from pipecat-ai/vp-trace-ignore-message
log: warning -> trace for elevenlabs tts unavailable context
2025-08-07 09:35:57 -05:00
Vanessa Pyne
e8c3f5dea6 Update src/pipecat/services/elevenlabs/tts.py
Co-authored-by: Mark Backman <mark@daily.co>
2025-08-07 09:23:33 -05:00
Mark Backman
4288b5e780 Merge pull request #2381 from pipecat-ai/aleix/runner-args-pipeline-idle-timeout
allow specifying PipelineTask idle timeout to runner arguments
2025-08-07 04:47:08 -07:00
Mark Backman
23343dd7e7 Remove idle_timeout_secs from quickstart 2025-08-07 07:44:21 -04:00
Mark Backman
88de5dd415 Merge pull request #2383 from pipecat-ai/aleix/riva-stt-iterator-exception
properly handle concurrent.futures.CancelledError
2025-08-07 04:39:56 -07:00
Mark Backman
33f87589d1 Merge pull request #2384 from pipecat-ai/aleix/release-evals-soniox-inworld-asyncai
scripts(evals): added soniox, inworld and asyncai
2025-08-07 04:35:18 -07:00
Aleix Conchillo Flaqué
7ed14ad91f scripts(evals): added soniox, inworld and asyncai 2025-08-06 23:14:50 -07:00
Aleix Conchillo Flaqué
86c6141580 DailyTransport: handle future cancellation 2025-08-06 23:03:20 -07:00
Aleix Conchillo Flaqué
c97643c797 RivaSTTService: always use WatchdogQueue 2025-08-06 23:00:03 -07:00
Aleix Conchillo Flaqué
434d346079 RivaSTTService: handle future cancellation 2025-08-06 22:59:52 -07:00
vipyne
64ae8d2394 log: warning -> trace for elevenlabs tts unavailable context 2025-08-06 22:40:47 -05:00
Aleix Conchillo Flaqué
786f24c9db examples(foundational): use RunnerArgs.pipeline_idle_timeout_secs 2025-08-06 19:38:06 -07:00
Aleix Conchillo Flaqué
38951aab56 scripts(evals): use RunnerArguments.pipeline_idle_timeout_secs 2025-08-06 19:37:29 -07:00
Aleix Conchillo Flaqué
ed8b0655a8 scripts(evals): fix runner eval cancellation
We need to call asyncio.gather() just once, not for every cancelled task.
2025-08-06 19:36:42 -07:00
Aleix Conchillo Flaqué
0b2b9f5f1b RunnerArguments: add pipeline_idle_timeout_secs 2025-08-06 19:35:40 -07:00
Filipi da Silva Fuchter
ad1841b739 Merge pull request #2377 from pipecat-ai/filipi/fast_api_freeze_issue
Fixed an issue in BaseOutputTransport where the loop could consume all CPU.
2025-08-06 14:58:36 -03:00
Mark Backman
b0c002c128 Merge pull request #2378 from pipecat-ai/mb/pyproject-compat-updates
Add new python-compatiblity workflow to check for dependency compatib…
2025-08-06 10:40:29 -07:00
Mark Backman
820176084c Add support for 3.13 by bumping min version for vllm to 0.9.0, adding support for torch and torchaudio up to the next major version 2025-08-06 13:36:01 -04:00
Mark Backman
5b7e31beff README updates for python versions 2025-08-06 13:36:01 -04:00
Mark Backman
41a22d3bf4 Add new python-compatiblity workflow to check for dependency compatibility across supported python versions 2025-08-06 13:36:01 -04:00
Filipi Fuchter
84fecabac5 Removing audio sleep from FastAPI and WebSocket server when they are not connected. 2025-08-06 14:02:51 -03:00
Filipi Fuchter
bbe01d10ef Fixed an issue in BaseOutputTransport where the loop could consume all CPU. 2025-08-06 12:42:58 -03:00
Mark Backman
4364990fd0 Merge pull request #2375 from fabrice404/gladia-region-selection
Gladia region selection
2025-08-06 07:01:24 -07:00
Fabrice Lamant
e576fa481f Add new region feature for GladiaSTTService in CHANGELOG 2025-08-06 15:31:10 +02:00
Mark Backman
ac6b59cae2 Merge pull request #2372 from pipecat-ai/mb/dotenv-dev
Wider package support for python-dotenv dev dep
2025-08-06 06:06:01 -07:00
Mark Backman
12e168e740 Wider package support for python-dotenv dev dep 2025-08-06 09:04:01 -04:00
Mark Backman
ac354f66ed Merge pull request #2371 from pipecat-ai/mb/docs-gen-with-uv
Update docs auto-generation to use uv
2025-08-06 06:02:52 -07:00
Mark Backman
eead793927 Merge pull request #2370 from pipecat-ai/mb/update-workflows-for-uv
Update workflows for uv
2025-08-06 05:54:55 -07:00
Fabrice Lamant
0594a203fc Add new region parameter to Gladia 2025-08-06 14:28:06 +02:00
Mark Backman
2337a2d92d Remove dev-requirements.txt and mentions of it 2025-08-05 21:46:50 -04:00
Mark Backman
b3e2603553 Update workflows for uv 2025-08-05 21:45:48 -04:00
Mark Backman
29229df719 Speed up builds, mocking large packages 2025-08-05 21:34:40 -04:00
Aleix Conchillo Flaqué
61f4dd2ff2 scripts(evals): fix 14e-function-calling-google 2025-08-05 17:44:45 -07:00
Mark Backman
42094fb206 Update docs auto-generation to use uv 2025-08-05 20:37:27 -04:00
Aleix Conchillo Flaqué
58c41f112a DailyRunnerArguments: make body optional (fix) 2025-08-05 16:59:36 -07:00
Aleix Conchillo Flaqué
fa55e2ca9b Merge pull request #2369 from pipecat-ai/aleix/pipeline-task-cancellation-fix
PipelineTask: always try to cancel things
2025-08-05 16:56:23 -07:00
Aleix Conchillo Flaqué
313fdc92a1 DailyRunnerArguments: make body optional 2025-08-05 16:39:18 -07:00
Aleix Conchillo Flaqué
d22d2da03d PipelineTask: always try to cancel things
In a previous commit we only cleanup things if the user run
`task.cancel()`. However, if the task finishes cleanly we were not cancelling
anything.
2025-08-05 16:24:59 -07:00
Aleix Conchillo Flaqué
de2ae9a2ec Merge pull request #2368 from pipecat-ai/aleix/release-evals-runner-args-fix
pass runner arguments to release evals
2025-08-05 16:23:32 -07:00
Aleix Conchillo Flaqué
52a6d8013c scripts(evals): pass runner arguments to run_bot() 2025-08-05 16:13:32 -07:00
Aleix Conchillo Flaqué
f14cbae9b5 DailyRunnerArguments: make token optional
DailyTransport can get a None token value.
2025-08-05 15:46:12 -07:00
Aleix Conchillo Flaqué
8fe906438a Merge pull request #2358 from pipecat-ai/aleix/system-frames-queued
system frames are now queued
2025-08-05 15:09:52 -07:00
Mark Backman
d8f4db8827 Merge pull request #2367 from richtermb/richtermb/fix-errorframe-docstring
Rename 'source' parameter to 'processor' in ErrorFrame class document…
2025-08-05 15:09:18 -07:00
Aleix Conchillo Flaqué
a5ea6e1642 FrameProcessor: system frames are now queued
System frames are now queued. Before, system frames could be generated from any
task and would not guarantee any order which was causing undesired
behavior. Also, it was possible to get into some rare recursion issues because
of the way system frames were executed (they were executed in-place, meaning
calling `push_frame()` would finish after the system frame traversed all the
pipeline). This makes system frames more deterministic.
2025-08-05 15:05:50 -07:00
richtermb
e777e78510 Rename 'source' parameter to 'processor' in ErrorFrame class documentation for clarity. 2025-08-05 15:02:00 -07:00
Aleix Conchillo Flaqué
49a5a1e375 PipelineTask: improve task cancellation 2025-08-05 14:49:23 -07:00
Aleix Conchillo Flaqué
61cb45d61b PipelineTask: also wait on CancelFrame
Before CancelFrames didn't need to be waited for because system frames were
processed in-place and therefore calling push_frame() would finalize after it
traversed all the pipeline. Now, system frames are queued so we need to wait
until CancelFrame reaches the end of the pipeline.
2025-08-05 14:49:23 -07:00
Aleix Conchillo Flaqué
6c6deb4e85 Merge pull request #2366 from pipecat-ai/aleix/run-bot-runner-arguments
add sigint/sigterm to RunnerArguments
2025-08-05 14:46:19 -07:00
Aleix Conchillo Flaqué
66ad29b2b1 example: pass RunnerArguments to run_bot()
This lets us get handle_sigint from RunnerArguments which knows where the
application is running and if SIGINT/SIGTERM should be handled or not.
2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
21e4f0d56d PipelineRunner: argument ordering 2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
627b44bac2 runner: use new RunnerArguments handle_sigint/handle_sigterm
This allow us to control applications behavior from the runner arguments, which
depen on the environment they run.
2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
e2a576beca RunnerArguments: add handle_sigint/handle_sigterm 2025-08-05 14:32:28 -07:00
Mark Backman
2981afb117 Merge pull request #2361 from pipecat-ai/mb/fix-changelog-simli
Fix Simli changelog entry placement
2025-08-05 14:12:38 -07:00
Mark Backman
d422c57b52 Merge pull request #2304 from pipecat-ai/mb/cartesia-cjk-lang-support
CartesiaTTSService: Add CJK lang support for word timestamps
2025-08-05 14:08:53 -07:00
Mark Backman
06d8bbd154 Fix Simli changelog entry placement 2025-08-05 17:07:58 -04:00
Mark Backman
35108afeb8 Merge pull request #2360 from pipecat-ai/mb/add-heygen-readme
Add HeyGen to the README page
2025-08-05 14:05:33 -07:00
Mark Backman
a0e2a2754a Merge pull request #2327 from richtermb/richtermb/push-more-error-frames
Add source parameter to ErrorFrame and set it in FrameProcessor. Upda…
2025-08-05 14:04:52 -07:00
Mark Backman
b8d620c8bb Merge pull request #2362 from pipecat-ai/mb/aws-stt-languages
AWSTranscribeSTTService add support for new languages
2025-08-05 14:00:50 -07:00
Mark Backman
f26bbe4092 Merge pull request #2363 from pipecat-ai/mb/update-14p
Update 14p, add 14p to evals, add Google creds to env.example
2025-08-05 14:00:13 -07:00
Mark Backman
52cb23f8d5 Merge pull request #2364 from pipecat-ai/mb/11labs-default-model
ElevenLabs TTS services: revert to Turbo v2.5 as default model
2025-08-05 13:59:59 -07:00
Filipi da Silva Fuchter
17e7f8a2cd Merge pull request #2352 from pipecat-ai/filipi/webrtc_audio_frame
Implementing if the bot it is speaking or not based on the SpeechOutputAudioRawFrame
2025-08-05 17:26:44 -03:00
richtermb
efddc4732c Refactor ErrorFrame: rename source field to processor for clarity and update related references in FrameProcessor. 2025-08-05 13:25:08 -07:00
richtermb
4476a76ad7 Merge branch 'main' into richtermb/push-more-error-frames 2025-08-05 13:23:24 -07:00
Filipi Fuchter
64592b274b Fixed an issue where BotStartedSpeakingFrame and BotStoppedSpeakingFrame
were not emitted when using `TavusVideoService` or `HeyGenVideoService`.
2025-08-05 17:11:34 -03:00
Aleix Conchillo Flaqué
95c661bdaa Merge pull request #2365 from pipecat-ai/aleix/update-release-evals-for-new-runner
scripts(evals): update to use new runner function
2025-08-05 13:07:57 -07:00
Aleix Conchillo Flaqué
5546c8e01c scripts(evals): update to use new runner function 2025-08-05 11:46:28 -07:00
Mark Backman
14e02c1b08 ElevenLabs TTS services: revert to Turbo v2.5 as default model 2025-08-05 13:44:37 -04:00
Mark Backman
ba5a5c7187 Update 14p, add 14p to evals, add Google creds to env.example 2025-08-05 13:30:36 -04:00
Mark Backman
2378cba155 AWSTranscribeSTTService add support for new languages 2025-08-05 13:01:06 -04:00
Mark Backman
1138c92a00 Merge pull request #2217 from simliai/main
feat: Add Simli Trinity models support to pipecat
2025-08-05 09:01:20 -07:00
Antonyesk601
fb82dc8308 Update CHANGELOG.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-08-05 17:46:01 +02:00
Mark Backman
c8a15f30fa Add HeyGen to the README page 2025-08-05 10:54:49 -04:00
antonyesk601
72168070f1 update changelog 2025-08-05 14:18:41 +00:00
Mark Backman
50083d1144 Merge pull request #2342 from pipecat-ai/mb/runner-connect-request-body
Development runner handles body information in the RTVI connect request
2025-08-05 05:15:55 -07:00
Mark Backman
64732518c6 Development runner handles body information in the RTVI connect request 2025-08-05 07:26:34 -04:00
Mark Backman
c3d8ea210f CartesiaTTSService: Add CJK lang support for word timestamps 2025-08-05 07:17:40 -04:00
Filipi da Silva Fuchter
98ed614f63 Merge pull request #2357 from pipecat-ai/filipi/latency_observer
Added detailed latency logging to UserBotLatencyLogObserver.
2025-08-05 08:11:48 -03:00
Filipi Fuchter
e43bdff31e Added detailed latency logging to UserBotLatencyLogObserver. 2025-08-04 19:36:30 -03:00
Mark Backman
42e48381fe Merge pull request #2355 from pipecat-ai/mb/update-readme-for-uv
Update the README with uv-centric steps
2025-08-04 15:28:07 -07:00
Mark Backman
df7ba64b4a Merge pull request #2354 from pipecat-ai/mb/revert-43-inline-script
Remove inline script from foundational 43a
2025-08-04 15:27:28 -07:00
Mark Backman
ac9b2e67a7 Merge pull request #2349 from pipecat-ai/mb/runner-support-daily-url-arg
daily runner util: remove arg parsing, add auto room, token generation
2025-08-04 13:44:25 -07:00
Mark Backman
c9918607cf Merge pull request #2335 from pipecat-ai/mb/quickstart-runner-improvements
Improve quickstart logging, runner startup message
2025-08-04 13:43:42 -07:00
Mark Backman
cfda410a43 Remove foundational requirements.txt file 2025-08-04 16:38:37 -04:00
Mark Backman
c773ddf83d Update foundational examples README 2025-08-04 16:26:11 -04:00
Mark Backman
54d5ebbc20 Update the README with uv-centric steps 2025-08-04 16:11:38 -04:00
Mark Backman
35002cd727 Remove inline script from foundational 43a 2025-08-04 15:46:18 -04:00
Mark Backman
53d75faa47 Merge pull request #2330 from pipecat-ai/mb/runner-clean-proxy-name
Runner: strip protocol from proxy address
2025-08-04 10:42:16 -07:00
Mark Backman
2901dddc2b Merge pull request #2338 from pipecat-ai/mb/update-release-evals-tavus
Add Tavus, HeyGen, Simli to release-evals
2025-08-04 10:38:27 -07:00
Mark Backman
3a8d809837 Runner: strip protocol from proxy address 2025-08-04 13:38:02 -04:00
Mark Backman
1b3c2bee30 Merge pull request #2331 from pipecat-ai/mb/more-foundational
Updating more foundational examples
2025-08-04 10:37:15 -07:00
Mark Backman
69f049cb63 Merge pull request #2328 from pipecat-ai/mb/04b-example-cleanup
Align 04b livekit example with other foundational examples
2025-08-04 10:36:57 -07:00
Vanessa Pyne
96b1000e52 Merge pull request #2341 from getchannel/realtime-text
Hotfix: Correct Gemini Live API class to fix 1007 payload error.
2025-08-04 11:03:57 -05:00
Filipi da Silva Fuchter
0184a8c231 Merge pull request #2351 from pipecat-ai/filipi/tavus_transport_ready
Changed `TavusVideoService` to send audio or video frames only after the   transport is ready, preventing warning messages at startup.
2025-08-04 11:48:27 -03:00
Filipi Fuchter
c22866ed58 Mentioning the TavusVideoService fix in the changelog. 2025-08-04 11:46:24 -03:00
Filipi Fuchter
0e533d21be Only send audio and video from the tavus video service if the transport is ready. 2025-08-04 10:52:30 -03:00
Mark Backman
6f6f4c3dea Merge pull request #2348 from sam-s10s/speechmatics-stt
Fix for Speechmatics STT
2025-08-04 06:15:39 -07:00
Mark Backman
f609971637 daily runner util: remove arg parsing, add auto room, token generation 2025-08-03 21:50:44 -04:00
Mark Backman
54ff10ae86 Merge pull request #2332 from hankehly/fix-piper-tts-json-payload
Fix PiperTTSService to send TTS input as JSON object
2025-08-03 17:39:04 -07:00
hankehly
77057eb829 Fix ruff formatting 2025-08-04 08:13:16 +09:00
Mark Backman
2b1a7b840d Merge pull request #2346 from adenta/heygen-testing
me trying to get heygen working
2025-08-03 14:11:14 -07:00
Sam Sykes
e07db88bc0 Updated changelog. 2025-08-03 22:11:10 +01:00
Sam Sykes
c2282b0e73 Set user_id to "" (not None) for RTVIProcessor. 2025-08-03 22:08:22 +01:00
Andrew Denta
593bf09d8d update script 2025-08-03 17:01:27 -04:00
Sam Sykes
534ed77ebf Merge branch 'main' into speechmatics-stt 2025-08-03 21:51:35 +01:00
Andrew Denta
193299988d me trying to get heygen working 2025-08-03 13:42:07 -04:00
Aleix Conchillo Flaqué
d589bcb345 Merge pull request #2344 from pipecat-ai/aleix/daily-python-0.19.6
pyproject: update daily-python to 0.19.6
2025-08-03 10:15:15 -07:00
Aleix Conchillo Flaqué
011ebc2801 Merge pull request #2345 from pipecat-ai/aleix/task-observer-signature-performance
TaskObserver: don't inspect on_push_frame signature for every frame
2025-08-03 10:15:00 -07:00
Aleix Conchillo Flaqué
3a72e94d0c LLMService: only do handle function inspection once 2025-08-03 10:09:19 -07:00
Aleix Conchillo Flaqué
d6d39fc873 TaskObserver: don't inspect on_push_frame signature for every frame 2025-08-03 10:09:17 -07:00
Pete
258e83c904 Fix: Correct Gemini Live API text input to prevent 1007 WebSocket errors
- Restore TextInputMessage.realtimeInput structure for correct API format
- Remove invalid turnComplete message from _send_user_text method
- turnComplete is only valid for clientContent, not realtimeInput messages
- realtimeInput text completion is automatically inferred by the API

This fixes WebSocket 1007 errors caused by mixing realtimeInput and
clientContent message types in violation of the Gemini Live API contract.
2025-08-03 10:58:59 -04:00
Mark Backman
061f2086b2 Merge pull request #2343 from pipecat-ai/mb/update-pre-commit-ruff-version
Update pre-commit-config ruff version
2025-08-03 03:38:54 -07:00
Aleix Conchillo Flaqué
a1f3f51168 pyproject: update daily-python to 0.19.6 2025-08-02 20:02:22 -07:00
hankehly
2177a2b805 Remove trailing space 2025-08-03 10:34:27 +09:00
hankehly
68164415ce Format changelog entry 2025-08-03 10:26:37 +09:00
hankehly
7646599b66 Add changelog entry 2025-08-03 10:23:58 +09:00
Mark Backman
e467eaf130 Merge pull request #2334 from Designedforusers/fix/tavus-transport-daily-callbacks
fix: Add missing transcription callbacks to TavusTransport for 0.0.77 compatibility
2025-08-02 16:09:57 -07:00
Designedforusers
9d6d53629e style: Apply ruff formatting to fix line length
Fixed a line length issue in tavus.py where the on_transcription_stopped callback was exceeding the maximum line length. Split the partial() call across multiple lines for better readability and compliance with project style guidelines.
2025-08-02 18:27:09 -04:00
Mark Backman
89596cfec4 Update pre-commit-config ruff version 2025-08-02 18:06:06 -04:00
Designedforusers
5e338ecaf1 refactor: Remove redundant transcription callback methods
As suggested in PR review, removed the _on_transcription_stopped and
_on_transcription_error method definitions. Now using the consistent
partial(self._on_handle_callback, ...) pattern for these callbacks,
matching how all other callbacks are handled.

This simplifies the code while maintaining the same functionality.
2025-08-02 15:02:54 -04:00
Designedforusers
62319021f8 docs: Add changelog entry for TavusVideoService fix
Added changelog entry as requested by maintainers for the fix addressing
missing transcription callbacks in TavusVideoService.
2025-08-02 14:53:44 -04:00
Pete
cccd82a617 Refactor TextInputMessage class to replace realtimeInput with a text attribute.
This was sending a 1007 because it was wrapping RealtimeInput in the json.

- Updated the `TextInputMessage` class to directly store text input as a string.
- Modified the `from_text` class method to create an instance using the new `text` attribute.
2025-08-02 14:34:00 -04:00
Mark Backman
f552ba1f5e Merge pull request #2336 from pipecat-ai/mb/suppress-pydub-warning
Suppress pydub (cartesia dependency) SyntaxWarning
2025-08-02 10:01:05 -07:00
Mark Backman
b9a2a9b729 Add Tavus, HeyGen, Simli to release-evals 2025-08-02 09:35:06 -04:00
Mark Backman
e43b3869c3 Suppress pydub SyntaxWarning from the cartesia module 2025-08-02 08:49:59 -04:00
Mark Backman
55731df999 Improve quickstart logging, runner startup message 2025-08-02 08:40:05 -04:00
Designedforusers
3a7ea25077 fix: Add missing transcription callbacks to TavusTransport for 0.0.77 compatibility
TavusTransport was broken in Pipecat 0.0.77 due to PR #2292 adding required
callbacks (on_transcription_stopped, on_transcription_error) to DailyCallbacks.

This fix adds placeholder implementations of these callbacks to TavusTransportClient,
allowing TavusTransport to initialize properly. These callbacks are not used by
Tavus (which handles avatar video, not transcription) but are required by the
DailyCallbacks validation.

Fixes initialization error:
- 2 validation errors for DailyCallbacks
- on_transcription_stopped: Field required
- on_transcription_error: Field required
2025-08-02 05:56:46 -04:00
hankehly
694922f627 Fix PiperTTSService to send TTS input as JSON object 2025-08-02 15:29:16 +09:00
Mark Backman
cc9950e72d Updating more foundational examples 2025-08-01 19:58:40 -04:00
richtermb
6814c390ba Update CHANGELOG to reflect the addition of the source field in ErrorFrame for improved error tracking. 2025-08-01 14:47:57 -07:00
Richter Brzeski
c2d05ad23b Merge branch 'pipecat-ai:main' into richtermb/push-more-error-frames 2025-08-01 14:47:08 -07:00
Mark Backman
ee56d8572d Merge pull request #2329 from pipecat-ai/mb/fix-livekit-empty-audio-frames
fix: LiveKitTransport, don't push empty AudioRawFrames
2025-08-01 12:53:05 -07:00
richtermb
91568eeddc Update type hint for source in ErrorFrame to use forward declaration for improved clarity. 2025-08-01 12:52:56 -07:00
richtermb
165d6b4c1d Update CHANGELOG to include new source field in ErrorFrame for error tracking. 2025-08-01 12:25:29 -07:00
Mark Backman
1d8abe3c1c fix: LiveKitTransport, don't push empty AudioRawFrames 2025-08-01 14:57:53 -04:00
Mark Backman
a6e69d6aad Merge pull request #2325 from pipecat-ai/mb/dependency-groups
Move dev to [dependency-groups], update uv.lock
2025-08-01 11:54:21 -07:00
Mark Backman
519da9cc61 Align 04b livekit example with other foundational examples 2025-08-01 14:28:15 -04:00
richtermb
ead4e97ab5 Add source parameter to ErrorFrame and set it in FrameProcessor. Updated error handling in AnthropicLLMService and DeepgramSTTService to include ErrorFrame with source information. 2025-08-01 11:14:50 -07:00
Mark Backman
0c021378b0 Merge pull request #2326 from pipecat-ai/readme-quickstart-link
Update README.md
2025-08-01 10:45:30 -07:00
Mark Backman
e22c7e8ad5 Update README.md 2025-08-01 13:40:03 -04:00
Mark Backman
b71057bf7c Move dev to [dependency-groups], update uv.lock 2025-08-01 09:43:56 -04:00
Mark Backman
0865f6cd7d Merge pull request #2318 from pipecat-ai/mb/add-asyncio-readme
Add AsyncAI TTS to README vendor list
2025-08-01 06:11:13 -07:00
Mark Backman
610b1ab065 Merge pull request #2319 from pipecat-ai/mb/use-new-runner
Update foundational examples to use new runner
2025-08-01 06:11:03 -07:00
Mark Backman
3a2a226668 Merge pull request #2320 from pipecat-ai/mb/uv-lock-init
Add initial uv.lock file
2025-08-01 06:07:53 -07:00
Mark Backman
8e4b7352fd Merge pull request #2321 from pipecat-ai/mb/dev-requirements
Add dev to optional-dependencies
2025-08-01 06:02:58 -07:00
Mark Backman
637d372fe4 Add dev to optional-dependencies 2025-07-31 23:39:23 -04:00
Mark Backman
ac15fe8ae4 Add workflow to update lockfile with pyproject.toml changes 2025-07-31 23:08:21 -04:00
Mark Backman
07239c0b8b Add initial uv.lock file 2025-07-31 22:46:44 -04:00
Mark Backman
367b2fbe3c Update requirements.txt 2025-07-31 22:12:57 -04:00
Mark Backman
f1b1d5b130 Update foundational examples to use the development runner 2025-07-31 22:11:32 -04:00
Mark Backman
ff45b77fdf Remove examples runner 2025-07-31 21:22:04 -04:00
Mark Backman
e522b7ae96 Add AsyncAI TTS to README vendor list 2025-07-31 19:33:37 -04:00
Mark Backman
b8eef4f93b Merge pull request #2314 from pipecat-ai/mb/sync-quickstart-example
Add workflow to sync quickstart to pipecat-quickstart repo
2025-07-31 15:34:26 -07:00
Mark Backman
dcc205996a Merge pull request #2317 from pipecat-ai/mb/release-prep-0.0.77
Changelog update for 0.0.77
2025-07-31 15:34:02 -07:00
Mark Backman
9f61af4d1b Changelog update for 0.0.77 2025-07-31 18:19:05 -04:00
Sam Sykes
e8faf28e6a Doc fix for incorrect argument name. 2025-07-31 22:30:54 +01:00
Filipi da Silva Fuchter
40d53b3d84 Merge pull request #2316 from sam-s10s/speechmatics-stt
Updated to SpeechmaticsSTTService
2025-07-31 18:28:16 -03:00
Sam Sykes
7c223a86c2 Fix to missing deprecated attribute enable_speaker_diarization. 2025-07-31 22:25:46 +01:00
Sam Sykes
2d3f61aa07 Updated Speechmatics Plugin (#2225)
Changes
Split out module attributes to make engine settings clearer
Removed internal audio buffer to use latest Speechmatics python SDK (0.4.0)
Use diarization for improved VAD in multi-speaker situations
Support custom dictionary / vocabulary with attributes
Deprecated attributes superseded by re-organised attributes

Diarization Enhancements
Focus on specific speakers (using speaker labels)
Ignore specific speakers (using speaker labels)
Separate transcription formats for active and inactive speakers
Support for known speakers
2025-07-31 17:51:38 -03:00
Mark Backman
e05a47744d Merge pull request #2311 from pipecat-ai/mb/quickstart-fixups
Set quickstart min pipecat-ai version to 0.0.77, remove non-quickstart examples
2025-07-31 13:42:10 -07:00
Aleix Conchillo Flaqué
6ffaab2b93 CHANGELOG cosmetics 2025-07-31 13:39:37 -07:00
Aleix Conchillo Flaqué
c2d8844903 Merge pull request #2312 from pipecat-ai/aleix/srhinos/main
Enable Interruption Support for LLMUserResponseAggregator
2025-07-31 13:30:57 -07:00
Mark Backman
e8caba7723 Add workflow to sync quickstart to pipecat-quickstart repo 2025-07-31 15:56:18 -04:00
Mark Backman
df96ef7d37 Remove non-quickstart demos 2025-07-31 15:38:54 -04:00
Aleix Conchillo Flaqué
7553f670af fix formatting and update CHANGELOG 2025-07-31 10:41:11 -07:00
Mark Backman
6960f5861b Set example min pipecat-ai version to 0.0.77 due to runner requirement 2025-07-31 12:21:58 -04:00
Mark Backman
b5edbbc0ca Merge pull request #2309 from pipecat-ai/mb/remove-runner-examples
Remove examples/runner-examples
2025-07-31 09:18:22 -07:00
Aleix Conchillo Flaqué
e78d9c2c95 Merge pull request #2293 from azain47/azain47/fix-piper-tts-service
Fix Piper TTS Service
2025-07-31 08:32:17 -07:00
Vanessa Pyne
b25547a98b Merge pull request #2305 from pipecat-ai/vp-changelog-text-input
update changelog
2025-07-31 10:16:47 -05:00
Mark Backman
e80281c3c4 Remove examples/runner-examples 2025-07-31 10:59:06 -04:00
Mark Backman
d692843e5b Merge pull request #2308 from pipecat-ai/mb/change-neuphonic-url
NeuphonicTTSService: change the default url value to the global endpoint
2025-07-31 07:38:57 -07:00
Mark Backman
eaad3c5d55 NeuphonicTTSService: change the default url value to the global endpoint 2025-07-31 10:24:54 -04:00
vipyne
f2a1c66379 update changelog 2025-07-31 08:55:25 -05:00
Vanessa Pyne
af8de227bb Merge pull request #2223 from getchannel/realtime-text
Add text input handling to unify context for realtimeInput stream of GeminiMultimodalLiveService
2025-07-31 08:53:39 -05:00
Mark Backman
7cd78dd286 Merge pull request #2303 from pipecat-ai/mb/add-new-quickstart-demos
Add quickstart demos
2025-07-31 05:56:00 -07:00
Mark Backman
226b516948 Add quickstart demos 2025-07-30 22:14:10 -04:00
Mark Backman
aa85fffa57 New runner module (#2269)
* Adds pipecat.runner.run - FastAPI-based development server with automatic bot discovery

* Adds new RunnerArguments types for different transports

* Adds new runner utils for creating transports and parsing data

* Adds new Daily and LiveKit utils for setup
2025-07-30 22:02:28 -04:00
srhinos
8b97ab70ff Enable Interruption Support for LLMUserResponseAggregator 2025-07-30 20:48:31 -04:00
Filipi da Silva Fuchter
9013b2929a Merge pull request #2300 from pipecat-ai/filipi/fast_api_race_condition
Fixed a race condition in FastAPIWebsocketClient
2025-07-30 18:10:09 -03:00
Filipi Fuchter
0c6e12a9b0 Fixed a race condition in FastAPIWebsocketClient that occurred when attempting to send a message while the client was disconnecting. 2025-07-30 18:07:40 -03:00
Aleix Conchillo Flaqué
efb24071d5 Merge pull request #2301 from pipecat-ai/aleix/daily-python-0.19.5
pyproject: update daily-python to 0.19.5
2025-07-30 14:01:27 -07:00
Filipi da Silva Fuchter
318ebec67e Merge pull request #2298 from pipecat-ai/filipi/google_interruptions
Fixed an issue in GoogleLLMService where interruptions did not work when an interruption strategy was used.
2025-07-30 17:49:07 -03:00
Aleix Conchillo Flaqué
c679227aa8 pyproject: update daily-python to 0.19.5 2025-07-30 13:19:48 -07:00
Filipi Fuchter
392853f5fa Fixed an issue in GoogleLLMService where interruptions did not work when an interruption strategy was used. 2025-07-30 12:10:32 -03:00
Mark Backman
98d27caab3 Merge pull request #2296 from pipecat-ai/mb/switch-rime-voices
Added the ability to switch voices to RimeTTSService
2025-07-30 07:55:52 -07:00
Mark Backman
0fa51968bf Added the ability to switch voices to RimeTTSService 2025-07-30 10:53:14 -04:00
Mark Backman
92aee2634b Merge pull request #2291 from pipecat-ai/mb/remove-on-client-closed 2025-07-30 06:36:32 -07:00
Filipi da Silva Fuchter
bff6a93f31 Merge pull request #2150 from pipecat-ai/filipi/hey_gen
HeyGen implementation for Pipecat - HeyGenVideoService
2025-07-30 09:10:07 -03:00
Filipi Fuchter
6e921cdf45 HeyGen implementation for Pipecat - HeyGenVideoService 2025-07-30 09:07:15 -03:00
Azain.
1e2b066cf3 Fix tts.py
Update Piper TTS Service to work with the newer Piper GPL Version, that uses JSON as its payload.
2025-07-30 13:41:27 +05:30
Pete
2af3b6329d Ruff format debug 2025-07-29 17:48:11 -04:00
Pete
8ca06e5887 Add InputTextRawFrame class for handling raw text input in frames
- Introduced `InputTextRawFrame` to represent raw text input from users or programs.
- Updated `GeminiMultimodalLiveLLMService` to process `InputTextRawFrame` and send user text via the Gemini Live API's realtime input stream.
- Enhanced `_send_user_text` method documentation for clarity on its functionality and usage.
2025-07-29 17:43:14 -04:00
Mark Backman
c145a9ef13 Merge pull request #2288 from pipecat-ai/mb/stt-mute-examples
Update placement of STTMuteFilter in examples to reflect the new reco…
2025-07-29 12:10:36 -07:00
Mark Backman
b523f9a4c6 Merge pull request #2248 from ashotbagh/feat/async-tts
feat(tts): integrate Async TTS engine into pipecat
2025-07-29 12:10:11 -07:00
Mark Backman
7f184422d0 Merge branch 'main' into feat/async-tts 2025-07-29 12:06:56 -07:00
Aleix Conchillo Flaqué
fa4c3ec6bf Merge pull request #2287 from pipecat-ai/aleix/asyncio-trace-logging
utils(asyncio): use trace logging for some cancelling messages
2025-07-29 11:56:25 -07:00
Aleix Conchillo Flaqué
9fafc10844 Merge pull request #2292 from richtermb/transcription-error-callback
Add on_transcription_error callback to DailyCallbacks and handle tran…
2025-07-29 11:56:02 -07:00
richtermb
67107d02ed Refactor callback invocation for on_transcription_stopped in DailyTransportClient for improved readability 2025-07-29 11:53:41 -07:00
richtermb
c1df19982c Add on_transcription_stopped callback to DailyTransport for handling transcription stop events 2025-07-29 11:50:16 -07:00
richtermb
444b1b5b02 Add on_transcription_stopped callback to DailyCallbacks and implement handling in DailyTransport for transcription stop events 2025-07-29 11:49:28 -07:00
Mark Backman
ebfa4f2d5e Push the STTMuteFrame upstream and downstream 2025-07-29 14:37:36 -04:00
Mark Backman
e961c438e7 Update placement of STTMuteFilter in examples to reflect the new recommendation 2025-07-29 14:36:39 -04:00
richtermb
d3d36a89e2 Add _on_transcription_error method to DailyTransport for handling transcription error events 2025-07-29 10:48:50 -07:00
richtermb
fa6e5ce4a7 Add on_transcription_error callback to DailyCallbacks and handle transcription errors in DailyTransportClient 2025-07-29 10:43:18 -07:00
Mark Backman
3ffb261864 Remove use of on_client_closed event in foundational examples 2025-07-29 13:28:33 -04:00
Mark Backman
f69a02b7a7 Merge pull request #2290 from pipecat-ai/mb/remove-examples
Removed most pipecat examples, relocating to pipecat-examples repo
2025-07-29 10:02:24 -07:00
Mark Backman
f1f4aed398 Remove random Dockerfile, update README links 2025-07-29 11:41:27 -04:00
Mark Backman
414c245c92 Remove android.yaml github workflow 2025-07-29 11:34:48 -04:00
Mark Backman
3f57d94c0b Update examples README 2025-07-29 11:28:40 -04:00
Mark Backman
15e3c69ddc Removed most pipecat examples, relocating to pipecat-examples repo 2025-07-29 11:17:01 -04:00
Ashot
39b00f5269 chore: address review comments 2025-07-29 18:20:50 +04:00
Mark Backman
4c368c78c6 Merge pull request #2289 from tomoima525/tomoima525/transcription-bucket-params
Add transcription_bucket param for rest helper
2025-07-29 05:05:08 -07:00
Tomoaki Imai
6eb00a99cb update Changelog for transcription_bucket params addition 2025-07-29 20:38:12 +09:00
Tomoaki Imai
3ae8cf1916 Add transcription_bucket param for rest helper 2025-07-29 18:58:38 +09:00
Aleix Conchillo Flaqué
03e87469df utils(asyncio): use trace logging for some cancelling messages 2025-07-28 17:43:41 -07:00
Mark Backman
70255d3c81 Merge pull request #2274 from pipecat-ai/mb/remove-message-name
Remove message["name"] addition when pushing
2025-07-28 17:29:23 -07:00
Mark Backman
96a72d0647 Remove message[name] addition when pushing 2025-07-28 20:13:50 -04:00
Mark Backman
27d4910694 Merge pull request #2286 from pipecat-ai/mb/fix-transcript-processor-newline
fix: Improve TranscriptProcessor detection for transcript type
2025-07-28 17:07:43 -07:00
Mark Backman
50242f4ad8 fix: Improve TranscriptProcessor detection for transcript type 2025-07-28 19:56:36 -04:00
Mark Backman
c9dda5251c Merge pull request #2284 from pipecat-ai/mb/yank-74-75
Add yanked notices to 0.0.74 and 0.0.75 changelogs
2025-07-28 12:55:47 -07:00
Mark Backman
419cc9ac68 Add yanked notices to 0.0.74 and 0.0.75 changelogs 2025-07-28 13:36:02 -04:00
Ashot
83b4747196 chore: address review comments 2025-07-28 17:52:17 +04:00
Ashot
a13b954415 formatting/cleanup: address Copilot PR review comments 2025-07-28 17:43:17 +04:00
Ashot
f2e9562f1b feat(tts): integrate Async TTS engine into pipecat 2025-07-28 17:42:57 +04:00
Mark Backman
afed9a61f2 Merge pull request #2268 from pipecat-ai/mb/inworld-changelog
Add changelog entry for InworldTTSService
2025-07-28 05:59:57 -07:00
Mark Backman
f0de27b35e Merge pull request #2273 from pipecat-ai/mb/gitignore-plivo-stream
.gitignore: add plivo-chatbot streams.xml, .python-version
2025-07-28 05:59:41 -07:00
Filipi da Silva Fuchter
9d5510ee47 Merge pull request #2265 from pipecat-ai/filipi/small_webrtc_buffer_processor
Fixed an issue in AudioBufferProcessor when using SmallWebRTCTransport
2025-07-28 09:23:58 -03:00
Mark Backman
434c3fc527 Merge pull request #2279 from Allenmylath/patch-26
Update README.md
2025-07-28 04:51:58 -07:00
allenmylath
aba79a9478 Update README.md
Explanatory comment.Eventhought code allows for transport params.It is not clearly given in readme, what all are the options are there.
2025-07-28 11:24:27 +05:30
Mark Backman
fc96e091a9 Update NVIDIA in README 2025-07-26 15:01:52 -04:00
Mark Backman
851a27c082 Add Groq TTS to README 2025-07-26 14:58:07 -04:00
Mark Backman
a72d93dc6d Add Inworld to README 2025-07-26 14:55:19 -04:00
Mark Backman
c971232f20 .gitignore: add plivo-chatbot streams.xml, .python-version 2025-07-26 10:19:50 -04:00
Mark Backman
4b2ba2d69f Merge pull request #2270 from Allenmylath/patch-25
Update README.md
2025-07-26 04:55:59 -07:00
allenmylath
240a698fab Update README.md
recreating examples without activating bedrock models will create errors . Warning and link added
2025-07-26 14:51:19 +05:30
Mark Backman
9aaae01063 Add changelog entry for InworldTTSService 2025-07-25 21:46:02 -04:00
Mark Backman
41c8d22cf3 Merge pull request #2208 from padillamt/mtp/add-inworld-tts
Inworld HTTP TTS Service
2025-07-25 17:13:37 -07:00
padillamt
b68f044ef7 mtpadilla: updated example to reflect parameter placement changes in base Inworld TTS class 2025-07-25 15:13:43 -07:00
padillamt
e140bd6960 mtpadilla: moved model and voice id setting into the class constructor 2025-07-25 14:04:49 -07:00
Filipi Fuchter
e86b55e2b3 Fixed an issue in AudioBufferProcessor when using SmallWebRTCTransport where, if the microphone was muted, track timing was not respected. 2025-07-25 17:01:41 -03:00
padillamt
4a9bec5b35 mtpadilla: stop metrics at result chunk 2025-07-25 11:14:20 -07:00
padillamt
37361391d9 mtpadilla: removed ability to set base_url via constructor, set internally based on streaming variable 2025-07-25 09:16:56 -07:00
Filipi da Silva Fuchter
4b3726eba4 Merge pull request #2260 from pipecat-ai/filipi/audio_resampler
Fixed an issue in `AudioBufferProcessor` that caused garbled audio
2025-07-25 09:27:42 -03:00
padillamt
8e66794759 mtpadilla: switch to Deepgram ASR for lower latency 2025-07-24 22:22:36 -07:00
padillamt
acc5b9f210 inworld: change to function that stops all processing metrics
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 22:07:15 -07:00
padillamt
f982ace4c5 inworld: removal of unnecessary setting of ssampling rate since matches default 2025-07-24 21:56:01 -07:00
padillamt
5fb1899aeb inworld: removal of unnecessary default assignment as already handled 2025-07-24 21:42:42 -07:00
padillamt
7483422bd9 inworld: change set_voice uto use self._settings
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:23:03 -07:00
padillamt
16c20f3a99 inworld: removal of unnecessary default assignment since already done
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:15:34 -07:00
padillamt
d248c102c8 inworld: removal of unnecessary default assignment since already done
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:15:20 -07:00
padillamt
662550cc5e mtpadilla: remove unused imports 2025-07-24 21:05:22 -07:00
padillamt
067f64389b mtpadilla: no longer needed so making empty 2025-07-24 20:44:27 -07:00
padillamt
81048ce43a mtpadilla: rename 07aa-interruptible-inworld-http.py to 07ab-interruptible-inworld-http.py 2025-07-24 20:42:29 -07:00
padillamt
f6440ee6e1 mtpadilla: correct Examples header in comments 2025-07-24 13:36:40 -07:00
padillamt
da8c67114a mtpadilla: make streaming the default for example 2025-07-24 13:35:29 -07:00
Mark Backman
d8ea1311ff Merge pull request #2254 from pipecat-ai/mb/11labs-create-context-id
ElevenLabsTTSService: Only reset the context_id when interruptions ar…
2025-07-24 13:32:37 -07:00
Mark Backman
2be615066c Merge pull request #2261 from pipecat-ai/mb/foundational-requirements
Foundational requirements.txt: add silero, websocket optional dep, re…
2025-07-24 11:06:16 -07:00
Mark Backman
75c2ffc0b5 Check if audio context is already available, create one if not 2025-07-24 13:57:46 -04:00
Mark Backman
2297eb217e ElevenLabsTTSService: Only reset the context_id when interruptions are enabled 2025-07-24 13:53:44 -04:00
Mark Backman
1bb821a07d Foundational requirements.txt: add silero, websocket optional dep, remove fastapi 2025-07-24 13:49:44 -04:00
Filipi Fuchter
970b8044a0 Fixed an issue in AudioBufferProcessor that caused garbled audio when enable_turn_audio was enabled and audio resampling was required. 2025-07-24 13:25:48 -03:00
Filipi da Silva Fuchter
d8bcb81f35 Merge pull request #2259 from pipecat-ai/filipi/eleven_labs_delayed_messages
Play delayed messages from `ElevenLabsTTSService` if they still belong to the current context.
2025-07-24 12:07:06 -03:00
Filipi da Silva Fuchter
3ce0ab8c6d Removing extra space.
Co-authored-by: Mark Backman <mark@daily.co>
2025-07-24 12:05:17 -03:00
Filipi Fuchter
097d786431 Fixing ruff format. 2025-07-24 12:03:17 -03:00
Filipi Fuchter
662f04879c Play delayed messages from ElevenLabsTTSService if they still belong to the current context. 2025-07-24 12:00:14 -03:00
Mark Backman
7a69f57e11 Merge pull request #2255 from pipecat-ai/mb/pyproject-versions-for-uv
pyproject.toml dependency updates to support better cross compatibility
2025-07-24 06:43:35 -07:00
Mark Backman
5b7b4efdc9 Add broader version support for stable core dependencies, up to the next major version 2025-07-24 09:40:52 -04:00
Mark Backman
cfa26524ca Add support for fastapi>=0.115.6,<0.117.0 2025-07-24 09:37:42 -04:00
Mark Backman
3d4ab7158d pyproject.toml dependency updates to support better cross compatibility 2025-07-24 09:37:42 -04:00
Mark Backman
26d1ca3c98 Merge pull request #2256 from pipecat-ai/mb/refactor-neuphonic-http
NeuphonicHttpTTSService: Refactor to use POST API
2025-07-24 06:36:23 -07:00
Mark Backman
083b32887e NeuphonicHttpTTSService: Refactor to use POST API 2025-07-24 01:05:37 -04:00
padillamt
b6367965cb mtpadilla: consolidate streaming and non-streaming options into a single class with common API, with boolean switch variable added (streaming) 2025-07-23 16:50:32 -07:00
padillamt
147bf9cfe8 mtpadilla: addition of non-streaming option with own dedicated class, and related additional non-streaming test option 2025-07-23 15:28:43 -07:00
Mark Backman
3391929127 Merge pull request #2252 from pipecat-ai/mb/example-axios-version-bump
Update axios in daily-pstn-server example due to transitive vulnerabi…
2025-07-23 13:30:58 -07:00
padillamt
a5d353030e mtpadilla: small formatting fix to comments 2025-07-23 12:02:58 -07:00
padillamt
f29024bcc0 mtpadilla: update coments regarding temperature parameter 2025-07-23 11:47:26 -07:00
Mark Backman
ebf9bc2741 Merge pull request #2246 from ydlamba/ydlamba/missing-livekit-event
fix(livekit): emit on_audio_track_subscribed event
2025-07-23 11:27:10 -07:00
Mark Backman
f5edde42f6 Update axios in daily-pstn-server example due to transitive vulnerability with form-data 2025-07-23 14:22:13 -04:00
Filipi da Silva Fuchter
37bb7ef926 Merge pull request #2239 from pipecat-ai/filipi/daily_log
Added `set_log_level` to `DailyTransport`
2025-07-23 14:48:34 -03:00
Filipi Fuchter
a63d1530a4 Added set_log_level to DailyTransport. 2025-07-23 14:43:53 -03:00
Yash Dev Lamba
960bc9df5b chore(changelog): add entry for LiveKitTransport audio subscribed event fix 2025-07-23 22:41:20 +05:30
Mark Backman
e2a153ee01 Merge pull request #2242 from pipecat-ai/mb/websockets-14
Upgrade websockets to support asyncio implementation
2025-07-23 08:58:08 -07:00
Mark Backman
300f19ad23 Port to the websockets asyncio implementation, support for websockets 13 and 14 2025-07-23 11:54:25 -04:00
Mark Backman
7955080da2 Change extra_headers to additional_headers, update websocket version support 2025-07-23 11:53:43 -04:00
Mark Backman
994e82c1ef Merge pull request #2243 from pipecat-ai/mb/word-wrangler-twilio-readme
Update Word Wrangler phone bot README to include deployment info
2025-07-23 07:04:19 -07:00
Mark Backman
b07b947352 Merge pull request #2244 from pipecat-ai/mb/upgrade-deepgram-4.7.0
Deepgram: Update optional dependency to 4.7.0
2025-07-23 07:04:02 -07:00
Filipi da Silva Fuchter
a6527c3856 Merge pull request #2240 from pipecat-ai/filipi/sig_term
Adding support for handle_sigterm
2025-07-23 08:15:50 -03:00
antonyesk601
1cbf7ae480 fix: remove unused variable; fix: remove redundant logic 2025-07-23 08:26:44 +00:00
Yash Dev Lamba
0e6874b605 fix(livekit): emit on_audio_track_subscribed event 2025-07-23 08:23:45 +05:30
Mark Backman
9ba172c49f Merge pull request #2236 from dbtreasure/fix/python-311-compatibility
Fix Python 3.11+ compatibility by pinning numba/llvmlite versions
2025-07-22 18:20:38 -07:00
dbtreasure
f710c94b6e Address code review feedback: remove explicit llvmlite pin
- Remove explicit llvmlite>=0.44.0 pin as numba>=0.61.0 automatically pulls compatible version
- Add changelog entry for Python 3.11+ dependency fix

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-22 18:45:32 -06:00
dbtreasure
6e3a0a2d5d Add explicit numba/llvmlite pins for Python 3.11+ compatibility
Fixes dependency resolution issues where transitive dependencies
through resampy would install incompatible versions:
- numba>=0.61.0 (supports Python 3.10-3.13)
- llvmlite>=0.44.0 (supports Python 3.10-3.13)

Previously, older versions (numba 0.53.1, llvmlite 0.36.0) only
supported Python 3.6-3.9, causing deployment failures on Python 3.11+.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-22 18:45:02 -06:00
Mark Backman
9530b8b842 Merge pull request #2235 from pipecat-ai/mb/nltk-tokenizer
Update match_endofsentence to use NLTK sentence tokenizer
2025-07-22 17:22:23 -07:00
Mark Backman
26c937af87 Update match_endofsentence to use NLTK sentence tokenizer 2025-07-22 20:19:29 -04:00
Mark Backman
976f6168f0 Deepgram: Update optional dependency to 4.7.0 2025-07-22 20:15:30 -04:00
Mark Backman
0be64e0fd9 Update Word Wrangler phone bot README to include deployment info 2025-07-22 20:10:20 -04:00
Filipi Fuchter
7d527c3a6b Mentioning the new field in the changelog. 2025-07-22 19:32:52 -03:00
Filipi Fuchter
c6f6930c27 Adding support for handle_sigterm 2025-07-22 17:24:07 -03:00
Mark Backman
c33dfe8309 Merge pull request #2233 from pipecat-ai/mb/enable-tracing-flag
fix: enable_tracing PipelineParam controls the service class decorators
2025-07-22 08:14:32 -07:00
Mark Backman
769cd1ef06 fix: enable_tracing PipelineParam controls the service class decorators 2025-07-22 11:10:53 -04:00
Mark Backman
6d72f60571 Merge pull request #2234 from pipecat-ai/mb/fix-minimax-pitch
fix: MiniMaxHttpTTSService pitch, add base_url arg
2025-07-22 08:10:01 -07:00
Mark Backman
e8d0712ac1 Merge pull request #2238 from pipecat-ai/mb/patch-form-data
Fix form-data vulnerability in pipecat-cloud-daily-pstn-server
2025-07-22 08:09:49 -07:00
Mark Backman
88b2c817ac Fix form-data vulnerability in pipecat-cloud-daily-pstn-server 2025-07-22 10:08:25 -04:00
Mark Backman
f8f6c9918d Merge pull request #2237 from pipecat-ai/mb/pipecat-cloud-example-pipeline-runner-args
Update Pipecat Cloud example to use handle_sigint=False in PipelineRu…
2025-07-22 06:55:56 -07:00
Mark Backman
8ee608bbfe Update Pipecat Cloud example to use handle_sigint=False in PipelineRunner args 2025-07-22 09:52:57 -04:00
Mark Backman
fad2ba4570 Merge pull request #2204 from yousifa/mcp-FunctionCallParams 2025-07-22 05:01:32 -07:00
Mark Backman
f609f7eb53 fix: MiniMaxHttpTTSService pitch, add base_url arg 2025-07-21 21:16:35 -04:00
Mark Backman
ea09813a2b Merge pull request #2227 from pipecat-ai/mb/fix-11labs-wordtimestamps
fix: Improve ElevenLabsTTSService word/timestamp calcuation accuracy
2025-07-21 16:07:07 -07:00
Mark Backman
53abfc27a7 fix: Improve ElevenLabsTTSService word/timestamp calcuation accuracy 2025-07-21 18:48:38 -04:00
padillamt
1915407ff7 inworld: removed unreferenced is_first_chunk variable 2025-07-21 15:30:48 -07:00
Mark Backman
9c72e96a2c Merge pull request #2230 from pipecat-ai/mb/livekit-tenacity
Livekit: change tenacity supported versions
2025-07-21 15:28:38 -07:00
Mark Backman
f66c67c4ab Merge pull request #2232 from pipecat-ai/mb/fix-ollama-args
Fix: Ollama kwargs error
2025-07-21 15:26:13 -07:00
Mark Backman
b623face03 Add Ollama function calling example 14u 2025-07-21 17:52:16 -04:00
Mark Backman
698d60f3ae fix: OLLamaLLMService pass base_url as kwarg 2025-07-21 17:51:11 -04:00
Mark Backman
c9717a23a5 Livekit: change tenacity supported versions 2025-07-21 17:30:18 -04:00
padillamt
076a675a75 inworld: Fix...Set sample_rate=None in InworldHttpTTSService to match Cartesia pattern 2025-07-21 13:50:36 -07:00
padillamt
0d5292c4ef inworld: typo fix in voice name 2025-07-21 13:48:13 -07:00
padillamt
4853d5d55c inworld: updated InworldHttpTTSService initialization 2025-07-21 13:27:25 -07:00
padillamt
8eda2435a2 inworld: removed explicit references to language since our models currently infer that from the text. 2025-07-21 13:24:10 -07:00
Mark Backman
d981ce6e56 Merge pull request #2226 from pipecat-ai/mb/11labs-speed-docstring
Fix 11Labs speed docstring
2025-07-21 13:21:45 -07:00
padillamt
54ff946976 inworld: largely adjustments for docstring compatibility 2025-07-21 12:07:58 -07:00
Mark Backman
1bbd3bd8ab Fix 11Labs speed docstring 2025-07-21 14:58:12 -04:00
padillamt
aadd088b50 inworld: commented out contents as per Pipecat guidance that this pattern is being retired 2025-07-21 10:52:55 -07:00
padillamt
4250aa6616 inworld: removal of backup copy, no longer needed 2025-07-21 10:11:50 -07:00
Kwindla Hultman Kramer
a20915caa7 Merge pull request #2224 from pipecat-ai/khk/mps
Add MPS backend auto-detection to local smart-turn v2
2025-07-21 09:24:51 -07:00
Vanessa Pyne
28cab5a606 Merge pull request #1932 from getchannel/groundingMetadata
Add groundingMetadata to Gemini Multimodal Live Service
2025-07-21 10:09:26 -05:00
Vanessa Pyne
cfea56064d small merge-main nit fixes - gemini_multimodal_live events.py 2025-07-21 09:54:15 -05:00
Vanessa Pyne
8467d87cfc small main-merge fixes - gemini.py 2025-07-21 09:52:32 -05:00
Kwindla Hultman Kramer
b20d020bea Add MPS backend auto-detection to local smart-turn v2 2025-07-20 20:18:45 -04:00
padillamt
e3711f96a3 inworld: added detailed comments 2025-07-20 17:06:35 -07:00
Pete
948257c66e Merge branch 'main' into groundingMetadata 2025-07-20 19:54:30 -04:00
Pete
b54d1fb7fd Resolve merge conflict and remove duplicate File API initialization
- Remove duplicate file_api initialization lines
- Keep grounding metadata tracking functionality
- Maintain clean code structure
2025-07-20 19:15:40 -04:00
Pete
ec361df0d1 Fix final ruff linting issues
- Remove duplicate import in __init__.py
- Clean up extra blank lines in gemini.py
- Remove extra blank line in _create_single_response method
2025-07-20 18:58:54 -04:00
Pete
b1a5cddde4 Refactor whitespace and formatting in multiple files
- Clean up unnecessary whitespace in `gemini.py`, `events.py`, and `file_api.py`
- Ensure consistent formatting in `26g-gemini-multimodal-live-groundingMetadata.py`
- Improve readability by aligning code and removing trailing spaces
2025-07-20 18:40:12 -04:00
Pete
e165d38277 remove truncated logging from debug 2025-07-20 18:27:21 -04:00
Pete
8ba340a8a5 remove debug logging 2025-07-20 18:21:42 -04:00
Pete
8f74b97591 Refactor _send_user_text method in Gemini multimodal service to streamline event creation for turn completion 2025-07-20 18:08:45 -04:00
Pete
1d69cd1a5e Remove debug logging from _send_user_text method in Gemini multimodal service 2025-07-20 18:04:57 -04:00
Pete
bd7a0f27cc Add text input handling to Gemini multimodal service
- Updated `RealtimeInput` to include an optional `text` parameter.
- Introduced `TextInputMessage` class for encapsulating text input data.
- Implemented `_send_user_text` method to send text input to the Gemini Live API.
- Enhanced message processing to support text input alongside media chunks.
2025-07-20 17:39:31 -04:00
padillamt
5d8c184d99 inworld: commit of original text file and changes that copy openai's with Inworld TTS as only change 2025-07-18 16:30:03 -07:00
padillamt
1bc442e329 inworld: docstring fix 2025-07-18 15:13:19 -07:00
kompfner
d4e33663b2 Merge pull request #2214 from pipecat-ai/pk/fix-google-llm-context
Fixed an issue in `GoogleLLMContext` where it would inject the `syste…
2025-07-18 09:28:28 -04:00
marcus-daily
d7d1b16dad Removing old import 2025-07-18 12:48:06 +01:00
marcus-daily
0bc2ea13f2 Updating changelog 2025-07-18 12:48:06 +01:00
marcus-daily
b5d1301221 Fix linter warnings 2025-07-18 12:48:06 +01:00
marcus-daily
ed8f30ec71 Add support for running smart-turn-v2 locally 2025-07-18 12:48:06 +01:00
antonyesk601
688031efd6 fix: use undeclared variable _preinitialized. fix: double send of start frame 2025-07-18 08:23:04 +00:00
kompfner
a74a935ca0 Merge pull request #1910 from matejmarinko-soniox/main
Add Soniox STT service integration
2025-07-17 09:29:07 -04:00
antonyesk601
0f9e69d3c7 feat: Add Simli Trinity models support to pipecat 2025-07-17 11:55:40 +00:00
padillamt
f3984aec33 inworld: added (empty) requirements for Inworld to be explicit reg dependencies 2025-07-16 13:21:32 -07:00
Paul Kompfner
7cfd56699b Fixed an issue in GoogleLLMContext where it would inject the system_message as a "user" message into cases where it was not meant to; it was only meant to do that when there were no "regular" (non-function-call) messages in the context, to ensure that inference would run properly. 2025-07-16 16:07:53 -04:00
Matej Marinko
cb984237a7 Fix lint error 2025-07-16 16:54:28 +02:00
Matej Marinko
c969fdddb9 Rename and simplify VAD finalization parameter usage 2025-07-16 09:47:34 +02:00
padillamt
2b76823b01 inworld: added comments to track a few things to confirm 2025-07-15 18:17:30 -07:00
padillamt
ca936bd569 inworld: added Inworld to list of needed credentials 2025-07-15 18:11:50 -07:00
padillamt
c67b779b91 inworld: first commit of Inworld example file for TTS 2025-07-15 17:21:16 -07:00
padillamt
913dba3b74 inworld: class name change 2025-07-15 17:15:57 -07:00
padillamt
384838147a inworld: removed unnecessary code from stop() and cancel() 2025-07-15 16:56:18 -07:00
padillamt
7861b911c0 inworld: first commit of __init__ and tts.py files 2025-07-15 16:50:50 -07:00
Mark Backman
9931ad2ce1 Merge pull request #2199 from Dev-Khant/add-host-support-in-Mem0
Add `host` support in Mem0 Memory
2025-07-15 11:41:15 -07:00
Filipi da Silva Fuchter
fd73feb645 Merge pull request #2201 from pipecat-ai/filipi/stt_issue
Only create the EmulateUserStartedSpeakingFrame if we have received a transcription
2025-07-15 13:56:11 -03:00
Yousif Astarabadi
ee78428a2a formatted 2025-07-14 20:38:28 -07:00
Yousif Astarabadi
ae02249255 mcp_tool_wrapper using FunctionCallParams 2025-07-14 20:31:22 -07:00
Filipi Fuchter
727af2e6fb Only create the EmulateUserStartedSpeakingFrame if we have received a transcription. 2025-07-14 17:38:03 -03:00
Mark Backman
8fd5576879 Merge pull request #2198 from Allenmylath/patch-24
Update app.py
2025-07-14 06:37:42 -07:00
kompfner
1f85dcee7c Merge pull request #2171 from pipecat-ai/pk/aws-strands-demo
Minimal AWS Strands demo
2025-07-14 09:32:16 -04:00
Dev Khant
138890bc5c Add support in Mem0 Memory 2025-07-14 18:08:25 +05:30
Filipi da Silva Fuchter
a094efc9e6 Merge pull request #2196 from pipecat-ai/mb/lmnt-model
LmntTTSService: update the default model to blizzard
2025-07-14 09:15:17 -03:00
allenmylath
1f9e2fdecc Update app.py
misleading comment. no endpoints.py
2025-07-14 14:02:35 +05:30
Mark Backman
4a2b4660bc LmntTTSService: update the default model to blizzard 2025-07-13 10:54:43 -07:00
Mark Backman
b3ac90015a Merge pull request #2195 from Trinary-Projects/transformers_ver_patch
Update transformers dep. to >=4.48.0 for Ultravox
2025-07-11 23:31:47 -07:00
Jaideep
2fe06f0a4e Update pyproject.toml 2025-07-12 11:34:45 +05:30
Mark Backman
1836a7484e Merge pull request #2193 from pipecat-ai/mb/changelog-0.0.76
Prepare changelog for 0.0.76 release
2025-07-11 16:15:34 -07:00
Mark Backman
25a5c5aaab Prepare changelog for 0.0.76 release 2025-07-11 16:08:08 -07:00
mattie ruth backman
24694e2558 Changelog entry 2025-07-11 14:30:12 -07:00
mattie ruth backman
2325edd9ba Add a text entry box to the simple-chatbot example 2025-07-11 14:30:12 -07:00
mattie ruth backman
fad5713ade Fix append-to-context function call 2025-07-11 14:30:12 -07:00
Paul Kompfner
fe8573322f AWS Strands demos 2025-07-11 16:42:27 -04:00
Mark Backman
06c1255abe fix: use a different aggregation timeout for emulated user speech (#2185)
* fix: use a different aggregation timeout for emulated user speech

* Add SpeechControlParamsFrame

* Update test_context_aggregator tests
2025-07-11 16:33:44 -04:00
Mark Backman
f108a67635 Merge pull request #2189 from pipecat-ai/mb/numpy-version-bump
Update numpy, transformers to support newer versions
2025-07-11 12:02:02 -07:00
Mark Backman
bf580d061d Update numpy, transformers to support newer versions 2025-07-11 11:58:31 -07:00
Filipi da Silva Fuchter
b005bd7b98 Merge pull request #2184 from pipecat-ai/filipi/twilio_issue
Fixing an issue where Pipecat was not receiving the user's audio
2025-07-11 15:32:28 -03:00
Filipi Fuchter
75f8baab33 Mentioning the fixes in the changelog. 2025-07-11 11:56:16 -03:00
Matej Marinko
5c3fb73cef Rename example 2025-07-11 16:07:24 +02:00
Filipi Fuchter
5c3f4180b9 Refactored VAD analyzer to process multiple audio frames in a single iteration if needed. 2025-07-11 10:59:32 -03:00
Mark Backman
6cd6e7ceed Merge pull request #2186 from pipecat-ai/mb/fix-pre-commit-config
Update .pre-commit-config.yaml to use pyproject.toml linting rules
2025-07-11 06:34:01 -07:00
Filipi Fuchter
1a146c2a64 Not serializing a JSON in case we have no audio. 2025-07-11 10:15:09 -03:00
Filipi Fuchter
eaeb9e6efa Not creating InputAudioRawFrame in case we don't have bytes. Fixed for Pilvo, Exotel and Telnyx. 2025-07-11 09:51:38 -03:00
Matej Marinko
2e84c91748 Remove outdated parameter 2025-07-11 08:52:39 +02:00
Matej Marinko
650d45c1f4 Use single sample rate parameter 2025-07-11 08:27:06 +02:00
Filipi Fuchter
f4f65024ef Refactoring the test client to use the new version of the Pipecat Client SDK. 2025-07-10 21:57:25 -03:00
Filipi Fuchter
1200aa4fb8 Not creating InputAudioRawFrame in case we don't have bytes. 2025-07-10 21:56:34 -03:00
Filipi da Silva Fuchter
6762363685 Merge pull request #2183 from pipecat-ai/filipi/parallel_pipeline_issue
Fixed an issue in ParallelPipeline that caused errors when attempting to drain the queues.
2025-07-10 21:51:04 -03:00
Filipi Fuchter
b2ead325c4 Fixed an issue in ParallelPipeline that caused errors when attempting to drain the queues. 2025-07-10 21:50:35 -03:00
Mark Backman
4e24b915cc Update .pre-commit-config.yaml to use pyproject.toml linting rules 2025-07-10 16:10:27 -07:00
kompfner
b610ee26ba Merge pull request #2181 from pipecat-ai/pk/fix-aws-nova-sonic-pipeline-freeze
Fix a pipeline freeze when using AWS Nova Sonic. The freeze occurs if…
2025-07-10 16:30:55 -04:00
Paul Kompfner
2b867f1613 Fix a pipeline freeze when using AWS Nova Sonic. The freeze occurs if the user starts speaking before we've finished sending the "trigger " audio (AWS Nova Sonic can only start speaking in response to a user utterance, so we have a simulated user utterance to "trigger" the bot speaking without the user having actually spoken first). 2025-07-10 15:57:05 -04:00
Mark Backman
7b8fe565c7 Merge pull request #2182 from pipecat-ai/mb/run-example-usage
run.py: Add example usage to the module docstring
2025-07-10 12:48:29 -07:00
Mark Backman
a246862910 run.py: Add example usage to the module docstring 2025-07-10 11:41:49 -07:00
Filipi da Silva Fuchter
106809f3fd Merge pull request #2166 from carolin-tavus/remove-persona-microphone-check
feat: Remove persona microphone check
2025-07-10 15:28:35 -03:00
carolin-tavus
f0d8499f7e feat: avoid checking microphone enabled 2025-07-10 09:40:27 +00:00
Mark Backman
332ca3d55e Merge pull request #2177 from pipecat-ai/mb/fix-ruff-improvements
Make fix-ruff.sh more flexible, use pyproject rules
2025-07-09 12:33:05 -07:00
Mark Backman
a48f5d5796 Make fix-ruff.sh more flexible, use pyproject rules 2025-07-09 11:48:17 -07:00
Mark Backman
f04f047428 Merge pull request #2176 from pipecat-ai/mb/pre-commit-config
Add docstring checking to .pre-commit-config.yaml
2025-07-09 11:47:25 -07:00
Mark Backman
4e61fd33ea Add docstring checking to .pre-commit-config.yaml 2025-07-09 11:18:40 -07:00
Matej Marinko
61ac77be72 Update docs 2025-07-09 11:59:45 +02:00
Matej Marinko
c093eb5b63 Move config to main file 2025-07-09 10:20:37 +02:00
Matej Marinko
98e24131bd Send raw result 2025-07-09 09:59:04 +02:00
Matej Marinko
7becce9e8c Add transcript tracing 2025-07-09 09:37:58 +02:00
Matej Marinko
3cdaeb719a Update examples to new format 2025-07-09 09:28:43 +02:00
Matej Marinko
8daaea5969 Minor code cleanup 2025-07-09 09:03:02 +02:00
matejmarinko-soniox
dc47516e14 Update src/pipecat/services/soniox/config.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-09 08:04:59 +02:00
Mark Backman
0fcc4f822f Merge pull request #2173 from captaincaius/fix-nextjs-webhook-example-null-check
fix nextjs webhook example num_endpoints null check
2025-07-08 14:10:16 -07:00
Captain Caius
c0ed061ff5 fix nextjs webhook example num_endpoints null check 2025-07-08 13:40:26 -07:00
Matej Marinko
0f727248d2 Merge branch 'main' of github.com:pipecat-ai/pipecat 2025-07-08 08:20:10 +02:00
Pete
7ed4fe50d4 Update gemini.py
-FunctionCallFromLLM
-Delete duplicate Gemini imports
2025-07-03 19:39:44 -04:00
Pete
6f66ec1727 Update gemini.py
tab indentation fix
2025-07-03 18:55:21 -04:00
Pete
c7e758fc36 Merge branch 'main' into groundingMetadata 2025-07-03 18:47:47 -04:00
Pete
14c22234bb Fix parameter name consistency in parse_server_event function
- Change function body to use 'str' parameter consistently
- Matches pattern used in OpenAI Realtime Beta service
- Fixes bug where parameter was named 'str' but body used 'message_str'
- Maintains consistency with existing codebase patterns
2025-07-03 18:02:24 -04:00
Pete
d565e9ae53 Update grounding metadata example with final refinements
- Reorganize imports and transport_params structure
- Remove copyright header for consistency
- Enhance grounding metadata logging with better formatting
- Remove unnecessary PipelineParams configuration
- Update message content formatting

Completes incorporation of draft PR #2121 changes
2025-07-03 17:53:55 -04:00
Pete
4951c97eab Clean up verbose logging in grounding metadata implementation
- Remove debug logging from grounding metadata event handlers
- Simplify logging in _process_grounding_metadata method
- Clean up example file logging for better readability
- Remove verbose event parsing comments

Based on suggestions from draft PR #2121
2025-07-03 17:49:27 -04:00
Pete
9b38f3e2fa Delete examples/foundational/26f-gemini-multimodal-live-files-api.py 2025-07-03 17:15:18 -04:00
Pete
a297e4208e Merge branch 'main' into groundingMetadata 2025-06-30 19:48:55 -04:00
Pete
1cf0b35ac1 Merge branch 'main' into groundingMetadata 2025-06-24 22:00:16 -04:00
Matej Marinko
c54084b7a4 Fix deadlock on STT service stop 2025-06-23 14:18:29 +02:00
Pete
e3fe040017 Update gemini.py 2025-06-21 14:43:15 -04:00
Pete
ae5e3e2dc4 Merge branch 'main' into groundingMetadata 2025-06-21 12:16:32 -04:00
Pete
77378d2779 Merge branch 'pipecat-ai:main' into groundingMetadata 2025-06-21 12:08:49 -04:00
Pete
4106f0dabe Merge branch 'pipecat-ai:main' into main 2025-06-21 10:54:25 -04:00
Pete
2ed1ed6821 Merge branch 'pipecat-ai:main' into main 2025-06-14 16:23:27 -04:00
Matej Marinko
6d3a38842d Merge branch 'main' of github.com:pipecat-ai/pipecat 2025-06-12 11:32:38 +02:00
Pete
7360f79413 Merge branch 'pipecat-ai:main' into main 2025-06-11 13:16:19 -04:00
Pete
8d55e13750 remove audio_transcriber from gemini.py
unecessary import removed.
2025-06-10 11:22:16 -04:00
Pete
737e8e79c9 Merge branch 'main' into groundingMetadata 2025-06-10 11:12:35 -04:00
Pete
4d977fede0 Merge branch 'main' into main 2025-06-10 11:07:59 -04:00
getchannel
8070e156d8 Add groundingMetadata events.py 2025-05-30 18:07:09 -04:00
getchannel
43c6f1f5cd Add groundingMetadata and logging gemini.py 2025-05-30 18:01:15 -04:00
getchannel
f53f5445ba Create 26g-gemini-multimodal-live-groundingMetadata.py 2025-05-30 17:36:36 -04:00
getchannel
7263d11ee4 update correct upload endpoint file_api.py 2025-05-30 13:41:55 -04:00
getchannel
f2d5b9ad69 Create 26f-gemini-multimodal-live-files-api.py
This is an example to test usage of the Files API integration. Specifically with the Gemini Multimodal Live Service.
2025-05-30 13:04:52 -04:00
getchannel
40c7e3c52c Update gemini.py 2025-05-30 12:19:40 -04:00
Matej Marinko
ee5fea4221 Fix auto finalization cycle 2025-05-29 14:58:35 +02:00
Matej Marinko
db7b60cfe9 Auto finalize fix 2025-05-29 13:24:53 +02:00
Matej Marinko
51b79bd6a1 Minor code style changes 2025-05-29 10:11:11 +02:00
Matej Marinko
95fe762776 Fix typo 2025-05-29 09:23:37 +02:00
Matej Marinko
2968c846ce Add Soniox STT service 2025-05-28 09:35:21 +02:00
getchannel
e27da96cdc Rename file_api to file_api.py
added proper .py to file name.
2025-05-13 22:01:02 -04:00
getchannel
d86502e79a add file_api __init__.py 2025-05-09 10:53:31 -04:00
getchannel
59c7744590 add FileData class events.py 2025-05-09 10:52:04 -04:00
getchannel
949971dea9 Create file_api 2025-05-09 10:51:24 -04:00
getchannel
cd4a893c65 add FileAPI to gemini.py 2025-05-09 10:50:27 -04:00
1064 changed files with 19095 additions and 104696 deletions

View File

@@ -1,60 +0,0 @@
name: android
on:
push:
branches:
- main
paths:
- "examples/simple-chatbot/client/android/**"
- "examples/p2p-webrtc/video-transform/client/android/**"
pull_request:
branches:
- "**"
paths:
- "examples/simple-chatbot/client/android/**"
- "examples/p2p-webrtc/video-transform/client/android/**"
workflow_dispatch:
inputs:
sdk_git_ref:
type: string
description: "Which git ref of the app to build"
concurrency:
group: build-android-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
sdk:
name: "Demo apps"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.sdk_git_ref || github.ref }}
- name: "Install Java"
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '17'
- name: "Example app: Simple Chatbot"
working-directory: examples/simple-chatbot/client/android
run: ./gradlew :simple-chatbot-client:assembleDebug
- name: Upload Simple Chatbot APK
uses: actions/upload-artifact@v4
with:
name: Simple Chatbot Android Client
path: examples/simple-chatbot/client/android/simple-chatbot-client/build/outputs/apk/debug/simple-chatbot-client-debug.apk
- name: "Example app: Small WebRTC Client"
working-directory: examples/p2p-webrtc/video-transform/client/android
run: ./gradlew :small-webrtc-client:assembleDebug
- name: Upload Small WebRTC APK
uses: actions/upload-artifact@v4
with:
name: Small WebRTC Android Client
path: examples/p2p-webrtc/video-transform/client/android/small-webrtc-client/build/outputs/apk/debug/small-webrtc-client-debug.apk

View File

@@ -21,24 +21,20 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
- name: Install project and other Python dependencies
run: |
source .venv/bin/activate
pip install --editable .
run: uv build
- name: Install project in editable mode
run: uv pip install --editable .

View File

@@ -18,35 +18,28 @@ jobs:
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "latest"
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing dev-requirements.txt and test-requirements.txt which
# contain all dependencies needed to run the tests.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('dev-requirements.txt') }}-${{ hashFiles('test-requirements.txt') }}
path: .venv
run: uv python install 3.10
- name: Install system packages
id: install_system_packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
- name: Install dependencies
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r test-requirements.txt
uv sync --group dev --extra anthropic --extra aws --extra google --extra langchain
- name: Run tests with coverage
run: |
source .venv/bin/activate
coverage run
coverage xml
uv run coverage run
uv run coverage xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:

View File

@@ -22,25 +22,22 @@ jobs:
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: "3.10"
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install development Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Ruff formatter
id: ruff-format
run: |
source .venv/bin/activate
ruff format --diff
run: uv run ruff format --diff
- name: Ruff linter (all rules)
id: ruff-check
run: |
source .venv/bin/activate
ruff check
run: uv run ruff check

View File

@@ -17,23 +17,17 @@ jobs:
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.gitref }}
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
run: uv build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:

View File

@@ -12,23 +12,16 @@ jobs:
with:
fetch-tags: true
fetch-depth: 100
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
run: uv build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
@@ -38,7 +31,7 @@ jobs:
publish-to-test-pypi:
name: "Publish to Test PyPI"
runs-on: ubuntu-latest
needs: [ build ]
needs: [build]
environment:
name: testpypi
url: https://pypi.org/p/pipecat-ai

View File

@@ -0,0 +1,123 @@
name: Python Compatibility Test
on:
push:
branches: [main, develop]
paths: ['pyproject.toml']
pull_request:
branches: [main, develop]
paths: ['pyproject.toml']
jobs:
test-dev-environment:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.10.18', '3.11.13', '3.12.11', '3.13.5']
name: Dev Environment - Python ${{ matrix.python-version }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y \
portaudio19-dev \
libcairo2-dev \
libgirepository1.0-dev \
pkg-config
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
version: 'latest'
- name: Set up Python ${{ matrix.python-version }}
run: |
uv python install ${{ matrix.python-version }}
uv python pin ${{ matrix.python-version }}
- name: Test uv sync with all extras (Python < 3.13)
if: "!startsWith(matrix.python-version, '3.13.')"
run: |
uv sync --group dev --all-extras --no-extra krisp
- name: Test uv sync without PyTorch extras (Python 3.13+)
if: startsWith(matrix.python-version, '3.13.')
run: |
uv sync --group dev --all-extras \
--no-extra krisp \
--no-extra ultravox \
--no-extra local-smart-turn \
--no-extra moondream \
--no-extra mlx-whisper
- name: Verify dev installation
run: |
uv run python --version
uv run python -c "import pipecat; print('✅ Dev environment - Pipecat imports successfully')"
test-user-experience:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.10.18', '3.11.13', '3.12.11', '3.13.5']
name: User Experience - Python ${{ matrix.python-version }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y \
portaudio19-dev \
libcairo2-dev \
libgirepository1.0-dev \
pkg-config
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
version: 'latest'
- name: Set up Python ${{ matrix.python-version }}
run: |
uv python install ${{ matrix.python-version }}
- name: Build local package
run: |
uv build
- name: Create test project
run: |
mkdir test-project
cd test-project
uv init --python ${{ matrix.python-version }}
- name: Test comprehensive extras with uv add (Python 3.10-3.12)
if: "!startsWith(matrix.python-version, '3.13.')"
run: |
cd test-project
# Use uv add with built wheel to leverage dependency management
uv add "../dist/pipecat_ai-"*".whl[anthropic,assemblyai,asyncai,aws,aws-nova-sonic,azure,cartesia,cerebras,deepseek,daily,deepgram,elevenlabs,fal,fireworks,fish,gladia,google,grok,groq,gstreamer,heygen,inworld,koala,langchain,livekit,lmnt,local,mcp,mem0,mlx-whisper,moondream,nim,neuphonic,noisereduce,openai,openpipe,openrouter,perplexity,playht,qwen,rime,riva,runner,sambanova,sentry,local-smart-turn,remote-smart-turn,silero,simli,soniox,soundfile,speechmatics,tavus,together,tracing,ultravox,webrtc,websocket,whisper]"
- name: Test Python 3.13 compatible extras with uv add
if: startsWith(matrix.python-version, '3.13.')
run: |
cd test-project
# Use uv add with built wheel and Python 3.13 compatible extras
uv add "../dist/pipecat_ai-"*".whl[anthropic,assemblyai,asyncai,aws,aws-nova-sonic,azure,cartesia,cerebras,deepseek,daily,deepgram,elevenlabs,fal,fireworks,fish,gladia,google,grok,groq,gstreamer,heygen,inworld,koala,langchain,livekit,lmnt,local,mcp,mem0,nim,neuphonic,noisereduce,openai,openpipe,openrouter,perplexity,playht,qwen,rime,riva,runner,sambanova,sentry,remote-smart-turn,silero,simli,soniox,soundfile,speechmatics,tavus,together,tracing,webrtc,websocket,whisper]"
- name: Verify user installation
run: |
cd test-project
uv run python --version
uv run python -c "import pipecat; print('✅ User experience - Pipecat imports successfully')"
# Test that basic functionality works
uv run python -c "from pipecat.pipeline.pipeline import Pipeline; print('✅ Pipeline import works')"

56
.github/workflows/sync-quickstart.yaml vendored Normal file
View File

@@ -0,0 +1,56 @@
name: Sync Quickstart to pipecat-quickstart repo
on:
push:
branches: [main]
paths:
- 'examples/quickstart/**'
workflow_dispatch: # Manual trigger
jobs:
sync-quickstart:
runs-on: ubuntu-latest
steps:
- name: Checkout main repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checkout quickstart repo
uses: actions/checkout@v4
with:
repository: pipecat-ai/pipecat-quickstart
token: ${{ secrets.QUICKSTART_SYNC_TOKEN }}
path: quickstart-repo
- name: Sync files (excluding READMEs)
run: |
# Copy code files only, skip READMEs
cp examples/quickstart/bot.py quickstart-repo/
cp examples/quickstart/requirements.txt quickstart-repo/
cp examples/quickstart/env.example quickstart-repo/
# Copy any other files that aren't README.md
find examples/quickstart -type f \
-not -name "README.md" \
-not -name "*.md" \
-exec cp {} quickstart-repo/ \;
- name: Commit and push changes
run: |
cd quickstart-repo
git config user.name "GitHub Action"
git config user.email "action@github.com"
git add .
# Only commit if there are changes
if ! git diff --staged --quiet; then
git commit -m "Sync from pipecat main repo
Updated files from examples/quickstart/
Commit: ${{ github.sha }}
"
git push
else
echo "No changes to sync"
fi

View File

@@ -22,31 +22,23 @@ jobs:
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "latest"
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing dev-requirements.txt and test-requirements.txt which
# contain all dependencies needed to run the tests.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('dev-requirements.txt') }}-${{ hashFiles('test-requirements.txt') }}
path: .venv
run: uv python install 3.10
- name: Install system packages
id: install_system_packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
- name: Install dependencies
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r test-requirements.txt
uv sync --group dev --extra anthropic --extra aws --extra google --extra langchain
- name: Test with pytest
run: |
source .venv/bin/activate
pytest
uv run pytest

42
.github/workflows/update-lockfile.yaml vendored Normal file
View File

@@ -0,0 +1,42 @@
name: Update lockfile
on:
push:
paths:
- 'pyproject.toml'
branches:
- main
workflow_dispatch: # Allows manual triggering from GitHub UI
jobs:
update-lockfile:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
# This gives the workflow permission to push back to the repo
token: ${{ secrets.GITHUB_TOKEN }}
- name: Install uv
uses: astral-sh/setup-uv@v1
- name: Update lockfile
run: uv lock
- name: Check for changes
id: verify-changed-files
run: |
if [ -n "$(git status --porcelain)" ]; then
echo "changed=true" >> $GITHUB_OUTPUT
else
echo "changed=false" >> $GITHUB_OUTPUT
fi
- name: Commit lockfile
if: steps.verify-changed-files.outputs.changed == 'true'
run: |
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git add uv.lock
git commit -m "chore: update uv.lock after dependency changes"
git push

7
.gitignore vendored
View File

@@ -31,8 +31,6 @@ MANIFEST
fly.toml
# Examples
examples/telnyx-chatbot/templates/streams.xml
examples/twilio-chatbot/templates/streams.xml
examples/**/node_modules/
examples/**/.expo/
examples/**/dist/
@@ -50,4 +48,7 @@ examples/**/web-build/
# Documentation
docs/api/_build/
docs/api/api
docs/api/api
# uv
.python-version

View File

@@ -1,8 +1,8 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.9.7
rev: v0.12.1
hooks:
- id: ruff
language_version: python3
args: [ --select, I, ]
args: [--fix]
- id: ruff-format

View File

@@ -9,22 +9,14 @@ build:
- python3-dev
- libasound2-dev
jobs:
pre_build:
- python -m pip install --upgrade pip
- pip install wheel setuptools
post_build:
- echo "Build completed"
post_install:
- pip install uv
- UV_PROJECT_ENVIRONMENT=$READTHEDOCS_VIRTUALENV_PATH uv sync --group docs --all-extras --no-extra krisp --no-extra gstreamer --no-extra ultravox --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
sphinx:
configuration: docs/api/conf.py
fail_on_warning: false
python:
install:
- requirements: docs/api/requirements.txt
- method: pip
path: .
search:
ranking:
api/*: 5

View File

@@ -5,7 +5,406 @@ All notable changes to **Pipecat** will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.0.75] - 2025-07-08
## Unreleased
### Changed
- Updated `pyproject.toml` to once again pin `numba` to `>=0.61.2` in order to
resolve package versioning issues.
### Other
- Updated `15-switch-voices.py` and `15a-switch-languages.py` examples to show
how to enclose complex logic (e.g. `ParallelPipeline`) into a single processor
so the main pipeline becomes simpler.
## [0.0.79] - 2025-08-07
### Changed
- Changed `pipecat-ai`'s `openai` dependency to `>=1.74.0,<=1.99.1` due to a
breaking change in `openai` 1.99.2 ([commit](https://github.com/openai/openai-python/commit/657f551dbe583ffb259d987dafae12c6211fba06))
### Deprecated
- `TTSService.say()` is deprecated, push a `TTSSpeakFrame` instead. Calling
functions directly is a discouraged pattern in Pipecat because, for example,
it might cause issues with frame ordering.
- `LLMMessagesFrame` is deprecated, in favor of either:
- `LLMMessagesUpdateFrame` with `run_llm=True`
- `OpenAILLMContextFrame` with desired messages in a new context
- `LLMUserResponseAggregator` and `LLMAssistantResponseAggregator` are
deprecated, as they depended on the now-deprecated `LLMMessagesFrame`. Use
`LLMUserContextAggregator` and `LLMAssistantResponseAggregator` (or
LLM-specific subclasses thereof) instead.
## [0.0.78] - 2025-08-07
### Added
- Added `enable_emulated_vad_interruptions` to `LLMUserAggregatorParams`.
When user speech is emulated (e.g. when a transcription is received but
VAD doesn't detect speech), this parameter controls whether the emulated
speech can interrupt the bot. Default is False (emulated speech is ignored
while the bot is speaking).
- Added new `handle_sigint` and `handle_sigterm` to `RunnerArguments`. This
allows applications to know what settings they should use for the environment
they are running on. Also, added `pipeline_idle_timeout_secs` to be able to
control the `PipelineTask` idle timeout.
- Added `processor` field to `ErrorFrame` to indicate `FrameProcessor` that
generated the error.
- Added new language support for `AWSTranscribeSTTService`. All languages
supporting streaming data input are now supported:
https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html
- Added support for Simli Trinity Avatars. A new `is_trinity_avatar` parameter
has been introduced to specify whether the provided `faceId` corresponds to a
Trinity avatar, which is required for optimal Trinity avatar performance.
- The development runner how handles custom `body` data for `DailyTransport`.
The `body` data is passed to the Pipecat client. You can POST to the `/start`
endpoint with a request body of:
```
{
"createDailyRoom": true,
"dailyRoomProperties": { "start_video_off": true },
"body": { "custom_data": "value" }
}
```
The `body` information is parsed and used in the application. The
`dailyRoomProperties` are currently not handled.
- Added detailed latency logging to `UserBotLatencyLogObserver`, capturing
average response time between user stop and bot start, as well as minimum and
maximum response latency.
- Added Chinese, Japanese, Korean word timestamp support to
`CartesiaTTSService`.
- Added `region` parameter to `GladiaSTTService`. Accepted values: eu-west
(default), us-west.
### Changed
- System frames are now queued. Before, system frames could be generated from
any task and would not guarantee any order which was causing undesired
behavior. Also, it was possible to get into some rare recursion issues because
of the way system frames were executed (they were executed in-place, meaning
calling `push_frame()` would finish after the system frame traversed all the
pipeline). This makes system frames more deterministic.
- Changed the default model for both `ElevenLabsTTSService` and
`ElevenLabsHttpTTSService` to `eleven_turbo_v2_5`. The rationale for this
change is that the Turbo v2.5 model exhibits the most stable voice quality
along with very low latency TTFB; latencies are on par with the Flash v2.5
model. Also, the Turbo v2.5 model outputs word/timestamp alignment data with
correct spacing.
- The development runners `/connect` and `/start` endpoint now both return
`dailyRoom` and `dailyToken` in place of the previous `room_url` and `token`.
- Updated the `pipecat.runner.daily` utility to only a take `DAILY_API_URL` and
`DAILY_SAMPLE_ROOM_URL` environment variables instead of argparsing `-u` and
`-k`, respectively.
- Updated `daily-python` to 0.19.6.
- Changed `TavusVideoService` to send audio or video frames only after the
transport is ready, preventing warning messages at startup.
- The development runner now strips any provided protocol (e.g. https://) from
the proxy address and issues a warning. It also strips trailing `/`.
### Deprecated
- In the `pipecat.runner.daily`, the `configure_with_args()` function is
deprecated. Use the `configure()` function instead.
- The development runner's `/connect` endpoint is deprecated and will be
removed in a future version. Use the `/start` endpoint in its place. In the
meantime, both endpoints work and deliver equivalent functionality.
### Fixed
- Fixed a `DailyTransport` issue that would result in an unhandled
`concurrent.futures.CancelledError` when a future is cancelled.
- Fixed a `RivaSTTService` issue that would result in an unhandled
`concurrent.futures.CancelledError` when a future is cancelled when reading
from the audio chunks from the incoming audio stream.
- Fixed an issue in the `BaseOutputTransport`, mainly reproducible with
`FastAPIWebsocketOutputTransport` when the audio mixer was enabled, where the
loop could consume 100% CPU by continuously returning without delay, preventing
other asyncio tasks (such as cancellation or shutdown signals) from being
processed.
- Fixed an issue where `BotStartedSpeakingFrame` and `BotStoppedSpeakingFrame`
were not emitted when using `TavusVideoService` or `HeyGenVideoService`.
- Fixed an issue in `LiveKitTransport` where empty `AudioRawFrame`s were pushed
down the pipeline. This resulted in warnings by the STT processor.
- Fixed `PiperTTSService` to send text as a JSON object in the request body,
resolving compatibility with Piper's HTTP API.
- Fixed an issue with the `TavusVideoService` where an error was thrown due to
missing transcription callbacks.
- Fixed an issue in `SpeechmaticsSTTService` where the `user_id` was set to
`None` when diarization is not enabled.
### Performance
- Fixed an issue in `TaskObserver` (a proxy to all observers) that was degrading
global performance.
### Other
- Added `07aa-interruptible-soniox.py`, `07ab-interruptible-inworld-http.py`,
`07ac-interruptible-asyncai.py` and `07ac-interruptible-asyncai-http.py`
release evals.
## [0.0.77] - 2025-07-31
### Added
- Added `InputTextRawFrame` frame type to handle user text input with Gemini
Multimodal Live.
- Added `HeyGenVideoService`. This is an integration for HeyGen Interactive
Avatar. A video service that handles audio streaming and requests HeyGen to
generate avatar video responses. (see https://www.heygen.com/)
- Added the ability to switch voices to `RimeTTSService`.
- Added unified development runner for building voice AI bots across multiple
transports
- `pipecat.runner.run` FastAPI-based development server with automatic bot
discovery
- `pipecat.runner.types` Runner session argument types
(`DailyRunnerArguments`, `SmallWebRTCRunnerArguments`,
`WebSocketRunnerArguments`)
- `pipecat.runner.utils.create_transport()` Factory function for creating
transports from session arguments
- `pipecat.runner.daily` and `pipecat.runner.livekit` Configuration
utilities for Daily and LiveKit setups
- Support for all transport types: Daily, WebRTC, Twilio, Telnyx, Plivo
- Automatic telephony provider detection and serializer configuration
- ESP32 WebRTC compatibility with SDP munging
- Environment detection (`ENV=local`) for conditional features
- Added Async.ai TTS integration (https://async.ai/)
- `AsyncAITTSService` WebSocket-based streaming TTS with interruption
support
- `AsyncAIHttpTTSService` HTTP-based streaming TTS service
- Example scripts:
- `examples/foundational/07ac-interruptible-asyncai.py` (WebSocket demo)
- `examples/foundational/07ac-interruptible-asyncai-http.py` (HTTP demo)
- Added `transcription_bucket` params support to the `DailyRESTHelper`.
- Added a new TTS service, `InworldTTSService`. This service provides
low-latency, high-quality speech generation using Inworld's streaming API.
- Added a new field `handle_sigterm` to `PipelineRunner`. It defaults to
`False`. This field handles SIGTERM signals. The `handle_sigint` field still
defaults to `True`, but now it handles only SIGINT signals.
- Added foundational example `14u-function-calling-ollama.py` for Ollama
function calling.
- Added `LocalSmartTurnAnalyzerV2`, which supports local on-device inference
with the new `smart-turn-v2` turn detection model.
- Added `set_log_level` to `DailyTransport`, allowing setting the logging level
for Daily's internal logging system.
- Added `on_transcription_stopped` and `on_transcription_error` to Daily
callbacks.
### Changed
- Changed the default `url` for `NeuphonicTTSService` to
`wss://api.neuphonic.com` as it provides better global performance. You can
set the URL to other URLs, such as the previous default:
`wss://eu-west-1.api.neuphonic.com`.
- Update `daily-python` to 0.19.5.
- `STTMuteFilter` now pushes the `STTMuteFrame` upstream and downstream, to
allow for more flexible `STTMuteFilter` placement.
- Play delayed messages from `ElevenLabsTTSService` if they still belong to the
current context.
- Dependency compatibility improvements: Relaxed version constraints for core
dependencies to support broader version ranges while maintaining stability:
- `aiohttp`, `Markdown`, `nltk`, `numpy`, `Pillow`, `pydantic`, `openai`,
`numba`: Now support up to the next major version (e.g. `numpy>=1.26.4,<3`)
- `pyht`: Relaxed to `>=0.1.6` to resolve `grpcio` conflicts with
`nvidia-riva-client`
- `fastapi`: Updated to support versions `>=0.115.6,<0.117.0`
- `torch`/`torchaudio`: Changed from exact pinning (`==2.5.0`) to compatible
range (`~=2.5.0`)
- `aws_sdk_bedrock_runtime`: Added Python 3.12+ constraint via environment
marker
- `numba`: Reduced minimum version to `0.60.0` for better compatibility
- Changed `NeuphonicHttpTTSService` to use a POST based request instead of the
`pyneuphonic` package. This removes a package requirement, allowing Neuphonic
to work with more services.
- Updated `ElevenLabsTTSService` to handle the case where
`allow_interruptions=False`. Now, when interruptions are disabled, the same
context ID will be used throughout the conversation.
- Updated the `deepgram` optional dependency to 4.7.0, which downgrades the
`tasks cancelled error` to a debug log. This removes the log from appearing
in Pipecat logs upon leaving.
- Upgraded the `websockets` implementation to the new asyncio implementation.
Along with this change, we're updating support for versions >=13.1.0 and
<15.0.0. All services have been update to use the asyncio implementation.
- Updated `MiniMaxHttpTTSService` with a `base_url` arg where you can specify
the Global endpoint (default) or Mainland China.
- Replaced regex-based sentence detection in `match_endofsentence` with NLTK's
punkt_tab tokenizer for more reliable sentence boundary detection.
- Changed the `livekit` optional dependency for `tenacity` to
`tenacity>=8.2.3,<10.0.0` in order to support the `google-genai` package.
- For `LmntTTSService`, changed the default `model` to `blizzard`, LMNT's
recommended model.
- Updated `SpeechmaticsSTTService`:
- Added support for additional diarization options.
- Added foundational example `07a-interruptible-speechmatics-vad.py`, which
uses VAD detection provided by `SpeechmaticsSTTService`.
### Fixed
- Fixed a `LLMUserResponseAggregator` issue where interruptions were not being
handled properly.
- Fixed `PiperTTSService` to work with newer Piper GPL.
- Fixed a race condition in `FastAPIWebsocketClient` that occurred when
attempting to send a message while the client was disconnecting.
- Fixed an issue in `GoogleLLMService` where interruptions did not work when an
interruption strategy was used.
- Fixed an issue in the `TranscriptProcessor` where newline characters could
cause the transcript output to be corrupted (e.g. missing all spaces).
- Fixed an issue in `AudioBufferProcessor` when using `SmallWebRTCTransport`
where, if the microphone was muted, track timing was not respected.
- Fixed an error that occurs when pushing an `LLMMessagesFrame`. Only some LLM
services, like Grok, are impacted by this issue. The fix is to remove the
optional `name` property that was being added to the message.
- Fixed an issue in `AudioBufferProcessor` that caused garbled audio when
`enable_turn_audio` was enabled and audio resampling was required.
- Fixed a dependency issue for uv users where an `llvmlite` version required
python 3.9.
- Fixed an issue in `MiniMaxHttpTTSService` where the `pitch` param was the
incorrect type.
- Fixed an issue with OpenTelemetry tracing where the `enable_tracing` flag did
not disable the internal tracing decorator functions.
- Fixed an issue in `OLLamaLLMService` where kwargs were not passed correctly
to the parent class.
- Fixed an issue in `ElevenLabsTTSService` where the word/timestamp pairs were
calculating word boundaries incorrectly.
- Fixed an issue where, in some edge cases, the
`EmulateUserStartedSpeakingFrame` could be created even if we didn't have a
transcription.
- Fixed an issue in `GoogleLLMContext` where it would inject the
`system_message` as a "user" message into cases where it was not meant to;
it was only meant to do that when there were no "regular" (non-function-call)
messages in the context, to ensure that inference would run properly.
- Fixed an issue in `LiveKitTransport` where the `on_audio_track_subscribed` was
never emitted.
### Other
- Added new quickstart demos:
- examples/quickstart: voice AI bot quickstart
- examples/client-server-web: client/server starter example
- examples/phone-bot-twilio: twilio starter example
- Removed most of the examples from the pipecat repo. Examples can now be
found in: https://github.com/pipecat-ai/pipecat-examples.
## [0.0.76] - 2025-07-11
### Added
- Added `SpeechControlParamsFrame`, a new `SystemFrame` that notifies
downstream processors of the VAD and Turn analyzer params. This frame is
pushed by the `BaseInputTransport` at Start and any time a
`VADParamsUpdateFrame` is received.
### Changed
- Two package dependencies have been updated:
- `numpy` now supports 1.26.0 and newer
- `transformers` now supports 4.48.0 and newer
### Fixed
- Fixed an issue with RTVI's handling of `append-to-context`.
- Fixed an issue where using audio input with a sample rate requiring resampling
could result in empty audio being passed to STT services, causing errors.
- Fixed the VAD analyzer to process the full audio buffer as long as it contains
more than the minimum required bytes per iteration, instead of only analyzing
the first chunk.
- Fixed an issue in ParallelPipeline that caused errors when attempting to drain
the queues.
- Fixed an issue with emulated VAD timeout inconsistency in
`LLMUserContextAggregator`. Previously, emulated VAD scenarios (where
transcription is received without VAD detection) used a hardcoded
`aggregation_timeout` (default 0.5s) instead of matching the VAD's
`stop_secs` parameter (default 0.8s). This created different user experiences
between real VAD and emulated VAD scenarios. Now, emulated VAD timeouts
automatically synchronize with the VAD's `stop_secs` parameter.
- Fix a pipeline freeze when using AWS Nova Sonic, which would occur if the
user started early, while the bot was still working through
`trigger_assistant_response()`.
## [0.0.75] - 2025-07-08 [YANKED]
**This release has been yanked due to resampling issues affecting audio output
quality and critical bugs impacting `ParallelPipelines` functionality.**
**Please upgrade to version 0.0.76 or later.**
### Added
@@ -66,7 +465,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Remove unncessary push task in each `FrameProcessor`.
## [0.0.74] - 2025-07-03
## [0.0.74] - 2025-07-03 [YANKED]
**This release has been yanked due to resampling issues affecting audio output
quality and critical bugs impacting `ParallelPipelines` functionality.**
**Please upgrade to version 0.0.76 or later.**
### Added

View File

@@ -1,40 +0,0 @@
# setup
FROM python:3.11.5
WORKDIR /app
COPY requirements.txt /app
COPY *.py /app
COPY pyproject.toml /app
COPY src/ /app/src/
COPY examples/ /app/examples/
WORKDIR /app
RUN ls --recursive /app/
RUN pip3 install --upgrade -r requirements.txt
RUN python -m build .
RUN pip3 install .
RUN pip3 install gunicorn
# If running on Ubuntu, Azure TTS requires some extra config
# https://learn.microsoft.com/en-us/azure/ai-services/speech-service/quickstarts/setup-platform?pivots=programming-language-python&tabs=linux%2Cubuntu%2Cdotnetcli%2Cdotnet%2Cjre%2Cmaven%2Cnodejs%2Cmac%2Cpypi
RUN wget -O - https://www.openssl.org/source/openssl-1.1.1w.tar.gz | tar zxf -
WORKDIR openssl-1.1.1w
RUN ./config --prefix=/usr/local
RUN make -j $(nproc)
RUN make install_sw install_ssldirs
RUN ldconfig -v
ENV SSL_CERT_DIR=/etc/ssl/certs
#ENV LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
RUN apt clean
RUN apt-get update
RUN apt-get -y install build-essential libssl-dev ca-certificates libasound2 wget
ENV PYTHONUNBUFFERED=1
WORKDIR /app
EXPOSE 8000
# run
CMD ["gunicorn", "--workers=2", "--log-level", "debug", "--chdir", "examples/server", "--capture-output", "daily-bot-manager:app", "--bind=0.0.0.0:8000"]

149
README.md
View File

@@ -8,7 +8,7 @@
**Pipecat** is an open-source Python framework for building real-time voice and multimodal conversational agents. Orchestrate audio and video, AI services, different transports, and conversation pipelines effortlessly—so you can focus on what makes your agent unique.
> Want to dive right in? [Install Pipecat](https://docs.pipecat.ai/getting-started/installation) then try the [quickstart](https://docs.pipecat.ai/getting-started/quickstart).
> Want to dive right in? Try the [quickstart](https://docs.pipecat.ai/getting-started/quickstart).
## 🚀 What You Can Build
@@ -31,11 +31,11 @@
## 🎬 See it in action
<p float="left">
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png" width="400" /></a>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/simple-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
<br/>
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png" width="400" /></a>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
</p>
## 📱 Client SDKs
@@ -51,98 +51,123 @@ You can connect to Pipecat from any platform using our official SDKs:
## 🧩 Available services
| Category | Services |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Parakeet (NVIDIA)](https://docs.pipecat.ai/server/services/stt/parakeet), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [FastPitch (NVIDIA)](https://docs.pipecat.ai/server/services/tts/fastpitch), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx) |
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
| Category | Services |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx) |
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
## ⚡ Getting started
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youre ready.
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you're ready.
```shell
# Install the module
pip install pipecat-ai
1. Install uv
# Set up your environment
cp dot-env.template .env
```
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with:
> **Need help?** Refer to the [uv install documentation](https://docs.astral.sh/uv/getting-started/installation/).
```shell
pip install "pipecat-ai[option,...]"
```
2. Install the module
```bash
# For new projects
uv init my-pipecat-app
cd my-pipecat-app
uv add pipecat-ai
# Or for existing projects
uv add pipecat-ai
```
3. Set up your environment
```bash
cp env.example .env
```
4. To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with:
```bash
uv add "pipecat-ai[option,...]"
```
> **Using pip?** You can still use `pip install pipecat-ai` and `pip install "pipecat-ai[option,...]"` to get set up.
## 🧪 Code examples
- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [Example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) — complete applications that you can use as starting points for development
- [Example apps](https://github.com/pipecat-ai/pipecat-examples) — complete applications that you can use as starting points for development
## 🛠️ Hacking on the framework itself
## 🛠️ Contributing to the framework
1. Set up a virtual environment before following these instructions. From the root of the repo:
### Prerequisites
```shell
python3 -m venv venv
source venv/bin/activate
**Python Version:** 3.10+
### Setup Steps
1. Clone the repository and navigate to it:
```bash
git clone https://github.com/pipecat-ai/pipecat.git
cd pipecat
```
2. Install the development dependencies:
2. Install development and testing dependencies:
```shell
pip install -r dev-requirements.txt
```bash
uv sync --group dev --all-extras --no-extra gstreamer --no-extra krisp --no-extra local
```
3. Install the git pre-commit hooks (these help ensure your code follows project rules):
3. Install the git pre-commit hooks:
```shell
pre-commit install
```bash
uv run pre-commit install
```
4. Install the `pipecat-ai` package locally in editable mode:
### Python 3.13+ Compatibility
```shell
pip install -e .
```
Some features require PyTorch, which doesn't yet support Python 3.13+. Install using:
> The `-e` or `--editable` option allows you to modify the code without reinstalling.
```bash
uv sync --group dev --all-extras \
--no-extra gstreamer \
--no-extra krisp \
--no-extra local \
--no-extra local-smart-turn \
--no-extra mlx-whisper \
--no-extra moondream \
--no-extra ultravox
```
5. Include optional dependencies as needed. For example:
> **Tip:** For full compatibility, use Python 3.12: `uv python pin 3.12`
```shell
pip install -e ".[daily,deepgram,cartesia,openai,silero]"
```
6. (Optional) If you want to use this package from another directory:
```shell
pip install "path_to_this_repo[option,...]"
```
> **Note**: Some extras (local, gstreamer) require system dependencies. See documentation if you encounter build errors.
### Running tests
Install the test dependencies:
To run all tests, from the root directory:
```shell
pip install -r test-requirements.txt
```bash
uv run pytest
```
From the root directory, run:
Run a specific test suite:
```shell
pytest
```bash
uv run pytest tests/test_name.py
```
### Setting up your editor

View File

@@ -1,13 +0,0 @@
build~=1.2.2
coverage~=7.9.1
grpcio-tools~=1.67.1
pip-tools~=7.4.1
pre-commit~=4.2.0
pyright~=1.1.402
pytest~=8.4.1
pytest-asyncio~=1.0.0
pytest-aiohttp==1.1.0
ruff~=0.12.1
setuptools~=78.1.1
setuptools_scm~=8.3.1
python-dotenv~=1.1.1

View File

@@ -1,10 +1,27 @@
#!/bin/bash
# Build docs using uv
echo "Installing dependencies with uv..."
uv sync --group docs --all-extras --no-extra krisp --no-extra gstreamer --no-extra ultravox --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
# Check if sphinx-build is available
if ! uv run sphinx-build --version &> /dev/null; then
echo "Error: sphinx-build is not available" >&2
exit 1
fi
# Clean previous build
rm -rf _build
echo "Building documentation..."
# Build docs matching ReadTheDocs configuration
sphinx-build -b html -d _build/doctrees . _build/html -W --keep-going
uv run sphinx-build -b html -d _build/doctrees . _build/html -W --keep-going
# Open docs (MacOS)
open _build/html/index.html
if [ $? -eq 0 ]; then
echo "Documentation built successfully!"
# Open docs (MacOS)
open _build/html/index.html
else
echo "Documentation build failed!" >&2
exit 1
fi

View File

@@ -1,4 +1,5 @@
import logging
import os
import sys
from datetime import datetime
from pathlib import Path
@@ -28,6 +29,7 @@ extensions = [
suppress_warnings = [
"autodoc.mocked_object",
"toc.not_included",
]
# Napoleon settings
@@ -45,84 +47,40 @@ autodoc_default_options = {
# Mock imports for optional dependencies
autodoc_mock_imports = [
"riva",
"livekit",
"pyht", # Base PlayHT package
"pyht.async_client", # PlayHT specific imports
"pyht.client",
"pyht.protos",
"pyht.protos.api_pb2",
"pipecat_ai_playht", # PlayHT wrapper
"aiortc",
"aiortc.mediastreams",
"cv2",
"av",
"pyneuphonic",
"mem0",
"mlx_whisper",
"anthropic",
"assemblyai",
"boto3",
"azure",
"cartesia",
"deepgram",
"elevenlabs",
"fal",
"gladia",
"google",
"krisp",
"langchain",
"lmnt",
"noisereduce",
"openpipe",
"simli",
"soundfile",
# Krisp - has build issues on some platforms
"pipecat_ai_krisp",
"pyaudio",
"krisp",
# System-specific GUI libraries
"_tkinter",
"tkinter",
"daily",
"daily_python",
# Moondream dependencies
"torch",
"transformers",
"intel_extension_for_pytorch",
# Ultravox dependencies
"huggingface_hub",
# Platform-specific audio libraries (if needed)
"gi",
"gi.require_version",
"gi.repository",
# OpenCV - sometimes has import issues during docs build
"cv2",
# Heavy ML packages excluded from ReadTheDocs
# ultravox dependencies
"vllm",
"vllm.engine.arg_utils",
# local-smart-turn dependencies
"coremltools",
"coremltools.models",
"coremltools.models.MLModel",
"torch",
"torch.nn",
"torch.nn.functional",
"torchaudio",
# moondream dependencies
"transformers",
"transformers.AutoTokenizer",
# Langchain dependencies
"langchain_core",
"langchain_core.messages",
"langchain_core.runnables",
"langchain_core.messages.AIMessageChunk",
"langchain_core.runnables.Runnable",
# LiveKit dependencies
"livekit",
"livekit.rtc",
"livekit_api",
"livekit_protocol",
"tenacity",
"tenacity.retry",
"tenacity.stop_after_attempt",
"tenacity.wait_exponential",
"rtc",
"rtc.Room",
"rtc.RoomOptions",
"rtc.AudioSource",
"rtc.LocalAudioTrack",
"rtc.TrackPublishOptions",
"rtc.TrackSource",
"rtc.AudioStream",
"rtc.AudioFrameEvent",
"rtc.AudioFrame",
"rtc.Track",
"rtc.TrackKind",
"rtc.RemoteParticipant",
"rtc.RemoteTrackPublication",
"rtc.DataPacket",
# Riva dependencies
"transformers.AutoFeatureExtractor",
"AutoFeatureExtractor",
"timm",
"einops",
"intel_extension_for_pytorch",
"huggingface_hub",
# riva dependencies
"riva",
"riva.client",
"riva.client.Auth",
@@ -132,57 +90,14 @@ autodoc_mock_imports = [
"riva.client.AudioEncoding",
"riva.client.proto.riva_tts_pb2",
"riva.client.SpeechSynthesisService",
# Local CoreML Smart Turn dependencies
"coremltools",
"coremltools.models",
"coremltools.models.MLModel",
"torch",
"torch.nn",
"torch.nn.functional",
"transformers",
"transformers.AutoFeatureExtractor",
# Also add specific classes that are imported
"AutoFeatureExtractor",
# Sentry dependencies
"sentry_sdk",
# AWS Nova Sonic dependencies
"aws_sdk_bedrock_runtime",
"aws_sdk_bedrock_runtime.client",
"aws_sdk_bedrock_runtime.config",
"aws_sdk_bedrock_runtime.models",
"smithy_aws_core",
"smithy_aws_core.credentials_resolvers",
"smithy_aws_core.credentials_resolvers.static",
"smithy_aws_core.identity",
"smithy_core",
"smithy_core.aio",
"smithy_core.aio.eventstream",
# MCP dependencies (you may already have these)
"mcp",
"mcp.client",
"mcp.client.session_group",
"mcp.client.sse",
"mcp.client.stdio",
"mcp.ClientSession",
"mcp.StdioServerParameters",
# gstreamer
"gi",
"gi.require_version",
"gi.repository",
# Protobuf mocks
"pipecat.frames.protobufs.frames_pb2",
"pipecat.serializers.protobuf",
"google.protobuf",
"google.protobuf.descriptor",
"google.protobuf.descriptor_pool",
"google.protobuf.runtime_version",
"google.protobuf.symbol_database",
"google.protobuf.internal.builder",
# MLX dependencies (Apple Silicon specific)
"mlx",
"mlx_whisper", # Note: might need underscore format too
]
# HTML output settings
html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"]
html_static_path = ["_static"] if os.path.exists("_static") else []
autodoc_typehints = "signature" # Show type hints in the signature only, not in the docstring
html_show_sphinx = False
@@ -201,6 +116,7 @@ def import_core_modules():
"pipecat.clocks",
"pipecat.metrics",
"pipecat.observers",
"pipecat.runner",
"pipecat.serializers",
"pipecat.sync",
"pipecat.transcriptions",

View File

@@ -14,7 +14,7 @@ Quick Links
* `Join our Community <https://discord.gg/pipecat>`_
.. toctree::
:maxdepth: 3
:maxdepth: 2
:caption: API Reference
:hidden:
@@ -26,6 +26,7 @@ Quick Links
Observers <api/pipecat.observers>
Pipeline <api/pipecat.pipeline>
Processors <api/pipecat.processors>
Runner <api/pipecat.runner>
Serializers <api/pipecat.serializers>
Services <api/pipecat.services>
Sync <api/pipecat.sync>

View File

@@ -1,55 +0,0 @@
# Sphinx dependencies
sphinx>=8.1.3
sphinx-rtd-theme
sphinx-markdown-builder
sphinx-autodoc-typehints
toml
# Install all extras individually to ensure they're properly resolved
pipecat-ai[anthropic]
pipecat-ai[assemblyai]
pipecat-ai[aws]
pipecat-ai[azure]
pipecat-ai[cartesia]
pipecat-ai[cerebras]
pipecat-ai[deepseek]
pipecat-ai[daily]
pipecat-ai[deepgram]
pipecat-ai[elevenlabs]
pipecat-ai[fal]
pipecat-ai[fireworks]
pipecat-ai[fish]
pipecat-ai[gladia]
pipecat-ai[google]
pipecat-ai[grok]
pipecat-ai[groq]
# pipecat-ai[krisp] # Mocked
pipecat-ai[koala]
# pipecat-ai[langchain] # Mocked
# pipecat-ai[livekit] # Mocked
pipecat-ai[lmnt]
pipecat-ai[local]
# pipecat-ai[local-smart-turn] # Mocked
# pipecat-ai[mem0] # Mocked
# pipecat-ai[mlx-whisper] # Mocked
# pipecat-ai[moondream] # Mocked
pipecat-ai[nim]
# pipecat-ai[neuphonic] # Mocked
pipecat-ai[noisereduce]
pipecat-ai[openai]
# pipecat-ai[openpipe]
# pipecat-ai[playht] # Mocked due to grpcio conflict with riva
pipecat-ai[qwen]
pipecat-ai[remote-smart-turn]
# pipecat-ai[riva] # Mocked
pipecat-ai[sambanova]
pipecat-ai[silero]
pipecat-ai[simli]
pipecat-ai[soundfile]
pipecat-ai[speechmatics]
pipecat-ai[tavus]
pipecat-ai[together]
# pipecat-ai[ultravox] # Mocked
# pipecat-ai[webrtc] # Mocked
pipecat-ai[websocket]
pipecat-ai[whisper]

View File

@@ -1,6 +1,10 @@
# Anthropic
ANTHROPIC_API_KEY=...
# Async
ASYNCAI_API_KEY=...
ASYNCAI_VOICE_ID=...
# AWS
AWS_SECRET_ACCESS_KEY=...
AWS_ACCESS_KEY_ID=...
@@ -25,6 +29,9 @@ CARTESIA_API_KEY=...
DAILY_API_KEY=...
DAILY_SAMPLE_ROOM_URL=https://...
# Deepgram
DEEPGRAM_API_KEY=...
# ElevenLabs
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
@@ -40,6 +47,13 @@ FIREWORKS_API_KEY=...
# Gladia
GLADIA_API_KEY=...
GLADIA_REGION=...
# Google
GOOGLE_API_KEY=...
GOOGLE_CLOUD_PROJECT_ID=...
GOOGLE_TEST_CREDENTIALS=...
GOOGLE_VERTEX_TEST_CREDENTIALS=...
# LMNT
LMNT_API_KEY=...
@@ -76,6 +90,9 @@ GROQ_API_KEY=...
# Grok
GROK_API_KEY=...
# Inworld
INWORLD_API_KEY=...
# Together.ai
TOGETHER_API_KEY=...
@@ -109,12 +126,17 @@ MINIMAX_GROUP_ID=...
# Sarvam AI
SARVAM_API_KEY=...
# Soniox
SONIOX_API_KEY=
# Speechmatics
SPEECHMATICS_API_KEY=...
# SambaNova
SAMBANOVA_API_KEY=...
# Sentry
SENTRY_DSN=...
# Heygen
HEYGEN_API_KEY=...

View File

@@ -1,88 +1,31 @@
# Pipecat Examples
This directory contains examples to help you learn how to build with Pipecat.
# Pipecat &mdash; Examples
## Getting Started
## Foundational snippets
Small snippets that build on each other, introducing one or two concepts at a time.
New to Pipecat? Start here:
➡️ [Take a look](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational)
- **[Quickstart](quickstart/)** - Get your first voice AI bot running in 5 minutes _(coming soon)_
- **[Client/Server Web](client-server-web/)** - Learn to build web applications with Pipecat's client SDKs _(coming soon)_
- **[Phone Bot with Twilio](phone-bot-twilio/)** - Connect your bot to a phone number _(coming soon)_
## Chatbot examples
Collection of self-contained real-time voice and video AI demo applications built with Pipecat.
## Foundational Examples
### Quickstart
Single-file examples that introduce core Pipecat concepts one at a time. These examples:
Each project has its own set of dependencies and configuration variables. They intentionally avoids shared code across projects &mdash; you can grab whichever demo folder you want to work with as a starting point.
- Build on each other progressively
- Focus on specific features or integrations
- Are used for testing with every Pipecat release
We recommend you start with a virtual environment:
See the **[Foundational Examples README](foundational/)** for the complete list.
```shell
cd pipecat-ai/examples/simple-chatbot
## More Advanced Examples
python -m venv venv
Ready to explore complex use cases? Visit **[pipecat-examples](https://github.com/pipecat-ai/pipecat-examples)** for:
source venv/bin/activate
pip install -r requirements.txt
```
Next, follow the steps in the README for each demo.
Make sure you `pip install -r requirements.txt` for each demo project, so you can be sure to have the necessary service dependencies that extend the functionality of Pipecat. You can read more about the framework architecture [here](https://github.com/pipecat-ai/pipecat/tree/main/docs).
## Projects:
| Project | Description | Services |
|----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|
| [Simple Chatbot](simple-chatbot) | Basic voice-driven conversational bot. A good starting point for learning the flow of the framework. | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI |
| [Storytelling Chatbot](storytelling-chatbot) | Stitches together multiple third-party services to create a collaborative storytime experience. | Deepgram, ElevenLabs, OpenAI, Fal, Daily, Custom UI |
| [Translation Chatbot](translation-chatbot) | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI |
| [Moondream Chatbot](moondream-chatbot) | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU** | Deepgram, ElevenLabs, OpenAI, Moondream, Daily, Daily Prebuilt UI |
| [Patient intake](patient-intake) | A chatbot that can call functions in response to user input. | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI |
| [Phone Chatbot](phone-chatbot) | A chatbot that connects to PSTN/SIP phone calls, powered by Daily or Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [Twilio Chatbot](twilio-chatbot) | A chatbot that connects to an incoming phone call from Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [studypal](studypal) | A chatbot to have a conversation about any article on the web | |
| [WebSocket Chatbot Server](websocket-server) | A real-time websocket server that handles audio streaming and bot interactions with speech-to-text and text-to-speech capabilities. | Cartesia, Deepgram, OpenAI, Websockets |
> [!IMPORTANT]
> These example projects use Daily as a WebRTC transport and can be joined using their hosted Prebuilt UI.
> It provides a quick way to join a real-time session with your bot and test your ideas without building any frontend code. If you'd like to see an example of a custom UI, try Storybot.
## FAQ
### Deployment
For each of these demos we've included a `Dockerfile`. Out of the box, this should provide everything needed to get the respective demo running on a VM:
```shell
docker build username/app:tag .
docker run -p 7860:7860 --env-file ./.env username/app:tag
docker push ...
```
### SSL
If you're working with a custom UI (such as with the Storytelling Chatbot), it's important to ensure your deployment platform supports HTTPS, as accessing user devices such as mics and webcams requires SSL.
If you try to run a custom UI without SSL, you may see an error in the console telling you that `navigator` is undefined, or no devices are available.
### Are these examples production ready?
Yes, kind of.
These demos attempt to keep things simple and are unopinionated regarding environment or scalability.
We're using FastAPI to spawn a subprocess for the bots / agents &mdash; useful for small tests, but not so great for production grade apps with many concurrent users. You can see how this works in each project's `start` endpoint in `server.py`.
Creating virtualized worker pools and on-demand instances is out of scope for these examples, but we hope to add some examples to this repo soon!
For projects that have CUDA as a requirement, such as Moondream Chatbot, be sure to deploy to a GPU-powered platform (such as [fly.io](https://fly.io) or [Runpod](https://runpod.io).)
## Getting help
➡️ [Join our Discord](https://discord.gg/pipecat)
➡️ [Reach us on Twitter](https://x.com/pipecat_ai)
- Production-ready applications
- Multi-platform client implementations
- Telephony integrations
- Multimodal and creative applications
- Deployment and monitoring examples

View File

@@ -1,45 +0,0 @@
# Bot ready signaling
A simple Pipecat example demonstrating how to handle signaling between the client and the bot,
ensuring that the bot starts sending audio only when the client is available,
thereby avoiding the risk of cutting off the beginning of the audio.
## Quick Start
### First, start the bot server:
1. Navigate to the server directory:
```bash
cd server
```
2. Create and activate a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install requirements:
```bash
pip install -r requirements.txt
```
4. Copy env.example to .env and configure:
- Add your API keys
5. Start the server:
```bash
python server.py
```
### Next, connect using the client app:
For client-side setup, refer to the [JavaScript Guide](client/javascript/README.md).
## Important Note
Ensure the bot server is running before using any client implementations.
## Requirements
- Python 3.10+
- Node.js 16+ (for JavaScript)
- Daily API key
- Cartesia API key
- Modern web browser with WebRTC support

View File

@@ -1,27 +0,0 @@
# JavaScript Implementation
Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
## Setup
1. Run the bot server. See the [server README](../../README).
2. Navigate to the `client/javascript` directory:
```bash
cd client/javascript
```
3. Install dependencies:
```bash
npm install
```
4. Run the client app:
```
npm run dev
```
5. Visit http://localhost:5173 in your browser.

View File

@@ -1,34 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI Chatbot</title>
</head>
<body>
<div class="container">
<div class="status-bar">
<div class="status">
Status: <span id="connection-status">Disconnected</span>
</div>
<div class="controls">
<button id="connect-btn">Connect</button>
<button id="disconnect-btn" disabled>Disconnect</button>
</div>
</div>
<audio id="bot-audio" autoplay></audio>
<div class="debug-panel">
<h3>Debug Info</h3>
<div id="debug-log"></div>
</div>
</div>
<script type="module" src="/src/app.js"></script>
<link rel="stylesheet" href="/src/style.css">
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -1,20 +0,0 @@
{
"name": "client",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.3.5"
},
"dependencies": {
"@daily-co/daily-js": "0.74.0"
}
}

View File

@@ -1,216 +0,0 @@
/**
* Copyright (c) 20242025, Daily
*
* SPDX-License-Identifier: BSD 2-Clause License
*/
import Daily from "@daily-co/daily-js";
/**
* ChatbotClient handles the connection and media management for a real-time
* voice interaction with an AI bot.
*/
class ChatbotClient {
constructor() {
// Initialize client state
this.dailyCallObject = null;
this.setupDOMElements();
this.setupEventListeners();
}
/**
* Set up references to DOM elements and create necessary media elements
*/
setupDOMElements() {
// Get references to UI control elements
this.connectBtn = document.getElementById('connect-btn');
this.disconnectBtn = document.getElementById('disconnect-btn');
this.statusSpan = document.getElementById('connection-status');
this.debugLog = document.getElementById('debug-log');
// Create an audio element for bot's voice output
this.botAudio = document.createElement('audio');
this.botAudio.autoplay = true;
this.botAudio.playsInline = true;
document.body.appendChild(this.botAudio);
}
/**
* Set up event listeners for connect/disconnect buttons
*/
setupEventListeners() {
this.connectBtn.addEventListener('click', () => this.connect());
this.disconnectBtn.addEventListener('click', () => this.disconnect());
}
/**
* Add a timestamped message to the debug log
*/
log(message) {
const entry = document.createElement('div');
entry.textContent = `${new Date().toISOString()} - ${message}`;
// Add styling based on message type
if (message.startsWith('User: ')) {
entry.style.color = '#2196F3'; // blue for user
} else if (message.startsWith('Bot: ')) {
entry.style.color = '#4CAF50'; // green for bot
}
this.debugLog.appendChild(entry);
this.debugLog.scrollTop = this.debugLog.scrollHeight;
console.log(message);
}
/**
* Update the connection status display
*/
updateStatus(status) {
this.statusSpan.textContent = status;
this.log(`Status: ${status}`);
}
handleEventToConsole (evt) {
this.log(`Received event: ${evt.action}`);
};
/**
* Set up listeners for track events (start/stop)
* This handles new tracks being added during the session
*/
setupTrackListeners() {
if (!this.dailyCallObject) return;
this.dailyCallObject.on("joined-meeting", () => {
this.updateStatus('Connected');
this.connectBtn.disabled = true;
this.disconnectBtn.disabled = false;
this.log('Client connected');
});
this.dailyCallObject.on("track-started", (evt) => {
if (evt.track.kind === "audio" && evt.participant.local === false) {
this.log("Audio track started.")
this.setupAudioTrack(evt.track);
}
});
this.dailyCallObject.on("track-stopped", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-joined", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-updated", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-left", () => {
// When the bot leaves, we are also disconnecting from the call
this.disconnect()
});
this.dailyCallObject.on("left-meeting", () => {
this.updateStatus('Disconnected');
this.connectBtn.disabled = false;
this.disconnectBtn.disabled = true;
this.log('Client disconnected');
});
this.dailyCallObject.on("error", this.handleEventToConsole.bind(this));
}
/**
* Set up an audio track for playback
* Handles both initial setup and track updates
*/
setupAudioTrack(track) {
this.log(`Setting up audio track, track state: ${track.readyState}, muted: ${track.muted}`);
// Check if we're already playing this track
if (this.botAudio.srcObject) {
const oldTrack = this.botAudio.srcObject.getAudioTracks()[0];
if (oldTrack?.id === track.id) return;
}
// Create a new MediaStream with the track and set it as the audio source
this.botAudio.srcObject = new MediaStream([track]);
this.botAudio.onplaying = async (event) => {
this.log("onplaying")
this.log("Will send the audio message to play the audio at the next tick")
this.dailyCallObject.sendAppMessage("playable")
}
}
async fetchRoomInfo() {
let connectUrl = '/connect'
let res = await fetch(connectUrl, {
method: "POST",
mode: "cors",
headers: new Headers({
"Content-Type": "application/json"
}),
})
if (res.ok) {
return res.json();
}
}
/**
* Initialize and connect to the bot
* This sets up the RTVI client, initializes devices, and establishes the connection
*/
async connect() {
try {
// Initialize the client
this.dailyCallObject = Daily.createCallObject({
subscribeToTracksAutomatically: true,
});
// Set up listeners for media track events
this.setupTrackListeners();
this.log('Creating the bot...');
let roomInfo = await this.fetchRoomInfo()
// Connect to the bot
this.log('Connecting to bot...');
// Only for making debugger easier
window.callObject = this.dailyCallObject;
await this.dailyCallObject.join({
url: roomInfo.room_url,
});
this.log('Connection complete');
} catch (error) {
// Handle any errors during connection
this.log(`Error connecting: ${error.message}`);
this.log(`Error stack: ${error.stack}`);
this.updateStatus('Error');
// Clean up if there's an error
if (this.dailyCallObject) {
try {
await this.dailyCallObject.leave();
} catch (disconnectError) {
this.log(`Error during disconnect: ${disconnectError.message}`);
}
}
}
}
/**
* Disconnect from the bot and clean up media resources
*/
async disconnect() {
if (this.dailyCallObject) {
try {
// Disconnect the RTVI client
await this.dailyCallObject.leave();
await this.dailyCallObject.destroy();
this.dailyCallObject = null;
// Clean up audio
if (this.botAudio.srcObject) {
this.botAudio.srcObject.getTracks().forEach((track) => track.stop());
this.botAudio.srcObject = null;
}
} catch (error) {
this.log(`Error disconnecting: ${error.message}`);
}
}
}
}
// Initialize the client when the page loads
window.addEventListener('DOMContentLoaded', () => {
new ChatbotClient();
});

View File

@@ -1,98 +0,0 @@
body {
margin: 0;
padding: 20px;
font-family: Arial, sans-serif;
background-color: #f0f0f0;
}
.container {
max-width: 1200px;
margin: 0 auto;
}
.status-bar {
display: flex;
justify-content: space-between;
align-items: center;
padding: 10px;
background-color: #fff;
border-radius: 8px;
margin-bottom: 20px;
}
.controls button {
padding: 8px 16px;
margin-left: 10px;
border: none;
border-radius: 4px;
cursor: pointer;
}
#connect-btn {
background-color: #4caf50;
color: white;
}
#disconnect-btn {
background-color: #f44336;
color: white;
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.main-content {
background-color: #fff;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
}
.bot-container {
display: flex;
flex-direction: column;
align-items: center;
}
#bot-video-container {
width: 640px;
height: 360px;
background-color: #e0e0e0;
border-radius: 8px;
margin: 20px auto;
overflow: hidden;
display: flex;
align-items: center;
justify-content: center;
}
#bot-video-container video {
width: 100%;
height: 100%;
object-fit: cover;
}
.debug-panel {
background-color: #fff;
border-radius: 8px;
padding: 20px;
}
.debug-panel h3 {
margin: 0 0 10px 0;
font-size: 16px;
font-weight: bold;
}
#debug-log {
height: 200px;
overflow-y: auto;
background-color: #f8f8f8;
padding: 10px;
border-radius: 4px;
font-family: monospace;
font-size: 12px;
line-height: 1.4;
}

View File

@@ -1,13 +0,0 @@
import { defineConfig } from 'vite';
export default defineConfig({
server: {
proxy: {
// Proxy /api requests to the backend server
'/connect': {
target: 'http://0.0.0.0:7860', // Replace with your backend URL
changeOrigin: true,
},
},
},
});

View File

@@ -1,60 +0,0 @@
# React Native Implementation
Basic implementation using the [Pipecat React Native SDK](https://docs.pipecat.ai/client/react-native/introduction).
## Usage
### Expo requirements
This project cannot be used with an [Expo Go](https://docs.expo.dev/workflow/expo-go/) app because [it requires custom native code](https://docs.expo.io/workflow/customizing/).
When a project requires custom native code or a config plugin, we need to transition from using [Expo Go](https://docs.expo.dev/workflow/expo-go/)
to a [development build](https://docs.expo.dev/development/introduction/).
More details about the custom native code used by this demo can be found in [rn-daily-js-expo-config-plugin](https://github.com/daily-co/rn-daily-js-expo-config-plugin).
### Building remotely
If you do not have experience with Xcode and Android Studio builds or do not have them installed locally on your computer, you will need to follow [this guide from Expo to use EAS Build](https://docs.expo.dev/development/create-development-builds/#create-and-install-eas-build).
### Building locally
You will need to have installed locally on your computer:
- [Xcode](https://developer.apple.com/xcode/) to build for iOS;
- [Android Studio](https://developer.android.com/studio) to build for Android;
#### Install the demo dependencies
```bash
# Use the version of node specified in .nvmrc
nvm i
# Install dependencies
npm i
# Before a native app can be compiled, the native source code must be generated.
npx expo prebuild
# Configure the environment variable to connect to the local server
cp env.example .env
# edit .env and add your local ip address, for example: http://192.168.1.16:7860
```
#### Running on Android
After plugging in an Android device [configured for debugging](https://developer.android.com/studio/debug/dev-options), run the following command:
```
npm run android
```
#### Running on iOS
Run the following command:
```
npm run ios
```
#### Connect to the server
Use the http://localhost:5173 in your app.

View File

@@ -1,75 +0,0 @@
{
"expo": {
"name": "bot-ready-rn",
"slug": "bot-ready-rn",
"version": "1.0.0",
"orientation": "portrait",
"icon": "./assets/icon.png",
"userInterfaceStyle": "light",
"splash": {
"image": "./assets/splash.png",
"resizeMode": "contain",
"backgroundColor": "#ffffff"
},
"updates": {
"fallbackToCacheTimeout": 0
},
"assetBundlePatterns": [
"**/*"
],
"ios": {
"supportsTablet": true,
"bitcode": false,
"bundleIdentifier": "co.daily.expo.BotReady",
"infoPlist": {
"UIBackgroundModes": [
"voip"
]
},
"appleTeamId": "EEBGKV9N3N"
},
"android": {
"adaptiveIcon": {
"foregroundImage": "./assets/adaptive-icon.png",
"backgroundColor": "#FFFFFF"
},
"package": "co.daily.expo.BotReady",
"permissions": [
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.BLUETOOTH",
"android.permission.CAMERA",
"android.permission.INTERNET",
"android.permission.MODIFY_AUDIO_SETTINGS",
"android.permission.RECORD_AUDIO",
"android.permission.SYSTEM_ALERT_WINDOW",
"android.permission.WAKE_LOCK",
"android.permission.FOREGROUND_SERVICE",
"android.permission.FOREGROUND_SERVICE_CAMERA",
"android.permission.FOREGROUND_SERVICE_MICROPHONE",
"android.permission.FOREGROUND_SERVICE_MEDIA_PROJECTION",
"android.permission.POST_NOTIFICATIONS"
]
},
"web": {
"favicon": "./assets/favicon.png"
},
"plugins": [
"@config-plugins/react-native-webrtc",
"@daily-co/config-plugin-rn-daily-js",
[
"expo-build-properties",
{
"android": {
"minSdkVersion": 24,
"compileSdkVersion": 35,
"targetSdkVersion": 34,
"buildToolsVersion": "35.0.0"
},
"ios": {
"deploymentTarget": "15.1"
}
}
]
]
}
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.4 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 46 KiB

View File

@@ -1,7 +0,0 @@
module.exports = function(api) {
api.cache(true);
return {
presets: ['babel-preset-expo'],
plugins: [["module:react-native-dotenv"]],
};
};

View File

@@ -1 +0,0 @@
API_BASE_URL=http://YOUR_LOCAL_IP:7860

View File

@@ -1,7 +0,0 @@
import { registerRootComponent } from "expo";
import App from "./src/App";
// registerRootComponent calls AppRegistry.registerComponent('main', () => App);
// It also ensures that the environment is set up appropriately
registerRootComponent(App);

View File

@@ -1,4 +0,0 @@
// Learn more https://docs.expo.io/guides/customizing-metro
const { getDefaultConfig } = require('expo/metro-config');
module.exports = getDefaultConfig(__dirname);

File diff suppressed because it is too large Load Diff

View File

@@ -1,31 +0,0 @@
{
"name": "bot-ready-rn",
"version": "1.0.0",
"scripts": {
"start": "expo start --dev-client",
"android": "expo run:android --device",
"ios": "expo run:ios --device",
"web": "expo start --web"
},
"dependencies": {
"@config-plugins/react-native-webrtc": "^10.0.0",
"@daily-co/config-plugin-rn-daily-js": "0.0.7",
"@daily-co/react-native-daily-js": "^0.70.0",
"@daily-co/react-native-webrtc": "^118.0.3-daily.2",
"@react-native-async-storage/async-storage": "1.23.1",
"expo": "^52.0.0",
"expo-build-properties": "~0.13.1",
"expo-dev-client": "~5.0.5",
"expo-splash-screen": "~0.29.16",
"expo-status-bar": "~2.0.0",
"react": "18.3.1",
"react-native": "0.76.3",
"react-native-background-timer": "^2.4.1",
"react-native-dotenv": "^3.4.11",
"react-native-get-random-values": "^1.11.0"
},
"devDependencies": {
"@babel/core": "^7.12.9"
},
"private": true
}

View File

@@ -1,121 +0,0 @@
import React, { useState, useEffect } from 'react';
import {SafeAreaView, View, Text, Button, StyleSheet, ScrollView} from 'react-native';
import Daily from "@daily-co/react-native-daily-js";
import { API_BASE_URL } from "@env";
const CallScreen = () => {
const [connectionStatus, setConnectionStatus] = useState('Disconnected');
const [isConnected, setIsConnected] = useState(false);
const [callObject, setCallObject] = useState(null);
const [logs, setLogs] = useState([]);
useEffect(() => {
if (callObject) {
setupTrackListeners(callObject);
}
}, [callObject]);
const log = (message) => {
setLogs((prevLogs) => [...prevLogs, `${new Date().toISOString()} - ${message}`]);
console.log(message);
};
const setupTrackListeners = (callObject) => {
callObject.on("joined-meeting", () => {
setConnectionStatus('Connected');
setIsConnected(true);
log('Client connected');
});
callObject.on("left-meeting", () => {
setConnectionStatus('Disconnected');
setIsConnected(false);
log('Client disconnected');
});
callObject.on("participant-left", () => {
// When the bot leaves, we are also disconnecting from the call
disconnect().catch((err) => {
log(`Failed to disconnect ${err}`);
})
});
// Trigger so the bot can start sending audio
callObject.on("track-started", (evt) => {
if (evt.track.kind === "audio" && evt.participant.local === false) {
handleEventToConsole(evt)
log("Sending the message that will trigger the bot to play the audio.")
callObject.sendAppMessage("playable")
}
});
callObject.on("error", (evt) => log(`Error: ${evt.error}`));
// Other events just for awareness
callObject.on("track-stopped", handleEventToConsole);
callObject.on("participant-joined", handleEventToConsole);
callObject.on("participant-updated", handleEventToConsole);
};
const handleEventToConsole = (evt) => {
log(`Received event: ${evt.action}`);
};
const connect = async () => {
try {
const callObject = Daily.createCallObject({ subscribeToTracksAutomatically: true });
setCallObject(callObject);
const connectionUrl = `${API_BASE_URL}/connect`
const res = await fetch(connectionUrl, { method: "POST", headers: { "Content-Type": "application/json" } });
const roomInfo = await res.json();
await callObject.join({ url: roomInfo.room_url });
} catch (error) {
log(`Error connecting: ${error.message}`);
}
};
const disconnect = async () => {
if (callObject) {
try {
await callObject.leave();
await callObject.destroy();
setCallObject(null);
} catch (error) {
log(`Error disconnecting: ${error.message}`);
}
}
};
return (
<SafeAreaView style={styles.safeArea}>
<View style={styles.container}>
<View style={styles.statusBar}>
<Text>Status: <Text style={styles.status}>{connectionStatus}</Text></Text>
<View style={styles.controls}>
<Button
title={isConnected ? "Disconnect" : "Connect"}
onPress={isConnected ? disconnect : connect}
/>
</View>
</View>
<View style={styles.debugPanel}>
<Text style={styles.debugTitle}>Debug Info</Text>
<ScrollView style={styles.debugLog}>
{logs.map((logEntry, index) => (
<Text key={index} style={styles.logText}>{logEntry}</Text>
))}
</ScrollView>
</View>
</View>
</SafeAreaView>
);
};
const styles = StyleSheet.create({
safeArea: { flex: 1, backgroundColor: '#f0f0f0', padding: 20 },
container: { flex: 1, margin: 20 },
statusBar: { flexDirection: 'row', justifyContent: 'space-between', alignItems: 'center', padding: 10, backgroundColor: '#fff', borderRadius: 8, marginBottom: 20 },
status: { fontWeight: 'bold' },
controls: { flexDirection: 'row', gap: 10 },
debugPanel: { height: '80%', backgroundColor: '#fff', borderRadius: 8, padding: 20},
debugTitle: { fontSize: 16, fontWeight: 'bold' },
debugLog: { height: '100%', overflow: 'scroll', backgroundColor: '#f8f8f8', padding: 10, borderRadius: 4, fontFamily: 'monospace', fontSize: 12, lineHeight: 1.4 },
});
export default CallScreen;

View File

@@ -1,50 +0,0 @@
# Bot ready signaling Server
A FastAPI server that manages bot instances and provide endpoint for Pipecat client connections.
## Endpoints
- `POST /connect` - Pipecat client connection endpoint
## Environment Variables
Copy `env.example` to `.env` and configure:
```ini
# Required API Keys
DAILY_API_KEY= # Your Daily API key
CARTESIA_API_KEY= # Your Cartesia API key
# Optional Configuration
DAILY_API_URL= # Optional: Daily API URL (defaults to https://api.daily.co/v1)
DAILY_SAMPLE_ROOM_URL= # Optional: Fixed room URL for development
HOST= # Optional: Host address (defaults to 0.0.0.0)
FAST_API_PORT= # Optional: Port number (defaults to 7860)
```
## Running the Server
Set up and activate your virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
Install dependencies:
```bash
pip install -r requirements.txt
```
If you want to use the local version of `pipecat` in this repo rather than the last published version, also run:
```bash
pip install --editable "../../../[daily,cartesia,openai]"
```
Run the server:
```bash
python server.py
```

View File

@@ -1,3 +0,0 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=
CARTESIA_API_KEY=

View File

@@ -1,4 +0,0 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,cartesia,openai]

View File

@@ -1,64 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
from typing import Optional
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
(url, token, _) = await configure_with_args(aiohttp_session)
return (url, token)
async def configure_with_args(
aiohttp_session: aiohttp.ClientSession, parser: Optional[argparse.ArgumentParser] = None
):
if not parser:
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token, args)

View File

@@ -1,147 +0,0 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
from typing import Any, Dict
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
# Load environment variables from .env file
load_dotenv(override=True)
# Dictionary to track bot processes: {pid: (process, room_url)}
bot_procs = {}
# Store Daily API helpers
daily_helpers = {}
def cleanup():
"""Cleanup function to terminate all bot processes.
Called during server shutdown.
"""
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
"""FastAPI lifespan manager that handles startup and shutdown tasks.
- Creates aiohttp session
- Initializes Daily API helper
- Cleans up resources on shutdown
"""
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
# Initialize FastAPI app with lifespan manager
app = FastAPI(lifespan=lifespan)
# Configure CORS to allow requests from any origin
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
async def create_room_and_token() -> tuple[str, str]:
"""Helper function to create a Daily room and generate an access token.
Returns:
tuple[str, str]: A tuple containing (room_url, token)
Raises:
HTTPException: If room creation or token generation fails
"""
room = await daily_helpers["rest"].create_room(DailyRoomParams())
if not room.url:
raise HTTPException(status_code=500, detail="Failed to create room")
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
return room.url, token
@app.post("/connect")
async def bot_connect(request: Request) -> Dict[Any, Any]:
"""Connect endpoint that creates a room and returns connection credentials.
This endpoint is called by client to establish a connection.
Returns:
Dict[Any, Any]: Authentication bundle containing room_url and token
Raises:
HTTPException: If room creation, token generation, or bot startup fails
"""
print("Creating room for RTVI connection")
room_url, token = await create_room_and_token()
print(f"Room URL: {room_url}")
# Start the bot process
try:
bot_file = "signalling_bot"
proc = subprocess.Popen(
[f"python3 -m {bot_file} -u {room_url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room_url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
# Return the authentication bundle in format expected by DailyTransport
return {"room_url": room_url, "token": token}
if __name__ == "__main__":
import uvicorn
# Parse command line arguments for server configuration
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Travel Companion FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
# Start the FastAPI server
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -1,95 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
from dataclasses import dataclass
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.frames.frames import AudioRawFrame, EndFrame, OutputAudioRawFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
@dataclass
class SilenceFrame(OutputAudioRawFrame):
def __init__(
self,
*,
sample_rate: int,
duration: float,
):
# Initialize the parent class with the silent frame's data
super().__init__(
audio=self.create_silent_audio_frame(sample_rate, 1, duration).audio,
sample_rate=sample_rate,
num_channels=1,
)
@staticmethod
def create_silent_audio_frame(
sample_rate: int, num_channels: int, duration: float
) -> AudioRawFrame:
"""Create an AudioRawFrame containing silence."""
frame_size = num_channels * 2 # 2 bytes per sample for 16-bit audio
total_frames = int(sample_rate * duration)
total_bytes = total_frames * frame_size
silent_audio = bytes(total_bytes) # Create a byte array filled with zeros
return AudioRawFrame(audio=silent_audio, sample_rate=sample_rate, num_channels=num_channels)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True)
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when we receive a specific message
@transport.event_handler("on_app_message")
async def on_app_message(transport, message, sender):
logger.debug(f"Received app message: {message} - {sender}")
if "playable" not in message:
return
await task.queue_frames(
[
SilenceFrame(
sample_rate=task.params.audio_out_sample_rate,
duration=0.5,
),
TTSSpeakFrame(f"Hello there, how are you doing today ?"),
EndFrame(),
]
)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,161 +0,0 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
runpod.toml

View File

@@ -1,15 +0,0 @@
FROM python:3.10-bullseye
RUN mkdir /app
RUN mkdir /app/assets
RUN mkdir /app/utils
COPY *.py /app/
COPY requirements.txt /app/
WORKDIR /app
RUN pip3 install -r requirements.txt
EXPOSE 7860
CMD ["python3", "server.py"]

View File

@@ -1,37 +0,0 @@
# Simple Chatbot
<img src="image.png" width="420px">
This app connects you to a chatbot powered by GPT-4, complete with animations generated by Stable Video Diffusion.
See a video of it in action: https://x.com/kwindla/status/1778628911817183509
And a quick video walkthrough of the code: https://www.loom.com/share/13df1967161f4d24ade054e7f8753416
The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
## Build and test the Docker image
```
docker build -t chatbot .
docker run --env-file .env -p 7860:7860 chatbot
```

View File

@@ -1,170 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import datetime
import io
import os
import sys
import wave
import aiofiles
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# Create the recordings directory if it doesn't exist
os.makedirs("recordings", exist_ok=True)
async def save_audio(audio: bytes, sample_rate: int, num_channels: int, name: str):
if len(audio) > 0:
filename = os.path.join(
"recordings",
f"{name}_conversation_recording{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}.wav",
)
with io.BytesIO() as buffer:
with wave.open(buffer, "wb") as wf:
wf.setsampwidth(2)
wf.setnchannels(num_channels)
wf.setframerate(sample_rate)
wf.writeframes(audio)
async with aiofiles.open(filename, "wb") as file:
await file.write(buffer.getvalue())
print(f"Merged audio saved to {filename}")
else:
print("No audio data to save")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
audio_out_enabled=True,
audio_in_enabled=True,
video_out_enabled=False,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
#
# Spanish
#
# transcription_settings=DailyTranscriptionSettings(
# language="es",
# tier="nova",
# model="2-general"
# )
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
#
# English
#
voice_id="cgSgspJ2msm6clMCkdW9",
#
# Spanish
#
# model="eleven_multilingual_v2",
# voice_id="gD1IexrzCvsXPHUuT0s3",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
#
# English
#
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your response to 12 words or fewer.",
#
# Spanish
#
# "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
# NOTE: Watch out! This will save all the conversation in memory. You
# can pass `buffer_size` to get periodic callbacks.
audiobuffer = AudioBufferProcessor(enable_turn_audio=True)
pipeline = Pipeline(
[
transport.input(), # microphone
context_aggregator.user(),
llm,
tts,
transport.output(),
audiobuffer, # used to buffer the audio in the pipeline
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=16000,
audio_out_sample_rate=16000,
enable_metrics=True,
enable_usage_metrics=True,
),
)
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "full")
@audiobuffer.event_handler("on_user_turn_audio_data")
async def on_user_turn_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "user")
@audiobuffer.event_handler("on_bot_turn_audio_data")
async def on_bot_turn_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "bot")
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await audiobuffer.start_recording()
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.cancel()
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,4 +0,0 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=7df...
OPENAI_API_KEY=sk-PL...
ELEVENLABS_API_KEY=aeb...

View File

@@ -1,5 +0,0 @@
aiofiles
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,openai,silero,elevenlabs]

View File

@@ -1,55 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)

View File

@@ -1,139 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse, RedirectResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
MAX_BOTS_PER_ROOM = 1
# Bot sub-process dict for status reporting and concurrency control
bot_procs = {}
daily_helpers = {}
load_dotenv(override=True)
def cleanup():
# Clean up function, just to be extra safe
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def start_agent(request: Request):
print(f"!!! Creating room")
room = await daily_helpers["rest"].create_room(DailyRoomParams())
print(f"!!! Room URL: {room.url}")
# Ensure the room property is present
if not room.url:
raise HTTPException(
status_code=500,
detail="Missing 'room' property in request data. Cannot start agent without a target room!",
)
# Check if there is already an existing process running in this room
num_bots_in_room = sum(
1 for proc in bot_procs.values() if proc[1] == room.url and proc[0].poll() is None
)
if num_bots_in_room >= MAX_BOTS_PER_ROOM:
raise HTTPException(status_code=500, detail=f"Max bot limited reach for room: {room.url}")
# Get the token for the room
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
# Spawn a new agent, and join the user session
# Note: this is mostly for demonstration purposes (refer to 'deployment' in README)
try:
proc = subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room.url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
return RedirectResponse(room.url)
@app.get("/status/{pid}")
def get_status(pid: int):
# Look up the subprocess
proc = bot_procs.get(pid)
# If the subprocess doesn't exist, return an error
if not proc:
raise HTTPException(status_code=404, detail=f"Bot with process id: {pid} not found")
# Check the status of the subprocess
if proc[0].poll() is None:
status = "running"
else:
status = "finished"
return JSONResponse({"bot_id": pid, "status": status})
if __name__ == "__main__":
import uvicorn
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Storyteller FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -1,39 +0,0 @@
# Daily Custom Tracks
This example shows how to send and receive Daily custom tracks. We will run a simple `daily-python` application to send an audio file with a custom track (named "pipecat") to a room. Then, the Pipecat bot will mirror that custom track into another custom track (named "pipecat-mirror") in the same room.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
## Run the bot
Start the bot by giving it a Daily room URL.
```bash
python bot.py -u ROOM_URL
```
The bot will wait for the first participant to join. Then, it will mirror a custom track named "pipecat" into a new custom track named "pipecat-mirror".
## Run the sender
Now, run the custom track sender. This is a simple `daily-python` application that opens and audio file and sends it as a custom track to the same Daily room.
```bash
python custom_track_sender.py -u ROOM_URL -i office-ambience-mono-16000.mp3
```
## Open client
Finally, open the client so you can hear both custom tracks.
```bash
open index.html
```
Once the client is opened, copy the URL of the Daily room and join it. You should be able to select which custom track you want to hear.

View File

@@ -1,89 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import sys
import aiohttp
from loguru import logger
from runner import configure
from pipecat.frames.frames import Frame, InputAudioRawFrame, OutputAudioRawFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.transports.services.daily import DailyParams, DailyTransport
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
class CustomTrackMirrorProcessor(FrameProcessor):
def __init__(self, transport_destination: str, **kwargs):
super().__init__(**kwargs)
self._transport_destination = transport_destination
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, InputAudioRawFrame) and frame.transport_source:
output_frame = OutputAudioRawFrame(
audio=frame.audio,
sample_rate=frame.sample_rate,
num_channels=frame.num_channels,
)
output_frame.transport_destination = self._transport_destination
await self.push_frame(output_frame)
else:
await self.push_frame(frame, direction)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url,
None,
"Custom tracks mirror",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
microphone_out_enabled=False, # Disable since we just use custom tracks
audio_out_destinations=["pipecat-mirror"],
),
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
CustomTrackMirrorProcessor("pipecat-mirror"),
transport.output(), # Transport bot output
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=16000,
audio_out_sample_rate=16000,
enable_metrics=True,
enable_usage_metrics=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_audio(participant["id"], audio_source="pipecat")
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,74 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import time
from daily import CallClient, CustomAudioSource, Daily
from pydub import AudioSegment
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument("-u", "--url", type=str, required=True, help="URL of the Daily room to join")
parser.add_argument(
"-i", "--input", type=str, required=True, help="Input audio file (needs 16000 sample rate)"
)
args, _ = parser.parse_known_args()
audio = AudioSegment.from_mp3(args.input)
raw_bytes = audio.raw_data
sample_rate = audio.frame_rate
channels = audio.channels
print(f"Length: {len(raw_bytes)} bytes")
print(f"Sample rate: {sample_rate}, Channels: {channels}")
# Initialize the Daily context & create call client
Daily.init()
client = CallClient()
# Join the room and indicate we have a custom track named "pipecat".
client.join(
args.url,
client_settings={
"publishing": {
"camera": False,
"microphone": False,
"customAudio": {"pipecat": True},
},
},
)
# Just sleep for a couple of seconds. To do this well we should really use
# completions.
time.sleep(2)
# Create the custom audio source. This is where we will write our audio.
audio_source = CustomAudioSource(sample_rate, channels)
# Create an audio track and assign it our audio source.
client.add_custom_audio_track("pipecat", audio_source)
# Just sleep for a second. To do this well we should really use completions.
time.sleep(1)
try:
# Just write one second of audio until we have read all the file.
chunk_size = sample_rate * channels * 2
while len(raw_bytes) > 0:
chunk = raw_bytes[:chunk_size]
raw_bytes = raw_bytes[chunk_size:]
audio_source.write_frames(chunk)
except KeyboardInterrupt:
client.leave()
# Just sleep for a second. To do this well we should really use completions.
time.sleep(1)
client.release()

View File

@@ -1,173 +0,0 @@
<html>
<head>
<title>daily custom tracks</title>
</head>
<script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
<link
rel="stylesheet"
type="text/css"
href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
/>
<script>
function enableButton(buttonId, enable) {
const button = document.getElementById(buttonId);
button.disabled = !enable;
}
function enableJoinButton(enable) {
enableButton("join-button", enable);
}
function enableLeaveButton(enable) {
enableButton("leave-button", enable);
}
function destroyPlayers(query) {
const items = document.querySelectorAll(query);
if (items) {
for (const item of items) {
item.remove();
}
}
}
function destroyParticipantPlayers(participantId) {
destroyPlayers(`audio[data-participant-id="${participantId}"]`);
destroyPlayers(`button[data-participant-id="${participantId}"]`);
}
async function startPlayer(player, track) {
player.muted = false;
player.autoplay = true;
if (track != null) {
player.srcObject = new MediaStream([track]);
}
}
async function buildAudioPlayer(track, participantId) {
const audioContainer = document.getElementById("audio-container");
const player = document.createElement("audio");
player.dataset.participantId = participantId;
// Create a new button for controlling audio
const audioControlButton = document.createElement("button");
audioControlButton.className = "ui primary green button"
audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
audioControlButton.dataset.participantId = participantId;
audioControlButton.onclick = () => {
if (player.paused) {
player.play();
audioControlButton.className = "ui primary red button"
} else {
player.pause();
audioControlButton.className = "ui primary green button"
}
};
audioContainer.appendChild(player);
audioContainer.appendChild(audioControlButton);
await startPlayer(player, track);
player.pause()
return player;
}
function subscribeToTracks(participantId) {
console.log(`subscribing to track`);
if (participantId === "local") {
return;
}
callObject.updateParticipant(participantId, {
setSubscribedTracks: {
audio: true,
video: false,
custom: true,
},
});
}
function startDaily() {
enableJoinButton(true);
enableLeaveButton(false);
window.callObject = window.DailyIframe.createCallObject({});
callObject.on("participant-joined", (e) => {
if (!e.participant.local) {
console.log("participant-joined", e.participant);
subscribeToTracks(e.participant.session_id);
}
});
callObject.on("participant-left", (e) => {
console.log("participant-left", e.participant.session_id);
destroyParticipantPlayers(e.participant.session_id);
});
callObject.on("track-started", async (e) => {
console.log("track-started", e.track);
if (e.track.kind === "audio") {
await buildAudioPlayer(e.track, e.participant.session_id);
}
});
}
async function joinRoom() {
enableJoinButton(false);
enableLeaveButton(true);
const meetingUrl = document.getElementById("meeting-url").value;
callObject.join({
url: meetingUrl,
startVideoOff: true,
startAudioOff: true,
subscribeToTracksAutomatically: false,
receiveSettings: {
base: { video: { layer: 0 } },
},
});
}
async function leaveRoom() {
enableJoinButton(true);
enableLeaveButton(false);
callObject.leave();
const audioContainer = document.getElementById("audio-container");
audioContainer.replaceChildren();
}
</script>
<body onload="startDaily()">
<div class="ui centered page grid" style="margin-top: 30px">
<div class="ten wide column">
<div class="ui form" style="margin-top: 30px">
<div class="field">
<label>Meeting URL</label>
<input id="meeting-url" value="" />
</div>
</div>
</div>
</div>
<div class="ui centered aligned header" style="margin-top: 30px">
<button id="join-button" class="ui primary button" onclick="joinRoom()">
Join
</button>
<button id="leave-button" class="ui button" onclick="leaveRoom()">
Leave
</button>
</div>
<div id="tile" class="ui container" style="margin-top: 30px">
<div id="tile" class="ui center aligned grid">
<div id="audio-container"></div><br/>
</div>
</div>
</body>
</html>

View File

@@ -1,2 +0,0 @@
pydub
pipecat-ai[daily]

View File

@@ -1,55 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)

View File

@@ -1,15 +0,0 @@
FROM python:3.10-bullseye
RUN mkdir /app
RUN mkdir /app/assets
RUN mkdir /app/utils
COPY *.py /app/
COPY requirements.txt /app/
WORKDIR /app
RUN pip3 install -r requirements.txt
EXPOSE 7860
CMD ["python3", "server.py"]

View File

@@ -1,39 +0,0 @@
# Daily Multi Translation
This example shows how to use Daily to stream multiple simultaneous translations using a single transport. Daily provides custom tracks and in this example we will simultaneously translate incoming audio in English to Spanish, French and German, each of them being sent to a custom track.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser. This will open a Daily Prebuilt room where you will speak in English (make sure you are not muted).
## Open client
Next, you need to open the client that will listen to the translations.
```bash
open index.html
```
Once the client is opened, copy the URL of the Daily room created above and join it. You should be able to select which translation you want to hear.
## Build and test the Docker image
```
docker build -t daily-multi-translation .
docker run --env-file .env -p 7860:7860 daily-multi-translation
```

View File

@@ -1,163 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.mixers.soundfile_mixer import SoundfileMixer
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
from pipecat.pipeline.parallel_pipeline import ParallelPipeline
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
BACKGROUND_SOUND_FILE = "office-ambience-mono-16000.mp3"
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Multi translation bot",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
audio_out_mixer={
"spanish": SoundfileMixer(
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
),
"french": SoundfileMixer(
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
),
"german": SoundfileMixer(
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
),
},
audio_out_destinations=["spanish", "french", "german"],
microphone_out_enabled=False, # Disable since we just use custom tracks
vad_analyzer=SileroVADAnalyzer(),
),
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts_spanish = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="cefcb124-080b-4655-b31f-932f3ee743de",
transport_destination="spanish",
)
tts_french = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="8832a0b5-47b2-4751-bb22-6a8e2149303d",
transport_destination="french",
)
tts_german = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="38aabb6a-f52b-4fb0-a3d1-988518f4dc06",
transport_destination="german",
)
messages_spanish = [
{
"role": "system",
"content": "You will be provided with a sentence in English, and your task is to only translate it into Spanish.",
},
]
messages_french = [
{
"role": "system",
"content": "You will be provided with a sentence in English, and your task is to only translate it into French.",
},
]
messages_german = [
{
"role": "system",
"content": "You will be provided with a sentence in English, and your task is to only translate it into German.",
},
]
llm_spanish = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
llm_french = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
llm_german = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
context_spanish = OpenAILLMContext(messages_spanish)
context_aggregator_spanish = llm_spanish.create_context_aggregator(context_spanish)
context_french = OpenAILLMContext(messages_french)
context_aggregator_french = llm_french.create_context_aggregator(context_french)
context_german = OpenAILLMContext(messages_german)
context_aggregator_german = llm_german.create_context_aggregator(context_german)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
ParallelPipeline(
# Spanish pipeline.
[
context_aggregator_spanish.user(),
llm_spanish,
tts_spanish,
context_aggregator_spanish.assistant(),
],
# French pipeline.
[
context_aggregator_french.user(),
llm_french,
tts_french,
context_aggregator_french.assistant(),
],
# German pipeline.
[
context_aggregator_german.user(),
llm_german,
tts_german,
context_aggregator_german.assistant(),
],
),
transport.output(), # Transport bot output
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=16000,
audio_out_sample_rate=16000,
enable_metrics=True,
enable_usage_metrics=True,
),
observers=[TranscriptionLogObserver()],
)
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,5 +0,0 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=7df...
OPENAI_API_KEY=sk-PL...
DEEPGRAM_API_KEY=efb...
CARTESIA_API_KEY=aeb...

View File

@@ -1,202 +0,0 @@
<html>
<head>
<title>daily multi translation</title>
</head>
<script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
<script
src="https://code.jquery.com/jquery-3.1.1.min.js"
integrity="sha256-hVVnYaiADRTO2PzUGmuLJr8BLUSjGIZsDYGmIJLv2b8="
crossorigin="anonymous"
></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
<link
rel="stylesheet"
type="text/css"
href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
/>
<script>
function enableButton(buttonId, enable) {
const button = document.getElementById(buttonId);
button.disabled = !enable;
}
function enableJoinButton(enable) {
enableButton("join-button", enable);
}
function enableLeaveButton(enable) {
enableButton("leave-button", enable);
}
function destroyPlayers(query) {
const items = document.querySelectorAll(query);
if (items) {
for (const item of items) {
item.remove();
}
}
}
function destroyParticipantPlayers(participantId) {
destroyPlayers(`video[data-participant-id="${participantId}"]`);
destroyPlayers(`audio[data-participant-id="${participantId}"]`);
destroyPlayers(`button[data-participant-id="${participantId}"]`);
}
async function startPlayer(player, track) {
player.muted = false;
player.autoplay = true;
if (track != null) {
player.srcObject = new MediaStream([track]);
}
}
async function buildVideoPlayer(track, participantId) {
const videoContainer = document.getElementById("video-container");
const player = document.createElement("video");
player.dataset.participantId = participantId;
videoContainer.appendChild(player);
await startPlayer(player, track);
await player.play();
return player;
}
async function buildAudioPlayer(track, participantId) {
const audioContainer = document.getElementById("audio-container");
const player = document.createElement("audio");
player.dataset.participantId = participantId;
// Create a new button for controlling audio
const audioControlButton = document.createElement("button");
audioControlButton.className = "ui primary green button"
audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
audioControlButton.dataset.participantId = participantId;
audioControlButton.onclick = () => {
if (player.paused) {
player.play();
audioControlButton.className = "ui primary red button"
} else {
player.pause();
audioControlButton.className = "ui primary green button"
}
};
audioContainer.appendChild(player);
audioContainer.appendChild(audioControlButton);
await startPlayer(player, track);
player.pause()
return player;
}
function subscribeToTracks(participantId) {
console.log(`subscribing to track`);
if (participantId === "local") {
return;
}
callObject.updateParticipant(participantId, {
setSubscribedTracks: {
audio: true,
video: true,
custom: true,
},
});
}
function startDaily() {
enableJoinButton(true);
enableLeaveButton(false);
window.callObject = window.DailyIframe.createCallObject({});
callObject.on("participant-joined", (e) => {
if (!e.participant.local) {
console.log("participant-joined", e.participant);
subscribeToTracks(e.participant.session_id);
}
});
callObject.on("participant-left", (e) => {
console.log("participant-left", e.participant.session_id);
destroyParticipantPlayers(e.participant.session_id);
});
callObject.on("track-started", async (e) => {
console.log("track-started", e.track);
if (e.track.kind === "video") {
await buildVideoPlayer(e.track, e.participant.session_id);
} else if (e.track.kind === "audio") {
await buildAudioPlayer(e.track, e.participant.session_id);
}
});
}
async function joinRoom() {
enableJoinButton(false);
enableLeaveButton(true);
const meetingUrl = document.getElementById("meeting-url").value;
callObject.join({
url: meetingUrl,
startVideoOff: true,
startAudioOff: true,
subscribeToTracksAutomatically: false,
receiveSettings: {
base: { video: { layer: 0 } },
},
});
}
async function leaveRoom() {
enableJoinButton(true);
enableLeaveButton(false);
callObject.leave();
const videoContainer = document.getElementById("video-container");
videoContainer.replaceChildren();
const audioContainer = document.getElementById("audio-container");
audioContainer.replaceChildren();
}
</script>
<body onload="startDaily()">
<div class="ui centered page grid" style="margin-top: 30px">
<div class="ten wide column">
<div class="ui form" style="margin-top: 30px">
<div class="field">
<label>Meeting URL</label>
<input id="meeting-url" value="" />
</div>
</div>
</div>
</div>
<div class="ui centered aligned header" style="margin-top: 30px">
<button id="join-button" class="ui primary button" onclick="joinRoom()">
Join
</button>
<button id="leave-button" class="ui button" onclick="leaveRoom()">
Leave
</button>
</div>
<div id="tile" class="ui container" style="margin-top: 30px">
<div id="tile" class="ui center aligned grid">
<div id="audio-container"></div><br/>
</div>
</div>
<div id="tile" class="ui container" style="margin-top: 30px">
<div id="tile" class="ui center aligned grid">
<div id="video-container" class="ui segment"></div>
</div>
</div>
</body>
</html>

View File

@@ -1,5 +0,0 @@
aiofiles
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,deepgram,openai,silero,cartesia,soundfile]

View File

@@ -1,55 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)

View File

@@ -1,139 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse, RedirectResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
MAX_BOTS_PER_ROOM = 1
# Bot sub-process dict for status reporting and concurrency control
bot_procs = {}
daily_helpers = {}
load_dotenv(override=True)
def cleanup():
# Clean up function, just to be extra safe
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def start_agent(request: Request):
print(f"!!! Creating room")
room = await daily_helpers["rest"].create_room(DailyRoomParams())
print(f"!!! Room URL: {room.url}")
# Ensure the room property is present
if not room.url:
raise HTTPException(
status_code=500,
detail="Missing 'room' property in request data. Cannot start agent without a target room!",
)
# Check if there is already an existing process running in this room
num_bots_in_room = sum(
1 for proc in bot_procs.values() if proc[1] == room.url and proc[0].poll() is None
)
if num_bots_in_room >= MAX_BOTS_PER_ROOM:
raise HTTPException(status_code=500, detail=f"Max bot limited reach for room: {room.url}")
# Get the token for the room
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
# Spawn a new agent, and join the user session
# Note: this is mostly for demonstration purposes (refer to 'deployment' in README)
try:
proc = subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room.url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
return RedirectResponse(room.url)
@app.get("/status/{pid}")
def get_status(pid: int):
# Look up the subprocess
proc = bot_procs.get(pid)
# If the subprocess doesn't exist, return an error
if not proc:
raise HTTPException(status_code=404, detail=f"Bot with process id: {pid} not found")
# Check the status of the subprocess
if proc[0].poll() is None:
status = "running"
else:
status = "finished"
return JSONResponse({"bot_id": pid, "status": status})
if __name__ == "__main__":
import uvicorn
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Storyteller FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -1,13 +0,0 @@
FROM python:3.11-bullseye
# Open port 7860 for http service
ENV FAST_API_PORT=7860
EXPOSE 7860
# Install Python dependencies
COPY *.py .
COPY ./requirements.txt requirements.txt
RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
# Start the FastAPI server
CMD python3 bot_runner.py --port ${FAST_API_PORT}

View File

@@ -1,39 +0,0 @@
# Fly.io deployment example
This project modifies the `bot_runner.py` server to launch a new machine for each user session. This is a recommended approach for production vs. running shell processess as your deployment will quickly run out of system resources under load.
For this example, we are using Daily as a WebRTC transport and provisioning a new room and token for each session. You can use another transport, such as WebSockets, by modifying the `bot.py` and `bot_runner.py` files accordingly.
## Setting up your fly.io deployment
### Create your fly.toml file
You can copy the `example-fly.toml` as a reference. Be sure to change the app name to something unique.
### Create your .env file
Copy the base `env.example` to `.env` and enter the necessary API keys.
`FLY_APP_NAME` should match that in the `fly.toml` file.
### Launch a new fly.io project
`fly launch` or `fly launch --org your-org-name`
### Set the necessary app secrets from your .env
Note: you can do this manually via the fly.io dashboard under the "secrets" sub-section of your deployment (e.g. "https://fly.io/apps/fly-app-name/secrets") or run the following terminal command:
`cat .env | tr '\n' ' ' | xargs flyctl secrets set`
### Deploy your machine
`fly deploy`
## Connecting to your bot
Send a post request to your running fly.io instance:
`curl --location --request POST 'https://YOUR_FLY_APP_NAME/'`
This request will wait until the machine enters into a `starting` state, before returning the a room URL and token to join.

View File

@@ -1,113 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import asyncio
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
daily_api_key = os.getenv("DAILY_API_KEY", "")
daily_api_url = os.getenv("DAILY_API_URL", "https://api.daily.co/v1")
async def main(room_url: str, token: str):
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
api_url=daily_api_url,
api_key=daily_api_key,
audio_in_enabled=True,
audio_out_enabled=True,
video_out_enabled=False,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY", ""),
voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are Chatbot, a friendly, helpful robot. Your output will be converted to audio so don't include special characters other than '!' or '?' in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by saying hello.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.cancel()
@transport.event_handler("on_call_state_updated")
async def on_call_state_updated(transport, state):
if state == "left":
# Here we don't want to cancel, we just want to finish sending
# whatever is queued, so we use an EndFrame().
await task.queue_frame(EndFrame())
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Pipecat Bot")
parser.add_argument("-u", type=str, help="Room URL")
parser.add_argument("-t", type=str, help="Token")
config = parser.parse_args()
asyncio.run(main(config.u, config.t))

View File

@@ -1,209 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pipecat.transports.services.helpers.daily_rest import (
DailyRESTHelper,
DailyRoomObject,
DailyRoomParams,
DailyRoomProperties,
)
load_dotenv(override=True)
# ------------ Configuration ------------ #
MAX_SESSION_TIME = 5 * 60 # 5 minutes
REQUIRED_ENV_VARS = [
"DAILY_API_KEY",
"OPENAI_API_KEY",
"ELEVENLABS_API_KEY",
"ELEVENLABS_VOICE_ID",
"FLY_API_KEY",
"FLY_APP_NAME",
]
FLY_API_HOST = os.getenv("FLY_API_HOST", "https://api.machines.dev/v1")
FLY_APP_NAME = os.getenv("FLY_APP_NAME", "pipecat-fly-example")
FLY_API_KEY = os.getenv("FLY_API_KEY", "")
FLY_HEADERS = {"Authorization": f"Bearer {FLY_API_KEY}", "Content-Type": "application/json"}
daily_helpers = {}
# ----------------- API ----------------- #
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# ----------------- Main ----------------- #
async def spawn_fly_machine(room_url: str, token: str):
async with aiohttp.ClientSession() as session:
# Use the same image as the bot runner
async with session.get(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines", headers=FLY_HEADERS
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Unable to get machine info from Fly: {text}")
data = await r.json()
image = data[0]["config"]["image"]
# Machine configuration
cmd = f"python3 bot.py -u {room_url} -t {token}"
cmd = cmd.split()
worker_props = {
"config": {
"image": image,
"auto_destroy": True,
"init": {"cmd": cmd},
"restart": {"policy": "no"},
"guest": {"cpu_kind": "shared", "cpus": 1, "memory_mb": 1024},
},
}
# Spawn a new machine instance
async with session.post(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines", headers=FLY_HEADERS, json=worker_props
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Problem starting a bot worker: {text}")
data = await r.json()
# Wait for the machine to enter the started state
vm_id = data["id"]
async with session.get(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines/{vm_id}/wait?state=started",
headers=FLY_HEADERS,
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Bot was unable to enter started state: {text}")
print(f"Machine joined room: {room_url}")
@app.post("/")
async def start_bot(request: Request) -> JSONResponse:
try:
data = await request.json()
# Is this a webhook creation request?
if "test" in data:
return JSONResponse({"test": True})
except Exception as e:
pass
# Use specified room URL, or create a new one if not specified
room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", "")
if not room_url:
params = DailyRoomParams(properties=DailyRoomProperties())
try:
room: DailyRoomObject = await daily_helpers["rest"].create_room(params=params)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Unable to provision room {e}")
else:
# Check passed room URL exists, we should assume that it already has a sip set up
try:
room: DailyRoomObject = await daily_helpers["rest"].get_room_from_url(room_url)
except Exception:
raise HTTPException(status_code=500, detail=f"Room not found: {room_url}")
# Give the agent a token to join the session
token = await daily_helpers["rest"].get_token(room.url, MAX_SESSION_TIME)
if not room or not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room_url}")
# Launch a new fly.io machine, or run as a shell process (not recommended)
run_as_process = os.getenv("RUN_AS_PROCESS", False)
if run_as_process:
try:
subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
else:
try:
await spawn_fly_machine(room.url, token)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to spawn VM: {e}")
# Grab a token for the user to join with
user_token = await daily_helpers["rest"].get_token(room.url, MAX_SESSION_TIME)
return JSONResponse(
{
"room_url": room.url,
"token": user_token,
}
)
if __name__ == "__main__":
# Check environment variables
for env_var in REQUIRED_ENV_VARS:
if env_var not in os.environ:
raise Exception(f"Missing environment variable: {env_var}.")
parser = argparse.ArgumentParser(description="Pipecat Bot Runner")
parser.add_argument(
"--host", type=str, default=os.getenv("HOST", "0.0.0.0"), help="Host address"
)
parser.add_argument("--port", type=int, default=os.getenv("PORT", 7860), help="Port number")
parser.add_argument(
"--reload", action="store_true", default=False, help="Reload code on change"
)
config = parser.parse_args()
try:
import uvicorn
uvicorn.run("bot_runner:app", host=config.host, port=config.port, reload=config.reload)
except KeyboardInterrupt:
print("Pipecat runner shutting down...")

View File

@@ -1,8 +0,0 @@
DAILY_API_KEY=
DAILY_SAMPLE_ROOM_URL= # Enter a Daily room URL to use a set room URL each time (useful for local testing)
OPENAI_API_KEY=
ELEVENLABS_API_KEY=
ELEVENLABS_VOICE_ID=
FLY_API_KEY=
FLY_APP_NAME=
RUN_AS_PROCESS= # Spawn fly.io machine for each session or run as local process

View File

@@ -1,25 +0,0 @@
# fly.toml app configuration file generated for pipecat-fly-example on 2024-07-01T15:04:53+01:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = 'pipecat-fly-example'
primary_region = 'sjc'
[build]
[env]
FLY_APP_NAME = 'pipecat-fly-example'
[http_service]
internal_port = 7860
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 0
processes = ['app']
[[vm]]
memory = 512
cpu_kind = 'shared'
cpus = 1

View File

@@ -1,5 +0,0 @@
pipecat-ai[daily,openai,silero]
fastapi
uvicorn
python-dotenv
loguru

View File

@@ -1,94 +0,0 @@
# Modal clone
modal-examples
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
dist/
*.egg-info/
*.egg
.installed.cfg
.eggs/
downloads/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
MANIFEST
# Virtual Environments
venv/
env/
.env
.venv/
ENV/
env.bak/
venv.bak/
# IDE
.idea/
.vscode/
.spyderproject
.spyproject
.ropeproject
# Testing and Coverage
.coverage
.coverage.*
htmlcov/
.pytest_cache/
.tox/
.nox/
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
cover/
# Logs and Databases
*.log
*.db
db.sqlite3
db.sqlite3-journal
pip-log.txt
# System Files
.DS_Store
Thumbs.db
desktop.ini
*.swp
*.swo
*.bak
*.tmp
*~
# Build and Documentation
docs/_build/
.pybuilder/
target/
instance/
.webassets-cache
.pdm.toml
.pdm-python
.pdm-build/
__pypackages__/
# Other
*.mo
*.pot
*.sage.py
.mypy_cache/
.dmypy.json
dmypy.json
.pyre/
.pytype/
cython_debug/
.ipynb_checkpoints

View File

@@ -1,91 +0,0 @@
# Deploying Pipecat to Modal.com
Deployment example for [modal.com](https://www.modal.com). This example demonstrates how to deploy a FastAPI webapp to Modal with an RTVI compatible `/connect` endpoint that launches a Pipecat pipeline in a separate Modal container and returns a room/token for the client to join. This example also supports providing a parameter to the `/connect` endpoint for specifying which Pipecat pipeline to launch; openai, gemini, or vllm. The vllm pipeline points to a self-hosted OpenAI compatible LLM, using a llama model (neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16), deployed to Modal.
![](diagram.jpg)
# Running this Example
## Install the Modal CLI
Setup a Modal account and install it on your machine if you have not already, following their easy 3 steps in their [Getting Started Guide](https://modal.com/docs/guide#getting-started)
## Deploy a self-serve LLM
1. Deploy Modal's OpenAI-compatible LLM service:
```bash
git clone https://github.com/modal-labs/modal-examples
cd modal-examples
modal deploy 06_gpu_and_ml/llm-serving/vllm_inference.py
```
Refer to Modal's guide and example for [Deploying an OpenAI-compatible LLM service with vLLM](https://modal.com/docs/examples/vllm_inference) for more details.
2. Take note of the endpoint URL from the previous step, which will look like:
```
https://{your-workspace}--example-vllm-openai-compatible-serve.modal.run
```
You'll need this for the `bot_vllm.py` file in the next section.
**Note:** The default Modal LLM example uses Llama-3.1 and will shut down after 15 minutes of inactivity. Cold starts take 5-10 minutes. To prepare the service, we recommend visiting the `/docs` endpoint (`https://<Modal workspace>--example-vllm-openai-compatible-serve.modal.run/docs`) for your deployed LLM and wait for it to fully load before connecting your client.
## Deploy FastAPI App and Pipecat pipeline to Modal
1. Setup environment variables
```bash
cd server
cp env.example .env
# Modify .env to provide your service API Keys
```
Alternatively, you can configure your Modal app to use [secrets](https://modal.com/docs/guide/secrets)
2. Update the `modal_url` in `server/src/bot_vllm.py` to point to the url produced from the self-serve llm deploy, mentioned above.
3. From within the `server` directory, test the app locally:
```bash
modal serve app.py
```
4. Deploy to production
```bash
modal deploy app.py
```
5. Note the endpoint URL produced from this deployment. It will look like:
```bash
https://{your-workspace}--pipecat-modal-fastapi-app.modal.run
```
You'll need this URL for the client's `app.js` configuration mentioned in its README.
## Launch your bots on Modal
### Option 1: Direct Link
Simply click on the url displayed after running the server or deploy step to launch an agent and be redirected to a Daily room to talk with the launched bot. This will use the OpenAI pipeline.
### Option 2: Connect via an RTVI Client
Follow the instructions provided in the [client folder's README](client/javascript/README.md) for building and running a custom client that connects to your Modal endpoint. The provided client provides a dropdown for choosing which bot pipeline to run.
# Navigating your llm, server, and Pipecat logs
In your [Modal dashboard](https://modal.com/apps), you should have two Apps listed under Live Apps:
1. `example-vllm-openai-compatible`: This App contains the containers and logs used to run your self-hosted LLM. There will be just one App Function listed: `serve`. Click on this function to view logs for your LLM.
2. `pipecat-modal`: This App contains the containers and logs used to run your `connect` endpoints and Pipecat pipelines. It will list two App Functions:
1. `fastapi_app`: This function is running the endpoints that your client will interact with and initiate starting a new pipeline (`/`, `/connect`, `/status`). Click on this function to see logs for each endpoint hit.
2. `bot_runner`: This function handles launching and running a bot pipeline. Click on this function to get a list of all pipeline runs and access each run's logs.
# Modal + Pipecat Tips
- In most other Pipecat examples, we use `Popen` to launch the pipeline process from the `/connect` endpoint. In this example, we use a Modal function instead. This allows us to run the pipelines using a separately defined Modal image as well as run each pipeline in an isolated container.
- For the FastAPI and most common Pipecat Pipeline containers, a default `debian_slim` CPU-only should be all that's required to run. GPU containers are needed for self-hosted services.
- To minimize cold starts of the pipeline and reduce latency for users, set `min_containers=1` on the Modal Function that launches the pipeline to ensure at least one warm instance of your function is always available.
- For next steps on running a self-hosted llm and reducing latency, check out all of [Modal's LLM examples](https://modal.com/docs/examples/vllm_inference).

View File

@@ -1 +0,0 @@
node_modules

View File

@@ -1,29 +0,0 @@
# JavaScript Implementation
Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
## Setup
1. Deploy the Modal server. See the main [README](../../README).
2. Navigate to the `client/javascript` directory:
```bash
cd client/javascript
```
3. Modify the baseUrl in src/app.js to point to your deployed Modal endpoint
4. Install dependencies:
```bash
npm install
```
5. Run the client app:
```
npm run dev
```
6. Visit http://localhost:5173 in your browser.

View File

@@ -1,49 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>AI Chatbot</title>
</head>
<body>
<div class="container">
<div class="status-bar">
<div class="status">
Status: <span id="connection-status">Disconnected</span>
</div>
<div class="controls">
<select id="bot-selector">
<option value="openai">OpenAI</option>
<option value="gemini">Gemini</option>
<option value="vllm">Llama</option>
</select>
<button id="connect-btn">Connect</button>
<button id="disconnect-btn" disabled>Disconnect</button>
</div>
</div>
<div class="main-content">
<div class="bot-container">
<div id="bot-video-container"></div>
<audio id="bot-audio" autoplay></audio>
</div>
</div>
<div class="device-bar">
<div class="device-controls">
<select id="device-selector"></select>
<button id="mic-toggle-btn">Mute Mic</button>
</div>
</div>
<div class="debug-panel">
<h3>Debug Info</h3>
<div id="debug-log"></div>
</div>
</div>
<script type="module" src="/src/app.js"></script>
<link rel="stylesheet" href="/src/style.css" />
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -1,21 +0,0 @@
{
"name": "client",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.3.5"
},
"dependencies": {
"@pipecat-ai/client-js": "^1.0.0",
"@pipecat-ai/daily-transport": "^1.0.0"
}
}

View File

@@ -1,376 +0,0 @@
/**
* Copyright (c) 20242025, Daily
*
* SPDX-License-Identifier: BSD 2-Clause License
*/
/**
* Pipecat Client Implementation
*
* This client connects to an RTVI-compatible bot server using WebRTC (via Daily).
* It handles audio/video streaming and manages the connection lifecycle.
*
* Requirements:
* - A running RTVI bot server (defaults to http://localhost:7860)
* - The server must implement the /connect endpoint that returns Daily.co room credentials
* - Browser with WebRTC support
*/
import { PipecatClient, RTVIEvent } from '@pipecat-ai/client-js';
import { DailyTransport } from '@pipecat-ai/daily-transport';
/**
* ChatbotClient handles the connection and media management for a real-time
* voice and video interaction with an AI bot.
*/
class ChatbotClient {
constructor() {
// Initialize client state
this.pcClient = null;
this.setupDOMElements();
this.initializeClientAndTransport();
this.setupEventListeners();
}
/**
* Set up references to DOM elements and create necessary media elements
*/
setupDOMElements() {
// Get references to UI control elements
this.connectBtn = document.getElementById('connect-btn');
this.disconnectBtn = document.getElementById('disconnect-btn');
this.statusSpan = document.getElementById('connection-status');
this.debugLog = document.getElementById('debug-log');
this.botVideoContainer = document.getElementById('bot-video-container');
this.deviceSelector = document.getElementById('device-selector');
// Create an audio element for bot's voice output
this.botAudio = document.createElement('audio');
this.botAudio.autoplay = true;
this.botAudio.playsInline = true;
document.body.appendChild(this.botAudio);
}
/**
* Set up event listeners for connect/disconnect buttons
*/
setupEventListeners() {
this.connectBtn.addEventListener('click', () => this.connect());
this.disconnectBtn.addEventListener('click', () => this.disconnect());
// Populate device selector
this.pcClient.getAllMics().then((mics) => {
console.log('Available mics:', mics);
mics.forEach((device) => {
const option = document.createElement('option');
option.value = device.deviceId;
option.textContent = device.label || `Microphone ${device.deviceId}`;
this.deviceSelector.appendChild(option);
});
});
this.deviceSelector.addEventListener('change', (event) => {
const selectedDeviceId = event.target.value;
console.log('Selected device ID:', selectedDeviceId);
this.pcClient.updateMic(selectedDeviceId);
});
// Handle mic mute/unmute toggle
const micToggleBtn = document.getElementById('mic-toggle-btn');
micToggleBtn.addEventListener('click', () => {
let micEnabled = this.pcClient.isMicEnabled;
micToggleBtn.textContent = micEnabled ? 'Unmute Mic' : 'Mute Mic';
this.pcClient.enableMic(!micEnabled);
// Add logic to mute/unmute the mic
if (micEnabled) {
console.log('Mic muted');
// Add code to mute the mic
} else {
console.log('Mic unmuted');
// Add code to unmute the mic
}
});
}
/**
* Set up the Pipecat client and Daily transport
*/
async initializeClientAndTransport() {
// Initialize the Pipecat client with a DailyTransport and our configuration
this.pcClient = new PipecatClient({
transport: new DailyTransport(),
enableMic: true, // Enable microphone for user input
enableCam: false,
callbacks: {
// Handle connection state changes
onConnected: () => {
this.updateStatus('Connected');
this.connectBtn.disabled = true;
this.disconnectBtn.disabled = false;
this.log('Client connected');
},
onDisconnected: () => {
this.updateStatus('Disconnected');
this.connectBtn.disabled = false;
this.disconnectBtn.disabled = true;
this.log('Client disconnected');
},
// Handle transport state changes
onTransportStateChanged: (state) => {
this.updateStatus(`Transport: ${state}`);
this.log(`Transport state changed: ${state}`);
if (state === 'connecting') {
window.startTime = Date.now();
}
if (state === 'ready') {
this.setupMediaTracks();
console.warn('TIME TO BOT READY:', Date.now() - window.startTime);
}
},
// Handle bot connection events
onBotConnected: (participant) => {
this.log(`Bot connected: ${JSON.stringify(participant)}`);
},
onBotDisconnected: (participant) => {
this.log(`Bot disconnected: ${JSON.stringify(participant)}`);
},
onBotReady: (data) => {
this.log(`Bot ready: ${JSON.stringify(data)}`);
this.setupMediaTracks();
},
// Transcript events
onUserTranscript: (data) => {
// Only log final transcripts
if (data.final) {
this.log(`User: ${data.text}`);
}
},
onBotTranscript: (data) => {
this.log(`Bot: ${data.text}`);
},
// Error handling
onMessageError: (error) => {
console.log('Message error:', error);
},
onMicUpdated: (data) => {
console.log('Mic updated:', data);
this.deviceSelector.value = data.deviceId;
},
onError: (error) => {
console.log('Error:', JSON.stringify(error));
},
},
});
// Set up listeners for media track events
this.setupTrackListeners();
await this.pcClient.initDevices();
window.client = this.pcClient;
}
/**
* Add a timestamped message to the debug log
*/
log(message) {
const entry = document.createElement('div');
entry.textContent = `${new Date().toISOString()} - ${message}`;
// Add styling based on message type
if (message.startsWith('User: ')) {
entry.style.color = '#2196F3'; // blue for user
} else if (message.startsWith('Bot: ')) {
entry.style.color = '#4CAF50'; // green for bot
}
this.debugLog.appendChild(entry);
this.debugLog.scrollTop = this.debugLog.scrollHeight;
console.log(message);
}
/**
* Update the connection status display
*/
updateStatus(status) {
this.statusSpan.textContent = status;
this.log(`Status: ${status}`);
}
/**
* Check for available media tracks and set them up if present
* This is called when the bot is ready or when the transport state changes to ready
*/
setupMediaTracks() {
if (!this.pcClient) return;
// Get current tracks from the client
const tracks = this.pcClient.tracks();
// Set up any available bot tracks
if (tracks.bot?.audio) {
this.setupAudioTrack(tracks.bot.audio);
}
if (tracks.bot?.video) {
this.setupVideoTrack(tracks.bot.video);
}
}
/**
* Set up listeners for track events (start/stop)
* This handles new tracks being added during the session
*/
setupTrackListeners() {
if (!this.pcClient) return;
// Listen for new tracks starting
this.pcClient.on(RTVIEvent.TrackStarted, (track, participant) => {
// Only handle non-local (bot) tracks
if (!participant?.local) {
if (track.kind === 'audio') {
this.setupAudioTrack(track);
} else if (track.kind === 'video') {
this.setupVideoTrack(track);
}
this.log(
`Track started event: ${track.kind} from ${
participant?.name || 'unknown'
}`
);
} else {
this.log('Local mic unmuted');
}
});
// Listen for tracks stopping
this.pcClient.on(RTVIEvent.TrackStopped, (track, participant) => {
if (participant.local) {
this.log('Local mic muted');
return;
}
this.log(
`Track stopped event: ${track.kind} from ${
participant?.name || 'unknown'
}`
);
});
}
/**
* Set up an audio track for playback
* Handles both initial setup and track updates
*/
setupAudioTrack(track) {
this.log('Setting up audio track');
// Check if we're already playing this track
if (this.botAudio.srcObject) {
const oldTrack = this.botAudio.srcObject.getAudioTracks()[0];
if (oldTrack?.id === track.id) return;
}
// Create a new MediaStream with the track and set it as the audio source
this.botAudio.srcObject = new MediaStream([track]);
}
/**
* Set up a video track for display
* Handles both initial setup and track updates
*/
setupVideoTrack(track) {
this.log('Setting up video track');
const videoEl = document.createElement('video');
videoEl.autoplay = true;
videoEl.playsInline = true;
videoEl.muted = true;
videoEl.style.width = '100%';
videoEl.style.height = '100%';
videoEl.style.objectFit = 'cover';
// Check if we're already displaying this track
if (this.botVideoContainer.querySelector('video')?.srcObject) {
const oldTrack = this.botVideoContainer
.querySelector('video')
.srcObject.getVideoTracks()[0];
if (oldTrack?.id === track.id) return;
}
// Create a new MediaStream with the track and set it as the video source
videoEl.srcObject = new MediaStream([track]);
this.botVideoContainer.innerHTML = '';
this.botVideoContainer.appendChild(videoEl);
}
/**
* Initialize and connect to the bot
* This sets up the Pipecat client, initializes devices, and establishes the connection
*/
async connect() {
try {
const botSelector = document.getElementById('bot-selector');
const selectedBot = botSelector.value;
// Initialize audio/video devices
this.log('Initializing devices...');
await this.pcClient.initDevices();
// Connect to the bot
this.log(`Connecting to bot: ${selectedBot}`);
await this.pcClient.connect({
// REPLACE WITH YOUR MODAL URL ENDPOINT
endpoint:
'https://<your-workspace>--pipecat-modal-fastapi-app.modal.run/connect',
requestData: {
bot_name: selectedBot,
},
});
this.log('Connection complete');
} catch (error) {
// Handle any errors during connection
console.error('Connection error:', error);
this.log(`Error connecting: ${JSON.stringify(error.message)}`);
this.log(`Error stack: ${error.stack}`);
this.updateStatus('Error');
// Clean up if there's an error
if (this.pcClient) {
try {
await this.pcClient.disconnect();
} catch (disconnectError) {
this.log(`Error during disconnect: ${disconnectError.message}`);
}
}
}
}
/**
* Disconnect from the bot and clean up media resources
*/
async disconnect() {
if (this.pcClient) {
try {
// Disconnect the Pipecat client
await this.pcClient.disconnect();
// Clean up audio
if (this.botAudio.srcObject) {
this.botAudio.srcObject.getTracks().forEach((track) => track.stop());
this.botAudio.srcObject = null;
}
// Clean up video
if (this.botVideoContainer.querySelector('video')?.srcObject) {
const video = this.botVideoContainer.querySelector('video');
video.srcObject.getTracks().forEach((track) => track.stop());
video.srcObject = null;
}
this.botVideoContainer.innerHTML = '';
} catch (error) {
this.log(`Error disconnecting: ${error.message}`);
}
}
}
}
// Initialize the client when the page loads
window.addEventListener('DOMContentLoaded', () => {
new ChatbotClient();
});

View File

@@ -1,135 +0,0 @@
body {
margin: 0;
padding: 20px;
font-family: Arial, sans-serif;
background-color: #f0f0f0;
}
.container {
max-width: 1200px;
margin: 0 auto;
}
.status-bar,
.device-bar {
display: flex;
justify-content: space-between;
align-items: center;
padding: 10px;
background-color: #fff;
border-radius: 8px;
margin-bottom: 20px;
}
.controls,
.device-controls {
display: flex;
align-items: center;
gap: 10px; /* Adds spacing between elements */
}
.device-controls {
margin-left: auto;
}
.controls button,
.device-controls button {
padding: 8px 16px;
margin-left: 10px;
border: none;
border-radius: 4px;
cursor: pointer;
}
#bot-selector,
#device-selector {
padding: 8px 16px;
padding-right: 40px;
border: none;
border-radius: 4px;
background-color: #6c757d; /* Gray background */
color: white; /* White text */
cursor: pointer;
appearance: none; /* Removes default browser styling for dropdowns */
background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 24 24' fill='white'%3E%3Cpath d='M7 10l5 5 5-5z'/%3E%3C/svg%3E"); /* Custom arrow */
background-repeat: no-repeat;
background-position: right 8px center; /* Position the arrow */
}
#bot-selector:focus,
#device-selector:focus {
outline: none;
box-shadow: 0 0 4px rgba(0, 0, 0, 0.3); /* Add a subtle focus effect */
}
#connect-btn {
background-color: #4caf50;
color: white;
}
#disconnect-btn {
background-color: #f44336;
color: white;
}
#mic-toggle-btn {
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.main-content {
background-color: #fff;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
}
.bot-container {
display: flex;
flex-direction: column;
align-items: center;
}
#bot-video-container {
width: 640px;
height: 360px;
background-color: #e0e0e0;
border-radius: 8px;
margin: 20px auto;
overflow: hidden;
display: flex;
align-items: center;
justify-content: center;
}
#bot-video-container video {
width: 100%;
height: 100%;
object-fit: cover;
}
.debug-panel {
background-color: #fff;
border-radius: 8px;
padding: 20px;
}
.debug-panel h3 {
margin: 0 0 10px 0;
font-size: 16px;
font-weight: bold;
}
#debug-log {
height: 200px;
overflow-y: auto;
background-color: #f8f8f8;
padding: 10px;
border-radius: 4px;
font-family: monospace;
font-size: 12px;
line-height: 1.4;
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 114 KiB

View File

@@ -1,307 +0,0 @@
"""modal_example.
This module shows a simple example of how to deploy a bot using Modal and FastAPI.
It includes:
- FastAPI endpoints for starting agents and checking bot statuses.
- Dynamic loading of bot implementations.
- Use of a Daily transport for bot communication.
"""
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import importlib
import os
from contextlib import asynccontextmanager
from typing import Any, Dict, Literal
import aiohttp
import modal
from fastapi import APIRouter, FastAPI, HTTPException
from fastapi.responses import JSONResponse, RedirectResponse
from pydantic import BaseModel
# container specifications for the FastAPI web server
web_image = (
modal.Image.debian_slim(python_version="3.13")
.pip_install_from_requirements("requirements.txt")
.pip_install("pipecat-ai[daily]")
.add_local_dir("src", remote_path="/root/src")
)
# container specifications for the Pipecat pipeline
bot_image = (
modal.Image.debian_slim(python_version="3.13")
.apt_install("ffmpeg")
.pip_install_from_requirements("requirements.txt")
.pip_install("pipecat-ai[daily,elevenlabs,openai,silero,google]")
.add_local_dir("src", remote_path="/root/src")
)
app = modal.App("pipecat-modal", secrets=[modal.Secret.from_dotenv()])
router = APIRouter()
bot_jobs = {}
daily_helpers = {}
# Names of all supported bot implementations
# These correspond to the bot files in the src directory
BotName = Literal["openai", "gemini", "vllm"]
def cleanup():
"""Cleanup function to terminate all bot processes.
Called during server shutdown.
"""
for entry in bot_jobs.values():
func = modal.FunctionCall.from_id(entry[0])
if func:
func.cancel()
def get_bot_file(bot_name: BotName) -> str:
"""Retrieve the bot file name corresponding to the provided bot_name.
Args:
bot_name (BotName): The name of the bot (e.g., 'openai', 'gemini', 'vllm').
Returns:
str: The file name corresponding to the bot implementation.
Raises:
ValueError: If the bot name is invalid or not supported.
"""
# bot_implementation = os.getenv("BOT_IMPLEMENTATION", "openai").lower().strip()
bot_implementation = bot_name.lower().strip()
if not bot_implementation:
bot_implementation = "openai"
if bot_implementation not in ["openai", "gemini", "vllm"]:
raise ValueError(
f"Invalid BOT_IMPLEMENTATION: {bot_implementation}. Must be 'openai' or 'gemini' or 'vllm'"
)
return f"bot_{bot_implementation}"
def get_runner(path: str, bot_file: str) -> callable:
"""Dynamically import the run_bot function based on the bot name.
Args:
path (str): The path to the bot files (e.g., 'src').
bot_file (str): The file name of the bot implementation (e.g., 'openai', 'gemini', 'vllm').
Returns:
function: The run_bot function from the specified bot module.
Raises:
ImportError: If the specified bot module or run_bot function is not found.
"""
try:
# Dynamically construct the module name
module_name = f"{path}.{bot_file}"
# Import the module
module = importlib.import_module(module_name)
# Get the run_bot function from the module
return getattr(module, "run_bot")
except (ImportError, AttributeError) as e:
raise ImportError(f"Failed to import run_bot from {module_name}: {e}")
async def create_room_and_token() -> tuple[str, str]:
"""Create a Daily room and generate an authentication token.
This function checks for existing room URL and token in the environment variables.
If not found, it creates a new room using the Daily API and generates a token for it.
Returns:
tuple[str, str]: A tuple containing the room URL and the authentication token.
Raises:
HTTPException: If room creation or token generation fails.
"""
from pipecat.transports.services.helpers.daily_rest import DailyRoomParams
room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", None)
token = os.getenv("DAILY_SAMPLE_ROOM_TOKEN", None)
if not room_url:
room = await daily_helpers["rest"].create_room(DailyRoomParams())
if not room.url:
raise HTTPException(status_code=500, detail="Failed to create room")
room_url = room.url
token = await daily_helpers["rest"].get_token(room_url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room_url}")
return room_url, token
@app.function(image=bot_image, min_containers=1)
async def bot_runner(room_url, token, bot_name: BotName = "openai"):
"""Launch the provided bot process, providing the given room URL and token for the bot to join.
Args:
room_url (str): The URL of the Daily room where the bot and client will communicate.
token (str): The authentication token for the room.
bot_name (BotName): The name of the bot implementation to use. Defaults to "openai".
Raises:
HTTPException: If the bot pipeline fails to start.
"""
try:
path = "src"
bot_file = get_bot_file(bot_name)
run_bot = get_runner(path, bot_file)
print(f"Starting bot process: {bot_file} -u {room_url} -t {token}")
await run_bot(room_url, token)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start bot pipeline: {e}")
@asynccontextmanager
async def lifespan(app: FastAPI):
"""FastAPI lifespan manager that handles startup and shutdown tasks.
- Creates aiohttp session
- Initializes Daily API helper
- Cleans up resources on shutdown
"""
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
class ConnectData(BaseModel):
"""Data provided by client to specify the bot pipeline.
Attributes:
bot_name (BotName): The name of the bot to connect to. Defaults to "openai".
"""
bot_name: BotName = "openai"
async def start(data: ConnectData):
"""Internal method to start a bot agent and return the room URL and token.
Args:
data (ConnectData): The data containing the bot name to use.
Returns:
tuple[str, str]: A tuple containing the room URL and token.
"""
room_url, token = await create_room_and_token()
launch_bot_func = modal.Function.from_name("pipecat-modal", "bot_runner")
function_id = launch_bot_func.spawn(room_url, token, data.bot_name)
bot_jobs[function_id] = (function_id, room_url)
return room_url, token
@router.get("/")
async def start_agent():
"""A user endpoint for launching a bot agent and redirecting to the created room URL.
This function retrieves the bot implementation from the environment,
starts the bot agent, and redirects the user to the room URL to
interact with the bot through a Daily Prebuilt Interface.
Returns:
RedirectResponse: A response that redirects to the room URL.
"""
bot_name = os.getenv("BOT_IMPLEMENTATION", "openai").lower().strip()
print(f"Starting bot: {bot_name}")
room_url, token = await start(ConnectData(bot_name=bot_name))
return RedirectResponse(room_url)
@router.post("/connect")
async def rtvi_connect(data: ConnectData) -> Dict[Any, Any]:
"""A user endpoint for launching a bot agent and retrieving the room/token credentials.
This function retrieves the bot implementation from the request, if provided,
starts the bot agent, and returns the room URL and token for the bot. This allows the
client to then connect to the bot using their own RTVI interface.
Args:
data (ConnectData): Optional. The data containing the bot name to use.
Returns:
Dict[Any, Any]: A dictionary containing the room URL and token.
"""
print(f"Starting bot: {data.bot_name}")
if data is None or not data.bot_name:
data.bot_name = os.getenv("BOT_IMPLEMENTATION", "openai").lower().strip()
room_url, token = await start(data)
return {"room_url": room_url, "token": token}
@router.get("/status/{fid}")
def get_status(fid: str):
"""Retrieve the status of a bot process by its function ID.
Args:
fid (str): The function ID of the bot process.
Returns:
JSONResponse: A JSON response containing the bot's status and result code.
Raises:
HTTPException: If the bot process with the given ID is not found.
"""
func = modal.FunctionCall.from_id(fid)
if not func:
raise HTTPException(status_code=404, detail=f"Bot with process id: {fid} not found")
try:
result = func.get(timeout=0)
return JSONResponse({"bot_id": fid, "status": "finished", "code": result})
except modal.exception.OutputExpiredError:
return JSONResponse({"bot_id": fid, "status": "finished", "code": 404})
except TimeoutError:
return JSONResponse({"bot_id": fid, "status": "running", "code": 202})
@app.function(image=web_image, min_containers=1)
@modal.concurrent(max_inputs=1)
@modal.asgi_app()
def fastapi_app():
"""Create and configure the FastAPI application.
This function initializes the FastAPI app with middleware, routes, and lifespan management.
It is decorated to be used as a Modal ASGI app.
"""
from fastapi.middleware.cors import CORSMiddleware
# Initialize FastAPI app
web_app = FastAPI(lifespan=lifespan)
web_app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Include the endpoints from endpoints.py
web_app.include_router(router)
return web_app

View File

@@ -1,14 +0,0 @@
DAILY_API_KEY=
# determines which bot file to default to: 'openai', 'gemini', or 'vllm'
BOT_IMPLEMENTATION=openai
# needed for the openai bot pipeline
OPENAI_API_KEY=
ELEVENLABS_API_KEY=
# needed for the gemini live bot pipeline
GOOGLE_API_KEY=
# needed if you modified the API Key for your self-hosted LLM
VLLM_API_KEY=

View File

@@ -1,3 +0,0 @@
python-dotenv==1.0.1
modal==1.0.5
fastapi[all]

Binary file not shown.

Before

Width:  |  Height:  |  Size: 759 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 884 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 876 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 881 KiB

Some files were not shown because too many files have changed in this diff Show More