Compare commits

...

1446 Commits

Author SHA1 Message Date
Aleix Conchillo Flaqué
41695806e8 process SystemFrames in a queue as high priority frames 2025-05-07 17:05:35 -07:00
Mark Backman
7280e390d9 Merge pull request #1774 from pipecat-ai/mb/moondream-ex-server
Add load_dotenv to moondream example server
2025-05-07 19:02:30 -04:00
Mark Backman
4efc3f0a39 Merge pull request #1775 from pipecat-ai/mb/patient-ex-env
Add load_dotenv to patient-intake server file
2025-05-07 19:02:20 -04:00
Mark Backman
cb7e7a8aa3 Add load_dotenv to patient-intake server file 2025-05-07 18:40:04 -04:00
Mark Backman
9136402846 Add load_dotenv to moondream example server 2025-05-07 18:29:27 -04:00
Aleix Conchillo Flaqué
260fc76137 Merge pull request #1773 from pipecat-ai/aleix/pipecat-0.0.67
update CHANGELOG for 0.0.67
2025-05-07 15:05:55 -07:00
Aleix Conchillo Flaqué
7cfb9a4d15 update CHANGELOG for 0.0.67 2025-05-07 14:59:16 -07:00
Mark Backman
2089e0c974 Merge pull request #1768 from pipecat-ai/mb/update-observers
Add DebugLogObserver
2025-05-07 17:30:49 -04:00
Mark Backman
9e0b4fe5d1 Replace list with tuple 2025-05-07 17:21:09 -04:00
Mark Backman
75ce632f84 Add DebugLogObserver 2025-05-07 17:21:08 -04:00
Mark Backman
efeb96c4e8 Remove unused imports 2025-05-07 17:20:42 -04:00
kompfner
fb5438e9c2 Merge pull request #1770 from pipecat-ai/pk/amazon-nova-sonic-interruption-reliability
AWS Nova Sonic service - make interruption handling more reliable, in…
2025-05-07 17:16:06 -04:00
Mark Backman
7da9f66e1c Merge pull request #1761 from pipecat-ai/mb/elevenlabs-context-id
Update ElevenLabsTTSService to use the new websocket API
2025-05-07 17:12:06 -04:00
Mark Backman
9e16e3d614 Update ElevenLabsTTSService to use the new websocket API 2025-05-07 17:09:52 -04:00
Paul Kompfner
84d040c6d0 AWS Nova Sonic service - make interruption handling more reliable, in terms of:
- not getting the conversation into a "stuck" state
- not losing assistant text that should've made it into the context
2025-05-07 16:34:18 -04:00
Mark Backman
f3e0beb8f1 Merge pull request #1762 from pipecat-ai/iss-1734-rtvi-function-call-breakage
Revert breaking change in RTVI protocol for function calling
2025-05-07 15:25:22 -04:00
Aleix Conchillo Flaqué
e00a1196ef Merge pull request #1767 from pipecat-ai/aleix/daily-python-0.18.2
pyproject: update daily-python to 0.18.2
2025-05-07 12:19:59 -07:00
Aleix Conchillo Flaqué
3867c0f8e7 Merge pull request #1766 from pipecat-ai/aleix/daily-fix-multiple-audio-video-sources
fix multiple audio video sources
2025-05-07 12:19:46 -07:00
Aleix Conchillo Flaqué
cdf0953722 pyproject: update daily-python to 0.18.2 2025-05-07 11:56:36 -07:00
Aleix Conchillo Flaqué
ed00f7d071 add video_source field to UserImageRequestFrame 2025-05-07 11:50:21 -07:00
Aleix Conchillo Flaqué
a3038afa02 DailyTransport: fix multiple audio/video sources 2025-05-07 11:50:00 -07:00
kompfner
f9ca0b8cc6 Merge pull request #1704 from pipecat-ai/pk/amazon-nova-sonic
Amazon Nova Sonic LLM service
2025-05-07 14:45:28 -04:00
Paul Kompfner
2920aa5af4 [WIP] AWS Nova Sonic service - pull AWS Nova Sonic support out of the aws optional dependency in pyproject.toml and into its own aws-nova-sonic optional dependency. That's because it requires Python >= 3.12, a higher version than the base project's 3.10. This change allows anyone using any of the other AWS services (including our own unit tests) to continue using the lower Python version. 2025-05-07 14:32:32 -04:00
Paul Kompfner
93c9cc4a0e [WIP] AWS Nova Sonic service - minor fix 2025-05-07 13:54:06 -04:00
Paul Kompfner
b53f9235e4 [WIP] AWS Nova Sonic service - remove unnecessary _context_available state, instead just relying on the presence of _context 2025-05-07 13:54:06 -04:00
Paul Kompfner
1491462d15 [WIP] AWS Nova Sonic service - remove _handling_bot_stopped_speaking, which no longer seems to be necessary; I'm no longer observing back-to-back BotStoppedSpeaking frames 2025-05-07 13:54:06 -04:00
Paul Kompfner
c78f779800 [WIP] AWS Nova Sonic service - log an error message if you try to use AWS Nova Sonic without the proper dependency (e.g. without having done pip install pipecat-ai[aws]) 2025-05-07 13:54:06 -04:00
Paul Kompfner
b013e375fb [WIP] AWS Nova Sonic service - simplify a bit of logic (and do the same simplification in the OpenAI Realtime service) 2025-05-07 13:54:06 -04:00
Paul Kompfner
52036138c1 [WIP] AWS Nova Sonic service - remove unnecessary (no-op) code 2025-05-07 13:54:06 -04:00
Paul Kompfner
4ba9a42861 [WIP] AWS Nova Sonic service - add more accurate typing 2025-05-07 13:54:06 -04:00
Paul Kompfner
27bff7a759 [WIP] AWS Nova Sonic service - fix comment 2025-05-07 13:54:06 -04:00
Paul Kompfner
896f8d85f7 [WIP] AWS Nova Sonic service - remove out-of-date TODO comment 2025-05-07 13:54:06 -04:00
Paul Kompfner
ed06cdd2c7 [WIP] AWS Nova Sonic service - add CHANGELOG entry 2025-05-07 13:54:02 -04:00
Paul Kompfner
8473647269 [WIP] AWS Nova Sonic service - update persistent-context example to better avoid saving "transitional", as opposed to meaningful, context messages 2025-05-07 13:52:51 -04:00
Paul Kompfner
5579145a06 [WIP] AWS Nova Sonic service - post-rebase, update examples to play nicely with recent pipecat changes 2025-05-07 13:52:51 -04:00
Paul Kompfner
35848d10b3 [WIP] AWS Nova Sonic service - remove various TODO comments 2025-05-07 13:52:51 -04:00
Paul Kompfner
c7e223e85a [WIP] AWS Nova Sonic service - remove print statements in favor of logger 2025-05-07 13:52:51 -04:00
Paul Kompfner
885b2d1d2f [WIP] AWS Nova Sonic service - make parameters configurable 2025-05-07 13:52:51 -04:00
Paul Kompfner
73020be511 [WIP] AWS Nova Sonic service - minor fix: only try to read received JSON if we have it 2025-05-07 13:52:51 -04:00
Paul Kompfner
d388c057c0 [WIP] AWS Nova Sonic service - recover from unwanted disconnection due to an error 2025-05-07 13:52:51 -04:00
Paul Kompfner
c4d0f91a7f [WIP] AWS Nova Sonic service - remove some old code that was accidentally still there, possibly sending a duplicate system instruction 2025-05-07 13:52:51 -04:00
Paul Kompfner
467233be04 [WIP] AWS Nova Sonic service - support multi-line system prompt 2025-05-07 13:52:51 -04:00
Paul Kompfner
2b02d08f4c [WIP] AWS Nova Sonic service - add comments to examples pointing out the us-east-1 is the only supported region so far 2025-05-07 13:52:51 -04:00
Paul Kompfner
9fe265ea64 [WIP] AWS Nova Sonic service - implement ability to persist and load conversations 2025-05-07 13:52:51 -04:00
Paul Kompfner
cc1f4ba81c [WIP] AWS Nova Sonic service - add a hacky way of programmatically triggering an assistant response 2025-05-07 13:52:51 -04:00
Paul Kompfner
3784bdbd27 [WIP] AWS Nova Sonic service - in our hacky direct manipulation of the context, aggregate assistant text rather than recording every chunk as a separate message 2025-05-07 13:52:51 -04:00
Paul Kompfner
4ffdc3b77c [WIP] AWS Nova Sonic service - do hacky direct manipulation of the context for now, since I can't seem to get assistant context aggregation working properly with frames, grr 2025-05-07 13:52:51 -04:00
Paul Kompfner
38c9fa681a [WIP] AWS Nova Sonic service - Protect against back-to-back BotStoppedSpeaking calls, which I've observed 2025-05-07 13:52:51 -04:00
Paul Kompfner
c477039954 [WIP] AWS Nova Sonic service - just for safety, add a short delay after BotStoppedSpeaking before sending LLMFullResponseEndFrame + TTSStoppedFrame, to give a bit of leeway for the LLM to deliver the "FINAL" text block describing what was said 2025-05-07 13:52:51 -04:00
Paul Kompfner
d6ef3d64ac [WIP] AWS Nova Sonic service - fix context problems of double-counting LLM text, and mis-categorizing user text as LLM text 2025-05-07 13:52:51 -04:00
Paul Kompfner
6938152db6 [WIP] AWS Nova Sonic service - fix comment 2025-05-07 13:52:51 -04:00
Paul Kompfner
2154db07f0 [WIP] AWS Nova Sonic service - remove unnecessary error log 2025-05-07 13:52:51 -04:00
Paul Kompfner
5e0803479e [WIP] AWS Nova Sonic service - add send_transcription_frames option 2025-05-07 13:52:51 -04:00
Paul Kompfner
3960c604a4 [WIP] AWS Nova Sonic service - fix empty assistant conversation history item in the context after tool use 2025-05-07 13:52:51 -04:00
Paul Kompfner
394648f1c9 [WIP] AWS Nova Sonic service - fix user utterances not making it into the context 2025-05-07 13:52:51 -04:00
Paul Kompfner
da5c4953d5 [WIP] AWS Nova Sonic service - allow passing in tools into initializer 2025-05-07 13:52:51 -04:00
Paul Kompfner
2b7e1cb5b1 [WIP] AWS Nova Sonic service - add tool calling 2025-05-07 13:52:51 -04:00
Paul Kompfner
f182eafb40 [WIP] AWS Nova Sonic service - add ability to pass in OpenAILLMContext 2025-05-07 13:52:51 -04:00
Paul Kompfner
9f7f42e885 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
9b8bce1914 [WIP] AWS Nova Sonic service - add voice_id 2025-05-07 13:52:51 -04:00
Paul Kompfner
96d05e12fc [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
68c1069548 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
5b64613f65 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
1f9baefba8 [WIP] AWS Nova Sonic service - added stubs for handling interruption and user-started-speaking frames 2025-05-07 13:52:51 -04:00
Paul Kompfner
0c255d2618 [WIP] AWS Nova Sonic service - added TTSTextFrame and reworked/cleaned up some bookkeeping logic 2025-05-07 13:52:51 -04:00
Paul Kompfner
a38206de9c [WIP] AWS Nova Sonic service - added TranscriptionFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
260f7c9b85 [WIP] AWS Nova Sonic service - format 2025-05-07 13:52:51 -04:00
Paul Kompfner
de294caed9 [WIP] AWS Nova Sonic service - added LLMFullResponseStartFrame, LLMTextFrame, and LLMFullResponseEndFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
e40aa4f99a [WIP] AWS Nova Sonic service - added TTSStartedFrame and TTSStoppedFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
b1d413b9be [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
8cbad070ad [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
13569a5a5a [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
d789334a60 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
7668b27fc0 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
6d30f441e8 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
a9e395b366 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
5e5626f04f [WIP] AWS Nova Sonic service 2025-05-07 13:52:47 -04:00
Aleix Conchillo Flaqué
d80aa5b44e Merge pull request #1753 from pipecat-ai/aleix/add-bedrock-support
Add support for Amazon Bedrock LLMs
2025-05-07 09:31:48 -07:00
Aleix Conchillo Flaqué
80ef6dc4de update README with AWS Bedrock and Transcribe 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
458549f7df AWSBedrockLLMService: fix function calling 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
a8405649d0 aws: use AWS prefix for all services 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
ce1a72850b tests: add bedrock context aggregator tests 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
58de381746 AWS: add missing utils 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
bed2e894a2 BedrockLLMService: pull initial system frame from messages 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
b4de98cfb7 AWS: various cleanups (logs, imports...) 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
a4b9db9e07 fix formatting 2025-05-07 09:26:26 -07:00
Adithya Suresh
664111a3c9 Added cache related info to metrics 2025-05-07 09:26:26 -07:00
Adithya Suresh
aa964847f3 System param to be a list 2025-05-07 09:26:26 -07:00
Adithya Suresh
fa5cac7e0a Bug fix in content format 2025-05-07 09:26:26 -07:00
Adithya Suresh
b2b01861b2 Remove model restriction 2025-05-07 09:26:26 -07:00
Adithya Suresh
f014f718eb Restructured STT and enabled prosody tags for generative Polly 2025-05-07 09:26:26 -07:00
Adithya Suresh
05ae8d3ffa Removed OpenAI based context formatting 2025-05-07 09:26:26 -07:00
Adithya Suresh
88c9e08bd8 Updated tools parsing logic 2025-05-07 09:26:26 -07:00
Adithya Suresh
844f61dfea Initial implementation 2025-05-07 09:26:26 -07:00
Tico Ballagas
acb7d597cb Change example to use generative voices 2025-05-07 09:19:49 -07:00
Tico Ballagas
2b18f60261 Initial implementation of AWS Transcribe TTS 2025-05-07 09:19:49 -07:00
mattie ruth backman
5b66133a6c Revert breaking change in RTVI protocol for function calling 2025-05-07 12:08:28 -04:00
Mark Backman
0c5bc6a57a Merge pull request #1760 from WebinarGeek/wg/daily-active-speaker-event
DailyTransport: added on_active_speaker_changed event handler
2025-05-07 11:17:49 -04:00
Mark Backman
7981e00955 Merge pull request #1759 from pipecat-ai/mb/readme-nvidia-riva
Update README with Riva services
2025-05-07 11:13:51 -04:00
Dan Berg
5e39c0cfeb DailyTransport: added on_active_speaker_changed event handler 2025-05-07 15:22:30 +02:00
Mark Backman
a444701929 Update README with Riva services 2025-05-07 09:02:08 -04:00
Mark Backman
f6c1eb5d9d Merge pull request #1757 from pipecat-ai/mb/remove-canonical
Removing CanonicalMetricsService
2025-05-06 21:38:03 -04:00
Mark Backman
a1d46cb26b Removing CanonicalMetricsService 2025-05-06 21:23:23 -04:00
Aleix Conchillo Flaqué
99ab148d88 Merge pull request #1739 from pipecat-ai/aleix/observers-frame-pushed-class
BaseObserver: add FramePushed class and deprecate multiple arguments
2025-05-06 15:29:05 -07:00
Aleix Conchillo Flaqué
d69fa5dba5 update CHANGELOG with UltravoxSTTService fix 2025-05-06 15:26:25 -07:00
Aleix Conchillo Flaqué
0d30b000af BaseObserver: add FramePushed class and deprecated multiple arguments 2025-05-06 15:26:23 -07:00
Mark Backman
e7c0e742d2 Merge pull request #1752 from pipecat-ai/mb/deepgram-tts-aura-2
Update Deepgram TTS default voice to Aura 2 voice
2025-05-06 16:26:26 -04:00
Mark Backman
2aff2dcca3 Merge pull request #1751 from pipecat-ai/mb/11labs-enable_ssml_parsing
Add enable_ssml_parsing and enable_logging to ElevenLabsTTSService
2025-05-06 16:25:20 -04:00
Mark Backman
288f8865c8 Add enable_logging to ElevenLabsTTSService 2025-05-06 12:13:26 -04:00
Mark Backman
8691870bcb Update Deepgram TTS default voice to Aura 2 voice 2025-05-06 11:29:32 -04:00
Mark Backman
e06146c237 Add enable_ssml_parsing to ElevenLabsTTSService 2025-05-06 11:06:57 -04:00
Aleix Conchillo Flaqué
c68e990cda Merge pull request #1748 from pipecat-ai/aleix/task-manager-dictionary
task manager dictionary and cleanup PipelineTask
2025-05-06 07:57:53 -07:00
Aleix Conchillo Flaqué
4583905313 PipelineTask: cleanup if task is cancelled from outside Pipecat 2025-05-05 21:33:21 -07:00
Aleix Conchillo Flaqué
9cc498b1fa TaskManager: use a dictionary instead of a set to store tasks 2025-05-05 21:27:49 -07:00
Mark Backman
b3c5dc4045 Merge pull request #1443 from adithyaxx/anthropic-client-bug-fixes
Handle missing token counts in AsyncAnthropicBedrock client properly
2025-05-05 21:13:11 -04:00
Aleix Conchillo Flaqué
3824da7261 Merge pull request #1745 from pipecat-ai/aleix/make-sure-transports-are-ready
only send data to transports after they are really ready
2025-05-05 14:21:36 -07:00
Aleix Conchillo Flaqué
855d567b1e only send data to transports after they are really ready 2025-05-05 14:06:58 -07:00
Mark Backman
b323a7bd88 Merge pull request #1742 from pipecat-ai/mb/pcc-krisp-filter
Update pipecat-cloud-example to use Krisp in PCC deployment only
2025-05-05 15:46:12 -04:00
Mark Backman
fa011d0018 Update pipecat-cloud-example to use Krisp in PCC deployment only 2025-05-05 15:09:29 -04:00
Aleix Conchillo Flaqué
e15fa8777a Merge pull request #1737 from CerebriumAI/kyle/fix-ultravox-spacing
[Fix] Ultravox frame spacing issue
2025-05-05 09:34:49 -07:00
Aleix Conchillo Flaqué
2143a6d927 Merge pull request #1732 from pipecat-ai/aleix/daily-remote-custom-tracks
DailyTransport: remove custom tracks before leaving
2025-05-05 08:44:11 -07:00
Aleix Conchillo Flaqué
044e2d3e73 DailyTransport: remove custom tracks before leaving 2025-05-05 08:35:35 -07:00
Kyle Gani
be112ec63f Merge branch 'kyle/fix-ultravox-performance' of github.com:CerebriumAI/pipecat into kyle/fix-ultravox-performance 2025-05-05 17:13:26 +02:00
Kyle Gani
d2f56c4e8f Fix: Spacing issue 2025-05-05 17:13:21 +02:00
Mark Backman
ddc6a9c695 Merge pull request #1670 from pipecat-ai/mb/daily-twilio-sip-example
Add standalone Daily + Twilio SIP example
2025-05-05 10:57:16 -04:00
Mark Backman
2bebdbc371 Merge pull request #1671 from pipecat-ai/khk/rime-arcana
support for rime arcana model
2025-05-05 10:54:50 -04:00
Mark Backman
8b9f1f0608 Add a changelog entry 2025-05-05 10:51:46 -04:00
Kwindla Hultman Kramer
b25f3b2ed2 support for rime arcana model 2025-05-05 10:50:46 -04:00
Mark Backman
a995cf81b6 Merge pull request #1724 from pipecat-ai/mb/demo-fixes
Demo fixes
2025-05-05 08:44:57 -04:00
Aleix Conchillo Flaqué
75d261639f Merge pull request #1726 from pipecat-ai/aleix/pipecat-0.0.66
update CHANGELOG for pipecat 0.0.66
2025-05-02 20:54:57 -07:00
Aleix Conchillo Flaqué
f720d795d0 update CHANGELOG for pipecat 0.0.66 2025-05-02 20:29:51 -07:00
Aleix Conchillo Flaqué
f6fe83e358 Merge pull request #1725 from pipecat-ai/aleix/update-daily-python-0.18.1
update to daily-python 0.18.1
2025-05-02 20:27:50 -07:00
Mark Backman
0513d0b6a8 Update README 2025-05-02 22:44:50 -04:00
Mark Backman
0679bb217d Remove Twilio from phone-chatbot directory 2025-05-02 22:18:50 -04:00
Mark Backman
38bd55e518 Update README 2025-05-02 22:18:50 -04:00
Mark Backman
65c7423280 Add other dial-in event handlers 2025-05-02 22:18:50 -04:00
Mark Backman
f24a85cc94 Add logic to only forward the first on_dialin_ready event 2025-05-02 22:18:50 -04:00
Mark Backman
53887b7c98 Display phone number in WebRTC call 2025-05-02 22:18:50 -04:00
Mark Backman
523c012c38 Use a Twilio asset to ring the phone throughout 2025-05-02 22:18:50 -04:00
Mark Backman
97c28989c1 Add standalone Daily + Twilio SIP example 2025-05-02 22:18:50 -04:00
Mark Backman
c19be6ebb2 Demo fixes 2025-05-02 20:58:10 -04:00
Aleix Conchillo Flaqué
54971a0735 update to daily-python 0.18.1 2025-05-02 17:47:44 -07:00
Mark Backman
4513e81e13 Merge pull request #1723 from pipecat-ai/mb/base-output-bot-speaking-log
Only display the destination in the bot started/stopped speaking log …
2025-05-02 17:32:47 -04:00
Mark Backman
872204b795 Only display the destination in the bot started/stopped speaking log when there is a desintation 2025-05-02 17:29:28 -04:00
Aleix Conchillo Flaqué
a94cbfe6f5 Merge pull request #1722 from pipecat-ai/aleix/base-output-transport-audio-task-fix
BaseOutputTransport: always initialize audio task
2025-05-02 14:26:30 -07:00
Aleix Conchillo Flaqué
7152faafb2 BaseOutputTransport: always initialize audio task
We also use the audio task to also send synchronized images with audio.
2025-05-02 14:23:15 -07:00
Mark Backman
e6aadaccd8 Merge pull request #1721 from pipecat-ai/mb/simli-silent-frames
Fix: SimliVideoService was continuously emitting audio, preventing Bo…
2025-05-02 16:44:39 -04:00
Mark Backman
3a73aa71b8 Merge pull request #1613 from pipecat-ai/mb/improve-storybot-readme
demo: Restructure storytelling-chatbot directory, update README steps…
2025-05-02 16:39:59 -04:00
Mark Backman
814e7509e1 demo: Restructure storytelling-chatbot directory, update README steps, link to vercel demo 2025-05-02 16:37:37 -04:00
Vanessa Pyne
e0cf5ec016 Merge pull request #1705 from pipecat-ai/vp-update-nvidia-models
Riva Service: add magpie-tts-multilingual model
2025-05-02 15:34:23 -05:00
vipyne
667bd32e6a Riva: remove deprecated lines in example 2025-05-02 15:33:10 -05:00
vipyne
b2ecd83706 update CHANGELOG with Riva details 2025-05-02 15:33:10 -05:00
vipyne
b2754117c8 Riva: refactor function_id and model_name 2025-05-02 15:33:10 -05:00
vipyne
6c428c303b update magpie voice 2025-05-02 15:33:10 -05:00
Mark Backman
e7d889a143 Update RivaSTTService to use by default 2025-05-02 15:33:10 -05:00
Mark Backman
da60e7069b Update pyproject.toml to use nvidia-riva-client 2.19.1 2025-05-02 15:33:10 -05:00
Mark Backman
c14406a3b9 Demos use the latest services 2025-05-02 15:33:10 -05:00
Mark Backman
725ab5ec21 Small fixes: No default api_key of None, ParakeetSTTService uses RivaSTTService.InputParams 2025-05-02 15:33:10 -05:00
Mark Backman
daf9d47e58 Update RivaSegmentedSTTService 2025-05-02 15:33:10 -05:00
vipyne
63a65627a2 Riva Service: add magpie-tts-multilingual model 2025-05-02 15:33:10 -05:00
Mark Backman
02c07755b0 Add Changelog entry for PR 1707 2025-05-02 15:33:10 -05:00
Matt Kim
15cbd18acc [Rime] Add phonemizeBetweenBrackets and pauseBetweenBrackets to RimeTTSService (ws)
There is a fix incoming in
2025-05-02 15:33:10 -05:00
Kwindla Hultman Kramer
93c40b87dc small groq updates 2025-05-02 15:33:10 -05:00
Mark Backman
eeaa9f67a1 Fix: SimliVideoService was continuously emitting audio, preventing BotStoppedSpeakingFrame from being sent 2025-05-02 16:32:42 -04:00
Mark Backman
b60691c7b2 Merge pull request #1720 from pipecat-ai/mb/changelog-pr-1707
Add Changelog entry for PR 1707
2025-05-02 16:13:40 -04:00
Mark Backman
2bb1b0b343 Add Changelog entry for PR 1707 2025-05-02 16:09:50 -04:00
Mark Backman
047ef9f86c Merge pull request #1707 from rimelabs/matt/rime/url_param_serialization
[Rime] Add new params to RimeTTSService
2025-05-02 16:08:01 -04:00
Kwindla Hultman Kramer
9a2c603c91 Merge pull request #1711 from pipecat-ai/khk/groq-updates 2025-05-02 12:21:15 -07:00
Filipi da Silva Fuchter
94c4169407 Merge pull request #1717 from pipecat-ai/local_smart_turn_torch
Local smart turn torch
2025-05-02 15:53:30 -03:00
Filipi Fuchter
cb8a551db8 Mentioning the new LocalSmartTurnAnalyzer in the changelog. 2025-05-02 14:32:18 -03:00
Filipi Fuchter
779f09af70 Fixing lint. 2025-05-02 14:22:38 -03:00
Filipi Fuchter
19dc0f2bfb New example using the local smart turn 2025-05-02 14:21:42 -03:00
Filipi Fuchter
f0709e22ba Creating a local smart turn using torch. 2025-05-02 14:21:29 -03:00
Mark Backman
8250736f5e Merge pull request #1708 from pipecat-ai/mb/gemini-user-context
Push GeminiMultimodalLiveLLMService TranscriptionFrame Upstream, remo…
2025-05-02 13:10:27 -04:00
Mark Backman
83348a9f93 Merge pull request #1714 from pipecat-ai/mb/fix-gemini-text-modality
Restore TEXT modalities support to GeminiMultimodalLiveLLMService
2025-05-02 10:41:05 -04:00
Mark Backman
96d40903a9 Only send TTSStoppedFrame from Gemini when in AUDIO mode, only send one LLMFullResponseEndFrame 2025-05-02 10:18:53 -04:00
Aleix Conchillo Flaqué
2560811805 Merge pull request #1697 from pipecat-ai/aleix/daily-custom-audio-tracks
add support for multiple transport destinations
2025-05-02 06:34:09 -07:00
Mark Backman
2b8c44c008 Merge pull request #1710 from pipecat-ai/mb/openai-context-aggregation
fix: OpenAIRealtimeBetaLLMService writes two assistant messages to th…
2025-05-02 07:43:35 -04:00
Mark Backman
38e2d37674 Restore TEXT modalities support to GeminiMultimodalLiveLLMService 2025-05-02 07:36:12 -04:00
Vanessa Pyne
6278561f88 Merge pull request #1709 from pipecat-ai/vp-fix-fastpitch-params-update
Riva TTS: update FastPitch params
2025-05-01 21:23:10 -05:00
Aleix Conchillo Flaqué
750e79c1ce DailyParams: rename to camera/microphone_out_enabled 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
71eb2963c5 examples: added daily-custom-tracks 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
f44e2c86ea BaseOutputTransport: compute sample_rate and audio_chunk_size in main class 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
afe1f0df8c DailyTransport: make sure we can write audio frames to destination 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
458fddfb48 update CHANGELOG with new Daily and Transport features 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
8d915c5ccb DailyParams: allow enabling/disabling camera/microphone tracks 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
304153dd03 TTSService: set transport destination to all TTS frames 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
a6781b7352 rename destination to transport_destination 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
5ad0058303 update CHANGELOG with frame source/destination support 2025-05-01 19:11:13 -07:00
Aleix Conchillo Flaqué
75c039de33 examples: add daily-multi-translation 2025-05-01 19:11:13 -07:00
Aleix Conchillo Flaqué
74e3c3677e DailyTransport: fix audio/video renderers registration 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
dc20327f10 DailyTransport: register audio destination and use custom tracks 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
e738affd29 BaseOutputTransport: allow sending audio/video to multiple destinations 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
ef3d732607 DailyTransport: allow capturing multiple simultaneous audio/video sources 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
6d63cff1bf DailyTransport: custom audio tracks support 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
12f42605a1 pyproject: update daily-python to 0.18.0 2025-05-01 18:58:44 -07:00
Kwindla Hultman Kramer
fac3337927 small groq updates 2025-05-01 17:09:15 -07:00
Mark Backman
76d198151c Push GeminiMultimodalLiveLLMService TranscriptionFrame Upstream, remove direct context addition 2025-05-01 15:41:04 -04:00
Mark Backman
6a907058de fix: OpenAIRealtimeBetaLLMService writes two assistant messages to the context 2025-05-01 15:37:39 -04:00
vipyne
6e1f531f64 Riva TTS: update FastPitch params
91138c3f66 (diff-ece228577b1d233ce600a948243f90cece53e3a9b89554a0b27a48bc4d6e0fdfR45)
2025-05-01 11:14:41 -05:00
Matt Kim
4232cca5b6 [Rime] Add phonemizeBetweenBrackets and pauseBetweenBrackets to RimeTTSService (ws)
There is a fix incoming in
2025-04-30 18:09:22 -07:00
Mark Backman
a6a4d3d71f Merge pull request #1706 from rimelabs/matt/rime/update_url
[Rime] - Update url for Websockets API
2025-04-30 19:14:04 -04:00
Mark Backman
c52de0f5de Merge pull request #1696 from pipecat-ai/mb/fix-gemini-live-context
Fix: GeminiMultimodalLiveLLMService was appending tokens to the context
2025-04-30 19:12:06 -04:00
Mark Backman
a1e1255f16 Strip newlines from generated user transcript 2025-04-30 18:27:46 -04:00
Mark Backman
c4f758725e Ignore TranscriptionFrames too 2025-04-30 18:22:43 -04:00
Aleix Conchillo Flaqué
7bc9a78ce6 udpate CHANGELOG with RTVIObserverParams 2025-04-30 15:13:14 -07:00
Aleix Conchillo Flaqué
f8be71b32c Merge pull request #1688 from pipecat-ai/aleix/add-rtvi-observer-params
RTVIObserver: add RTVIObserverParams to configure what to send
2025-04-30 15:11:18 -07:00
Aleix Conchillo Flaqué
957fa5546d RTVIObserver: add RTVIObserverParams to configure what to send 2025-04-30 15:09:02 -07:00
Aleix Conchillo Flaqué
039cb8fcae Merge pull request #1690 from pipecat-ai/aleix/rtvi-function-call-single-param
RTVIProcessor: use single FunctionCallParams
2025-04-30 15:04:05 -07:00
Mark Backman
8e05f2f1a1 Merge pull request #1702 from pipecat-ai/mb/stt-mute-transcription-frames
Add InterimTranscriptionFrame and TranscriptionFrame to STTMuteFilter…
2025-04-30 17:54:24 -04:00
Matt Kim
8467aa1ed3 [Rime] - Update url for Websockets API
Rime has migrated their Websockets api to the base url `user.rime.ai` along with all other tts endpoints. 

See the [docs](https://docs.rime.ai/api-reference/endpoint/websockets)

`users-ws.rime.ai` is deprecated and will not reflect upgrades to the rime ws api.
2025-04-30 14:20:13 -07:00
Mark Backman
9c5878af3d OpenAI Realtime and Gemini Live push LLMTextFrame again, overwrite the assitant context aggregator for LLMTextFrame 2025-04-30 17:18:20 -04:00
Mark Backman
ef29800fe9 Update the changelog 2025-04-30 16:28:17 -04:00
Mark Backman
7e09933070 OpenAI Realtime should push TTSTextFrame only 2025-04-30 16:28:17 -04:00
Mark Backman
82a9d7f992 Gemini Mulitmodal Live to push TTSTextFrame only 2025-04-30 16:28:17 -04:00
Mark Backman
facbebb15f Transcribe user audio in 26b 2025-04-30 16:28:16 -04:00
Mark Backman
2ba60fc41f Update TranscriptProcessor to handle GeminiMultimodalLiveLLMService changes 2025-04-30 16:28:16 -04:00
Mark Backman
685f951ae2 Fix: GeminiMultimodalLiveLLMService was appending tokens to the context 2025-04-30 16:28:16 -04:00
Mark Backman
27d4c927a8 Merge pull request #1701 from pipecat-ai/mb/gemini-extend-session
Add context_window_compression support to GeminiMultimodalLiveLLMService
2025-04-30 14:35:50 -04:00
Mark Backman
20a59e8c56 Add InterimTranscriptionFrame and TranscriptionFrame to STTMuteFilter frame processing 2025-04-30 10:50:56 -04:00
Mark Backman
d9a0a93667 Add context_window_compression support to GeminiMultimodalLiveLLMService 2025-04-30 09:55:34 -04:00
Mark Backman
154d5d1859 Merge pull request #1699 from pipecat-ai/mb/more-docs-mocks
Additional import mocks to fix docs failure
2025-04-30 08:36:57 -04:00
Mark Backman
a192217256 Additional import mocks to fix docs failure 2025-04-29 21:45:27 -04:00
Mark Backman
10cdc47e05 Merge pull request #1689 from pipecat-ai/mb/handle-http-smart-turn-errors
Handle case where Fal Smart Turn returns a 500 error
2025-04-29 14:21:45 -04:00
Mark Backman
2b4d41a548 Merge pull request #1693 from flixoflax/bump-modal-example-dependencies
Bump modal deployment example dependencies
2025-04-29 14:21:31 -04:00
Filipi da Silva Fuchter
962f8062a5 Merge pull request #1581 from pipecat-ai/voice_agent_ice_servers
Configuring the voice-agent example to wait for all the ice candidates.
2025-04-29 13:30:12 -03:00
Filipi Fuchter
d80d385b2f Adding a section explaining about ice servers. 2025-04-29 12:19:59 -03:00
Filipi Fuchter
b347ca472f Checking the state again to avoid any eventual race condition 2025-04-29 12:04:20 -03:00
Filipi Fuchter
c3c4952abf Reducing the timeout to 2 seconds for gathering the ice candidates. 2025-04-29 11:34:58 -03:00
Filipi Fuchter
f369ab4c1a Printing each new ice candidate. 2025-04-29 11:23:33 -03:00
flixoflax
62b41c6789 removed aiohttp from deps, and version constraint from pipecat-ai 2025-04-29 16:14:51 +02:00
Filipi da Silva Fuchter
2872bc7902 Merge pull request #1587 from pipecat-ai/improving_ice_servers
Updated `SmallWebRTCConnection` to support `ice_servers` with credentials.
2025-04-29 11:07:32 -03:00
Filipi Fuchter
9658b75a10 Configuring the voice-agent example to use ice-servers and wait for all the ice candidates. 2025-04-29 10:52:07 -03:00
flixoflax
63de9039e6 Bumb modal deployment example deps 2025-04-29 15:50:56 +02:00
Filipi Fuchter
9352396d7e Mentioning the new feature in the changelog. 2025-04-29 10:40:09 -03:00
Filipi Fuchter
d1ab1d38b7 Fixing the examples to use the new IceServer structure. 2025-04-29 10:33:19 -03:00
Filipi Fuchter
080f70d91c Allowing to define the username and credential for the ice servers. 2025-04-29 10:32:42 -03:00
Mark Backman
ebed1fc6ea Restructure _send_raw_request to raise errors early, then process the successful response 2025-04-29 09:05:14 -04:00
Aleix Conchillo Flaqué
6821b1cdab RTVIProcessor: use single FunctionCallParams 2025-04-29 05:56:23 -07:00
Mark Backman
144ae9b611 Handle case where Fal Smart Turn returns a 500 error 2025-04-29 08:53:02 -04:00
Aleix Conchillo Flaqué
a2e7331ce2 Merge pull request #1680 from pipecat-ai/aleix/local-input-select-stt-update
examples: update local-input-select-stt
2025-04-28 12:01:12 -07:00
Aleix Conchillo Flaqué
8accd3e387 Merge pull request #1681 from pipecat-ai/aleix/tts-service-llm-full-response-end-fix
TTSService: do not push LLMFullResponseEndFrame if not needed
2025-04-28 12:00:55 -07:00
Mark Backman
3d05a74dc0 Merge pull request #1678 from Dev-Khant/integrate-mem0-oss
Integrate with Mem0 OSS
2025-04-28 14:15:40 -04:00
Aleix Conchillo Flaqué
e3c965f4d5 TTSService: do not push LLMFullResponseEndFrame if not needed 2025-04-28 11:03:22 -07:00
Dev-Khant
5354e5d891 formatting 2025-04-28 22:57:18 +05:30
Aleix Conchillo Flaqué
5784e91cff update .gitignore 2025-04-28 09:34:45 -07:00
Aleix Conchillo Flaqué
bc5f098aaa examples: update local-input-select-stt 2025-04-28 09:28:26 -07:00
Aleix Conchillo Flaqué
93534b4692 Merge pull request #1679 from pipecat-ai/aleix/update-package-lock-apr-28
examples: update package-lock.json
2025-04-28 09:27:01 -07:00
Aleix Conchillo Flaqué
b23d54c609 examples: update package-lock.json 2025-04-28 09:15:14 -07:00
Dev-Khant
aa23a7b1e6 update changelog 2025-04-28 20:03:20 +05:30
Dev-Khant
c0c41789ab Integrate with Mem0 OSS 2025-04-28 17:15:53 +05:30
Aleix Conchillo Flaqué
029ef4f8c2 Merge pull request #1667 from pipecat-ai/aleix/function-call-single-parameter
function call single parameter
2025-04-25 13:53:10 -07:00
Aleix Conchillo Flaqué
9cad6dfce9 RTVIProcessor: simplify handle_function_call() and depreacted handle_function_call_start() 2025-04-25 13:34:05 -07:00
Aleix Conchillo Flaqué
4df6444832 examples: update with single FunctionCallParams parameter 2025-04-25 13:34:05 -07:00
Aleix Conchillo Flaqué
944bc23135 LLMService: use a single FunctionCallParams parameter for function calls 2025-04-25 13:34:03 -07:00
Aleix Conchillo Flaqué
1d863ee7de Merge pull request #1669 from pipecat-ai/aleix/short-utterances-fixes
short utterances fixes
2025-04-25 13:25:15 -07:00
Vanessa Pyne
9ca775d1ab Merge pull request #1668 from pipecat-ai/vp-update-transport-params-in-ex
Update examples with new transport param names
2025-04-25 15:24:28 -05:00
Aleix Conchillo Flaqué
03002ad685 LLMUserContextAggregator: reduce aggregation_timeout to 0.5 2025-04-25 13:21:00 -07:00
Aleix Conchillo Flaqué
99a4154cbc LLMUserContextAggregator: ignore short uterrances while bot speaking 2025-04-25 13:21:00 -07:00
vipyne
0f68cc182d Update examples with new transport param names 2025-04-25 15:18:56 -05:00
Aleix Conchillo Flaqué
3ac50b9902 Merge pull request #1664 from CerebriumAI/kyle/fix-ultravox-performance
Improved: Ultravox performance
2025-04-25 12:59:25 -07:00
Michael Louis
9557705b53 Changed warmup to internal function 2025-04-25 14:40:19 -04:00
Mark Backman
a7718926e9 Merge pull request #1666 from pipecat-ai/mb/add-vad-start-stop-events
Add VADUserStartedSpeakingFrame and VADUserStoppedSpeakingFrame
2025-04-25 10:49:17 -04:00
Mark Backman
dfa10af6ed Simplify VAD events to be detected and emitted from BaseInputTransport 2025-04-25 10:28:44 -04:00
Mark Backman
8485ea6c5e Merge pull request #1650 from pipecat-ai/mb/fal-smart-turn-readme
Add hosted demo link to Fal smart turn README
2025-04-25 09:39:54 -04:00
Mark Backman
b298376766 Add VADUserStartedSpeakingFrame and VADUserStoppedSpeakingFrame 2025-04-25 09:06:54 -04:00
Mark Backman
7cfefe4f84 Merge pull request #1593 from WebinarGeek/wg/gladia-translations
Push gladia translations as a TranscriptionFrame
2025-04-25 08:35:36 -04:00
Mark Backman
71c7373987 Update CHANGELOG 2025-04-25 08:26:40 -04:00
Mark Backman
d1086914fe Add TranslationFrame and use in GladiaSTTService; add 13c-gladia-translation.py 2025-04-25 08:25:39 -04:00
Dan Berg
2fb85941d3 Add hint installing test requirements 2025-04-25 08:24:51 -04:00
Dan Berg
be8788e4da Push gladia translations as a TranscriptionFrame 2025-04-25 08:24:51 -04:00
Mark Backman
acb6abd761 Merge pull request #1644 from pipecat-ai/mb/add-transport-examples
Add transports examples to foundational examples
2025-04-25 08:22:26 -04:00
Mark Backman
a528aad957 Merge pull request #1642 from pipecat-ai/mb/foundational-requirements
Add deepgram and cartesia to foundational example requirements to mak…
2025-04-25 08:22:08 -04:00
Mark Backman
5c13252801 Merge pull request #1658 from pipecat-ai/mb/update-fal-smart-turn-version
Update daily-transport version in fal-smart-turn demo
2025-04-25 08:21:50 -04:00
Mark Backman
a4422ac6c2 Add transports examples to foundational examples 2025-04-25 08:20:04 -04:00
Mark Backman
985a031353 Merge pull request #1657 from pipecat-ai/mb/word-wrangler-example
Add Word Wrangler demos
2025-04-25 08:16:02 -04:00
Kyle Gani
5c4079b286 Cleaned up: layout 2025-04-25 13:18:03 +02:00
Kyle Gani
5489ac5a73 Merge branch 'main' into kyle/fix-ultravox-performance 2025-04-25 13:16:48 +02:00
Kyle Gani
49fbcc86ac Improved: Ultravox performance 2025-04-25 13:12:08 +02:00
Aleix Conchillo Flaqué
d20c3307b9 Merge pull request #1648 from pipecat-ai/aleix/always-push-audio
input transports now always push audio frames
2025-04-24 18:59:09 -07:00
Aleix Conchillo Flaqué
9fd76923fd STTService: passthrough audio frames by default 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
a753a623d4 examples: allow setting custom program arguments 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
4ee6c4b59e BaseOutputTransport: reword camera with video 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
e79a002e5a examples: update camera_* with video_* 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
420912dd4b transports: deprecate TransportParams.camera_* in favor of video_* 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
de7185e8db examples: remove vad_enabled=True 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
8bfcfe8b1d transports: deprecate TransportParams.vad_enabled 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
26d2ce5926 examples: remove vad_audio_passthrough=True 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
9ee56bff9e transports: push audio by default. s/vad_audio_passthrough/audio_in_passthrough/ 2025-04-24 17:14:16 -07:00
Vanessa Pyne
2e0d77e4f0 Merge pull request #1627 from pipecat-ai/vp-mcp-take-3
MCP Service
2025-04-24 18:16:19 -05:00
vipyne
4cc8a4312c Revert "update 39* examples as per #1648"
This reverts commit b29ffeef29.
2025-04-24 18:11:35 -05:00
vipyne
cb7cb381aa Add MCPClient changelog line 2025-04-24 18:05:20 -05:00
vipyne
b29ffeef29 update 39* examples as per #1648 2025-04-24 17:19:50 -05:00
vipyne
b7b2a5b7a1 mcp service fix and add multiple mcp example 2025-04-24 17:13:40 -05:00
vipyne
3384598e07 mcp_service: pr notes 2025-04-24 17:13:40 -05:00
vipyne
c420dbe57f MCP Service
WIP getting mcp.run to work

add mcp[cli] to toml and lint

mcp stdio example

mcp sse example

mcp_run POC

ruff formatting
2025-04-24 17:13:40 -05:00
Mark Backman
fa8aafc7a5 Merge pull request #1660 from pipecat-ai/mb/foundational-examples-readme
Update foundational README with ToC
2025-04-24 18:12:27 -04:00
Mark Backman
4b364dda29 Update foundational README with ToC 2025-04-24 18:04:36 -04:00
Mark Backman
6bb765e40f Update daily-transport version in fal-smart-turn demo 2025-04-24 15:06:08 -04:00
Mark Backman
c80d09f66c Add Word Wrangler demos 2025-04-24 14:31:48 -04:00
Mark Backman
f8ff10c5d5 Add hosted demo link to Fal smart turn README 2025-04-23 21:27:53 -04:00
Mark Backman
09ff836ef6 Merge pull request #1640 from pipecat-ai/mb/fal-smart-turn-example
Fal Smart Turn example
2025-04-23 17:27:27 -04:00
Mark Backman
e446ecac14 Merge pull request #1649 from pipecat-ai/mb/fix-smart-turn-metrics
fix: SmartTurnMetricsData was reporting 0 for inference and processin…
2025-04-23 17:23:16 -04:00
Mark Backman
8c0c8a6153 Code review fixes 2025-04-23 17:22:19 -04:00
Mark Backman
70033ae00b fix: SmartTurnMetricsData was reporting 0 for inference and processing time 2025-04-23 17:17:46 -04:00
Mark Backman
ac9dce63ae README updates 2025-04-23 16:55:35 -04:00
Mark Backman
8b2df48fab Fix DailyTransport bot name 2025-04-23 16:43:54 -04:00
Mark Backman
156a5690fc Merge pull request #1647 from mattmatters/fix-typo
Fix wating -> waiting typo
2025-04-23 16:27:00 -04:00
Matt Lewis
d42c618398 Fix wating -> waiting typo 2025-04-23 15:55:14 -04:00
Mark Backman
b23ca5a4a8 Merge pull request #1641 from pipecat-ai/mb/11labs-input-params
ElevenLabs: InputParams can be set individually
2025-04-23 14:31:37 -04:00
Mark Backman
63a6697a90 ElevenLabs: InputParams can be set individually 2025-04-23 14:28:38 -04:00
Aleix Conchillo Flaqué
f1e45d0f02 Merge pull request #1646 from pipecat-ai/aleix/pipecat-0.0.65
update CHANGELOG for 0.0.65
2025-04-23 11:27:25 -07:00
Mark Backman
4ad227ca2d Merge pull request #1643 from pipecat-ai/mb/gladia-keepalive
Add a keepalive task to GladiaSTTService
2025-04-23 14:27:15 -04:00
Aleix Conchillo Flaqué
66cc18194b update CHANGELOG for 0.0.65 2025-04-23 11:25:32 -07:00
Mark Backman
7d65132c93 Add a keepalive task to GladiaSTTService 2025-04-23 14:21:44 -04:00
Mark Backman
db7d7a4204 Merge pull request #1645 from pipecat-ai/mb/telnyx-auto-hang-up
Add auto_hang_up to Telnyx serializer
2025-04-23 14:20:28 -04:00
Mark Backman
7bbac11084 Add docstrings to TelnyxFrameSerializer 2025-04-23 14:13:20 -04:00
Mark Backman
76c8322b57 Make call_sid optional in TwilioFrameSerializer 2025-04-23 14:08:50 -04:00
Mark Backman
7b1cd3523d Twilio: send only one hangup command 2025-04-23 13:41:36 -04:00
Mark Backman
6bd821ac9a Add auto_hang_up to Telnyx serializer 2025-04-23 13:29:54 -04:00
Mark Backman
a6d51c343e Add deepgram and cartesia to foundational example requirements to make quickstart smoother 2025-04-23 08:47:47 -04:00
Mark Backman
1a5cf7a521 Add local run and deployment steps to README 2025-04-22 21:37:35 -04:00
Mark Backman
69491417ec Fal Smart Turn example 2025-04-22 21:16:41 -04:00
Aleix Conchillo Flaqué
b91780ced2 Merge pull request #1638 from pipecat-ai/aleix/pipecat-0.0.64
update CHANGELOG for 0.0.64
2025-04-22 17:35:25 -07:00
Aleix Conchillo Flaqué
8ded666958 update CHANGELOG for 0.0.64 2025-04-22 17:32:06 -07:00
Filipi da Silva Fuchter
2490c804a5 Merge pull request #1631 from pipecat-ai/smart_turn_timeout
Returning the turn as complete if the request don’t return a result within SmartTurnParams stop_secs
2025-04-22 19:51:10 -03:00
Filipi Fuchter
dd8856a673 Merge branch 'main' into smart_turn_timeout
# Conflicts:
#	dot-env.template
2025-04-22 19:49:32 -03:00
Aleix Conchillo Flaqué
e7da08dab1 move smart turn files to audio.turn.smart_turn package 2025-04-22 15:29:31 -07:00
Aleix Conchillo Flaqué
ae60d42016 s/SmartTurnAnalyzer/HttpSmartTurnAnalyzer/ and add FalSmartTurnAnalyzer 2025-04-22 15:13:12 -07:00
Aleix Conchillo Flaqué
50e8d82ece SmartTurn: some linting cleanup 2025-04-22 14:39:02 -07:00
Mark Backman
cc9901a82f Replace httpx with aiohttp 2025-04-22 17:14:19 -04:00
Aleix Conchillo Flaqué
1fd43e8a3f Merge pull request #1636 from pipecat-ai/aleix/examples-logging
examples: always use loguru for logging
2025-04-22 13:06:40 -07:00
Aleix Conchillo Flaqué
fdc508a1a5 examples: always use loguru for logging 2025-04-22 11:51:49 -07:00
Mark Backman
37269db247 Merge pull request #1634 from pipecat-ai/mb/twilio-end-call
Automatically hangup Twilio calls
2025-04-22 14:05:10 -04:00
Mark Backman
51269aabbd Added cancel method to WebsocketServerOutputTransport 2025-04-22 13:58:39 -04:00
Mark Backman
74ecc19e09 Code review feedback 2025-04-22 13:54:12 -04:00
Mark Backman
c6d48c16df Add twilio to pyproject.toml, update demo to use twilio option 2025-04-22 13:01:56 -04:00
Mark Backman
873d84aa09 Twilio serializer to return None 2025-04-22 12:50:11 -04:00
Mark Backman
7360866c97 Add docstrings 2025-04-22 12:49:17 -04:00
Mark Backman
81f4768661 Automatically hangup Twilio calls 2025-04-22 12:45:34 -04:00
Vanessa Pyne
972d65f61b Merge pull request #1628 from pipecat-ai/vp-typo-fixes
typo fixes in phone-chatbot example
2025-04-22 10:05:56 -05:00
Mark Backman
1da9d398e3 Merge pull request #1619 from pipecat-ai/mb/grok-3-beta
GrokLLMService uses grok-3-beta as default model
2025-04-22 10:33:32 -04:00
Filipi Fuchter
7358bc6428 Returning the turn as complete if the request don’t return a result within SmartTurnParams stop_secs 2025-04-22 10:35:14 -03:00
vipyne
a6af499f84 typo fixes in phone-chatbot example 2025-04-21 23:49:13 -05:00
Aleix Conchillo Flaqué
f9d1a53e28 Merge pull request #1609 from pipecat-ai/aleix/pyproject-py-typed
pyproject: fix license fields
2025-04-21 16:14:22 -07:00
Mark Backman
3f3010af79 Add a SmartTurnMetricsData class, emitted by Metrics Frame in response to smart turn responses 2025-04-21 18:56:14 -04:00
Aleix Conchillo Flaqué
a02d47ddbd Merge pull request #1625 from 0xPatryk/patch-1
Fixed AttributeError: object has no attribute '_sample_rate"
2025-04-21 15:40:54 -07:00
Patryk
a649aff3e7 Fixed AttributeError: 'OpenAITTSService' object has no attribute '_sample_rate' 2025-04-21 11:03:45 +02:00
Mark Backman
a9b551d73e GrokLLMService uses grok-3-beta as default model 2025-04-19 08:05:59 -04:00
Mark Backman
747a821943 Merge pull request #1614 from pipecat-ai/mb/changelog-for-1525
Add CHANGELOG entry for PR 1525
2025-04-19 07:10:13 -04:00
Aleix Conchillo Flaqué
010db3ccd5 README: minor update 2025-04-18 20:57:05 -07:00
Aleix Conchillo Flaqué
db773b8b93 Merge pull request #1616 from pipecat-ai/aleix/new-readme
make README more fun
2025-04-18 18:15:35 -07:00
Mark Backman
16b7bf71b4 Additional README changes 2025-04-18 21:00:57 -04:00
Aleix Conchillo Flaqué
82d19508a4 make README more fun 2025-04-18 14:37:28 -07:00
Mark Backman
dc3646f0e7 Merge pull request #1615 from pipecat-ai/mb/issue-template
Add issue templates and move the pull request template to .github
2025-04-18 14:58:09 -04:00
Mark Backman
62e659cd3a Update to .yml templates so that types are used 2025-04-18 13:21:01 -04:00
Mark Backman
b2945f44fd Add issue templates and move the pull request template to .github 2025-04-18 12:17:46 -04:00
Mark Backman
618fbef81c Add CHANGELOG entry for PR 1525 2025-04-18 11:32:34 -04:00
Mark Backman
70c42dfa6e Merge pull request #1525 from shaiyon/google-default-creds
Enable usage of Application Default Credentials in Google services
2025-04-18 11:31:08 -04:00
Mark Backman
9ab374dd1f Merge pull request #1612 from pipecat-ai/mb/07g-stt-model
examples: Fix 07g by changing STT model
2025-04-18 08:04:20 -04:00
Mark Backman
cc6d284417 examples: Fix 07g by changing STT model 2025-04-18 07:13:34 -04:00
Filipi da Silva Fuchter
f77d8f0b6f Merge pull request #1611 from pipecat-ai/smart_turn_changelog
Mentioning the Smart Turn Detection into the changelog.
2025-04-17 23:02:57 -03:00
Varun Singh
9c0beb05cf Merge pull request #1597 from pipecat-ai/vr000m-opus-added
Changing default codec to OPUS for telephony
2025-04-17 18:42:12 -07:00
Aleix Conchillo Flaqué
858981c404 Merge pull request #1610 from pipecat-ai/aleix/add-base-turn-analyzer
audio: add BaseTurnAnalyzer class
2025-04-17 18:38:08 -07:00
Aleix Conchillo Flaqué
9eed225aa2 audio: add BaseTurnAnalyzer class 2025-04-17 18:37:52 -07:00
Filipi Fuchter
9f7371e485 Mentioning the Smart Turn Detection into the changelog. 2025-04-17 22:31:40 -03:00
Aleix Conchillo Flaqué
d77c37ff14 pyproject: add py.typed (PEP 561) 2025-04-17 17:29:04 -07:00
Aleix Conchillo Flaqué
b4916f9dae pyproject: fix license fields 2025-04-17 17:28:14 -07:00
Aleix Conchillo Flaqué
004a920920 Merge pull request #1563 from Bnowako/packaging-type-information
Add marker file for static type checkers
2025-04-17 17:26:15 -07:00
Filipi da Silva Fuchter
203c5a3a60 Merge pull request #1592 from pipecat-ai/smart_turn
Smart turn
2025-04-17 18:21:47 -03:00
Filipi Fuchter
7f6fb1754b Merge remote-tracking branch 'origin/smart_turn' into smart_turn 2025-04-17 17:53:53 -03:00
Filipi Fuchter
a390ce13a4 Removing the UserEndOfTurnFrame 2025-04-17 17:53:31 -03:00
Filipi da Silva Fuchter
61d31d1c40 Restoring stop_secs to default value.
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-17 17:44:47 -03:00
Filipi da Silva Fuchter
e872ff943a Using the default model for OpenAi.
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-17 17:43:39 -03:00
Filipi da Silva Fuchter
c71005e249 Using the default model for OpenAi.
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-17 17:43:23 -03:00
Filipi Fuchter
6e06bf97c0 Preventing emitting the UserStartedSpeaking event multiple times. 2025-04-17 17:21:29 -03:00
Filipi Fuchter
a80dc94e91 Fixing ruff format. 2025-04-17 16:47:17 -03:00
Filipi Fuchter
3ea9cfd251 Keeping the _speech_triggered as true if the state is incomplete. 2025-04-17 16:46:15 -03:00
Filipi Fuchter
a80f82cdb6 Moving the environment variables to inside the demo. 2025-04-17 16:28:50 -03:00
Aleix Conchillo Flaqué
d24bab354f Merge pull request #1607 from pipecat-ai/aleix/fix-websocket-disconnects
services: fix TTS websocket services disconnections
2025-04-17 12:27:52 -07:00
Filipi Fuchter
53ee3fb64c Changing the log levels used in smart_turn 2025-04-17 16:14:13 -03:00
Filipi Fuchter
3599761e4e Changing the default behavior to only use the last vad segment, and increasing the default stop_secs to 3 2025-04-17 16:07:03 -03:00
Aleix Conchillo Flaqué
c0b3fe3985 services: only read from TTS websocket if websocket connection established 2025-04-17 11:54:07 -07:00
Aleix Conchillo Flaqué
497d48b6c8 services: fix TTS websocket services disconnections
Fixes #1467
2025-04-17 11:29:49 -07:00
Filipi Fuchter
e179916c9c Creating a new param use_only_last_vad_segment 2025-04-17 11:49:51 -03:00
Filipi Fuchter
b0b38beb19 Returning the max duration back to 8 seconds. 2025-04-17 11:39:48 -03:00
Filipi Fuchter
8577139d21 Fixing to keep the last max samples. 2025-04-17 11:39:06 -03:00
Filipi Fuchter
e2fbbb4b40 Renaming the smart turn classes. 2025-04-17 10:43:21 -03:00
Filipi Fuchter
88ce117e84 Changing the max duration default value to 16 seconds. 2025-04-17 10:35:13 -03:00
Filipi Fuchter
266537c3f4 Fixing to respect the stop_secs. 2025-04-17 10:07:08 -03:00
Filipi Fuchter
230d2f80fa Merge branch 'main' into smart_turn 2025-04-17 09:36:30 -03:00
Filipi Fuchter
3f0688aefa Testing smart turn using stop_secs as 5 seconds 2025-04-17 09:36:03 -03:00
Filipi da Silva Fuchter
5be3e6979e Merge pull request #1533 from pipecat-ai/daily_small_webrtc
Example interoping between SmallWebRTC and Daily
2025-04-17 09:19:23 -03:00
Mark Backman
9c19cff818 Merge pull request #1585 from ArmanJR/main
Troubleshooting SSL error
2025-04-16 22:46:45 -04:00
Mark Backman
95f3537bde Merge pull request #1598 from pipecat-ai/mb/11labs-http-timestamps
Added word/timestamp pairs to ElevenLabsHttpTTSService
2025-04-16 22:38:26 -04:00
Mark Backman
7ff748defd Merge pull request #1600 from pipecat-ai/mb/11labs-previous-text
Add previous_text context to ElevenLabsHttpTTSService
2025-04-16 22:33:38 -04:00
Mark Backman
2dafbee2aa Code review fixes 2025-04-16 22:29:33 -04:00
Mark Backman
1e0a9d7b06 Add previous_text context to ElevenLabsHttpTTSService 2025-04-16 22:22:08 -04:00
Mark Backman
4a23e138b1 Added word/timestamp pairs to ElevenLabsHttpTTSService 2025-04-16 22:20:51 -04:00
Mark Backman
384f80983f Added word/timestamp pairs to ElevenLabsHttpTTSService 2025-04-16 21:55:00 -04:00
Aleix Conchillo Flaqué
f6f01ea7e4 Merge pull request #1588 from pipecat-ai/aleix/llm-aggregator-params
LLM aggregator params
2025-04-16 15:25:21 -07:00
Aleix Conchillo Flaqué
f385cc0460 pyproject: add websockets as google dependency 2025-04-16 15:19:25 -07:00
Aleix Conchillo Flaqué
e97de43de2 add LLMUserAggregatorParams and LLMAssistantAggregatorParams 2025-04-16 15:19:19 -07:00
Aleix Conchillo Flaqué
8299c96ad4 Merge pull request #1603 from pipecat-ai/aleix/deepgram-tavus-fixes
deepgram/tavus fixes
2025-04-16 14:55:45 -07:00
Aleix Conchillo Flaqué
e9af585edd DeepgramTTSService: re-add base_url to constructor 2025-04-16 14:54:02 -07:00
Aleix Conchillo Flaqué
31f7082d12 DeepgramTTSService: use Deepgram's asyncrest instead of asyncio.to_thread 2025-04-16 14:40:59 -07:00
Aleix Conchillo Flaqué
6cea71270e tts: use smaller audio chunk sizes 2025-04-16 14:40:59 -07:00
Aleix Conchillo Flaqué
d05b2d0e8d TavusVideoService: fix rate limiting and max size 2025-04-16 14:40:59 -07:00
Filipi Fuchter
a458c1e92b Improving the README and fixing the env.example 2025-04-16 18:38:48 -03:00
Filipi Fuchter
5bbf1d0209 Example interoping between SmallWebRTC and Daily. 2025-04-16 17:14:12 -03:00
Mark Backman
235cd9cecc Merge pull request #1586 from rahultayal22/rah_google_vertex_issue
Fixed params issue in Google Vertex ai
2025-04-16 14:56:46 -04:00
Mark Backman
829f3ed2db Merge pull request #1601 from pipecat-ai/mb/eject-at-exp-token
Add eject_at_token_exp to Daily REST helpers, modify default values
2025-04-16 14:54:41 -04:00
Rahul Tayal
ac64f0ba91 Run ruff on code 2025-04-16 23:19:09 +05:30
Rahul Tayal
ce41a7585b Resolved comment to update change log 2025-04-16 22:24:25 +05:30
Mark Backman
ce92dfb5ec Add eject_at_token_exp to Daily REST helpers, modify default values 2025-04-16 12:26:33 -04:00
Mark Backman
ee132a2188 Merge pull request #1596 from pipecat-ai/mb/gpt-4.1
Update services and examples to use gpt-4.1 by default
2025-04-16 08:37:48 -04:00
Mark Backman
5f3bbf9828 Rely on default OpenAI model for examples and tests 2025-04-16 08:33:34 -04:00
Mark Backman
55d1d81430 Merge pull request #1595 from pipecat-ai/mb/rtvi-start-convo
Update client/server demos to kick off conversation in on_client_read…
2025-04-16 08:23:16 -04:00
Filipi Fuchter
8e36bdbed7 Adding some comments to the code. 2025-04-16 09:11:27 -03:00
Filipi Fuchter
cd8bd7f487 Adding some comments to the code. 2025-04-16 08:58:40 -03:00
Filipi Fuchter
5fa47b7a5c Adding the dependencies for the remote smart turn 2025-04-16 08:45:01 -03:00
Filipi Fuchter
616961b487 Stop removing segments from the end 2025-04-16 08:04:38 -03:00
Filipi Fuchter
650d4d9ee2 Changing the start speech time and adding logs. 2025-04-16 07:55:20 -03:00
Filipi Fuchter
2627cb6bf2 Allowing to define SmartTurnParams 2025-04-16 07:13:13 -03:00
Filipi Fuchter
0e4115049b Refactoring to use keep alive sessions. 2025-04-16 06:44:57 -03:00
Filipi Fuchter
3ebef9346f Adding support for RemoteSmartTurn 2025-04-16 06:33:42 -03:00
Filipi Fuchter
3e2d21779f Refactoring the BaseEndOfTurnAnalyzer to include most of the logic 2025-04-16 06:11:56 -03:00
Filipi Fuchter
cfefcac35f Resetting the silence frames when the user speaks. 2025-04-15 20:51:36 -03:00
Filipi Fuchter
57b39c084f Triggering to check if the turn is complete based on the maximum timeout 2025-04-15 20:42:41 -03:00
Filipi Fuchter
11b6de0900 Triggering to check if the turn is complete each time the user stops speaking based on the vad 2025-04-15 17:28:00 -03:00
Varun Singh
824bc9bf16 Update dial.js 2025-04-15 12:48:33 -07:00
Varun Singh
d0ddef6c12 Update server.py 2025-04-15 12:37:33 -07:00
Mark Backman
ad40a0f076 Update OpenAILLMService and OpenPipeLLMService to use gpt-4.1 by default 2025-04-15 15:11:05 -04:00
Filipi Fuchter
e6325a8229 Integrating with the smart turn model to predict 2025-04-15 16:01:09 -03:00
Mark Backman
6d10732889 Update OpenAILLMService examples to use gpt-4.1 2025-04-15 14:59:55 -04:00
Mark Backman
fdb46a0fa9 Update client/server demos to kick off conversation in on_client_ready handler 2025-04-15 14:50:38 -04:00
Filipi Fuchter
3588b06718 Adding missing torch dependency. 2025-04-15 12:28:36 -03:00
Filipi Fuchter
73874f6ec0 Loading the smart turn model. 2025-04-15 12:11:06 -03:00
Filipi Fuchter
6ab9a8ad7f Starting to create a local smart turn 2025-04-15 11:24:39 -03:00
Filipi Fuchter
821e303249 Bringing Aleix initial implementation for the smart turn. 2025-04-15 10:21:40 -03:00
chadbailey59
efae26a5a8 Client connect/disconnect events for DailyTransport (#1544)
* added multi transport example

* added working example

* restructured example and added readme

* removed image

* cleanup

* changed data type of callback signature

* removed pipecat example

* added changelog
2025-04-14 15:56:41 -05:00
Aleix Conchillo Flaqué
d16ace22ac Merge pull request #1583 from pipecat-ai/aleix/soundfilemixer-constructor-updates
SoundfileMixer: add mixing argument and require keywords
2025-04-14 10:59:30 -07:00
Rahul Tayal
001c26b79c Fixed params issue in Google Vertex ai 2025-04-14 23:29:16 +05:30
Arman
8dc4f1cda0 Troubleshooting SSL error 2025-04-14 13:39:53 -04:00
Aleix Conchillo Flaqué
ab6be11a0e SoundfileMixer: add mixing argument and require keywords 2025-04-14 08:30:56 -07:00
Filipi da Silva Fuchter
054158b0ff Merge pull request #1579 from pipecat-ai/fixing_smallwebrtc_issue
Fixed an issue in `SmallWebRTCTransport`
2025-04-14 10:44:22 -03:00
Filipi da Silva Fuchter
174cf13abd Merge pull request #1580 from pipecat-ai/fixing_voice_agent_example
Fixing the voice agent example to always create the video transceiver.
2025-04-14 10:44:07 -03:00
Filipi Fuchter
099d2c02e1 Fixing the voice agent example to always create the video transceiver. 2025-04-14 10:41:39 -03:00
Filipi Fuchter
e1108466f6 Fixed an issue in SmallWebRTCTransport where an error was thrown if the client did not create a video transceiver. 2025-04-14 10:36:25 -03:00
Mark Backman
edd53d425e Merge pull request #1577 from pipecat-ai/hush/trackStoppedSimpleChatbot
docs: Fix TrackStopped typo in SimpleChatbot
2025-04-14 08:32:58 -04:00
James Hush
b160cf34e9 Remove formatting 2025-04-14 15:13:45 +08:00
James Hush
dae3b927e1 docs: Fix TrackStopped typo in SimpleChatbot 2025-04-14 15:12:17 +08:00
Mark Backman
bd3d30111a Merge pull request #1569 from pipecat-ai/pipecat-0.0.63
Update CHANGELOG for 0.0.63
2025-04-11 20:09:58 -04:00
Mark Backman
8c7e16e717 Update CHANGELOG for 0.0.63 2025-04-11 20:04:50 -04:00
Mark Backman
f6accbd510 Updating foundation examples to use SmallWebRTCTransport and pipecat-ai-small-webrtc-prebuilt (#1534)
Co-authored-by: Filipi Fuchter <filipi@daily.co>
2025-04-11 19:44:16 -04:00
Mark Backman
8186219879 Merge pull request #1513 from pipecat-ai/mb/gemini-context-formatting
Fix: GeminiMultimodalLiveLLMService, add spaces between words in assi…
2025-04-11 15:30:51 -04:00
Mark Backman
b9a2ed5b58 Fix: GeminiMultimodalLiveLLMService, add spaces between words in assistant context messages 2025-04-11 15:14:52 -04:00
Mark Backman
7ac12ffc85 Merge pull request #1550 from pipecat-ai/mb/cartesia-spelling-timestamps
Fix: Cartesia's spelling feature adds whole word to context
2025-04-11 15:13:55 -04:00
Filipi da Silva Fuchter
f623cf96f7 Merge pull request #1560 from pipecat-ai/bot_left_signalling
Bot left signalling message
2025-04-11 16:08:01 -03:00
Mark Backman
06be20eb16 Fix: Cartesia's spelling feature adds whole word to context 2025-04-11 15:04:58 -04:00
Filipi Fuchter
816b3a9545 Fixing ruff format 2025-04-11 15:37:16 -03:00
Filipi Fuchter
255666925b Sending a new signalling message peerLeft. 2025-04-11 15:35:50 -03:00
Mark Backman
0df065fda4 Merge pull request #1566 from pipecat-ai/mb/gemini-live-beta
Add Gemini Live support for languages, native model transcriptions, media resolution, and VAD settings
2025-04-11 12:40:04 -04:00
Mark Backman
241a947b8b Add CHANGELOG entries 2025-04-11 11:46:48 -04:00
Mark Backman
e28c199dd1 Add GeminiMultimodalLiveLLMService support for VAD Params 2025-04-11 11:46:48 -04:00
Filipi da Silva Fuchter
6220ee4efb Merge pull request #1565 from pipecat-ai/fixing_video_transform_demo
Fixing the video transform demo to use 20ms audio.
2025-04-11 11:45:29 -03:00
Filipi Fuchter
b650d043bf Fixing the video transform demo to use 20ms audio. 2025-04-11 11:22:41 -03:00
Mark Backman
121e6d2157 Add media resolution support to GeminiMultimodalLiveLLMService 2025-04-11 10:18:29 -04:00
Mark Backman
dbd7869de7 Add model transcription support 2025-04-11 10:02:52 -04:00
Mark Backman
b7d56d5ff0 Add language support for Gemini Live 2025-04-11 09:21:14 -04:00
Bnowako
61cba0136f Add marker file for static type checkers 2025-04-11 11:00:57 +02:00
Filipi da Silva Fuchter
ed743b55d4 Merge pull request #1561 from pipecat-ai/fixing_voice_agent
Fixing voice agent example
2025-04-10 23:33:35 -03:00
Filipi Fuchter
fb074895f5 Fixing ruff format. 2025-04-10 23:19:31 -03:00
Filipi Fuchter
d916865ccc Fixing voice agent example to work with the last released version of pipecat. 2025-04-10 23:10:50 -03:00
Filipi Fuchter
6378a8ccd3 Starting to implement a signalling message to when the bot has left 2025-04-10 23:02:27 -03:00
Aleix Conchillo Flaqué
5dbb5f176b Merge pull request #1551 from pipecat-ai/aleix/daily-python-0.17.0
pyproject: update daily-python to 0.17.0
2025-04-10 09:06:55 -07:00
Filipi da Silva Fuchter
b89f2611f7 Merge pull request #1539 from pipecat-ai/small_wbertc_mute_state
SmallWebRTC mute state
2025-04-10 11:26:53 -03:00
Filipi Fuchter
db0f783c55 Updating the video-transform demo to use the latest version of the SmallWebRTCTransport. 2025-04-10 11:23:28 -03:00
Filipi Fuchter
20ec323647 Refactoring the video-transform demo to be able to enable or disable the cam. 2025-04-10 11:23:05 -03:00
Filipi Fuchter
f71c09a4fd Added support in SmallWebRTCTransport to detect when remote tracks are muted. 2025-04-10 11:22:37 -03:00
Mark Backman
cba4ebfcf9 Merge pull request #1555 from pipecat-ai/mb/gemini-beta-base 2025-04-10 09:01:16 -04:00
Mark Backman
3b9a8946f9 Update GeminiMultimodalLiveLLMService base_url 2025-04-10 08:17:52 -04:00
Mark Backman
db3620c4be Merge pull request #1553 from balaji-atoa/main
feat: change default model name on live api
2025-04-10 08:10:35 -04:00
Mark Backman
11338ea92d Merge pull request #1552 from pipecat-ai/mb/p2p-capture-image
Add image capture to SmallWebRTCTransport
2025-04-10 07:52:13 -04:00
Filipi da Silva Fuchter
90563a4091 Merge pull request #1542 from pipecat-ai/small_webrtc_prebuilt_ui
Using the small-webrtc-prebuilt-ui
2025-04-10 07:39:26 -03:00
Filipi da Silva Fuchter
937f5f7cb7 Update examples/p2p-webrtc/video-transform/server/requirements.txt
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-10 07:37:23 -03:00
Filipi da Silva Fuchter
4f221b817a Update examples/p2p-webrtc/video-transform/README.md
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-10 07:37:07 -03:00
balaji-atoa
c79c1f65fc feat: change default model name on live api 2025-04-10 11:59:11 +05:30
Mark Backman
8ad2ad0e59 Add image capture to SmallWebRTCTransport 2025-04-09 23:01:06 -04:00
Aleix Conchillo Flaqué
499b258bf9 pyproject: update daily-python to 0.17.0 2025-04-09 18:59:10 -07:00
Filipi Fuchter
05b6a5ae4b Improving the video-transform readme 2025-04-09 15:55:13 -03:00
Filipi Fuchter
65fcea28ce Using the small-webrtc-prebuilt-ui 2025-04-09 15:45:30 -03:00
Kwindla Hultman Kramer
005c0b55b6 Merge pull request #1545 from pipecat-ai/khk/gem-live-0408
Gemini Multimodal Live API base_url format fix
2025-04-08 21:46:30 -07:00
Kwindla Hultman Kramer
1828127f41 small fix to wss base_url 2025-04-08 20:22:26 -07:00
Filipi da Silva Fuchter
77ab841cab Merge pull request #1532 from pipecat-ai/p2p_ios_demo
iOS demo for the p2p-webrtc video-transform example
2025-04-07 16:58:06 -03:00
Filipi Fuchter
3bbc75110a Mentioning the iOS client inside the changelog and fixing the readme. 2025-04-07 16:54:26 -03:00
Filipi Fuchter
b2ce1d9378 Merge branch 'main' into p2p_ios_demo 2025-04-07 16:50:58 -03:00
Filipi Fuchter
58714865df Using the public version of pipecat-client-ios-small-webrtc 2025-04-07 16:48:18 -03:00
Mark Backman
03b3635b0a Merge pull request #1521 from pipecat-ai/mb/increase-bot-vad-stop-secs
Increase BOT_VAD_STOP_SECS for services with slower speech patterns
2025-04-07 14:44:31 -04:00
Mark Backman
aaa7b5e626 Merge pull request #1524 from pipecat-ai/mb/tts-generate-with-text
TTS: Skip generation when there is no text
2025-04-07 14:44:18 -04:00
Varun Singh
0b8486ce39 Merge pull request #1418 from pipecat-ai/vr000m-pcc-dialin-webhook-server
Pipecat Cloud: Companion server to handle webhooks for pinless dial-in
2025-04-07 09:00:38 -07:00
Mark Backman
d4ae091ddd Update port in FastAPI README, add run steps to nextjs README 2025-04-07 11:09:43 -04:00
Mark Backman
9e0a57a6de Rename directories 2025-04-07 10:44:41 -04:00
Mark Backman
fc4c1e4110 README updates 2025-04-07 10:33:18 -04:00
Mark Backman
9b740d9e72 Merge pull request #1537 from pipecat-ai/mb/azure-tts-lang
Fix: Set language for Azure TTS services
2025-04-07 09:46:08 -04:00
Mark Backman
b03563765f Fix: Set language for Azure TTS services 2025-04-07 09:24:31 -04:00
Filipi Fuchter
a1578bd67a iOS demo for the p2p-webrtc video-transform example 2025-04-04 16:40:52 -03:00
Filipi da Silva Fuchter
6466573b84 Merge pull request #1498 from pipecat-ai/aiortc_example_ios
Improvements for the SmallWebRTCTransport
2025-04-04 16:39:06 -03:00
Filipi Fuchter
b42dc83696 Improvements for the SmallWebRTCTransport:
- Wait until the pipeline is ready before triggering the `connected` event.
  - Queue messages if the data channel is not ready.
  - Update the aiortc dependency to fix an issue where the 'video/rtx' MIME type
    was incorrectly handled as a codec retransmission.
  - Avoid initial video delays.
2025-04-04 16:33:57 -03:00
Filipi Fuchter
fe5931b884 Updating aiortc to fix an issue where 'video/rtx' MIMEType retransmission incorrectly handled as a codec 2025-04-04 16:28:54 -03:00
Filipi Fuchter
4b438ff7d7 Allowing ngrok connections to the video-transform demo 2025-04-04 16:28:37 -03:00
Filipi da Silva Fuchter
89a8c16676 Merge pull request #1531 from pipecat-ai/fix_chunk_default_value
Fixed SmallWebRTCTransport to support dynamic chunk values.
2025-04-04 16:04:05 -03:00
Filipi Fuchter
c4c92585f9 Fixed SmallWebRTCTransport to support dynamic chunk values. 2025-04-04 15:38:12 -03:00
Prem Adithya
c510870736 Merge branch 'pipecat-ai:main' into anthropic-client-bug-fixes 2025-04-04 16:41:04 +11:00
Shaiyon Hariri
af23200511 Use default google creds as fallback when not provided in llm_vertex,stt, and tts 2025-04-03 16:42:58 -04:00
Mark Backman
63146d6f85 TTS: Skip generation when there is no text 2025-04-03 16:15:58 -04:00
Mattie Ruth
ec00edc893 Update client examples to use latest versions (#1523) 2025-04-03 15:47:03 -04:00
Mark Backman
a21be058e2 Increase BOT_VAD_STOP_SECS for services with slower speech patterns 2025-04-03 15:25:48 -04:00
Mark Backman
c226c20e12 Merge pull request #1519 from pipecat-ai/mb/ref-docs-toc
Docs: Update ToC With Adapters and Observers
2025-04-03 15:19:35 -04:00
Aleix Conchillo Flaqué
78e6669105 Merge pull request #1514 from pipecat-ai/aleix/producer-consumer-processors
processors: add ProducerProcessor and ConsumerProcessor
2025-04-03 12:18:49 -07:00
Aleix Conchillo Flaqué
79f29e14dd processors: add ProducerProcessor and ConsumerProcessor 2025-04-03 09:44:56 -07:00
Mark Backman
d4a00fd080 Merge pull request #1517 from pipecat-ai/mb/update-simple-chatbot-packages
Update client packages for simple-chatbot JS and React
2025-04-03 10:07:40 -04:00
Mark Backman
d4186fa115 Merge pull request #1518 from pipecat-ai/mb/openai-verse
Add verse voice and bump the OpenAI version
2025-04-03 09:48:09 -04:00
Mark Backman
3536cbcd13 Add docstrings to FunctionSchema, update CONTRIBUTING.md with docstrings guidance, ignore __init__ docstrings if a class is sufficiently documented 2025-04-03 09:21:26 -04:00
Mark Backman
e3bcb70b13 Update ToC With Adapters and Observers 2025-04-03 09:02:09 -04:00
Mark Backman
19a82f9522 Add verse voice and bump the OpenAI version 2025-04-03 08:23:59 -04:00
Mark Backman
8c0a847449 Update client packages for simple-chatbot JS and React 2025-04-03 07:43:25 -04:00
Dominic Stewart
e3704cd1a1 Updated imports to work with pipecat 0.62 (#1515) 2025-04-03 15:07:02 +08:00
Dominic Stewart
1ba037865b Call Transfer demo (#1348)
* Updated code to dial out to an operator, keep track of operator conversation while escalated and then return to conversation when finished

* Removed unnecessary imports

* Updated bot runner code, added call routing file and then updated the call transfer and voicemail detection examples

* Updated the bot files

* Made prompt one level higher in the body and an array

* Updated call transfer examples to work correctly

* Updated gemini voicemail detection example to work

* Added twilio bot support back to the bot_runner

* Moved some state management, participant management and other logic to the helper file.

* Updated comments

* Updated env and requirements file

* Ran the examples and made sure code works. Still need to work on the prompts a bit

* Fixed format issue

* Add support to disable summary in call transfer

* Added support for operator transfer mode

* Updated readme file

* Updated readme based on feedback, and handling of various properties in the json to be more flexible for future examples

* Updated number of endpoints

* Updated readme to remove fly deployment text and replaced with Pipecat Cloud

* Starting to tweak function calls and prompts

* Updated examples to more consistently call the functions and say what they need to say

* Updated examples

* Updated examples

* Updated examples to work correctly

* Add simple bot versions of dialin and dialout

* Refactored the bot runner file to make adding future examples easier

* Based on feedback, removed examples for multiple LLMs and also adjusted voicemail detection code to be simpler

* Made sure to only capture the users transcription once

* Updated readme with latest changes

* Forgot to update the order of examples in one place

* Fixed formatting issue

* Adjusted based on james feedback

* Changed default_mode to default_calltransfer_mode
2025-04-03 09:03:23 +09:00
Aleix Conchillo Flaqué
909520f76e Merge pull request #1508 from pipecat-ai/mb/gemini-push-stop-speaking-frame
LLMAssistantContextAggregator should push BotStoppedSpeakingFrames
2025-04-02 16:25:08 -07:00
Mark Backman
d06cfcd597 Merge pull request #1512 from pipecat-ai/mb/fix-gemini-examples
Examples: Fix context_aggregator.assistant() pipeline position
2025-04-02 19:07:09 -04:00
Mark Backman
2579d0cf57 Examples: Fix context_aggregator.assistant() pipeline position 2025-04-02 16:11:03 -04:00
Mark Backman
1ec20b2e74 Merge pull request #1509 from pipecat-ai/mb/openia-voices
Add new voices to OpenAITTSService
2025-04-02 15:50:39 -04:00
Mark Backman
55a6e5aa4c Add new voices to OpenAITTSService 2025-04-02 12:09:36 -04:00
Varun Singh
2229730169 moving to appropriate directory 2025-04-01 23:45:09 -07:00
Varun Singh
24b54c66ee fixes review comments 2025-04-01 23:39:21 -07:00
Varun Singh
a14205415f replaced dailyAPIKey with pccApiKey, also allow handling of messages when hmac is missing 2025-04-01 23:34:24 -07:00
Varun Singh
18b56d4a10 Fix README.md 2025-04-01 23:32:50 -07:00
Mark Backman
b85bd91d08 LLMAssistantContextAggregator should push BotStoppedSpeakingFrames 2025-04-01 23:35:09 -04:00
Aleix Conchillo Flaqué
23f3285a7d Merge pull request #1507 from pipecat-ai/aleix/pipecat-0.0.62
update CHANGELOG for 0.0.62
2025-04-01 19:00:06 -07:00
Aleix Conchillo Flaqué
94f6436619 update CHANGELOG for 0.0.62 2025-04-01 18:55:04 -07:00
Aleix Conchillo Flaqué
480692971c Merge pull request #1506 from pipecat-ai/aleix/websockets-mixer-loop-fixes
transports(websocket): close connection from last transport
2025-04-01 18:52:47 -07:00
Aleix Conchillo Flaqué
5df5f6ae4c transports(websocket): close connection from last transport 2025-04-01 18:32:03 -07:00
Aleix Conchillo Flaqué
6940112ab9 Merge pull request #1504 from pipecat-ai/aleix/base-output-transport-audio-10ms-chunk-update
TransportParams: set audio_out_10ms_chunks to 4
2025-04-01 15:15:24 -07:00
Aleix Conchillo Flaqué
80584e9138 TransportParams: set audio_out_10ms_chunks to 4 2025-04-01 15:13:28 -07:00
Aleix Conchillo Flaqué
1fd01e715d Merge pull request #1503 from pipecat-ai/aleix/function-call-result-system-frame
frames: make FunctionCallResultFrame a SystemFrame
2025-04-01 15:08:26 -07:00
Aleix Conchillo Flaqué
a7a1cd0cde Merge pull request #1502 from pipecat-ai/aleix/test-user-idle-py310
tests: fix test_user_idle_processor for python 3.10
2025-04-01 15:08:10 -07:00
Aleix Conchillo Flaqué
e5a6b9d2b4 Merge pull request #1500 from pipecat-ai/aleix/base-output-transport-optimize-bot-speaking
BaseOutputTransport: optimize BotSpeakingFrames
2025-04-01 14:59:25 -07:00
Aleix Conchillo Flaqué
169b50af61 frames: make FunctionCallResultFrame a SystemFrame 2025-04-01 14:42:22 -07:00
Aleix Conchillo Flaqué
31311d8ac5 tests: fix test_user_idle_processor for python 3.10 2025-04-01 13:54:59 -07:00
Aleix Conchillo Flaqué
bfd06b321d BaseOutputTransport: optimize BotSpeakingFrames 2025-04-01 11:11:49 -07:00
Aleix Conchillo Flaqué
3efbcab39c Merge pull request #1499 from pipecat-ai/aleix/base-output-transport-set-chunks-size
BaseOutputTransport: allow setting 10ms output audio chunks
2025-04-01 11:10:34 -07:00
Aleix Conchillo Flaqué
b40ca391f5 BaseOutputTransport: allow setting 10ms output audio chunks 2025-04-01 10:48:36 -07:00
Aleix Conchillo Flaqué
43008c8c5b Merge pull request #1501 from pipecat-ai/aleix/transcription-processor-interruption
TranscriptProcessor: send TranscriptionUpdateFrame after interruption
2025-04-01 10:46:16 -07:00
Aleix Conchillo Flaqué
3a37b11e56 TranscriptProcessor: send TranscriptionUpdateFrame after interruption 2025-04-01 10:21:21 -07:00
Mark Backman
9ea81bc982 Merge pull request #1497 from pipecat-ai/mb/gladia-languages
Align languages with Gladia's supported languages, remove audio_enhancer option
2025-04-01 11:54:24 -04:00
Mark Backman
98b499e2e9 Remove audio_enhancer option 2025-04-01 10:26:28 -04:00
Mark Backman
72c8f6c8c3 Update GladiaSTTService language list 2025-04-01 10:17:42 -04:00
Mark Backman
ea61256ddc Merge pull request #1496 from pipecat-ai/mb/gladia-model
Update GladiaSTTService default model
2025-04-01 08:52:13 -04:00
Mark Backman
babafadbe4 Merge pull request #1494 from pipecat-ai/mb/p2p-examples-gitignore
Add .gitignore to p2p video-transform example
2025-04-01 07:39:35 -04:00
Mark Backman
a5660f6dc7 Add .gitignore to p2p video-transform example 2025-04-01 07:20:39 -04:00
Aleix Conchillo Flaqué
64ad916c5f Merge pull request #1492 from pipecat-ai/aleix/downgrade-to-aiohttp-3.11.12
pyproject: downgrade to aiohttp 3.11.12
2025-03-31 19:01:04 -07:00
Aleix Conchillo Flaqué
13d0563298 pyproject: downgrade to aiohttp 3.11.12
See https://pypi.org/project/aiohttp/#history
2025-03-31 18:59:41 -07:00
Mark Backman
20a1dd066d Update GladiaSTTService default model 2025-03-31 19:02:28 -04:00
Mark Backman
56f6e3ceb4 Merge pull request #1490 from pipecat-ai/fix_ruff_format
Fixing ruff format.
2025-03-31 18:37:19 -04:00
Mark Backman
3afab63870 Merge pull request #1488 from pipecat-ai/mb/stt-mute-filter-logline
Clarify the mute/unmute log line in STTMuteFilter
2025-03-31 18:35:47 -04:00
Filipi Fuchter
d3b9a0aab0 Fixing ruff format. 2025-03-31 19:17:40 -03:00
Filipi da Silva Fuchter
6b21081a7d Merge pull request #1487 from pipecat-ai/smallwebrtc_ios_support
SmallWebRTCTransport: Improvements to work with mobile
2025-03-31 19:10:03 -03:00
Aleix Conchillo Flaqué
648bdea64c fix formatting 2025-03-31 15:04:45 -07:00
milo157
ed387e876a Merge pull request #1486 from CerebriumAI/feature/ultravox
Feature/ultravox - bug fixes
2025-03-31 15:03:26 -07:00
Aleix Conchillo Flaqué
2fb9aa4d76 Merge pull request #1489 from pipecat-ai/aleix/base-ai-services-restructure
services: restructure base AI services into modules
2025-03-31 15:00:13 -07:00
Aleix Conchillo Flaqué
9eba8f1637 services: restructure base AI services into modules 2025-03-31 13:53:36 -07:00
Mark Backman
43c255f58a Clarify the mute/unmute log line in STTMuteFilter 2025-03-31 16:45:02 -04:00
Filipi Fuchter
121e70a029 Improvements on the video transform example to work on mobile. 2025-03-31 17:11:38 -03:00
Filipi Fuchter
70e28a0547 Adding support to yuvj420p which is the format that we receive from mobile iOS. 2025-03-31 13:12:20 -03:00
Mark Backman
c9a93f2504 Merge pull request #1469 from pipecat-ai/mb/update-gladia
Refactor GladiaSTTService to support addition params
2025-03-31 11:18:32 -04:00
Adithya Suresh
e8783f6a33 Handle cache token counts being none 2025-03-31 15:25:11 +11:00
Mark Backman
8a12470efd Reorganize into a directory 2025-03-30 20:01:40 -04:00
Mark Backman
05d53bc66f Refactor GladiaSTTService; add support for additional params 2025-03-30 19:54:55 -04:00
Aleix Conchillo Flaqué
e763cd7bee Merge pull request #1471 from pipecat-ai/aleix/services-restructure
services: restructure services into folders
2025-03-30 16:23:26 -07:00
Aleix Conchillo Flaqué
94ec5118e6 track already reported deprecated modules (mark's update) 2025-03-30 16:21:00 -07:00
Aleix Conchillo Flaqué
7203ef6885 examples: use new services packages 2025-03-30 16:21:00 -07:00
Aleix Conchillo Flaqué
3074a62bb1 services: restructure services into folders 2025-03-30 16:21:00 -07:00
Mark Backman
31712b84ac Merge pull request #1479 from pipecat-ai/mb/qwen-pyproject-entry 2025-03-29 22:36:40 -04:00
Mark Backman
c99ec0b0b7 Add placeholder entry for qwen to pyproject.toml 2025-03-29 20:20:48 -04:00
Mark Backman
cd7abd2962 Merge pull request #1478 from pipecat-ai/mb/alibaba-cloud-offerings
Add QwenLLMService
2025-03-29 20:13:21 -04:00
Mark Backman
c7544954cf Merge pull request #1476 from pipecat-ai/mb/ref-docs-mem0-mlx-whisper
Update reference docs generation for mem0 and mlx-whisper
2025-03-29 20:12:58 -04:00
Mark Backman
4f390b15a3 Merge pull request #1477 from pipecat-ai/mb/fix-mem0-example-number
Renumber mem0 example, small changelog updates
2025-03-29 20:04:45 -04:00
Mark Backman
f2a05b065d Add QwenLLMService 2025-03-29 19:43:37 -04:00
Mark Backman
5d5041eb2b Renumber mem0 example, small changelog updates 2025-03-29 18:45:39 -04:00
Mark Backman
f4dc66cb13 Update reference docs generation for mem0 and mlx-whisper 2025-03-29 18:42:08 -04:00
Mark Backman
b88744b18d Merge pull request #1475 from pipecat-ai/khk/whisper-mlx-example
Example and CHANGELOG for WhisperSTTServiceMLX service
2025-03-29 18:09:17 -04:00
Kwindla Hultman Kramer
209de2638d WhisperSTTServiceMLX example and CHANGELOG 2025-03-29 18:04:07 -04:00
Mark Backman
5d829fb6a9 Merge pull request #1474 from pipecat-ai/khk/mem0-changelog
Changelog entry for mem0 service
2025-03-29 18:02:32 -04:00
Mark Backman
a978a5cd4a Fix Whisper formatting 2025-03-29 17:57:50 -04:00
Mark Backman
b9ea3f0fd9 Update README, organize pyproject.toml 2025-03-29 17:56:17 -04:00
Kwindla Hultman Kramer
d2f5ee2915 Changelog entry for mem0 service 2025-03-29 17:55:26 -04:00
Mark Backman
acddddc508 Merge pull request #1472 from pipecat-ai/mb/small-webrtc-readme
Add README link for SmallWebRTCTransport
2025-03-29 17:38:15 -04:00
Kwindla Hultman Kramer
0c2c6fa771 Merge pull request #1383 from zboyles/add-mlx-whisper
Added Support for MLX Whisper models on Apple M-Series
2025-03-29 14:25:37 -07:00
Mark Backman
80088c6138 Merge pull request #1473 from pipecat-ai/mb/ref-docs-updates
Update packages for auto-generating docs
2025-03-29 17:20:46 -04:00
Kwindla Hultman Kramer
766639a9a4 Merge pull request #1388 from deshraj/user/dyadav/mem0-integration
Added mem0 service.
2025-03-29 13:12:58 -07:00
Mark Backman
675e2b1498 Update packages for auto-generating docs 2025-03-29 08:21:58 -04:00
Mark Backman
af6c23f7b1 Add README link for SmallWebRTCTransport 2025-03-28 21:29:24 -04:00
Aleix Conchillo Flaqué
d212e88030 Merge pull request #1468 from pipecat-ai/aleix/smallwebrtc-updates
transports(webrtc): some SmallWebRTC updates
2025-03-28 14:41:45 -07:00
Aleix Conchillo Flaqué
d6758bf2ad transports(webrtc): rename appMessage to app-message 2025-03-28 14:35:11 -07:00
Filipi Fuchter
5abfb15300 Registering the event handlers and fixing the examples. 2025-03-28 17:30:06 -03:00
Aleix Conchillo Flaqué
f576254d61 transports(webrtc): some SmallWebRTC updates 2025-03-28 13:19:23 -07:00
Aleix Conchillo Flaqué
a90807a3d2 Merge pull request #1465 from roey-priel/main
Tavus / Deepgram TTS compatibility
2025-03-28 08:43:09 -07:00
roey
a06fc4ce50 yield outside of the loop 2025-03-28 08:41:36 -07:00
roey
80cb4497f0 Merge pull request #1 from roey-priel/deepgram-tts-tavus-compatibility
Update deepgram.py
2025-03-27 17:06:33 -07:00
roey
8aa878c5e9 Update deepgram.py 2025-03-27 17:05:29 -07:00
Filipi da Silva Fuchter
e982b3d919 Merge pull request #1290 from pipecat-ai/aiortc_example
P2P WebRTC transport option to Pipecat
2025-03-27 18:29:44 -03:00
Filipi Fuchter
8945fd1fc6 Starting the server by default as localhost. 2025-03-27 18:27:56 -03:00
Filipi Fuchter
16b97d151b Adding the SmallWebRTCTransport to the changelog. 2025-03-27 17:56:12 -03:00
Filipi Fuchter
f7ac142ad2 Merge branch 'main' into aiortc_example 2025-03-27 17:50:46 -03:00
Filipi da Silva Fuchter
2355067f61 Merge pull request #1441 from pipecat-ai/aiortc_example_small_webrtc_transport
P2P WebRTC transport - example improvements.
2025-03-27 17:49:12 -03:00
Filipi Fuchter
76f9626d35 Using the @pipecat-ai/small-webrtc-transport from npm. 2025-03-27 17:48:32 -03:00
Filipi da Silva Fuchter
f82c2566e8 Merge pull request #1270 from pipecat-ai/improve_protobuf_serializer
Added support to `ProtobufFrameSerializer` to send the transport messages
2025-03-27 17:28:37 -03:00
Filipi Fuchter
b6007bb3d6 Added support to ProtobufFrameSerializer to send the transport messages 2025-03-27 17:26:03 -03:00
Filipi Fuchter
311a5360ad Renaming the example to p2p-webrtc 2025-03-27 16:46:00 -03:00
Filipi Fuchter
62cb0376f2 Changing the file types. 2025-03-27 16:34:40 -03:00
Filipi Fuchter
91a69b7029 Improving the readmes for the webrtc examples. 2025-03-27 16:32:46 -03:00
Mark Backman
1d4d7f28a1 Merge pull request #1463 from pipecat-ai/mb/add-piper-readme
Add Piper to README
2025-03-27 08:52:31 -04:00
Mark Backman
a55a7bbb96 Add Piper to README 2025-03-27 08:03:16 -04:00
Mark Backman
a394b35e85 Merge pull request #1459 from pipecat-ai/mb/issue-1454
Fix: GoogleTTSService was emitting two TTSStoppedFrames
2025-03-27 08:00:16 -04:00
Mark Backman
aa85df4fd6 Fix: GoogleTTSService was emitting two TTSStoppedFrames 2025-03-27 07:55:19 -04:00
Filipi da Silva Fuchter
3bb1f5f7a8 Merge pull request #1130 from pedro-a-n-moreira/piper-tts
Add support for Piper TTS
2025-03-27 08:08:05 -03:00
Filipi Fuchter
7c115f9d59 Merge branch 'main' into piper-tts
# Conflicts:
#	CHANGELOG.md
2025-03-27 08:01:38 -03:00
Filipi Fuchter
a82b847971 Fixing ruff format. 2025-03-27 07:58:53 -03:00
Filipi Fuchter
50515aa842 Adding PiperTTSService to the changelog. 2025-03-27 07:50:47 -03:00
Filipi Fuchter
b348fde32b Refactoring PiperTTSService to match the others TTS services provided by Pipecat and fixing noise issue due to wav header. 2025-03-27 07:46:38 -03:00
Filipi Fuchter
45787520b2 Refactoring the piper test to use run_test provided by Pipecat 2025-03-27 07:45:28 -03:00
Filipi Fuchter
053bf72da2 Adding pytest-aiohttp to the dev requirements. 2025-03-27 07:44:46 -03:00
Filipi Fuchter
ca4893397a Creating a foundational example which uses the piper service. 2025-03-27 07:44:26 -03:00
Filipi Fuchter
c1f6a4e079 Adding PIPER_BASE_URL to the env template. 2025-03-27 07:44:05 -03:00
Aleix Conchillo Flaqué
135ed811f1 Merge pull request #1460 from pipecat-ai/aleix/segmented-tts-ignore-emulated-frames
segmented tts ignore emulated frames
2025-03-26 16:03:59 -07:00
Aleix Conchillo Flaqué
055a3f1c53 LLMAssistantContextAggregator: stop emulations if the user starts speaking 2025-03-26 14:39:12 -07:00
Aleix Conchillo Flaqué
750bb88586 SegmentedSTTService: ignore emulated frames 2025-03-26 14:38:48 -07:00
Aleix Conchillo Flaqué
c4f9171fe1 frames: indicate if UserStartedSpeakingFrame/UserStoppedSpeakingFrame are emulated 2025-03-26 14:37:36 -07:00
Filipi Fuchter
d223201c3f Merge branch 'main' into piper-tts
# Conflicts:
#	test-requirements.txt
2025-03-26 16:47:45 -03:00
Mark Backman
86701fd3c7 Merge pull request #1457 from pipecat-ai/mb/fix-rtvi-observer-gemini
Fix: Resolve an issue where Google LLM context messages were causing …
2025-03-26 14:18:37 -04:00
Mark Backman
b414077a07 Fix: Resolve an issue where Google LLM context messages were causing a TypeError 2025-03-26 13:55:42 -04:00
kompfner
15f23929e9 Merge pull request #1455 from pipecat-ai/prepare-0.0.61
Update CHANGELOG for 0.0.61
2025-03-26 13:50:29 -04:00
Mark Backman
cc9e4047d0 Merge pull request #1447 from nicougou/feat/support_tts_instruct
feature/support instructions in OpenAITTSService
2025-03-26 13:35:41 -04:00
Paul Kompfner
4ef4dcefce Update CHANGELOG for 0.0.61 2025-03-26 13:06:31 -04:00
kompfner
f3caa8cf7a Merge pull request #1452 from pipecat-ai/daily-python-0.16.1
Bump daily-python dependency to 0.16.1 to pick up a bugfix
2025-03-26 13:01:38 -04:00
Mark Backman
e5470fec7a Merge pull request #1453 from pipecat-ai/khk/groq
New GroqTTSService
2025-03-26 12:49:18 -04:00
Mark Backman
887c197bce Add sample_rate to the constructor 2025-03-26 12:29:40 -04:00
Kwindla Hultman Kramer
f5d49fea81 try/catch import of groq SDK 2025-03-26 12:29:40 -04:00
Kwindla Hultman Kramer
e087f6ec5d GroqTTSService added to CHANGELOG.md 2025-03-26 12:29:39 -04:00
Kwindla Hultman Kramer
406f5a395b fix class heirarchy and audio chunking 2025-03-26 12:29:18 -04:00
Kwindla Hultman Kramer
060bb4c26b wip 2025-03-26 12:29:18 -04:00
Nico
499e69846d review: add changelog entries 2025-03-26 17:13:30 +01:00
Paul Kompfner
e6e339a02e Bump daily-python dependency to 0.16.1 to pick up a bugfix 2025-03-26 11:22:23 -04:00
Nico
dc2ee2bf0a review: remove websocket_base_url 2025-03-26 15:41:42 +01:00
Nico
d982fc35d8 fix: formatter 2025-03-26 15:41:42 +01:00
Nico
72d373e565 feature/support instructions in OpenAITTSService 2025-03-26 15:41:42 +01:00
Aleix Conchillo Flaqué
59fdfe697d Merge pull request #1449 from pipecat-ai/aleix/google-assistant-aggregator-function-call-result
GoogleAssistantContextAggregator: allow any value as function call result
2025-03-26 07:25:34 -07:00
Filipi da Silva Fuchter
97c9e0676e Merge pull request #1451 from pipecat-ai/set-tool-choice-from-context-aggregator
Set tool choice from context aggregator
2025-03-26 09:12:26 -03:00
Filipi Fuchter
aeac40312e Added the feature to change dynamically the tool choice to the changelog. 2025-03-26 09:06:29 -03:00
Filipi Fuchter
ce9f75a851 Fixing the tool choice extra type to be a dict instead of string. 2025-03-26 08:17:50 -03:00
Filipi Fuchter
c45d852f6b Merge branch 'main' into set-tool-choice-from-context-aggregator
# Conflicts:
#	src/pipecat/processors/aggregators/llm_response.py
2025-03-26 07:14:57 -03:00
Deshraj Yadav
55cc1fe9f6 Fix import lines 2025-03-25 23:35:47 -07:00
Deshraj Yadav
1ba7e2d6fa Format imports properly 2025-03-25 23:30:01 -07:00
Deshraj Yadav
1b8d326b49 Run ruff 2025-03-25 23:15:35 -07:00
Aleix Conchillo Flaqué
077952b658 GoogleAssistantContextAggregator: allow any value as function call result 2025-03-25 19:11:27 -07:00
Deshraj Yadav
e694971423 Merge pull request #2 from pipecat-ai/khk/mem0
small changes to make 35-mem0.py
2025-03-25 18:10:36 -07:00
Kwindla Hultman Kramer
d00ae492e5 small changes to make 35-mem0.py like the other foundational single-file examples. 2025-03-25 15:51:38 -07:00
Aleix Conchillo Flaqué
9450b07ec5 Merge pull request #1442 from pipecat-ai/aleix/on-context-updated-as-task
LLMAssistantContextAggregator: create a task to run on_context_updated
2025-03-25 15:39:36 -07:00
Aleix Conchillo Flaqué
19b464ba23 tests: add assistant aggregator function call frame handling 2025-03-25 15:37:06 -07:00
Aleix Conchillo Flaqué
8aebf00c2d GoogleAssistantContextAggregator: function call result should be a JSON object 2025-03-25 15:37:06 -07:00
Aleix Conchillo Flaqué
01458895c2 LLMAssistantContextAggregator: create a task to run on_context_updated 2025-03-25 14:37:11 -07:00
kompfner
2082d023ef Merge pull request #1448 from pipecat-ai/daily-python-0.16.0
Bump daily-python dependency to 0.16.0 to pick up support in `DailyTr…
2025-03-25 17:32:38 -04:00
Paul Kompfner
c99436b80e Bump daily-python dependency to 0.16.0 to pick up support in DailyTransport for updating remote participants' canReceive permission via the update_remote_participants() method 2025-03-25 17:29:48 -04:00
Filipi Fuchter
f884c93826 Refactoring the video-transform example to use pipecat client. 2025-03-25 17:32:25 -03:00
Deshraj Yadav
2780c6eed6 Incorporate suggestions 2025-03-25 10:45:08 -07:00
Deshraj Yadav
7ad36eeaf4 Add mem0 as a service integration 2025-03-25 10:44:12 -07:00
Filipi Fuchter
67a93d09c2 Merge branch 'main' into aiortc_example 2025-03-25 10:31:53 -03:00
Aleix Conchillo Flaqué
f3b50bc3c4 Revert "LLMAssistantContextAggregator: create a task to run on_context_updated"
This reverts commit 397bae29f7.
2025-03-24 15:40:26 -07:00
Aleix Conchillo Flaqué
397bae29f7 LLMAssistantContextAggregator: create a task to run on_context_updated 2025-03-24 15:39:35 -07:00
Mark Backman
3b3fdd0da1 Merge pull request #1439 from pipecat-ai/mb/fix-rtvi-bot-speaking-events
Fix: RTVIObserver now outputs a single bot started and stopped speaki…
2025-03-24 11:44:31 -04:00
Mark Backman
a9b1298f3b Fix: RTVIObserver now outputs a single bot started and stopped speaking event per turn 2025-03-24 10:25:31 -04:00
Filipi Fuchter
2fcf4e6d70 Fixing ruff format 2025-03-24 11:23:55 -03:00
Filipi Fuchter
fcb8b9a5b3 Refactoring how we are creating the answer so we don't need to wait for the client gathering all ice candidates. 2025-03-24 11:12:41 -03:00
Filipi Fuchter
fee0409f63 Logging if the remote peer supports trickle ice. 2025-03-24 08:59:21 -03:00
Filipi Fuchter
3be6973e2c Adding support to define ice servers. 2025-03-24 08:57:24 -03:00
Filipi Fuchter
5184d178ef Merge branch 'main' into aiortc_example 2025-03-24 08:37:08 -03:00
Thomas B.
48e8d3968a fix: recognition language correctly set for Azure STT (#1436) 2025-03-23 19:29:52 -07:00
Aleix Conchillo Flaqué
59644a939a Merge pull request #1434 from pipecat-ai/aleix/examples-07-interruptible-local
examples: add foundational 07x-interruptible-local.py
2025-03-23 05:44:40 -07:00
Aleix Conchillo Flaqué
3311afc581 examples: add foundational 07x-interruptible-local.py 2025-03-22 21:58:55 -07:00
Filipi da Silva Fuchter
a3ccbf91f7 Merge pull request #1429 from pipecat-ai/fixing_set_tool_issue
Only checking the length if tools is a list.
2025-03-21 13:56:45 -03:00
Filipi Fuchter
3ed764a769 Only checking the length if tools is a list. 2025-03-21 12:56:05 -03:00
Mark Backman
be8d5a31f5 Merge pull request #1425 from Allenmylath/patch-25
Update env.example
2025-03-21 08:39:03 -04:00
Mark Backman
480bcc1ab1 Merge pull request #1424 from Allenmylath/patch-24
Update requirements.txt
2025-03-21 08:38:54 -04:00
allenmylath
dd81048ddb Update env.example
EXAMPLE USES CARTESI NOT ELEVNE LABS
2025-03-21 10:11:28 +05:30
allenmylath
04d462ff02 Update requirements.txt
example uses cartesia not elevenlabs
2025-03-21 10:09:09 +05:30
Aleix Conchillo Flaqué
7e7aaeddd9 Merge pull request #1423 from pipecat-ai/aleix/elevenlabs-pcm-8000
ElevenLabs: add support for a sample rate of 8000
2025-03-20 19:34:16 -07:00
Aleix Conchillo Flaqué
e77f7c8456 update ruff and pyright versions 2025-03-20 19:19:08 -07:00
Aleix Conchillo Flaqué
442f18d47b ultravox: fix formatting 2025-03-20 19:19:08 -07:00
Aleix Conchillo Flaqué
fc78e6fc5a ElevenLabs: add support for a sample rate of 8000 2025-03-20 19:13:23 -07:00
Aleix Conchillo Flaqué
d71b520153 update CHANGELOG.md and fix formatting 2025-03-20 18:58:06 -07:00
milo157
3b4d91e1c1 Fixed ultravox service bugs (#1420) 2025-03-20 18:55:43 -07:00
Aleix Conchillo Flaqué
09c62d939a Merge pull request #1422 from pipecat-ai/aleix/pipecat-0.0.60
update CHANGELOG for 0.0.60
2025-03-20 16:25:52 -07:00
Aleix Conchillo Flaqué
f2b9789acf update CHANGELOG for 0.0.60 2025-03-20 16:17:34 -07:00
Aleix Conchillo Flaqué
1592703e77 Merge pull request #1421 from pipecat-ai/aleix/rollback-deepgram-to-3.8.0
pyproject: rollback deepgram-sdk to 3.8.0
2025-03-20 16:16:08 -07:00
Aleix Conchillo Flaqué
66e42ae410 pyproject: rollback deepgram-sdk to 3.8.0 2025-03-20 16:15:43 -07:00
Mark Backman
8d6dbbe293 Merge pull request #1417 from pipecat-ai/mb/update-realtime-transcription
Update InputAudioTranscription to use gpt-4o-transcribe model, update…
2025-03-20 18:49:06 -04:00
Mark Backman
2ac8f2ec2d Fix linting 2025-03-20 18:40:16 -04:00
Paul Kompfner
41688205be Provide new settings in OpenAI Realtime example 2025-03-20 18:23:25 -04:00
Mark Backman
541a4b6063 Update InputAudioTranscription to use gpt-4o-transcribe model, update 19 examples to use FunctionSchema 2025-03-20 18:23:24 -04:00
Aleix Conchillo Flaqué
8f6d92ce7d update CHANGELOG with BaseOpenAILLMService default_headers 2025-03-20 13:47:15 -07:00
Aleix Conchillo Flaqué
96fa6c19a8 Merge pull request #1398 from nicougou/feature/openai_custom_headers
feature: add custom headers to AsyncOpenAI
2025-03-20 13:45:57 -07:00
Varun Singh
c9f7882728 initial commit 2025-03-20 12:31:08 -07:00
Aleix Conchillo Flaqué
0fdd577ae7 Merge pull request #1416 from pipecat-ai/aleix/pipecat-0.0.59
update CHANGELOG for 0.0.59
2025-03-20 11:48:14 -07:00
Aleix Conchillo Flaqué
2133152e5b update CHANGELOG for 0.0.59 2025-03-20 11:42:54 -07:00
Aleix Conchillo Flaqué
c3f3f4603d Merge pull request #1413 from pipecat-ai/aleix/llm-user-aggregator-emulate-fixes
LLMUserContextAggregator: fix emulated user started/stopped speaking issues
2025-03-20 11:41:26 -07:00
Aleix Conchillo Flaqué
b20ce7d655 examples: move 07u-interruptible-neuphonic to 07v 2025-03-20 11:38:29 -07:00
Aleix Conchillo Flaqué
66ba1116a4 pyproject: rollback azure to 1.42.0 2025-03-20 11:23:40 -07:00
Aleix Conchillo Flaqué
08956e914a livekit: remove unnecessary transport cleanup() function 2025-03-20 11:23:40 -07:00
Aleix Conchillo Flaqué
5a39f146f6 LLMUserContextAggregator: fix emulated user started/stopped speaking issues 2025-03-20 11:23:40 -07:00
kompfner
de8a831ee1 Merge pull request #1414 from pipecat-ai/march-main
March OpenAI updates
2025-03-20 14:22:09 -04:00
Aleix Conchillo Flaqué
efa5f133d7 openai_realtime: fix and update function calling 2025-03-20 11:14:59 -07:00
Paul Kompfner
44380bc8c0 Remove duplicate changelog entry due to rebase mistake 2025-03-20 13:51:16 -04:00
Paul Kompfner
721ee75887 Comment tweak 2025-03-20 13:43:00 -04:00
Paul Kompfner
ada68f0699 More robust handling of conversation item retrieval errors in OpenAIRealtimeBetaLLMService 2025-03-20 13:43:00 -04:00
Mark Backman
70dbf0d6fc Updated default models for OpenAISTTService and OpenAITTSService to gpt-4o based models 2025-03-20 13:42:56 -04:00
Paul Kompfner
f0774268cc Rename gpt-4o-transcribe-latest to gpt-4o-transcribe in OpenAIRealtimeBetaLLMService 2025-03-20 13:39:40 -04:00
Chad Bailey
2ae5bdd8a9 lets talk about dogs 2025-03-20 13:39:40 -04:00
Chad Bailey
0d74bcacb7 updated models in the 07g example 2025-03-20 13:39:40 -04:00
Paul Kompfner
f94a099111 Revert the default model to be "gpt-4o-realtime-preview-2024-12-17" In OpenAIRealtimeBetaLLMService 2025-03-20 13:39:36 -04:00
Paul Kompfner
3dd4ef7230 Tweak changelog entries describing slate of recent updates to OpenAIRealtimeBetaLLMService 2025-03-20 13:36:22 -04:00
Paul Kompfner
e707efbffa Update changelog with slate of recent updates to OpenAIRealtimeBetaLLMService 2025-03-20 13:35:12 -04:00
Paul Kompfner
7b594093dd Handle the possibility of multiple concurrent calls to retrieve_conversation_item() in the OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
31317ce77d Add error handling to the retrieve_conversation_item() method of the OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
f693a3c70f Add retrieve_conversation_item() method to OpenAIRealtimeBetaLLMService, using the new conversation.item.retrieve introspection message. 2025-03-20 13:31:28 -04:00
Paul Kompfner
39ca607bbb Add on_conversation_item_created and on_conversation_item_updated events to OpenAIRealtimeBetaLLMService.
The hope is that this will expose to the user conversation item ids at relevant times for them to use with the new `conversation.item.retrieve` introspection message.
2025-03-20 13:31:28 -04:00
Paul Kompfner
9840abd85b Make it so you specifying model=None when creating a InputAudioTranscription results in a validation error 2025-03-20 13:31:28 -04:00
Paul Kompfner
1075c25055 Add new semantic turn detection option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
e91610c69e linter fix 2025-03-20 13:31:28 -04:00
Paul Kompfner
1a20d9bed7 Add new input_audio_noise_reduction option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
d009b80438 Add new GPT-4o transcription option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
kompfner
fe5fc30211 Revert "Add new GPT-4o transcription option to OpenAIRealtimeBetaLLMService" 2025-03-20 13:31:28 -04:00
Paul Kompfner
be2cf6d556 formatting fix 2025-03-20 13:31:28 -04:00
Paul Kompfner
e80bfe22de Add new GPT-4o transcription option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
214c8f79eb linter fix 2025-03-20 13:31:28 -04:00
Paul Kompfner
16accafa6d formatting fix 2025-03-20 13:31:28 -04:00
Kwindla Hultman Kramer
4449e9a25b add response.done status=failed error 2025-03-20 13:31:28 -04:00
Kwindla Hultman Kramer
bfdf52bd69 change examples/foundational/19-openai-realtime-beta.py to use the new preview model 2025-03-20 13:31:28 -04:00
Kwindla Hultman Kramer
2b4debec11 add support for conversation.item.input_audio_transcription.delta 2025-03-20 13:31:28 -04:00
Mark Backman
f4626287cd Merge pull request #1411 from pipecat-ai/mb/add-fal-wizper
Add FalSTTService
2025-03-20 13:08:08 -04:00
Mark Backman
e4bb4aacb4 Example: Rename 07 ultravox example 2025-03-20 12:46:00 -04:00
Mark Backman
f298febacf Add FalSTTService 2025-03-20 12:45:16 -04:00
Aleix Conchillo Flaqué
c51291190b Merge pull request #1394 from pipecat-ai/aleix/function-calls-as-tasks
function calls as tasks
2025-03-20 09:34:37 -07:00
Aleix Conchillo Flaqué
e0c3f6ad83 services: mark function calls as completed even the result is None 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
b1d506c137 GoogleAssistantContextAggregator: properly update function response 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
1f6ed01ba6 LLMAssistantContextAggregator: remove tool call id with image requests 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
3e9678db84 user image requests can now be related to function calls 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
d455fd070e update CHANGELOG 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
d1550d5a85 tests: remove TestFrameProcessor, reimplement with run_test() 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
c15286b148 examples: deprecate start_callback from LLMService.register_function() 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
a98000fd1d function calling now run in tasks 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
fc06306efd Merge pull request #1406 from pipecat-ai/aleix/pipeline-task-idle-timeouts
PipelineTask: automatically cancel tasks if pipeline is idle
2025-03-20 08:37:39 -07:00
Mark Backman
039fa59165 Merge pull request #1409 from pipecat-ai/aleix/segmented-stt-service-vad-events
SegmentedSTTService: use VAD events to detect valid audio
2025-03-20 09:11:08 -04:00
Aleix Conchillo Flaqué
0e14cec139 pyproject: update multiple libraries 2025-03-20 01:22:33 -07:00
Aleix Conchillo Flaqué
2417ec4f92 LLMUserContextAggregator: increase bot_interruption_timeout to 5 seconds 2025-03-20 01:20:34 -07:00
Aleix Conchillo Flaqué
7cdcd1c3d1 OpenAITTSService: allow specifying any model name 2025-03-20 01:20:34 -07:00
Aleix Conchillo Flaqué
b6be25ab84 SegmentedSTTService: use VAD events to detect valid audio 2025-03-20 00:31:49 -07:00
Aleix Conchillo Flaqué
e18d9f6a11 PipelineTask: automatically cancel tasks if pipeline is idle 2025-03-19 23:30:46 -07:00
Mark Backman
3a73346a41 Merge pull request #1408 from pipecat-ai/mb/claude-models-example
Update to Claude 3.7 Sonnet latest in examples
2025-03-20 01:44:59 -04:00
Aleix Conchillo Flaqué
8d58d1c8bb Merge pull request #1404 from pipecat-ai/aleix/gemini-push-frame-fixes
GeminiMultimodalLiveLLMService: fix duplicated messages in context
2025-03-19 21:51:39 -07:00
Mark Backman
07a77e066f Update to Claude 3.7 Sonnet latest in examples 2025-03-19 23:18:30 -04:00
Aleix Conchillo Flaqué
3024896d3d Merge pull request #1405 from pipecat-ai/aleix/tts-services-fallback
WebsocketTTSService: add `on_connection_error` and `reconnect_on_error`
2025-03-19 19:39:51 -07:00
Aleix Conchillo Flaqué
a3b5e4413a WebsocketTTSService: add on_connection_error and reconnect_on_error 2025-03-19 19:38:08 -07:00
Aleix Conchillo Flaqué
f31e77c4f6 pyproject: added empty tavus dependencies 2025-03-19 18:43:07 -07:00
Aleix Conchillo Flaqué
8942c2e053 GeminiMultimodalLiveLLMService: fix duplicated messages in context
Fixes #1384
2025-03-19 15:33:54 -07:00
Aleix Conchillo Flaqué
afb26be0ad Merge pull request #1396 from pipecat-ai/aleix/stt-service-audio-passthrough
SegmentedSTTService: allow audio to pass-through downstream
2025-03-19 11:16:40 -07:00
Aleix Conchillo Flaqué
48d73a2636 SegmentedSTTService: allow audio to pass-through downstream 2025-03-19 11:06:12 -07:00
Aleix Conchillo Flaqué
da531dabfd Merge pull request #1304 from pipecat-ai/aleix/handle-emails-user-email-gathering
add skip tags aggregator to support TTS service spelling out tags
2025-03-19 11:05:10 -07:00
Aleix Conchillo Flaqué
336e2f1579 TTSServices: for now just specify a single text aggregator 2025-03-19 11:02:29 -07:00
Aleix Conchillo Flaqué
fc0f404d26 examples: add new 36-user-email-gathering.py 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
54620133d4 services: add spelling out support to CartesiaTTSService and RimeTTSService 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
e7224473f2 utils(text): add new SkipTagsAggregator 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
1a3a268c9d utils(string): add new function parse_start_end_tags() 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
11984b89b7 utils(string): add support for floating point numbers 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
1dbad2326a utils(string): support email addresses in end of sentence matching 2025-03-19 10:57:27 -07:00
Mark Backman
2e0c6c2bd1 Merge pull request #1397 from pipecat-ai/mb/disconnect-bot
Fix: RTVI message disconnect-bot now pushes EndTaskFrame
2025-03-19 10:45:24 -04:00
Nico
5f28834588 feature: add custom headers to AsyncOpenAI 2025-03-19 14:49:51 +01:00
Mark Backman
7f1ccab445 Fix: RTVI message disconnect-bot now pushes EndTaskFrame 2025-03-19 07:07:45 -04:00
Aleix Conchillo Flaqué
7ddac4eb88 Merge pull request #1395 from pipecat-ai/aleix/multiple-text-filters-and-aggregators
TTSService: allow passing multiple text filters and aggregators
2025-03-18 21:25:29 -07:00
Aleix Conchillo Flaqué
514ecda755 TTSService: allow passing multiple text filters and aggregators 2025-03-18 17:31:01 -07:00
balalo
48b6850df4 allow other function names 2025-03-18 20:45:31 +01:00
Aleix Conchillo Flaqué
71a38a120e Merge pull request #1376 from pipecat-ai/aleix/event-handlers-as-tasks
event handlers are now executed in separate tasks
2025-03-18 12:10:34 -07:00
Mark Backman
79616de7a4 Merge pull request #1392 from pipecat-ai/mb/fix-google-stt-timeout
Fix an issue where GoogleSTTService would timeout due to stream inact…
2025-03-18 14:17:44 -04:00
Mark Backman
6368fbe0dd Merge pull request #1318 from Vaibhav159/vl_google_vertex_llm
adding vertex google llm
2025-03-18 14:17:21 -04:00
Mark Backman
5dc8b48fbe Fix an issue where GoogleSTTService would timeout due to stream inactivity 2025-03-18 14:06:32 -04:00
Aleix Conchillo Flaqué
9112ff114f Merge pull request #1359 from lucasrothman/tavus-output-sample-rate
Tavus support for custom output rate
2025-03-18 10:16:34 -07:00
Aleix Conchillo Flaqué
32609b1132 event handlers are now executed in separate tasks 2025-03-18 09:25:39 -07:00
Vaibhav159
4303ed4991 rename service 2025-03-18 20:58:21 +05:30
Mark Backman
4677c34663 Merge pull request #1387 from pipecat-ai/mb/pattern-aggregator
Add PatternPairAggregator
2025-03-18 08:46:42 -04:00
Mark Backman
b28276446d Code review feedback 2025-03-18 07:49:54 -04:00
Mark Backman
2dee882710 Add unit tests 2025-03-18 07:30:37 -04:00
Mark Backman
6ec4052f29 Add CHANGELOG entries 2025-03-18 07:30:36 -04:00
Mark Backman
ddcc1fbb2f Add foundational example 35 2025-03-18 07:30:11 -04:00
Mark Backman
e731a0d41f Add PairPatternAggregator 2025-03-18 07:30:11 -04:00
Mark Backman
4918eab4e8 Merge pull request #1371 from pipecat-ai/mb/openai-realtime-transcription
Add TranscriptProcessor support for OpenAIRealtimeBetaLLMService
2025-03-18 07:28:07 -04:00
Mark Backman
11987765d8 Merge pull request #1381 from pipecat-ai/mb/recording-example-stt
Update the 34-audio-recording.py example to include an STT processor
2025-03-18 07:20:42 -04:00
Mark Backman
6f09ee25b8 Merge pull request #1385 from pipecat-ai/mb/add-neuphonic-readme
Add Google Imagen and Neuphonic TTS to README
2025-03-18 07:20:15 -04:00
Mark Backman
83dda8a759 Merge pull request #1390 from adnansiddiquei/add-neuphonic-languages
Added 5 new languages for Neuphonic: FR, PT, RU, ZH, HI.
2025-03-18 07:18:27 -04:00
Adnan Siddiquei
188677e601 Added 4 new languages: FR, PT, RU, ZH, HI. 2025-03-18 10:35:22 +00:00
balalo
dc5067407d Fix ruff check 2025-03-18 11:12:51 +01:00
balalo
1c19777d5e Fix format 2025-03-18 11:09:40 +01:00
balalo
2e1a18503b Set tool choice from context aggregator 2025-03-18 10:41:43 +01:00
Lucas Rothman
c57fa93a70 Renamed to sample_rate 2025-03-17 16:22:36 -07:00
Mark Backman
6885d07e88 Simplify the TranscriptProcessor _emit_aggregated_text logic 2025-03-17 16:36:03 -04:00
Mark Backman
acd0660f66 Update GeminiMultimodalLiveLLMService to work with the TranscriptProcessor 2025-03-17 16:36:03 -04:00
Mark Backman
3f002f8ffb Remove unnecessary TranscriptProcessor examples 2025-03-17 16:36:02 -04:00
Mark Backman
d5776c27f4 Update 19-openai-realtime-beta 2025-03-17 16:35:35 -04:00
Mark Backman
6e6905405b Update CHANGELOG 2025-03-17 16:35:35 -04:00
Mark Backman
571c10403f tests: Add additional coverage to test_transcript_processor 2025-03-17 16:35:35 -04:00
Mark Backman
5b6b700214 OpenAIRealtimeBetaLLMService outputs a TTSTextFrame 2025-03-17 16:35:35 -04:00
Mark Backman
1ad8e28025 Update TranscriptProcessor to more robustly handle different TTSTextFrame outputs 2025-03-17 16:35:35 -04:00
Mark Backman
3458f1b6de Add Google Imagen to README 2025-03-17 11:43:40 -04:00
Mark Backman
02dbef8f5a Add Neuphonic TTS to README 2025-03-17 11:28:51 -04:00
Zac
1baa52a17e Enhanced whisper.py with MLX Whisper model support and added optional mlx-whisper to pyproject.toml. Added error handling for missing modules and created a new WhisperSTTServiceMLX class for MLX Whisper integration. 2025-03-16 02:18:54 -04:00
Mark Backman
c1382b0691 Update the 34-audio-recording.py example to include an STT processor 2025-03-15 20:30:35 -04:00
Vaibhav159
5f000efc61 adding example 2025-03-15 10:36:26 +05:30
Vaibhav159
fa7da8f5f6 adding vertex llm 2025-03-15 10:21:40 +05:30
Mark Backman
8b86f6991d Merge pull request #1343 from pipecat-ai/mb/pipecat-cloud-example
Add a Pipecat Cloud deployment example
2025-03-14 20:49:45 -04:00
Mark Backman
d3cd1a6c59 Update with latest starter 2025-03-14 20:40:33 -04:00
Mark Backman
24220f38f0 Add a Pipecat Cloud deployment example 2025-03-14 20:40:29 -04:00
Aleix Conchillo Flaqué
1f8752ab03 Merge pull request #1378 from pipecat-ai/aleix/remove-deprecations
removed most deprecations
2025-03-14 14:42:34 -07:00
Aleix Conchillo Flaqué
16d7df1c9f removed most deprecations 2025-03-14 14:37:08 -07:00
Aleix Conchillo Flaqué
2474211291 Merge pull request #1379 from pipecat-ai/aleix/introduce-text-aggregators
introduce text aggregators
2025-03-14 13:03:49 -07:00
Aleix Conchillo Flaqué
b632d71465 TTSService: flush_audio() should be in the base class 2025-03-14 10:48:25 -07:00
Aleix Conchillo Flaqué
f8610a69a5 introduce text aggregators 2025-03-14 10:48:25 -07:00
Aleix Conchillo Flaqué
624a454f8b Merge pull request #1366 from adnansiddiquei/neuphonic-tts-plugin
Add integration for Neuphonic TTS
2025-03-14 10:27:24 -07:00
Aleix Conchillo Flaqué
11ba08b7ba Merge pull request #1377 from pipecat-ai/aleix/task-upstream-downstream-filters
PipelineTask: only call event handlers if a filter is matched
2025-03-14 08:49:24 -07:00
Adnan Siddiquei
11b13d053b Fixed a bug from previous commit. Removed the concept of model from Neuphonic. 2025-03-14 11:17:22 +00:00
Adnan Siddiquei
7dec8431e1 Review comments by aconchillo. 2025-03-14 10:52:13 +00:00
Aleix Conchillo Flaqué
ce3f3b2edb Merge pull request #1372 from pipecat-ai/khk-fix-multimodal-live-example
fix for 26-gemini-multimodal-live.py
2025-03-13 20:22:07 -07:00
Aleix Conchillo Flaqué
1b3b4ee04a PipelineTask: only call event handlers if a filter is matched 2025-03-13 18:44:30 -07:00
Mark Backman
676c5d9ba7 Merge pull request #1374 from pipecat-ai/mb/add-riva-to-readme 2025-03-13 20:41:05 -04:00
Mark Backman
6eb3a8409f README: Add Parakeet and FastPitch 2025-03-13 18:42:19 -04:00
Filipi Fuchter
526f9c2e06 Fixing the voice agent feedback when disconnected. 2025-03-13 18:41:40 -03:00
Kwindla Hultman Kramer
c9a31ea513 fix for 26-gemini-multimodal-live.py 2025-03-13 14:35:47 -07:00
Filipi Fuchter
2770d64a25 Fixing ruff format. 2025-03-13 17:38:11 -03:00
Filipi Fuchter
8a7e305619 Closing the old peer connection 2025-03-13 17:35:47 -03:00
Filipi Fuchter
8f2dadf5a0 Improving the reconnection logic to be able to recreate the peer connection in some cases. 2025-03-13 17:07:32 -03:00
Aleix Conchillo Flaqué
c0c7c5d600 Merge pull request #1370 from pipecat-ai/aleix/minor-ultravox-updates
services(ultravox): CHANGELOG, formatting and minor changes
2025-03-13 12:05:13 -07:00
Aleix Conchillo Flaqué
87004937be services(ultravox): CHANGELOG, formatting and minor changes 2025-03-13 11:49:18 -07:00
Aleix Conchillo Flaqué
b426be3067 Merge pull request #1331 from CerebriumAI/feature/ultravox
Added ultravox service
2025-03-13 10:40:00 -07:00
Aleix Conchillo Flaqué
b71e2b97ff Merge pull request #1368 from pipecat-ai/aleix/pipelinetask-frame-event-handlers
PipelineTask: add on_frame_reached_upstream and on_frame_reached_downstream
2025-03-13 10:31:33 -07:00
Aleix Conchillo Flaqué
25dcf7def6 PipelineTask: add on_frame_reached_upstream/on_frame_reached_downstream 2025-03-13 10:26:11 -07:00
Filipi Fuchter
30432639b4 Creating a keep alive connection 2025-03-13 14:20:25 -03:00
Adnan Siddiquei
1bf964a667 Added two examples on how to use Neuphonic as a TTS (07u). 2025-03-13 14:42:42 +00:00
Adnan Siddiquei
08fb931ef6 Swapped NEUPHONIC_API_TOKEN for NEUPHONIC_API_KEY. 2025-03-13 12:10:03 +00:00
Aleix Conchillo Flaqué
c5aa931096 Merge pull request #1358 from pipecat-ai/aleix/abstractmethod-fixes
ai_services: fix abstractmethod issues
2025-03-12 17:26:48 -07:00
Filipi Fuchter
d33a4b3a11 Implementing reconnection logic. 2025-03-12 18:23:12 -03:00
Filipi Fuchter
9cad8bfcc6 Increasing the time that we are waiting for the frame. 2025-03-12 15:36:40 -03:00
Mark Backman
b084a3e9e7 Merge pull request #1367 from MaCaki/macaki/rime/send_msg_in_flush_audio
[rime client] Sending over trailing space to help indicate end of utt…
2025-03-12 14:25:18 -04:00
macaki
5c9e33bc7a formatting 2025-03-12 12:20:18 -06:00
Filipi Fuchter
93d8ddf4f2 Only showing the timout warning to receive frame if the client is connected. 2025-03-12 15:13:59 -03:00
Adnan Siddiquei
0b9c4b2255 Fixed a couple of small bugs. 2025-03-12 18:04:48 +00:00
macaki
effb5f6cd8 added changelog 2025-03-12 11:57:25 -06:00
Adnan Siddiquei
ead555eb4b Corrected versions on pyproject.toml. 2025-03-12 17:39:04 +00:00
macaki
f843482968 [rime client] Sending over trailing space to help indicate end of utterance after a punctuation. 2025-03-12 11:26:43 -06:00
Adnan Siddiquei
23a4933af9 Initial implementation of Neuphonic service. A TTS provider. 2025-03-12 17:15:31 +00:00
Filipi Fuchter
0d05312071 Supporting renegotiation inside the voice agent server. 2025-03-12 11:55:38 -03:00
Filipi Fuchter
f8e33d8b7b Improving the video transform feedback when we are connecting, and cleaning the pc_id when disconnected. 2025-03-12 11:49:33 -03:00
Filipi Fuchter
f24c5b0aa7 Adding support for renegotiation. 2025-03-12 11:31:18 -03:00
Michael Louis
d9ef19233a Added foundational example for ultravox 2025-03-12 10:30:23 -04:00
Mark Backman
357334e3c9 Merge pull request #1341 from pipecat-ai/mb/fix-google-typo
Add a set_language convenience method for GoogleSTTService
2025-03-12 09:05:52 -04:00
Filipi Fuchter
da25e0c008 Configuring the bot to receive the video live. 2025-03-12 10:00:33 -03:00
Filipi Fuchter
c99d02d8bb Adding support for interruptions when using SmallWebRTCTransport. 2025-03-12 09:09:11 -03:00
Mark Backman
59ea94af86 Merge pull request #1360 from pipecat-ai/mb/update-cartesia-voice
Update Cartesia voice for demos
2025-03-12 08:02:26 -04:00
Mark Backman
4a363bebf0 Add a set_language convenience method for GoogleSTTService 2025-03-12 07:58:29 -04:00
Mark Backman
c196fb5f98 Merge pull request #1342 from pipecat-ai/mb/lmnt-flush-audio 2025-03-11 22:22:38 -04:00
Mark Backman
5f97f6ff94 Add flush_audio() to LmntTTSService 2025-03-11 21:57:54 -04:00
Mark Backman
5860fe5319 Merge pull request #1340 from pipecat-ai/mb/fish-flush
Add flush_audio to FishTTSService
2025-03-11 21:56:44 -04:00
Mark Backman
3522bbb533 tmp 2025-03-11 21:55:18 -04:00
Mark Backman
cfca7269f4 Update the Cartesia voice in all demos with one built for sonic-2 2025-03-11 21:53:03 -04:00
Mark Backman
e6f269a903 Add flush_audio to FishTTSService 2025-03-11 21:48:41 -04:00
Mark Backman
468e936a5f Merge pull request #1356 from pipecat-ai/mb/add-chirp-tts-support
Add support for Chirp voices in GoogleTTSService
2025-03-11 20:12:52 -04:00
Lucas Rothman
ecc4411128 Tavus support for custom output rate 2025-03-11 16:02:33 -07:00
Aleix Conchillo Flaqué
740ba4e759 ai_services: fix abstractmethod issues 2025-03-11 14:29:03 -07:00
Filipi Fuchter
e56c8f881c Full video transformation example using SmallWebRTCTransport. 2025-03-11 11:36:47 -03:00
Filipi Fuchter
a747f08017 Simple voice agent example using SmallWebRTCTransport. 2025-03-11 11:36:23 -03:00
Filipi Fuchter
c6c0b73345 P2P WebRTC transport option to Pipecat: SmallWebRTCTransport. 2025-03-11 11:35:39 -03:00
Filipi Fuchter
fde90ee01d Creating an EventEmitter util class 2025-03-11 11:33:47 -03:00
Filipi Fuchter
689a844aaf Created a new transport param to inform if the camera input should be enabled. 2025-03-11 11:33:24 -03:00
Filipi Fuchter
aab98b61a0 Fixed issue where sending too many images per second caused Gemini to ignore them. 2025-03-11 11:32:38 -03:00
Mark Backman
a62741df94 Add support for Chirp voices in GoogleTTSService 2025-03-11 07:56:27 -04:00
Mark Backman
5bd359ada9 Merge pull request #1354 from pipecat-ai/mb/cartesia-changelog
Changelog entry for Cartesia model update
2025-03-11 07:20:04 -04:00
Mark Backman
40562402a2 Changelog entry for Cartesia model update 2025-03-10 21:10:11 -04:00
Mark Backman
98e5089fbe Merge pull request #1353 from kunal-cai/main
[Cartesia] Update the default alias for Cartesia TTS Service
2025-03-10 21:07:19 -04:00
Kunal Shah
e1c8a09b60 [Cartesia] Update the default alias for Cartesia TTS Service 2025-03-10 14:43:58 -07:00
Filipi da Silva Fuchter
154fe65011 Merge pull request #1336 from pipecat-ai/fixing_function_calling_examples
Pipecat small fixes and refactored function calling examples
2025-03-07 16:10:27 -03:00
Mark Backman
61f534ca34 Merge pull request #1334 from pipecat-ai/aleix/user-and-bot-turn-audio
add support for user and bot turn audio
2025-03-06 18:35:56 -05:00
Mark Backman
a91c26785f Store recording in a folder 2025-03-06 18:31:48 -05:00
Aleix Conchillo Flaqué
d7e93551d2 examples(chatbot-audio-recording): add support for user/bot turn audio 2025-03-06 11:49:01 -08:00
Aleix Conchillo Flaqué
06c742a2ad AudioBufferProcessor: add on_user_turn_audio_data and on_bot_turn_audio_data 2025-03-06 11:49:01 -08:00
Filipi Fuchter
55b0797fd5 Removing the extra examples inside the unified-format-function-calling folder 2025-03-06 12:00:22 -03:00
Filipi Fuchter
21443b9a08 Refactored gemini multimodal example to use the unified format for function calling. 2025-03-06 11:59:08 -03:00
Filipi Fuchter
4b167a3c3d Fixing the ruff format. 2025-03-06 10:38:45 -03:00
Filipi Fuchter
2df77430aa Refactoring the 14 series examples to use the unified format for function calling. 2025-03-06 10:35:26 -03:00
Filipi Fuchter
2d114b15f9 Adding missing flush_audio method to AzureTTSService. 2025-03-06 10:34:25 -03:00
Filipi Fuchter
26000b616d Fixing the base_whisper services to implement set_language. 2025-03-06 10:15:04 -03:00
Aleix Conchillo Flaqué
710eebab09 Merge pull request #1332 from pipecat-ai/aleix/base-object-and-event-handlers
introduce BaseObject class
2025-03-05 13:41:27 -08:00
Dominic Stewart
532423eb4c Updated example to switch pipelines per the original request (#1320) 2025-03-05 13:40:36 -08:00
Aleix Conchillo Flaqué
bb29e50adb introduce BaseObject class 2025-03-05 13:38:53 -08:00
Filipi da Silva Fuchter
4048d6782b Merge pull request #1211 from pipecat-ai/function_calling_unified_format
Unified format for function calling
2025-03-05 18:30:22 -03:00
Filipi Fuchter
76d36a312b Adding the unified format function calling to the changelog. 2025-03-05 14:18:37 -03:00
Filipi Fuchter
2a75373c04 Created examples for unified format function calling. 2025-03-05 14:12:30 -03:00
Filipi Fuchter
a840b0e815 Prevents pytest from collecting TestFrameProcessor. 2025-03-05 14:11:52 -03:00
Filipi Fuchter
ebcde719a6 Integration test for function calling. 2025-03-05 14:11:16 -03:00
Filipi Fuchter
5c912927bb Unit tests for function calling adapters. 2025-03-05 14:11:02 -03:00
Filipi Fuchter
0e55db054e Created script to fix ruff format issues. 2025-03-05 14:10:47 -03:00
Filipi Fuchter
5967ac0d4f Implementing unified format for function calling. 2025-03-05 14:10:32 -03:00
Aleix Conchillo Flaqué
1451483cf7 Merge pull request #1330 from pipecat-ai/aleix/playht-update-0.1.12
pyproject: update pyht to 0.1.12
2025-03-04 18:35:03 -08:00
Michael Louis
3fe7c1d730 Added ultravox service 2025-03-04 13:59:03 -05:00
Aleix Conchillo Flaqué
c14b85c12b pyproject: update pyht to 0.1.12
Fixes #1309
2025-03-04 10:26:11 -08:00
kompfner
9f3c0219d7 Merge pull request #1329 from pipecat-ai/add-permissions-to-daily-meeting-token-properties
Add the `permissions` property to `DailyMeetingTokenProperties`
2025-03-03 14:44:10 -05:00
Aleix Conchillo Flaqué
ec36fef26e updated CHANGELOG and fix GladiaSTTService formatting 2025-03-03 09:53:03 -08:00
allenmylath
5f1848d24b Update gladia.py (#1317)
* Update gladia.py

According to gladia docs 
https://docs.gladia.io/api-reference/v2/live/init
speech threshould value close to 1 enables gladia to better isolate speeech from noise.
2025-03-03 09:51:11 -08:00
Aleix Conchillo Flaqué
d6867bd12f Merge pull request #1321 from pipecat-ai/aleix/allow-setting-context-aggregator-parameters
LLMService: add user/assistant args to create_context_aggregator()
2025-03-03 09:48:31 -08:00
Aleix Conchillo Flaqué
17a1f30572 LLMService: add user/assistant args to create_context_aggregator() 2025-03-03 09:46:37 -08:00
Paul Kompfner
8e0dc1f256 Add the permissions property to DailyMeetingTokenProperties 2025-03-03 10:13:25 -05:00
Kwindla Hultman Kramer
b9100beee3 Merge pull request #1327 from pipecat-ai/azure-realtime-changelog
CHANGELOG.md entry for AzureRealtimeBetaLLMService
2025-03-02 20:30:40 -08:00
Mark Backman
b8bc3d2565 Merge pull request #1326 from pipecat-ai/mb/11labs-speed
Add speed as InputParam to ElevenLabs TTS services
2025-03-02 15:20:01 -05:00
Kwindla Hultman Kramer
3213e85b7d CHANGELOG.md entry for AzureRealtimeBetaLLMService 2025-03-02 12:16:50 -08:00
Kwindla Hultman Kramer
de3bcd64c4 Merge pull request #1324 from pipecat-ai/azure-realtime
Support for Azure OpenAI Realtime API
2025-03-02 12:13:29 -08:00
Mark Backman
ad7f1eec12 Create a function to build voice_settings dictionary 2025-03-02 08:27:29 -05:00
Mark Backman
29310b4e92 Add speed as InputParam to ElevenLabs TTS services 2025-03-02 08:19:44 -05:00
Kwindla Hultman Kramer
2f4d36a146 docstring fixup 2025-03-01 15:44:10 -08:00
Kwindla Hultman Kramer
6c9bb782b1 add __init__.py 2025-03-01 15:42:20 -08:00
Kwindla Hultman Kramer
010d9103d4 support for Azure OpenAI Realtime API 2025-03-01 15:39:19 -08:00
Aleix Conchillo Flaqué
12131eb7c5 Merge pull request #1313 from Vaibhav159/vl_add_automated_formatting
using ruff automated formatting to avoid action failures.
2025-02-28 13:12:31 -08:00
Aleix Conchillo Flaqué
80b830322a Merge pull request #1311 from pipecat-ai/aleix/llm-full-response-aggregator
add new LLMFullResponseAggregator
2025-02-28 13:08:06 -08:00
Aleix Conchillo Flaqué
8db9d16174 add new LLMFullResponseAggregator 2025-02-28 13:05:21 -08:00
Aleix Conchillo Flaqué
1c92fab1fb Merge pull request #1308 from Vaibhav159/vl_google_openai_format
adding GoogleLLMOpenAIBetaService
2025-02-28 12:04:37 -08:00
Vaibhav159
974717d1b9 sync with main 2025-03-01 01:16:21 +05:30
Vaibhav159
59fb631390 fixing function calling and adding example 2025-03-01 01:14:37 +05:30
Vaibhav159
4824220260 adding GoogleLLMOpenAIBetaService 2025-03-01 01:14:26 +05:30
Mark Backman
55a338614d Merge pull request #1312 from pipecat-ai/mb/move-server-message-frame
Rename ServerMessageFrame to RTVIServerMessageFrame and move to rtvi.py
2025-02-28 13:59:31 -05:00
Vaibhav159
f033046963 using ruff automated formatting to avoid repeated failures 2025-02-28 08:25:15 +05:30
Mark Backman
6018fc068c Rename ServerMessageFrame to RTVIServerMessageFrame and move to rtvi.py 2025-02-27 20:07:07 -05:00
Aleix Conchillo Flaqué
d5b634301f Merge pull request #1302 from pipecat-ai/aleix/cleanup-llm-tts-logging
services: minor LLM and TTS logging improvements
2025-02-27 13:51:04 -08:00
Aleix Conchillo Flaqué
a37eb1049d Merge pull request #1310 from Canonical-AI-Inc/without-audio
Optional Recording
2025-02-27 13:37:39 -08:00
Adrian Cowham
803ea9d8bc update the canonical client so that the audio recording is optional as long as there is a transcript 2025-02-27 12:31:02 -08:00
Mark Backman
499bc25217 Merge pull request #1303 from pipecat-ai/mb/add-server-to-client-msg
Add a new generic server to client message and frame type
2025-02-27 12:56:57 -05:00
Mark Backman
53d403af4b Remove the RTVIServerMessage logic from the RTVIProcessor 2025-02-27 12:50:43 -05:00
Aleix Conchillo Flaqué
a0a8ea1641 Merge pull request #1301 from pipecat-ai/aleix/example-22d-fix-llm-aggregator 2025-02-26 22:39:48 -08:00
Mark Backman
26c68ccd7c Add a new generic server to client message and frame type 2025-02-26 18:59:06 -05:00
Aleix Conchillo Flaqué
fa010c8644 services: minor LLM and TTS logging improvements 2025-02-26 15:36:25 -08:00
Aleix Conchillo Flaqué
d58f398bc4 examples: fix for 22d-natural-conversation-gemini-audio.py 2025-02-26 13:15:07 -08:00
Aleix Conchillo Flaqué
11383a86a1 Merge pull request #1300 from pipecat-ai/aleix/prepare-0.0.58
update CHANGELOG for 0.0.58
2025-02-26 11:31:24 -08:00
Aleix Conchillo Flaqué
daa52ff8df update CHANGELOG for 0.0.58 2025-02-26 11:29:04 -08:00
Mark Backman
a5f41e22f7 Merge pull request #1299 from pipecat-ai/mb/add-track-level-recording
Added on_track_audio_data callback to AudioBufferProcessor for track level recording
2025-02-26 13:49:36 -05:00
Mark Backman
530bb5233d example: Added a foundational example (34) for audio recording 2025-02-26 13:44:32 -05:00
Aleix Conchillo Flaqué
4a64e09f6c Merge pull request #1297 from pipecat-ai/aleix/daily-python-0.15.0
pyproject: update daily-python, aiohttp and pydantic
2025-02-26 10:26:59 -08:00
Aleix Conchillo Flaqué
74582bb8d5 pyproject: update daily-python, aiohttp and pydantic 2025-02-26 10:22:34 -08:00
Mark Backman
1ca2101e3a Added on_track_audio_data callback to AudioBufferProcessor for track level recording 2025-02-26 10:48:56 -05:00
Aleix Conchillo Flaqué
e80311c323 Merge pull request #1296 from pipecat-ai/aleix/google-always-send-text-with-audio
GoogleLLMService: always send text with audio
2025-02-26 07:47:56 -08:00
Aleix Conchillo Flaqué
2f24c422b6 Merge pull request #1289 from pipecat-ai/aleix/tts-http-improvements
small TTS http improvements
2025-02-26 07:47:26 -08:00
Mark Backman
0d0b9fddef Merge pull request #1291 from pipecat-ai/mb/playht-http-protocol
PlayHTHttpTTSService now takes a separate protocol input
2025-02-26 08:09:49 -05:00
Mark Backman
1753cc99f4 PlayHTHttpTTSService now takes a separate protocol input 2025-02-26 08:01:54 -05:00
Aleix Conchillo Flaqué
4f8b036abe pyproject: remote httpx old dependency and upgrade anthropic/google-genai 2025-02-25 22:28:21 -08:00
Aleix Conchillo Flaqué
f83c89c202 examples: update google examples 2025-02-25 22:28:02 -08:00
Aleix Conchillo Flaqué
bb89a036e5 google: always send text part when sending inline audio 2025-02-25 22:27:38 -08:00
Aleix Conchillo Flaqué
b994a03466 examples: add more HTTP TTS services examples 2025-02-25 21:40:41 -08:00
Aleix Conchillo Flaqué
27161f8e3b BaseOutputTransport: cleanup audio buffer after bot stops talking 2025-02-25 21:39:47 -08:00
Aleix Conchillo Flaqué
8acf9a488b tts: some small HTTP-based services improvements 2025-02-25 21:39:47 -08:00
Aleix Conchillo Flaqué
96c6aeaada Merge pull request #1295 from pipecat-ai/aleix/pipelinetask-keyword-arguments
PipelineTask: force constructor keyword arguments
2025-02-25 19:00:58 -08:00
Aleix Conchillo Flaqué
6722aae598 PipelineTask: force constructor keyword arguments 2025-02-25 18:58:47 -08:00
Aleix Conchillo Flaqué
66564392a6 Merge pull request #1293 from pipecat-ai/aleix/log-pipecat-version
log pipecat version on application startup
2025-02-25 18:57:52 -08:00
Aleix Conchillo Flaqué
f258f5ab66 Merge pull request #1292 from pipecat-ai/aleix/audiocontext-terminate-nicely
AudioContextWordTTSService: wait for all requested audio
2025-02-25 18:56:41 -08:00
Aleix Conchillo Flaqué
f8f0578c3d log pipecat version on application startup 2025-02-25 18:55:45 -08:00
Aleix Conchillo Flaqué
aa60a413f3 Merge pull request #1294 from pipecat-ai/aleix/improve-test-requirements
improve test-requirements.txt
2025-02-25 18:55:18 -08:00
Aleix Conchillo Flaqué
3e66f2378d improve test-requirements.txt 2025-02-25 17:34:33 -08:00
Aleix Conchillo Flaqué
9a50f33e36 AudioContextWordTTSService: wait for all requested audio 2025-02-25 15:35:47 -08:00
Aleix Conchillo Flaqué
4bd5e9c0a7 Merge pull request #1285 from pipecat-ai/aleix/handle-stop-task-gracefully
handle stop task gracefully
2025-02-25 11:25:38 -08:00
Mark Backman
12092c8715 Merge pull request #1288 from pipecat-ai/mb/clean-up-tts-text-input
TTSService: Remove newlines before sending text to TTS service to gen…
2025-02-25 14:00:43 -05:00
Mark Backman
92cc6d39f2 TTSService: Remove newlines before sending text to TTS service to generate 2025-02-25 13:37:25 -05:00
Aleix Conchillo Flaqué
34a50033cb tk: use TkTransportParams in examples 2025-02-25 10:24:24 -08:00
Aleix Conchillo Flaqué
e60b65228b allow multiple StartFrames 2025-02-25 10:24:04 -08:00
Mark Backman
e74864335b Merge pull request #1287 from pipecat-ai/mb/30-observer-pipeline-task
Example 30: Move observers to PipelineTask
2025-02-25 12:11:23 -05:00
Mark Backman
27a088a457 Merge pull request #1286 from pipecat-ai/mb/update-grok-2
Set grok-2 as default model for GrokLLMSService
2025-02-25 12:11:09 -05:00
Mark Backman
cfe72143b8 Example 30: Move observers to PipelineTask 2025-02-25 10:54:25 -05:00
Mark Backman
36a729cbfe Set grok-2 as default model for GrokLLMSService 2025-02-25 10:00:45 -05:00
Aleix Conchillo Flaqué
d2f006682c introduce new BaseTaskManager 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
fb7fe540f5 tts: don't connect to websocket if already connected 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
1ec68bd071 make sure we don't create tasks if already created 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
4536d03e82 FrameProcessor: cancel input/push tasks on CancelFrame 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
699704732c asyncio: re-raise CancelledError in wait_for_task() 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
376d969a77 task: handle StopFrame and StopTaskFrame gracefully 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
68789dfcf0 frames: add new StopFrame 2025-02-24 21:34:23 -08:00
Aleix Conchillo Flaqué
fe9fc61c4e Merge pull request #1282 from pipecat-ai/aleix/pipelinetask-observers-constructor
PipelineTask: pass observers in contructor parameter
2025-02-24 21:29:46 -08:00
Aleix Conchillo Flaqué
6028f0f23a PipelineTask: pass observers in contructor parameter 2025-02-24 21:29:17 -08:00
Aleix Conchillo Flaqué
e9a0959e28 Merge pull request #1283 from pipecat-ai/aleix/check-dangling-tasks
PipelineTask: add check_dangling_tasks parameter
2025-02-24 21:26:32 -08:00
Dominic Stewart
f66be2cfa7 Dom/gemini system prompt switching (#1260)
* Updated example to use Gemini

* Fixed typo

* Based on feedback, made the gemini file something that can be called separately

* Updated the readme

* Updated the readme

* Changed example to use gemini 2.0 flash lite

* This works

* Improvement

* I think this works

* Updated the code to use the correct prompt broken down into smaller pieces

* Added a few more things to detect in the prompt

* Fixed import ordering

* Updated prompt for non gemini bot to look for more voicemail examples, plus added logic to detect if we're doing dialin or not to avoid a non-fatal dialin related error

* moved terminate call to handlers class

* Simplified logic for dialin

* Forgot to use the same logic for the openai bot

* Starting to add logic for native audio input for flash lite

* Fixed logic

* Fixed some code based on suggestions
2025-02-24 22:29:55 -06:00
Aleix Conchillo Flaqué
f818bed58f Merge pull request #1281 from pipecat-ai/aleix/google-context-aggregator-upgrade-context
google: updgrade OpenAILLMContext to GoogleLLMContext
2025-02-24 17:37:26 -08:00
Aleix Conchillo Flaqué
07b9be5308 PipelineTask: add check_dangling_tasks parameter 2025-02-24 17:33:10 -08:00
Aleix Conchillo Flaqué
40c2452d6e google: updgrade OpenAILLMContext to GoogleLLMContext 2025-02-24 15:35:18 -08:00
Aleix Conchillo Flaqué
30cdd1b71a Merge pull request #1280 from pipecat-ai/aleix/add-completion-timeout
services(llm): add on_completion_timeout event
2025-02-24 15:07:20 -08:00
Aleix Conchillo Flaqué
2110b79507 services(llm): add on_completion_timeout event 2025-02-24 14:55:36 -08:00
Aleix Conchillo Flaqué
fc544fa61c Merge pull request #1272 from pipecat-ai/aleix/tts-websocket-interruptions
services: fix some TTS websocket service interruption handling
2025-02-24 14:54:41 -08:00
Mark Backman
976fe95304 Merge pull request #1279 from pipecat-ai/mb/remove-open-optional-dep
Remove `openai` optional dependency from services as it's now required
2025-02-24 17:42:53 -05:00
Aleix Conchillo Flaqué
408270b647 lmnt: don't send "eof" before closing the socket 2025-02-24 14:37:37 -08:00
Mark Backman
1dfb75bc9d Merge pull request #1278 from pipecat-ai/mb/claude-3-7
Update AnthropicLLMService to use claude-3-7-sonnet-20250219 by default
2025-02-24 15:41:28 -05:00
Mark Backman
cefc2a1088 Fix test-requirements.text ordering 2025-02-24 15:06:13 -05:00
Mark Backman
3b9b9200ea Remove openai optional dependency from services as it's now required 2025-02-24 15:05:42 -05:00
Mark Backman
d6f29a0f4b Update AnthropicLLMService to use claude-3-7-sonnet-20250219 by default 2025-02-24 14:32:00 -05:00
Aleix Conchillo Flaqué
5b762d11ef Merge pull request #1228 from CarlKho-Minerva/main
Missing Cartesia~=1.3.1 → `test-requirements`
2025-02-24 08:47:41 -08:00
Aleix Conchillo Flaqué
2f3e2da6b9 Merge pull request #1259 from pipecat-ai/openai-not-optional
Since the `openai` package is used by pretty much everything in pipec…
2025-02-24 08:45:45 -08:00
allenmylath
45058d4a94 Update audio_buffer_processor.py (#1266) 2025-02-24 08:41:19 -08:00
Aleix Conchillo Flaqué
5b637bd826 services: fix some TTS websocket service interruption handling 2025-02-24 08:37:22 -08:00
Mark Backman
2d4fd7e903 Merge pull request #1274 from pipecat-ai/mb/add-ellipsis-test
Add one additional ellipsis test to test_utils_string
2025-02-23 11:26:20 -05:00
Mark Backman
b5662520aa Add one additional ellipsis test to test_utils_string 2025-02-23 11:04:24 -05:00
Aleix Conchillo Flaqué
af45c170b5 Merge pull request #1264 from pipecat-ai/aleix/add-log-observers
add initial log observers
2025-02-21 15:20:45 -08:00
Aleix Conchillo Flaqué
65f548b2ec examples(30-observer): update to use LLMLogObserver 2025-02-21 15:15:16 -08:00
Aleix Conchillo Flaqué
b29ab8c608 observers: add LLMLogObserver and TranscriptionLogObserver 2025-02-21 15:15:16 -08:00
Aleix Conchillo Flaqué
d6dc37f0b6 Merge pull request #1269 from pipecat-ai/aleix/endofsentence-support-ellipses
utils: add support for ellipses in match_endofsentence()
2025-02-21 15:08:22 -08:00
Aleix Conchillo Flaqué
12bce2e8c0 utils: add support for ellipses in match_endofsentence() 2025-02-21 15:05:50 -08:00
Aleix Conchillo Flaqué
4acf7296e0 Merge pull request #1261 from pipecat-ai/aleix/emualted-frames-being-triggered-prematurely
LLMUserContextAggregator: don't reset timer with interim transcription
2025-02-21 10:15:28 -08:00
Aleix Conchillo Flaqué
98706d429c LLMUserContextAggregator: make sure incoming transcription has text 2025-02-21 10:12:54 -08:00
Aleix Conchillo Flaqué
41720b1a13 LLMUserContextAggregator: don't reset timer with interim transcription
It turns out that in some cases we only get interim transcriptions (e.g. someone
is speaking very very softly or someone is talking in the background). In those
cases we don't want to interrupt the bot because there's really nothing to
interrupt the bot for.

We originally thought we should interrupt the bot right at the time we got an
interim frame, but this is causing too many false positives. It's actually
better to simply wait for a real transcription before interrupting (in case VAD
didn't interrupt).
2025-02-21 09:05:56 -08:00
Aleix Conchillo Flaqué
3ef4245166 Merge pull request #1265 from pipecat-ai/aleix/transport-remove-audio-out-is-live 2025-02-21 06:51:09 -08:00
Filipi da Silva Fuchter
3bb0797922 Merge pull request #1257 from pipecat-ai/fastapi_disconnect_issue
Fixed an issue where FastAPI was not triggering on_client_disconnected.
2025-02-21 09:15:15 -03:00
Filipi Fuchter
7c7b4c52af Fixed an issue where EndTaskFrame was not triggering on_client_disconnected or closing the WebSocket in FastAPI. 2025-02-21 09:11:58 -03:00
Aleix Conchillo Flaqué
01f083b7fc transports: remove TransportParams.audio_out_is_live 2025-02-20 23:33:06 -08:00
Aleix Conchillo Flaqué
91fcaebe25 Merge pull request #1263 from Vaibhav159/vl_fix_deepgram_sample_rate_mismatch
fixing deepgram mismatch
2025-02-20 22:39:06 -08:00
Vaibhav159
9c5fe5c85e fixing deepgram mismatch 2025-02-21 09:32:40 +05:30
Aleix Conchillo Flaqué
7e5e167a4b Merge pull request #1250 from pipecat-ai/aleix/context-aggregation-simulatenous-text-tools
AssistantContextAggregator: append aggregation and tools in the same turn
2025-02-20 17:32:57 -08:00
Aleix Conchillo Flaqué
d04c4b36f3 AssistantContextAggregator: append aggregation and tools in the same turn 2025-02-20 17:29:43 -08:00
Aleix Conchillo Flaqué
a811e53626 Merge pull request #1253 from pipecat-ai/aleix/http-tts-services-stopped-frame
HTTP TTS services stopped frame
2025-02-20 17:28:05 -08:00
Paul Kompfner
df57202a05 Since the openai package is used by pretty much everything in pipecat (due to OpenAILLMContext being the standard context representation), let's make it a non-optional dependency.
This change solves an issue faced by users who aren't intending to use OpenAI getting scary error messages saying that they need the `openai` optional dependency "in order to use OpenAI", along with an instruction to set the OPENAI_API_KEY environment variable.

Note that with this change we could theoretically remove from pyproject.toml a number of defined optional dependencies that list only the `openai` package as a dependency (like `deepseek`, for example), but I didn't want to "break the API" in terms of how users install/consume pipecat and its set of built-in services.

Finally, I removed the `python-deepcompare` dependency from the `openai` optional dependency, since it appears to me like it was added by mistake (my guess is it was used for debugging during development and then never removed).
2025-02-20 15:21:35 -05:00
Aleix Conchillo Flaqué
69e6f3fdb7 rime: pass aiohttp session to constructor 2025-02-20 07:36:24 -08:00
Aleix Conchillo Flaqué
6809254963 tts: fix metrics and TTSStoppedFrame frame in HTTP services
Fixes #1247
2025-02-20 07:36:21 -08:00
Aleix Conchillo Flaqué
81093d3bed Merge pull request #1252 from pipecat-ai/aleix/remove-vad-extra-logging
BaseInputTransport: remove VAD logging
2025-02-20 07:32:20 -08:00
Aleix Conchillo Flaqué
d9a67164f6 Merge pull request #1251 from pipecat-ai/aleix/fish-tts-service-push-stop-frame
FishAudioTTSService should push TTSStoppedFrame
2025-02-20 07:32:05 -08:00
Aleix Conchillo Flaqué
98259af54e update CHANGELOG 2025-02-19 22:05:48 -08:00
Dominic Stewart
039d144c79 examples(phone-bot): updated example to use Gemini (#1233) 2025-02-19 22:03:37 -08:00
Aleix Conchillo Flaqué
d0f67fc189 BaseInputTransport: remove VAD logging
These logs are very verbose. They were added to try to find an issue that
resulted in being because of low CPU/memory resources, but these logs were not
helpful to determine that.
2025-02-19 21:55:11 -08:00
Aleix Conchillo Flaqué
6e3f96aa83 fish: automatically send TTSStoppedFrame after timeout 2025-02-19 21:41:18 -08:00
Aleix Conchillo Flaqué
293677588d tts: make push_stop_frames default to 2.0s 2025-02-19 21:39:00 -08:00
Filipi da Silva Fuchter
77e777b1ce Merge pull request #1249 from pipecat-ai/invoking_call_start_function
Fixed an issue that `start_callback` was not invoked for some LLM services
2025-02-19 18:09:00 -03:00
Filipi Fuchter
7e7926059c Fixed an issue that start_callback was not invoked for some LLM services. 2025-02-19 18:04:20 -03:00
Aleix Conchillo Flaqué
c948754eff Merge pull request #1248 from pipecat-ai/aleix/daily-transport-room-url
daily: add room_url property
2025-02-19 09:46:46 -08:00
Aleix Conchillo Flaqué
83f1a8830d daily: add room_url property 2025-02-19 09:29:53 -08:00
James Hush
80f8e05fcf docs: fix transcripts in translation chatbot example (#1199) 2025-02-19 16:07:22 +08:00
Aleix Conchillo Flaqué
afd1a1e80b Merge pull request #1245 from pipecat-ai/aleix/stt-mute-filter-trace-logging 2025-02-18 21:21:55 -08:00
Aleix Conchillo Flaqué
84ac88cad7 STTMuteFilter: change suppressed logging to trace 2025-02-18 18:03:37 -08:00
Aleix Conchillo Flaqué
211163e5c7 Merge pull request #1241 from pipecat-ai/aleix/deepgram-nova-3
deepgram: use the new nova-3 model as default
2025-02-18 17:53:04 -08:00
Aleix Conchillo Flaqué
1b0bcebef6 deepgram: use the new nova-3 model as default 2025-02-18 17:51:54 -08:00
Aleix Conchillo Flaqué
89736b03c4 Merge pull request #1243 from pipecat-ai/aleix/add-deepgram-addons
deepgram: add ability to provide custom addons
2025-02-18 17:47:48 -08:00
Aleix Conchillo Flaqué
4edda718ed deepgram: add ability to provide custom addons 2025-02-18 17:45:41 -08:00
Aleix Conchillo Flaqué
22a62edc9e Merge pull request #1242 from pipecat-ai/aleix/utils-network-exponential
network: added exponential_backoff_time() function
2025-02-18 17:44:21 -08:00
Aleix Conchillo Flaqué
50b6cc8135 network: added exponential_backoff_time() function 2025-02-18 17:42:43 -08:00
Aleix Conchillo Flaqué
45cf36925a Merge pull request #1240 from pipecat-ai/aleix/handle-deepgram-on-error
deepgram: handle error event and reconnect
2025-02-18 17:41:29 -08:00
Filipi da Silva Fuchter
83a71e1fec Merge pull request #1112 from pipecat-ai/bot-ready-signalling-rn
React Native client for the bot ready example.
2025-02-18 15:17:38 -03:00
Filipi Fuchter
e809c8680e Upgrading to use the latest node stable version 2025-02-18 15:12:44 -03:00
Aleix Conchillo Flaqué
c926063d74 deepgram: handle error event and reconnect 2025-02-18 09:52:18 -08:00
Aleix Conchillo Flaqué
0334550356 Merge pull request #1238 from pipecat-ai/aleix/stt-mute-filter-ignore-input-audio-frames
STTMuteFilter: ignore audio frames so no transcriptions are generated
2025-02-18 09:48:13 -08:00
Aleix Conchillo Flaqué
90b9dce710 STTMuteFilter: ignore audio frames so no transcriptions are generated 2025-02-17 19:59:05 -08:00
Carl Kho
a5cdd5f1b8 Add Cartesia API key to dot-env.template 2025-02-14 21:29:37 -08:00
Carl Kho
5f937b8479 Update test requirements to include Cartesia version 1.3.1 2025-02-14 21:14:32 -08:00
Aleix Conchillo Flaqué
b45f7fee6f Merge pull request #1225 from pipecat-ai/aleix/prepare-0.0.57
update CHANGELOG for 0.0.57
2025-02-14 18:50:08 -08:00
Aleix Conchillo Flaqué
01c06c5cac update CHANGELOG for 0.0.57 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
329e89c1d9 TTSService: push BotStoppedSpeakingFrame 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
883410d8ac FrameProcessor: no need to create an input event every time 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
1f5b790dd0 TTSService: reset processing text during interruptions 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
a107b1cb4b examples(06a): use CartesiaTTSService 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
63950912f0 LLMAssistantContextAggregator: add missing variable initialization 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
2ce9402571 LLMAssistantResponseAggregator: initialize messages 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
f6912c0f9a utils: don't consider colon an end of sentence 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
633a4d4c58 FalImageGenService: load image async to not block the event loop 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
67da745bb3 tts: make frame pausing/resuming optional 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
5126d4de92 tts: handle incoming frames pausing/resuming from base TTSService class 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
426d7ac213 transports: some local audio and tk updates 2025-02-14 18:47:33 -08:00
Mark Backman
9115692c72 Merge pull request #1227 from pipecat-ai/mb/fix-25-error
fix: ensure proper Google message format conversion in transcription …
2025-02-14 21:01:05 -05:00
Mark Backman
c26fe3f277 fix: ensure proper Google message format conversion in transcription filter 2025-02-14 20:28:26 -05:00
Mark Backman
47b059d387 Merge pull request #1226 from pipecat-ai/mb/add-transcript-processor-tests
tests: add tests for TranscriptProcessor
2025-02-14 19:50:38 -05:00
Mark Backman
a49d81e519 tests: add tests for TranscriptProcessor 2025-02-14 17:10:40 -05:00
Aleix Conchillo Flaqué
b3a575c7c7 Merge pull request #1212 from Vaibhav159/vl_fix_incorrect_has_regular_messages_check
fixing google llm service error
2025-02-14 13:16:37 -08:00
Aleix Conchillo Flaqué
790d0c1256 Merge pull request #1224 from M1ngXU/patch-1
Update openai.py
2025-02-14 13:13:00 -08:00
Aleix Conchillo Flaqué
ee7e0dc3f7 Merge pull request #1223 from pipecat-ai/aleix/audio-context-tts-service
audio context tts service and cartesia fixes
2025-02-14 12:12:42 -08:00
Aleix Conchillo Flaqué
f53ee79ddb RimeTTSService: use AudioContextWordTTSService 2025-02-14 11:55:54 -08:00
Aleix Conchillo Flaqué
aeadb40c3f CartesiaTTSService: use AudioContextWordTTSService
By supporting multiple audio requests we fix an issue that was causing audio
overlapping.
2025-02-14 11:55:54 -08:00
Aleix Conchillo Flaqué
cacb07f4c2 introduce AudioContextWordTTSService 2025-02-14 11:55:54 -08:00
M1ngXU
0b91d821fb Update openai.py
d
2025-02-14 20:27:08 +01:00
Aleix Conchillo Flaqué
af66a43056 Merge pull request #1222 from pipecat-ai/aleix/websocket-service-handle-clean-disconnection
WebsocketService: handle clean server disconnection
2025-02-14 10:33:54 -08:00
Aleix Conchillo Flaqué
e006dcf172 WebsocketService: handle clean server disconnection
The websocket async iterator doesn't raise an exception when the server
disconnects cleanly. We should handle that and raise an exception so we can
reconnect.
2025-02-14 10:11:56 -08:00
Filipi da Silva Fuchter
8588f8b0d8 Merge pull request #1220 from pipecat-ai/instant_voice_demo_example
Instant voice example.
2025-02-14 14:24:13 -03:00
Filipi Fuchter
bff54547b0 Instant voice example. 2025-02-14 14:19:17 -03:00
Mark Backman
b2754bf208 Merge pull request #1219 from pipecat-ai/mb/markdown-text-filter-tests
Add MarkdownTextFilter tests
2025-02-13 21:10:52 -05:00
Mark Backman
9a4942b0d0 Merge pull request #1218 from pipecat-ai/mb/user-idle-tests
Add UserIdleProcessor tests
2025-02-13 18:53:22 -05:00
Mark Backman
ed6201910b Add MarkdownTextFilter tests 2025-02-13 18:51:46 -05:00
Mark Backman
ac5ebc587e Add tests for UserIdleProcessor 2025-02-13 18:47:29 -05:00
Aleix Conchillo Flaqué
dff4c54e57 Merge pull request #1209 from pipecat-ai/aleix/reimplement-llm-response-aggregators
reimplement LLM response aggregators
2025-02-13 15:30:40 -08:00
Aleix Conchillo Flaqué
c744409651 SegmentedSTTService: fix process_audio_frame() arguments 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
7578fbeaef update google requirements 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
5909dff423 LLMContextResponseAggregator: add VAD emulation support 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
a6502df72c services: forgot to pass context instead of user aggregator 2025-02-13 13:50:33 -08:00
Aleix Conchillo Flaqué
e0d24d7fc0 update CHANGELOG 2025-02-13 13:21:32 -08:00
Aleix Conchillo Flaqué
99779046a8 services: use push_context_frame() 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
67cdc0063a BaseTransportOutput: allow pushing frames upstream 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
b28f752afa tests: add anthropic and google aggregator tests 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
463078e375 initialize assistant aggregators with context and push upstream instead 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
84510fd521 LLMUserContextAggregator: add space between transcriptions 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
9f6a1c093a LLMUserContextAggregator: reset user speaking time after bot interruption 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
b602e78625 tests: add OpenAI context aggregator tests 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
7c815121ea LLMContextResponseAggregator: add missing reset() implementation 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
16a107948b services: missing kwargs in anthropic/openai user context aggregator 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
839aa7d935 llm_response: add some initial docstrings to LLM aggregators 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
4cbcfe2b0b LLMUserContextAggregator: interrupt the bot if VAD happened a while back 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
91a628d1ba UserResponseAggregator: implement on top of LLMUserResponseAggregator 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
50288eeaaa tests: add LLM response aggregators tests 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
e1f2bbceb3 reimplement LLM response aggregators 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
8bdd7ed0ed tests: implement langchain tests with run_test() 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
1b7dfe8126 tests: add a new SleepFrame
The new SleepFrame allow us to control when system frames are pushed to the
pipeline.
2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
d1ee851a65 tests: rename some variables to make things clearer 2025-02-13 13:20:37 -08:00
Filipi da Silva Fuchter
0358673b46 Merge pull request #1215 from pipecat-ai/instant_voice_demo
Instant voice demo improvements - part 02
2025-02-13 18:14:15 -03:00
Filipi Fuchter
16fe1b10e9 - Added support for the RTVIProcessor to handle buffered audio in base64 format, converting it into InputAudioRawFrame for transport.
- Added support for the `RTVIProcessor` to trigger `start_audio_in_streaming` only after the `client-ready` message.
2025-02-13 18:08:55 -03:00
Filipi Fuchter
f001819df8 - Added a new audio_in_stream_on_start field to TransportParams.
- Added a new method `start_audio_in_streaming` in the `BaseInputTransport`.
- Updated `DailyTransport` to respect the `audio_in_stream_on_start` field, ensuring it only starts receiving the audio input if it is enabled.
2025-02-13 18:08:36 -03:00
Filipi Fuchter
dceec60186 Updated FastAPIWebsocketOutputTransport to send TransportMessageFrame and TransportMessageUrgentFrame to the serializer. 2025-02-13 18:07:33 -03:00
Filipi Fuchter
b96979a4ed Update WebsocketServer to not wrap the message inside a text frame. 2025-02-13 18:07:04 -03:00
Mark Backman
745c40def4 Merge pull request #1214 from pipecat-ai/mb/stt-mute-tests
Improve STTMuteFilter, add tests
2025-02-13 09:50:43 -05:00
Mark Backman
42ab62716d Merge pull request #1198 from pipecat-ai/mb/more-whisper-params
Add prompt and temperature args to OpenAI and Groq hosted Whisper STT…
2025-02-13 09:16:38 -05:00
Mark Backman
16ba2010aa Refactor process_frame to be more consistent 2025-02-13 09:15:29 -05:00
Mark Backman
ec0ca46617 Fix temperature docstrings to reference optional 2025-02-13 09:04:20 -05:00
Mark Backman
6ff1f526ff Merge pull request #1216 from pipecat-ai/mb/google-cloud-speech
Add the google-cloud-speech package to the google dependency
2025-02-13 07:04:34 -05:00
Mark Backman
84143cc80c self._muted now returns from STT process_audio_frames 2025-02-13 07:00:44 -05:00
Mark Backman
229dccedc6 Add the google-cloud-speech package to the google dependency 2025-02-12 23:19:17 -05:00
Aleix Conchillo Flaqué
68aaa1f8f4 Merge pull request #1213 from pipecat-ai/aleix/base-transport-output-bot-vad-stop-secs
BaseOutputTransport: use specific VAD stop secs for the bot
2025-02-12 19:01:56 -08:00
Aleix Conchillo Flaqué
f110a45c85 BaseOutputTransport: use specific VAD stop secs for the bot 2025-02-12 19:01:39 -08:00
Mark Backman
1e8a86de63 Handle starting muted, add tests 2025-02-12 19:01:49 -05:00
Mark Backman
ee93e2a2b1 Reorder frame pushing for STTMuteFilter, update STTMuteFrame to SystemFrame 2025-02-12 15:51:18 -05:00
Mark Backman
2e87a019a8 Merge pull request #1208 from pipecat-ai/mb/stt-mute-first-bot-speech
Add new STTMuteStrategy: MUTE_UNTIL_FIRST_BOT_COMPLETE
2025-02-12 12:21:02 -05:00
Vaibhav159
687b3d9d4c fixing google llm service error 2025-02-12 22:22:04 +05:30
Mark Backman
397768d872 Add new STTMuteStrategy: MUTE_UNTIL_FIRST_BOT_COMPLETE 2025-02-12 10:59:28 -05:00
Mark Backman
24cdcd74e6 Merge pull request #1197 from pipecat-ai/mb/google-stt
Add GoogleSTTService
2025-02-12 10:16:18 -05:00
Mark Backman
5d6370690c Add _reconnect_if_needed to simplify reconnect logic 2025-02-12 10:11:18 -05:00
Mark Backman
9f728aa623 Add reconnect logic to handle Google's 5 min time limit 2025-02-12 10:11:18 -05:00
Mark Backman
32d8f6153f Update InputParams to languages: support str or List of Languages 2025-02-12 10:11:18 -05:00
Mark Backman
8c2071f248 Add ClientOptions for region selection 2025-02-12 10:11:18 -05:00
Mark Backman
a9c2197dc6 Add ability to update options 2025-02-12 10:11:18 -05:00
Mark Backman
ce0358804b Docstrings and cleanup 2025-02-12 10:11:18 -05:00
Mark Backman
66a6a6a295 Enable interim transcriptions, add VAD events option 2025-02-12 10:11:18 -05:00
Mark Backman
9f1732c390 Update CHANGELOG and README 2025-02-12 10:11:17 -05:00
Mark Backman
b44ddf2456 07n uses all Google services 2025-02-12 10:09:36 -05:00
Mark Backman
17420f4d0c Update language support 2025-02-12 10:09:36 -05:00
Mark Backman
6cb55ec2cb Add GoogleSTTService 2025-02-12 10:09:36 -05:00
Filipi da Silva Fuchter
e2b4554a54 Merge pull request #1129 from pipecat-ai/instant_voice_demo
Pipecat improvements for the instant voice demo
2025-02-12 11:53:40 -03:00
Mark Backman
fd68b82e48 Merge pull request #1163 from pipecat-ai/mb/rime-websocket
Add RimeTTSService
2025-02-12 09:51:56 -05:00
Filipi Fuchter
cc90f5ab9f Sending the RTVI messages to the websocket 2025-02-12 11:46:49 -03:00
Filipi Fuchter
08f40d9179 Adding support to DailyTransport receive raw-audio through appMessage 2025-02-12 11:46:37 -03:00
Aleix Conchillo Flaqué
80e1325621 include codecov.yml 2025-02-11 23:46:19 -08:00
Aleix Conchillo Flaqué
ed76a5bfa5 Merge pull request #1202 from pipecat-ai/aleix/fix-simli-audiolayout-error
simli: fix audio layout error
2025-02-11 22:24:22 -08:00
Mark Backman
69b0d9035f Mark end_time as unused 2025-02-11 17:44:52 -05:00
Mark Backman
dcc63dd648 Use the vendor default for temperature 2025-02-11 14:29:33 -05:00
Aleix Conchillo Flaqué
2d08f42870 Merge pull request #1204 from pipecat-ai/aleix/add-coverage-support
github: add coverage support
2025-02-11 11:09:25 -08:00
Mark Backman
0814c0bc82 Merge pull request #1203 from pipecat-ai/expose-update-remote-participants-on-daily-transport
Expose `update_remote_participants()` from `DailyTransport`
2025-02-11 13:57:08 -05:00
Paul Kompfner
28e233b195 Update CHANGELOG to reflect the addition of update_remote_participants() 2025-02-11 13:23:47 -05:00
Aleix Conchillo Flaqué
6e4d2d6ade examples: fix more dependabot warnings 2025-02-11 10:09:33 -08:00
Aleix Conchillo Flaqué
266135ec54 examples: fix dependabot warnings 2025-02-11 10:07:05 -08:00
Aleix Conchillo Flaqué
d81aa48262 test-requirements: update transformers to 4.48.0 2025-02-11 10:04:21 -08:00
Aleix Conchillo Flaqué
8c7752fbc2 github: add coverage support 2025-02-11 09:58:21 -08:00
Julien Le Bourg
77fb63372a fix: incorrectly changed the base type in my last pull request for L… (#1184)
* fix: incorrectly changed the base type in my last pull request for  LocalAudioTransport

* update examples to use the new LocalTransportParams

* add local device select example
2025-02-11 08:35:57 -08:00
Paul Kompfner
5a8279d3c2 Expose update_remote_participants() from DailyTransport 2025-02-11 11:28:03 -05:00
Aleix Conchillo Flaqué
4db620198a simli: fix audio layout error
Fixes #1201
2025-02-11 07:05:35 -08:00
Mark Backman
d35f4c6b99 Add prompt and temperature args to OpenAI and Groq hosted Whisper STT services 2025-02-10 21:06:37 -05:00
Aleix Conchillo Flaqué
0a990b2aaa Merge pull request #1196 from pipecat-ai/aleix/audio-buffer-processor-continuous-intermittent-stream
AudioBufferProcessor: handle continuous and intermittent user audio
2025-02-10 16:07:12 -08:00
Mark Backman
97586b132d Simplify _calculate_word_times 2025-02-10 18:45:49 -05:00
Mark Backman
8020db350e Update RimeHttpTTSService to use mistv2 model by default 2025-02-10 18:45:48 -05:00
Mark Backman
54f64b8dad Code review feedback 2025-02-10 18:45:08 -05:00
Mark Backman
8f8a3ae7f9 Add RimeTTSService 2025-02-10 18:45:06 -05:00
Mark Backman
344aff5681 Merge pull request #1191 from pipecat-ai/mb/azure-tts-error-handling
Improve AzureTTSService error handling
2025-02-10 18:01:39 -05:00
Mark Backman
0d2e90cff1 Merge pull request #1190 from pipecat-ai/mb/languages-hosted-whisper
Add language support to OpenAI and Groq hosted Whisper
2025-02-10 17:49:38 -05:00
Mark Backman
1a8dd6b713 Improve AzureTTSService error handling 2025-02-10 17:48:55 -05:00
Mark Backman
2dc585aee0 Merge pull request #1185 from pipecat-ai/mb/update-readme-hacking
Add missing pip install -e . step to the README, and clarify steps
2025-02-10 17:45:58 -05:00
Mark Backman
a64fa44811 Merge pull request #1186 from pipecat-ai/mb/whisper-multilingual
Add language support to WhisperSTTService
2025-02-10 17:26:10 -05:00
Aleix Conchillo Flaqué
baeb83484d Merge pull request #1194 from Vaibhav159/vl_fix_elevenlabs_disconnect_issue
fixing disconnect issue
2025-02-10 13:41:59 -08:00
Vaibhav159
b0c3f80963 resolve merge conf 2025-02-11 03:03:32 +05:30
Aleix Conchillo Flaqué
eb3c9b1e75 AudioBufferProcessor: handle continuous and intermittent user audio
Fixes #1172
2025-02-10 11:26:31 -08:00
Mark Backman
ad4cbdb1ec Merge pull request #1159 from Canonical-AI-Inc/gemini-rag
Gemini 2.0 Flash Lite RAG example
2025-02-10 13:42:11 -05:00
Aleix Conchillo Flaqué
32baee924b RTVI: fix premature bot-tts-text messages (#1193) 2025-02-10 10:37:54 -08:00
Adrian Cowham
9cc53509d1 PR feedback: renamed file, added docstring, changed file read logic 2025-02-10 09:39:01 -08:00
Vaibhav159
2c62d3bf32 break once ConnectionClosed error 2025-02-10 23:04:05 +05:30
Vaibhav159
b06b16adb7 fixing disconnect issue 2025-02-10 22:55:20 +05:30
Mark Backman
cd52d73027 Add language support to OpenAI and Groq hosted Whisper 2025-02-10 10:18:00 -05:00
Mark Backman
c9d8c572c7 Add language support to WhisperSTTService 2025-02-09 10:51:23 -05:00
Mark Backman
d9439fd398 Add missing pip install -e . step to the README, and clarify steps 2025-02-09 09:15:10 -05:00
Mark Backman
081abcedb3 Merge pull request #1176 from pipecat-ai/mb/stt-mute-deprecate-stt-service
Deprecate stt_service parameter in STTMuteFilter
2025-02-09 08:35:22 -05:00
Mark Backman
1455e24ad1 Add keyword args, collocated warnings import with the deprecation 2025-02-09 08:29:20 -05:00
Mark Backman
4613cf4790 Merge pull request #1181 from pipecat-ai/mb/daily-docstrings
Add docstrings to daily.py
2025-02-09 08:05:59 -05:00
Mark Backman
7aa2e1209d Merge pull request #1177 from pipecat-ai/mb/perplexity
Add PerplexityLLMService
2025-02-09 08:05:46 -05:00
Mark Backman
76daaab6ca Add PerplexityLLMService 2025-02-09 08:00:31 -05:00
Mark Backman
37cfe870cc Merge pull request #1183 from pipecat-ai/mb/add-groq-stt
Add GroqSTTService, BaseWhisperSTTService, and refactor OpenAISTTService
2025-02-09 07:56:35 -05:00
Mark Backman
160167758b Add docstrings to daily.py 2025-02-09 07:53:51 -05:00
Mark Backman
4b634713a5 Merge pull request #1182 from pipecat-ai/mb/28c-optional-db
Update 28c option to output to log line only by default
2025-02-09 07:52:21 -05:00
Mark Backman
72954d5f15 Remove to base_whisper.py 2025-02-09 07:51:30 -05:00
Mark Backman
f2b07271c1 Update GroqLLMService to use llama-3.3-70b-versatile as the default model 2025-02-09 07:51:30 -05:00
Mark Backman
32b9de5f51 Add GroqSTTService, BaseWhisperSTTService, and refactor OpenAISTTService 2025-02-09 07:51:28 -05:00
Mark Backman
71ce8f9bcf Merge pull request #1179 from pipecat-ai/mb/remove-command-dash-badge
Remove CommandDash badge from README
2025-02-09 07:47:32 -05:00
Mark Backman
7d05728e2f Update 28c option to output to log line only by default 2025-02-08 10:00:45 -05:00
Mark Backman
dee5448b57 Merge pull request #1123 from pipecat-ai/cb/sqlite
Add SQLite storage to the Gemini persistent storage example
2025-02-08 09:07:52 -05:00
Mark Backman
d67861925a Merge pull request #1128 from golbin/whisper-api
Add Whisper STT service using OpenAI API
2025-02-08 08:35:26 -05:00
Mark Backman
0180619d44 Merge pull request #1173 from TheCodingLand/local-pyaudio-device-ids
adds configurable device ids for local audio transport
2025-02-08 08:04:00 -05:00
Mark Backman
f07e498612 Remove CommandDash badge from README 2025-02-08 07:59:39 -05:00
TheCodingLand
57964cb929 fix LocalAudioTransport param type 2025-02-08 12:32:20 +01:00
TheCodingLand
6840c77684 apply ruff formatting 2025-02-08 12:03:23 +01:00
Mark Backman
a1b58115ce Deprecate stt_service parameter in STTMuteFilter 2025-02-07 19:24:03 -05:00
chadbailey59
23eb6e3d46 storybot fixes (#1175)
* storybot fixes

* readme cleanup
2025-02-07 13:58:02 -06:00
Mark Backman
74a2c38c6c Merge pull request #1174 from pipecat-ai/mb/bump-google-genai-version
Bump google-genai version to 1.0.0
2025-02-07 14:53:44 -05:00
Mark Backman
90b217fda8 Bump google-genai version to 1.0.0 2025-02-07 14:32:37 -05:00
Aleix Conchillo Flaqué
6855bc0ada Merge pull request #1166 from pipecat-ai/aleix/google-rtvi-observer
rtvi: separate specific google RTVI into a GoogleRTVIObserver
2025-02-08 03:19:02 +08:00
TheCodingLand
a359434307 remove Doc and Annotated imports 2025-02-07 19:42:34 +01:00
TheCodingLand
856c8959c3 enhance doc 2025-02-07 19:38:26 +01:00
TheCodingLand
8da7a42137 adds configurable input and output device ids for local audio 2025-02-07 19:23:18 +01:00
Aleix Conchillo Flaqué
510a0f5ef5 rtvi: deprecate RTVI.observer() 2025-02-07 09:19:43 -08:00
Aleix Conchillo Flaqué
03ac744bcf rtvi: deprecate frame processors 2025-02-07 09:17:29 -08:00
Aleix Conchillo Flaqué
b058461a7d GoogleRTVIObserver: add explicit constructor 2025-02-07 09:15:32 -08:00
Mark Backman
abd9f16b90 Export .rtvi, update new-chatbot example, rename and update foundational 32 2025-02-07 09:15:32 -08:00
Aleix Conchillo Flaqué
d07732f2e8 rtvi: separate specific google RTVI into a GoogleRTVIObserver 2025-02-07 09:15:32 -08:00
Aleix Conchillo Flaqué
4d25582e16 dev-requirements: update pyright and ruff 2025-02-06 21:51:57 -08:00
Aleix Conchillo Flaqué
d4b2160f9c Merge pull request #1161 from pipecat-ai/aleix/prepare-0.0.56
update CHANGELOG for 0.0.56
2025-02-06 13:50:04 -08:00
Aleix Conchillo Flaqué
dd7926aab5 update CHANGELOG for 0.0.56 2025-02-06 13:45:13 -08:00
Aleix Conchillo Flaqué
070bf66980 transports: fix local transports audio cleanup 2025-02-06 13:45:13 -08:00
Aleix Conchillo Flaqué
962fc27dbd Merge pull request #1160 from pipecat-ai/aleix/fix-unit-test-logging
tests: remove logger from tests.utils
2025-02-06 13:26:37 -08:00
Mark Backman
3d4d6132fc Merge pull request #1158 from pipecat-ai/mb/update-22c
Update foundation examples 22b, 22c, and 22d to be ready for function…
2025-02-06 16:25:05 -05:00
Aleix Conchillo Flaqué
a96d9294b7 tests: remove logger from tests.utils 2025-02-06 13:18:28 -08:00
Aleix Conchillo Flaqué
a6e78550d5 Merge pull request #1156 from pipecat-ai/aleix/prefer-optional
prefer Optional over to "| None"
2025-02-06 13:08:48 -08:00
Adrian Cowham
d9f6b7b93c added an example using using Gemini's large context window for RAG 2025-02-06 12:49:29 -08:00
Mark Backman
969de92ad9 Update foundation examples 22b, 22c, and 22d to be ready for function calling 2025-02-06 15:36:16 -05:00
Aleix Conchillo Flaqué
c4dbe92b30 prefer Optional over to "| None" 2025-02-06 11:11:37 -08:00
Aleix Conchillo Flaqué
684764fece Merge pull request #1155 from pipecat-ai/aleix/sentry-fixes-and-example
sentry fixes and example
2025-02-06 11:09:31 -08:00
Aleix Conchillo Flaqué
c4be07693f examples: added sentry-metrics example 2025-02-06 10:46:04 -08:00
Aleix Conchillo Flaqué
c5d5ca8232 SentryMetrics: use transactions and call parent methods 2025-02-06 10:44:38 -08:00
Mark Backman
428e763814 Merge pull request #1149 from pipecat-ai/mb/update-google-default-llm-model
Use gemini-2.0-flash-001 as the default model for GoogleLLMService
2025-02-06 12:41:13 -05:00
Mark Backman
0efa2711ff Merge pull request #1152 from pipecat-ai/mb/docstrings
Add docstrings for PipelineTask and related classes/functions
2025-02-06 12:30:12 -05:00
Mark Backman
4904f52cee Use gemini-2.0-flash-001 as the default model for GoogleLLMService 2025-02-06 12:29:15 -05:00
Aleix Conchillo Flaqué
dbcf14ddb4 Merge pull request #1154 from pipecat-ai/aleix/twilio-telnyx-sample-rates
serializers: don't update twilio/telnyx sample rates
2025-02-06 09:27:42 -08:00
Aleix Conchillo Flaqué
7c13ec10d9 examples: cleanup ElevenLabsTTSService constructor arguments 2025-02-06 09:25:52 -08:00
Aleix Conchillo Flaqué
29b9dccc53 serializers: don't update twilio/telnyx sample rates 2025-02-06 09:25:52 -08:00
Aleix Conchillo Flaqué
e8ce826473 Merge pull request #1151 from pipecat-ai/aleix/base-output-transport-resample
BaseOutputTransport: resample incoming audio if needed
2025-02-06 09:25:07 -08:00
Aleix Conchillo Flaqué
bbb991dfd8 Merge pull request #1153 from pipecat-ai/aleix/base-input-transport-show-vad
BaseInputTransport: show VAD results when interruptions not allowed
2025-02-06 09:24:12 -08:00
Mark Backman
4432e7e4f7 Add docstrings for PipelineTask and related classes/functions 2025-02-06 11:04:54 -05:00
Aleix Conchillo Flaqué
ee9cce64b2 BaseInputTransport: show VAD results when interruptions not allowed 2025-02-06 07:40:03 -08:00
Aleix Conchillo Flaqué
1ae4f0150d BaseOutputTransport: resample incoming audio if needed 2025-02-06 07:37:43 -08:00
Mark Backman
4c77c3ed34 Merge pull request #1148 from pipecat-ai/mb/fix-twilio-serializer
Fix sample rate handling in Twilio and Telnyx serializers
2025-02-06 10:25:13 -05:00
Aleix Conchillo Flaqué
975b97472a Merge pull request #1144 from pipecat-ai/aleix/frame-processor-missing-init-warning
FrameProcessor: add an error about missing super().process_frame(...)
2025-02-06 07:18:35 -08:00
Mark Backman
c8ccf13bc7 fix: Use audio_in_sample_rate to deserialize data for TelnyxFrameSerializer 2025-02-06 09:59:21 -05:00
Mark Backman
ba59736f87 fix: Use audio_in_sample_rate to deserialize data for TwilioFrameSerializer 2025-02-06 09:55:15 -05:00
Jin Kim
5989e1ed16 Merge branch 'main' into whisper-api 2025-02-06 13:14:36 +09:00
Aleix Conchillo Flaqué
bc21a0b817 FrameProcessor: add an error about missing super().process_frame(...) 2025-02-05 18:33:03 -08:00
Aleix Conchillo Flaqué
99d3227ff5 Merge pull request #1126 from pipecat-ai/aleix/prepare-0.0.55
update CHANGELOG for 0.0.55
2025-02-05 11:32:39 -08:00
Aleix Conchillo Flaqué
7730f59635 update CHANGELOG for 0.0.55 2025-02-05 11:30:40 -08:00
Aleix Conchillo Flaqué
ba31546c32 Merge pull request #1139 from pipecat-ai/aleix/task-start-metadata
pipeline task start metadata and unit test improvements
2025-02-05 10:51:51 -08:00
Aleix Conchillo Flaqué
a363d12d1f dev-requirements: fix conflicts because of nvidia-riva-client 2025-02-05 10:34:46 -08:00
Aleix Conchillo Flaqué
feab9c8fa2 tests: run_test() now uses PipelineTask 2025-02-05 10:34:38 -08:00
Aleix Conchillo Flaqué
61f6669926 task: allow passing StartFrame metadata via start_metadata param 2025-02-05 10:34:38 -08:00
Aleix Conchillo Flaqué
3be69908d2 Merge pull request #1131 from pipecat-ai/aleix/global-audio-sample-rates
introduce PipelineParams audio input/output sample rates
2025-02-05 08:11:25 -08:00
Aleix Conchillo Flaqué
fcb80ec330 playht: don't set sample_rate in _settings 2025-02-05 07:46:24 -08:00
Mark Backman
c9f5684e2f OpenAITTSService: Add warning about changing sample_rate 2025-02-05 10:13:46 -05:00
Mark Backman
c257fa1573 AzureTTSService, AzureHttpTTSService: add start() method 2025-02-05 10:05:19 -05:00
Mark Backman
97c55da29f PlayHTHttpTTSService: add start() method to set sample_rate 2025-02-05 09:54:41 -05:00
Aleix Conchillo Flaqué
49426aa9a1 transport(websocket): improve exception logging 2025-02-04 23:50:45 -08:00
Aleix Conchillo Flaqué
0a333c26da services(elevenlabs): warn if sample rate not supported 2025-02-04 23:50:21 -08:00
Aleix Conchillo Flaqué
75a29424ff examples(telnyx-chatbot): use cartesia so we can use 8khz 2025-02-04 23:49:50 -08:00
Filipi da Silva Fuchter
cd1b429308 Merge pull request #1133 from pipecat-ai/fixing_krisp_issue
Fixing the issue in Krisp when trying to create more than one
2025-02-04 20:44:29 -03:00
Filipi Fuchter
7f1ae4b8cc Fixing the issue in Krisp when trying to create more than one filter in the same process. 2025-02-04 20:10:56 -03:00
Aleix Conchillo Flaqué
af9fd811cd examples(moondream-chatbot): fix UserImageRequester 2025-02-04 14:37:53 -08:00
Aleix Conchillo Flaqué
69f5c9b9d3 update anthropic and openpipe versions 2025-02-04 14:37:36 -08:00
Aleix Conchillo Flaqué
ab45e481be introduce PipelineParams audio input/output sample rates 2025-02-04 14:12:56 -08:00
Pedro Moreira
79ac696973 Add support for Piper TTS 2025-02-04 13:51:33 -03:00
Jin Kim
ef1e4277d3 Add an example for Whisper using OpenAI API 2025-02-04 10:32:55 +09:00
Jin Kim
823b763b25 Change OpenAI example file name 2025-02-04 10:28:06 +09:00
Jin Kim
3cb189eb1f Add whisper STT service using OpenAI API 2025-02-04 10:27:28 +09:00
Aleix Conchillo Flaqué
cc54255c41 Merge pull request #1125 from pipecat-ai/aleix/twilio-chatbot-improvements 2025-02-03 11:10:33 -08:00
Aleix Conchillo Flaqué
1cdb66f889 examples(twilio-chatbot): create sample rate variable 2025-02-03 10:58:06 -08:00
Aleix Conchillo Flaqué
51a86a509c examples: multiple twilio-chatbot improvements 2025-02-03 10:36:24 -08:00
Aleix Conchillo Flaqué
824898f7b7 Merge pull request #1121 from pipecat-ai/aleix/audio-resamplers
introduce audio resamplers
2025-02-03 10:32:55 -08:00
Aleix Conchillo Flaqué
57dadb6359 audio(utils): some variable renames 2025-02-03 09:33:04 -08:00
Aleix Conchillo Flaqué
5dcdc68ef5 examples: fix 22 series initial gate state 2025-02-03 09:16:58 -08:00
Aleix Conchillo Flaqué
aafb2db620 GatedOpenAILLMContextAggregator: use keyword argument and add start_open 2025-02-03 09:16:44 -08:00
Aleix Conchillo Flaqué
f3f22cf61c AudioBufferProcessor: add start_recording()/stop_recording() 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
371c2f3704 canonical: do not reset audio buffers 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
1f14f62696 AudioBufferProcessor: fix audio buffer silence computation 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
06449eff2c BaseAudioResampler: make resample() async 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
dcfb86583d serializers: serialize()/deserialize() are now async 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
cda34a1320 AudioBufferProcessor: fix user/bot audio buffers silence padding 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
13611fd8e1 AudioBufferProcessor: call callback on CancelFrame 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
fc89aad469 introduce audio resamplers 2025-02-01 11:06:55 -08:00
Aleix Conchillo Flaqué
6c7474e1a2 frames: add pass to DTMFFrames 2025-01-31 18:37:40 -08:00
Aleix Conchillo Flaqué
95f0dbf3f3 CHANGELOG.md: task.cancel() and EndFrame clarification 2025-01-31 18:35:35 -08:00
Aleix Conchillo Flaqué
11aeb68ddb frames: fix type s/OuputDTMFFrame/OutputDTMFFrame/ 2025-01-31 18:28:38 -08:00
Aleix Conchillo Flaqué
a43c102fc8 Merge pull request #1064 from jcbjoe/jg/additional_dtmf_frames
Added: Additional DTMF frames
2025-01-31 18:25:08 -08:00
Chad Bailey
d236973c0f moved sqlite code back to a single example 2025-01-31 23:18:06 +00:00
Mark Backman
16b49bdce6 Merge pull request #1122 from pipecat-ai/mb/openai-org-id
Add organization and project level auth in OpenAILLMService
2025-01-31 14:35:26 -05:00
Mark Backman
41477c8f78 Add organization and project level auth in OpenAILLMService 2025-01-31 14:27:25 -05:00
Aleix Conchillo Flaqué
bb9a2560c3 Merge pull request #1118 from pipecat-ai/aleix/task-manager
introduce TaskManager
2025-01-31 10:24:52 -08:00
Aleix Conchillo Flaqué
002699f16c rtvi: delay creating tasks until we get StartFrame 2025-01-31 10:06:11 -08:00
chadbailey59
a17243bc1e More Storybot updates (#1116)
* initial changes for gemini storybot

* storybot updates for gemini

* more storybot updates

* interim interruptible commit

* cleanup

* cleanup

* cleanup

* first draft

* wip

* more storybot fixes

* more storybot updates WIP

* committing before changing the image prompting strategy

* wip

* prompt updating

* cleanup

* cleanup

* cleanup

* readme cleanup

* fixup
2025-01-30 20:13:18 -06:00
Aleix Conchillo Flaqué
d95819746a tests: make sure QueuedFrameProcessor push frames 2025-01-30 13:48:44 -08:00
Aleix Conchillo Flaqué
b65f32e8e1 task: start TaskObserver when tasks can be created
We have to start proxy observer tasks once we know the TaskManager has an event
loop.
2025-01-30 13:46:56 -08:00
Aleix Conchillo Flaqué
0131d0a531 examples: make sure unhandled frames are always pushed 2025-01-30 13:15:49 -08:00
Aleix Conchillo Flaqué
642affb2fe add missing super().process_frame() calls 2025-01-30 13:15:17 -08:00
Aleix Conchillo Flaqué
a145005498 SyncParallelPipeline: cleanup source/sink processors 2025-01-30 13:13:02 -08:00
Aleix Conchillo Flaqué
241f241ed9 SyncParallelPipeline: don't add source/sink processors inside pipeline 2025-01-30 13:12:37 -08:00
Aleix Conchillo Flaqué
85e572e2d8 gladia: cleanup receive messages task 2025-01-30 13:10:47 -08:00
Aleix Conchillo Flaqué
10716e8ec1 utils: protect obj_id() and obj_count() with a lock 2025-01-30 13:10:36 -08:00
Aleix Conchillo Flaqué
41d60a14cc introduce TaskManager and PipelineRunner event loop 2025-01-30 13:10:36 -08:00
Aleix Conchillo Flaqué
e69c065a86 update CHANGELOG and fix formatting 2025-01-30 08:55:29 -08:00
Aleix Conchillo Flaqué
f90c17ab30 Merge pull request #1083 from team-telnyx/creating_telnyx_chatbot
Creating telnyx chatbot
2025-01-30 08:49:20 -08:00
Aleix Conchillo Flaqué
bc4fdd587a Merge pull request #1103 from pipecat-ai/aleix/tts-service-push-silence-before-tts-stop-frame
services(tts): allow pushing silence audio before TTSStoppedFrame
2025-01-30 08:48:41 -08:00
Aleix Conchillo Flaqué
665a6017f9 services(tts): allow pushing silence audio before TTSStoppedFrame 2025-01-30 08:46:56 -08:00
Aleix Conchillo Flaqué
4119d7a115 Merge pull request #1104 from pipecat-ai/aleix/twilio-transport-message-frames
serializers(twilio): handle transport message frames
2025-01-30 08:45:55 -08:00
Aleix Conchillo Flaqué
2634b03ffa serializers(twilio): handle transport message frames 2025-01-30 08:30:09 -08:00
Aleix Conchillo Flaqué
6a50759b9f Merge pull request #1105 from pipecat-ai/aleix/websocket-client
added new websocket client transport
2025-01-30 08:28:26 -08:00
Mark Backman
7982faba67 Merge pull request #1115 from pipecat-ai/mb/elevenlabs-language-fixes
Improve ElevenLabs language checking logic
2025-01-30 10:03:22 -05:00
Mark Backman
2b4bf57c04 Improve ElevenLabs language checking logic 2025-01-30 09:52:36 -05:00
Filipi Fuchter
7e3e126730 Migrating the base API URL for the react native example to an .env file. 2025-01-30 10:42:16 -03:00
Filipi Fuchter
75ca0571bb Improving the layout from the bot ready react native demo. 2025-01-30 10:31:04 -03:00
Filipi Fuchter
a48e5d0714 Only sending the message when it is a remote audio track. 2025-01-30 10:14:37 -03:00
Filipi Fuchter
2b6a992207 Sending the app-message to start playing audio once the track has started. 2025-01-30 09:37:33 -03:00
Filipi Fuchter
24cf106ed2 Refactoring the code to ask for the room that it should connect. 2025-01-30 09:14:18 -03:00
Rafal Skorski
b93e4ab9cb Formatting adjusted and the encoding selection moved from TelnyFrameSerilaizer to websocket_endpoint function in server.py 2025-01-30 12:52:30 +01:00
Dominic Stewart
c140c04b9a Merge pull request #1080 from DominicStewart/dom/voicemail-detection-bot
Add voicemail detection example
2025-01-30 09:20:12 +09:00
Dominic
a7c8d2af8e Removed extra space too 2025-01-30 09:18:29 +09:00
Dominic
f3f520a76a Removed formatting that vs code automatically adds to readme file 2025-01-30 09:17:27 +09:00
Mark Backman
5e0f42a3e0 Merge pull request #1111 from pipecat-ai/mb/gemini-restructure-messages
GoogleLLMContext: Allow _restructure_from_openai_messages to handle c…
2025-01-29 19:06:47 -05:00
Filipi Fuchter
95c8346cb5 Starting to create a react native client for the bot ready example. 2025-01-29 19:00:42 -03:00
Mark Backman
220ce9fd0f GoogleLLMContext: Allow _restructure_from_openai_messages to handle context frames that contain function call data and / or messages 2025-01-29 16:01:39 -05:00
Filipi da Silva Fuchter
5d0486a26f Merge pull request #1008 from pipecat-ai/cutting_initial_words
Avoid cutting off the beginning of the audio
2025-01-29 17:02:40 -03:00
Chad Bailey
bc98c2e36c added sqlite storage example 2025-01-29 19:12:15 +00:00
Aleix Conchillo Flaqué
091258f617 improve create_task names 2025-01-29 11:11:40 -08:00
Aleix Conchillo Flaqué
2a1408eb2a transports(websocket server): remove unused variable 2025-01-29 11:11:40 -08:00
Aleix Conchillo Flaqué
6393b41d58 transports(websocket): added WebsocketClientTransport 2025-01-29 11:11:37 -08:00
Filipi Fuchter
2a5728264c Adding missing dependency to openai 2025-01-29 15:52:42 -03:00
Filipi Fuchter
2ef0735462 Adding readme to teach how to use. 2025-01-29 15:45:48 -03:00
Filipi Fuchter
80bbfff4be Merge branch 'main' into cutting_initial_words 2025-01-29 15:36:52 -03:00
Aleix Conchillo Flaqué
4ff68e66b9 Merge pull request #1110 from pipecat-ai/aleix/frame-metadata
frames: added metadata field to Frame class
2025-01-29 10:30:59 -08:00
Aleix Conchillo Flaqué
3a688840fc frames: added metadata field to Frame class 2025-01-29 09:53:21 -08:00
Aleix Conchillo Flaqué
2ca8b95bbf Merge pull request #1106 from Vaibhav159/vl_moving_test_utils_to_pipecat_package
moving test utils inside of package
2025-01-29 09:44:34 -08:00
Mark Backman
2aafc6bd1d Merge pull request #1107 from AngeloGiacco/angelo/increase-ws-connection
fix: elevenlabs tts increase websocket max message size limit to 16MB
2025-01-29 10:04:42 -05:00
Angelo Giacco
0ff9ef8707 fix: add changelog 2025-01-29 14:27:39 +00:00
Angelo Giacco
596cae994d fix: elevenlabs tts increase websocket max message size limit to 16MB 2025-01-29 13:55:27 +00:00
Dominic
9ad9cb1ff8 Cleaned up formatting 2025-01-29 17:36:08 +09:00
Dominic Stewart
60e800e9ba Merge branch 'main' into dom/voicemail-detection-bot 2025-01-29 17:30:56 +09:00
Dominic
1c8f0ed7da Finalised code and added a bit about this example to the README 2025-01-29 17:27:44 +09:00
Vaibhav159
8407a86532 moving test utils inside of package 2025-01-29 12:46:43 +05:30
Dominic
417d661d28 Updated bot_runner and bot_daily with adjustments necessary to run voicemail detection from bot_daily code 2025-01-29 16:11:45 +09:00
Aleix Conchillo Flaqué
8cd23c42fc Merge pull request #1100 from pipecat-ai/aleix/use-task-cancel-on-left-disconnected
use `task.cancel()` when participant leaves/disconnects
2025-01-28 16:02:02 -08:00
Aleix Conchillo Flaqué
0547a15695 task: allow queuing a CancelFrame to cancel the task 2025-01-28 15:59:36 -08:00
Aleix Conchillo Flaqué
3fe2124314 examples: use task.cancel() when participant leaves or disconnects 2025-01-28 15:46:20 -08:00
Aleix Conchillo Flaqué
ba358a4f0a task: cleanup processors after task finishes running 2025-01-28 15:02:25 -08:00
Aleix Conchillo Flaqué
79ef8c947d Merge pull request #1099 from pipecat-ai/aleix/daily-transport-queue-events
transports(daily): queue events until join completes
2025-01-28 14:38:25 -08:00
Aleix Conchillo Flaqué
f024476b08 transports(daily): queue events until join completes 2025-01-28 11:22:42 -08:00
Dominic
73690a13d9 Moved voicemail detection to phone-chatbot and working on that now 2025-01-28 22:31:08 +09:00
Dominic
6ebf06a6fb Removed start_terminate_call function as unnecessary 2025-01-28 10:39:10 +09:00
Dominic
2f4f779c91 Fixed a few things 2025-01-28 10:39:10 +09:00
Dominic
941ee6e5e8 Add voicemail detection example 2025-01-28 10:39:10 +09:00
Aleix Conchillo Flaqué
cd5075ed7a Merge pull request #1097 from pipecat-ai/aleix/pipecat-0.0.57
prepare CHANGELOG for 0.0.54
2025-01-27 14:56:51 -08:00
Aleix Conchillo Flaqué
6f41a667c8 prepare CHANGELOG for 0.0.54 2025-01-27 14:48:56 -08:00
Aleix Conchillo Flaqué
0b222a7eae Merge pull request #1085 from pipecat-ai/aleix/task-creation-and-cancellation
improve task creation and cancellation
2025-01-27 14:47:20 -08:00
Aleix Conchillo Flaqué
f09f4b8fc4 services(tavus): fix EndFrame and CancelFrame processing 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
cca241a2b7 examples(22c): fix cancel_task call 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
1489e44740 gemini(multimodal live): fix model audio queue variable 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
f55f78e70e update CHANGELOG.md 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
10202dc529 transports(websockets): cancel or wait for tasks to finish 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
498805a34c FrameProcessor: add wait_for_task() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
509f143e1b update CHANGELOG.md 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
737e4fa3bd gemini(multimodal live): connect on StartFrame 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
8b5228a105 utils: move task functions to asyncio module 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
6cc01bc5b0 examples: update 14 series with TTSSpeakFrame 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
2a2928d96c gemini: create transcribe tasks only once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
a3a6adbd17 user_idle_processor: add missing parent cleanup() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
bf5ced18b2 fix parallel pipelines cleanup 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
2eccd1b1e9 utils: update some logging levels 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
9374bed878 tests: langchain fixes 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
c03d0352b1 utils/tasks: added new documentation 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
af90b8b4fa utils: add wait_for_task() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
0a9daa2f56 task: avoid canceling tasks more than once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
e48c0e52ef transports(daily): avoid canceling task more than once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
6bca8396d3 utils: error if we try to cancel the same task multiple times 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
c2d8a45a07 runner: warn about remaining dangling tasks 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
80a7f1b1e7 runner: improve signal handler task cancellation 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
aff6e24560 pipeline: fix pipeline cleanup 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
cb93f6b368 utils: store created tasks and add current_tasks() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
ff0bcec33a transports: improve task naming 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
5885fcc230 add id and name properties 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
57b186cde8 base_transport: add name and id fields 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
d1a3f404a5 improve task creation and cancellation
If a FrameProcessor needs to create a task it should use
FrameProcessor.create_task() and FrameProcessor.cancel_task(). This gives
Pipecat more control over all the tasks that are created in Pipecat.

Both functions internally use the utils module: utils.create_task() and
utils.cancel_task() which should also be used outside of FrameProcessors. That
is, unless strictly necessary, we should avoid using asyncio.create_task().
2025-01-27 14:42:23 -08:00
chadbailey59
179ddbea7d Add dialout to the Daily phone example (#998)
* added dialout to daily phone example

* cleanup

* cleanup

* pre-commit hook

* Fix typo

* More explicit README instructions

---------

Co-authored-by: Mark Backman <mark@daily.co>
2025-01-27 12:21:30 -06:00
Mark Backman
86c1e6a3bd Merge pull request #1081 from pipecat-ai/mb/user-idle-add-retry
Added retry functionality and a new callback to the UserIdleProcessor
2025-01-27 10:30:45 -05:00
Mark Backman
9e9822f17d Use inspect.signature to determine which callback to use 2025-01-27 10:24:58 -05:00
Mark Backman
5f9671e2ca Added retry functionality and a new callback to the UserIdleProcessor 2025-01-27 10:24:57 -05:00
Mark Backman
aac8961ae5 Merge pull request #1078 from pipecat-ai/mb/improve-error-handling-truncate-audio
Add better error handling for OpenAIRealtimeBetaLLMService truncate errors
2025-01-27 08:54:39 -05:00
Mark Backman
3e6377346a Merge pull request #1093 from pipecat-ai/mb/update-example-6a 2025-01-26 19:43:39 -05:00
Mark Backman
9d9a622b1a Merge pull request #1094 from pipecat-ai/mb/readme-service-section 2025-01-26 19:43:12 -05:00
Mark Backman
3e9a6b6262 Merge pull request #1095 from pipecat-ai/mb/elevenlabs-lang-codes 2025-01-26 12:21:28 -05:00
Mark Backman
fb3097560f Remove eleven_multilinguagal_v2 from language code list 2025-01-26 07:17:38 -05:00
Mark Backman
ff6368add0 Update README.md
Adding a section so that table can be linked to.
2025-01-25 16:12:53 -05:00
Mark Backman
89fd03d86f Merge pull request #1090 from vengad-arrowhead/main
Adding hindi danda symbol as end of sentence marker
2025-01-25 09:36:19 -05:00
Mark Backman
0672530d6b Fix foundational example 6a to switch images when the bot is speaking 2025-01-25 08:40:42 -05:00
vengadanathan srinivasan
7a0cfc8d3d Adding hindi danda symbol as end of sentence marker 2025-01-25 14:55:51 +05:30
Mark Backman
b881dd57b3 Merge pull request #1086 from pipecat-ai/mb/fix-expiry-time-type-mismatch 2025-01-24 17:31:08 -05:00
Mark Backman
abf0d0d053 Improve token parameter construction using DailyMeetingTokenProperties 2025-01-24 17:22:31 -05:00
Mark Backman
1acdf7aff7 Fix expiry_time type validation in get_token REST API helper 2025-01-24 17:21:50 -05:00
Mark Backman
96b90abda6 Merge pull request #1082 from pipecat-ai/mb/update-function-calling-examples
Update function calling examples to push a TextFrame in the start_cal…
2025-01-24 17:21:13 -05:00
Filipi da Silva Fuchter
202a844eeb Merge pull request #1051 from pipecat-ai/gemini_grounding_metadata_rtvi
Sending Search Response to RTVI
2025-01-24 19:20:50 -03:00
Filipi Fuchter
655d56f634 Fixing pydantic validation when creating meeting token. 2025-01-24 19:15:56 -03:00
Filipi Fuchter
07c84b733b Sending Search Response to RTVI 2025-01-24 18:59:46 -03:00
Filipi da Silva Fuchter
7c52736ff6 Merge pull request #1030 from pipecat-ai/gemini_grounding_metadata
Introduce support for extracting and processing grounding metadata from GoogleLLMService.
2025-01-24 15:41:54 -03:00
Mark Backman
48ce751602 Merge pull request #1075 from Vaibhav159/vl_add_daily_meeting_token_v2
adding models to DailyRestHelper
2025-01-24 13:21:52 -05:00
Vaibhav159
1f1e2dac2b wrapping things up 2025-01-24 23:44:23 +05:30
Vaibhav159
71c2dc3d05 minor typing change 2025-01-24 23:38:44 +05:30
Vaibhav159
ef02ece662 doc string 2025-01-24 22:47:40 +05:30
Vaibhav159
d5818fad5b addressing comments 2025-01-24 22:46:54 +05:30
Rafal Skorski
9c22bd8df1 Improving read me and encoding support 2025-01-24 16:44:11 +01:00
Mark Backman
dbea86baae Update function calling examples to push a TextFrame in the start_callback 2025-01-24 10:21:08 -05:00
Vaibhav159
c5faac1cf8 adding RecordingsBucketConfig 2025-01-24 15:14:20 +05:30
Vaibhav159
e106d7a215 adding line space 2025-01-24 09:12:07 +05:30
Vaibhav159
40c1a8369a updated changelog 2025-01-24 09:11:15 +05:30
Vaibhav159
6ab2404a98 adding more properties to daily room 2025-01-24 09:10:25 +05:30
Mark Backman
e61c996a2e Merge pull request #1079 from ecdeng/patch-1
Update cartesia.py to use the new model pointer `sonic`
2025-01-23 22:15:30 -05:00
Eric Deng
2c81dc1f06 Update cartesia.py to use the new model pointer sonic instead of sonic-english
We are now using `sonic` as a pointer to the latest stable release (https://docs.cartesia.ai/build-with-sonic/models#continuous-updates). sonic-english will forever point to `sonic-2024-10-19`, which is already out of date.
2025-01-23 15:47:07 -08:00
Mark Backman
53251dcb88 Add better error handling for OpenAIRealtimeBetaLLMService truncate errors 2025-01-23 14:25:08 -05:00
Mark Backman
d4e4b12109 Merge pull request #1071 from porcelaincode/patch-1
Update runner.py
2025-01-23 13:19:22 -05:00
Mark Backman
466d26a4f2 Merge pull request #1077 from Vaibhav159/vl_fix_missing_leftover_audio
adding missing audio buffer fix
2025-01-23 13:16:41 -05:00
Vaibhav159
ef511d580d adding missing audio buffer fix 2025-01-23 23:17:49 +05:30
Vaibhav159
5957ddb038 adding missing audio buffer fix 2025-01-23 23:17:18 +05:30
Vaibhav159
799c2d14b8 adding meeting token v2 func 2025-01-23 21:40:42 +05:30
Rafal Skorski
8eef21db6e Adding telnyx serializer 2025-01-23 15:39:46 +01:00
vatsal
dee1224530 Update runner.py 2025-01-23 13:21:49 +05:30
Joe Garlick
b72504f1cb Added: Additional DTMF frames 2025-01-22 13:47:23 +00:00
Rafal Skorski
89b87289e2 elevenlabs key added to env.example 2025-01-21 17:12:27 +01:00
Rafal Skorski
e0e190a1a2 Create telnyx chat bot example application 2025-01-21 17:09:55 +01:00
Filipi Fuchter
9b61633aa0 Introduce support for extracting and processing grounding metadata from Google LLM responses. 2025-01-20 11:28:12 -03:00
Filipi Fuchter
c4c15eff39 Sending a silence frame to prevent the audio from clipping. 2025-01-16 18:30:19 -03:00
Filipi Fuchter
7efd00e0f7 Asking for the bot to send the audio only when the audio element is already on playing state. 2025-01-16 16:00:56 -03:00
Filipi Fuchter
119c0da299 Configuring a proxy so we can test from mobile 2025-01-16 11:02:53 -03:00
Filipi Fuchter
ea1323723d Handling the signalling to play the audio 2025-01-16 10:42:22 -03:00
Filipi Fuchter
d2efe27350 Improving the logs and updating status 2025-01-16 10:36:45 -03:00
Filipi Fuchter
5dc7d2a378 Creating the bot when pressing to connect. 2025-01-16 10:28:39 -03:00
Filipi Fuchter
88c540f9bc Starting to create the example signalling through app message. 2025-01-16 10:14:38 -03:00
824 changed files with 99858 additions and 18520 deletions

87
.github/ISSUE_TEMPLATE/1-bug_report.yml vendored Normal file
View File

@@ -0,0 +1,87 @@
name: Bug report
description: Report a bug or unexpected behavior
type: Bug
body:
- type: markdown
attributes:
value: |
## Bug Report
Thank you for taking the time to fill out this bug report.
- type: markdown
attributes:
value: |
### Environment
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: python-version
attributes:
label: Python version
description: Which Python version are you using?
placeholder: e.g., 3.12.8
validations:
required: true
- type: input
id: os
attributes:
label: Operating System
description: Which OS are you using?
placeholder: e.g., Ubuntu 24.04, Windows 11, macOS 12.5
validations:
required: true
- type: textarea
id: description
attributes:
label: Issue description
description: Provide a clear description of the issue.
validations:
required: true
- type: textarea
id: repro
attributes:
label: Reproduction steps
description: List the steps to reproduce the issue.
placeholder: |
1. Do this...
2. Then do that...
3. Observe the error...
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected behavior
description: What did you expect to happen?
validations:
required: true
- type: textarea
id: actual
attributes:
label: Actual behavior
description: What actually happened?
validations:
required: true
- type: textarea
id: logs
attributes:
label: Logs
description: If applicable, include any relevant logs or error messages
render: shell
validations:
required: false

67
.github/ISSUE_TEMPLATE/2-question.yml vendored Normal file
View File

@@ -0,0 +1,67 @@
name: Question
description: Ask a question or get help
type: Question
body:
- type: markdown
attributes:
value: |
## Question
Use this form to ask a question about pipecat.
- type: markdown
attributes:
value: |
### Environment (if applicable)
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using? (if applicable)
placeholder: e.g., 0.0.63
validations:
required: false
- type: input
id: python-version
attributes:
label: Python version
description: Which Python version are you using? (if applicable)
placeholder: e.g., 3.12.8
validations:
required: false
- type: input
id: os
attributes:
label: Operating System
description: Which OS are you using? (if applicable)
placeholder: e.g., Ubuntu 24.04, Windows 11, macOS 12.5
validations:
required: false
- type: textarea
id: question
attributes:
label: Question
description: Provide your question in detail here.
validations:
required: true
- type: textarea
id: tried
attributes:
label: What I've tried
description: Describe what you've already tried or research you've done.
placeholder: I've looked at the documentation and tried...
validations:
required: false
- type: textarea
id: context
attributes:
label: Context
description: Any additional context or information that might help others understand your question better.
validations:
required: false

View File

@@ -0,0 +1,52 @@
name: Feature request
description: Suggest an enhancement or new feature
type: Enhancement
body:
- type: markdown
attributes:
value: |
## Feature Request
Thank you for suggesting an enhancement to pipecat.
- type: textarea
id: problem
attributes:
label: Problem Statement
description: A clear description of the problem this feature would solve.
placeholder: I'm always frustrated when...
validations:
required: true
- type: textarea
id: solution
attributes:
label: Proposed Solution
description: A clear and concise description of what you want to happen.
validations:
required: true
- type: textarea
id: alternatives
attributes:
label: Alternative Solutions
description: Any alternative solutions or features you've considered.
validations:
required: false
- type: textarea
id: context
attributes:
label: Additional Context
description: Add any other context, mockups, or screenshots about the feature request here.
placeholder: You can drag and drop images here to include them.
validations:
required: false
- type: checkboxes
id: contribution
attributes:
label: Would you be willing to help implement this feature?
options:
- label: Yes, I'd like to contribute
- label: No, I'm just suggesting

View File

@@ -0,0 +1,82 @@
name: Service Issue
description: An issue with a third-party service
type: Service Issue
body:
- type: markdown
attributes:
value: |
## Service Issue
Use this form to report an issue with a third-party service integration.
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: service-name
attributes:
label: Service Name
description: Which third-party service is having issues?
placeholder: e.g., OpenAI, ElevenLabs, Anthropic
validations:
required: true
- type: input
id: service-version
attributes:
label: Service or model version
description: Which version of the service API or model are you using?
placeholder: e.g., v1, gpt-4.1
validations:
required: false
- type: textarea
id: description
attributes:
label: Issue Description
description: Provide a clear description of the service issue.
validations:
required: true
- type: textarea
id: reproduction
attributes:
label: Reproduction Steps
description: Provide steps to reproduce the issue.
placeholder: |
1. Configure service X
2. Call method Y
3. See error Z
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected Behavior
description: What did you expect to happen?
validations:
required: true
- type: textarea
id: actual
attributes:
label: Actual Behavior
description: What actually happened?
validations:
required: true
- type: textarea
id: logs
attributes:
label: Error Logs
description: If available, include any error messages or logs.
render: shell
validations:
required: false

View File

@@ -0,0 +1,56 @@
name: New Service
description: Request to support a new third-party service
type: New Service
body:
- type: markdown
attributes:
value: |
## New Service Request
Use this form to request support for a new third-party service in pipecat.
- type: input
id: service-name
attributes:
label: Service Name
description: What is the name of the third-party service?
placeholder: e.g., NewAPI, SomeService
validations:
required: true
- type: input
id: service-website
attributes:
label: Service Website
description: Link to the service's website or documentation
placeholder: e.g., https://newapi.com
validations:
required: true
- type: textarea
id: service-description
attributes:
label: Service Description
description: Briefly describe what this service does and how it works.
validations:
required: true
- type: textarea
id: api-info
attributes:
label: API Information
description: If available, provide details about the service's API.
placeholder: |
- API documentation link
- Authentication method
- Key endpoints you'd like supported
validations:
required: false
- type: checkboxes
id: contribution
attributes:
label: Would you be willing to help implement this service?
options:
- label: Yes, I'd like to contribute
- label: No, I'm just suggesting

74
.github/ISSUE_TEMPLATE/6-dependency.yml vendored Normal file
View File

@@ -0,0 +1,74 @@
name: Dependency Issue
description: An issue with a Pipecat dependency (not a third-party service)
type: Dependency Issue
body:
- type: markdown
attributes:
value: |
## Dependency Issue
Use this form to report an issue with a Pipecat dependency.
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: dependency-name
attributes:
label: Dependency Name
description: Which Pipecat dependency is causing the issue?
placeholder: e.g., openai, anthropic, fastapi
validations:
required: true
- type: input
id: dependency-version
attributes:
label: Dependency Version
description: Which version of the dependency are you using?
placeholder: e.g., 1.2.3
validations:
required: true
- type: textarea
id: description
attributes:
label: Issue Description
description: Provide a clear description of the dependency issue.
validations:
required: true
- type: textarea
id: impact
attributes:
label: Impact
description: How is this dependency issue affecting your usage of pipecat?
validations:
required: true
- type: textarea
id: reproduction
attributes:
label: Reproduction Steps
description: If applicable, provide steps to reproduce the issue.
placeholder: |
1. Install dependency X
2. Run command Y
3. See error Z
validations:
required: false
- type: textarea
id: logs
attributes:
label: Error Logs
description: If applicable, include any relevant error messages or logs.
render: shell
validations:
required: false

View File

@@ -0,0 +1,70 @@
name: Troubleshooting
description: Help with a specific use case
type: Troubleshooting
body:
- type: markdown
attributes:
value: |
## Troubleshooting Request
Use this form to get help with a specific use case or implementation.
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: python-version
attributes:
label: Python version
description: Which version of Python are you using?
placeholder: e.g., 3.12.8
validations:
required: true
- type: input
id: os
attributes:
label: Operating System
description: Which OS are you using?
placeholder: e.g., Ubuntu 24.04, Windows 11, macOS 12.5
validations:
required: true
- type: textarea
id: use-case
attributes:
label: Use Case Description
description: Describe what you're trying to accomplish with pipecat.
validations:
required: true
- type: textarea
id: current-approach
attributes:
label: Current Approach
description: What have you tried so far? Include code snippets if relevant.
render: python
validations:
required: true
- type: textarea
id: errors
attributes:
label: Errors or Unexpected Behavior
description: Describe any errors or unexpected behavior you're encountering.
validations:
required: true
- type: textarea
id: additional-context
attributes:
label: Additional Context
description: Any other information that might help us understand your situation.
validations:
required: false

1
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1 @@
blank_issues_enabled: false

54
.github/workflows/coverage.yaml vendored Normal file
View File

@@ -0,0 +1,54 @@
name: coverage
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
jobs:
coverage:
name: "Coverage"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing dev-requirements.txt and test-requirements.txt which
# contain all dependencies needed to run the tests.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('dev-requirements.txt') }}-${{ hashFiles('test-requirements.txt') }}
path: .venv
- name: Install system packages
id: install_system_packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r test-requirements.txt
- name: Run tests with coverage
run: |
source .venv/bin/activate
coverage run
coverage xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
slug: pipecat-ai/pipecat

21
.gitignore vendored
View File

@@ -7,7 +7,7 @@ venv
/.idea
#*#
# Distribution / packaging
# Distribution / Packaging
.Python
build/
develop-eggs/
@@ -30,8 +30,23 @@ MANIFEST
.env
fly.toml
# Example files
pipecat/examples/twilio-chatbot/templates/streams.xml
# Examples
examples/telnyx-chatbot/templates/streams.xml
examples/twilio-chatbot/templates/streams.xml
examples/**/node_modules/
examples/**/.expo/
examples/**/dist/
examples/**/npm-debug.*
examples/**/*.jks
examples/**/*.p8
examples/**/*.p12
examples/**/*.key
examples/**/*.mobileprovision
examples/**/*.orig.*
examples/**/web-build/
# macOS
.DS_Store
# Documentation
docs/api/_build/

View File

@@ -1,7 +1,8 @@
repos:
- repo: local
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.9.7
hooks:
- id: ruff-format-hook
name: Check ruff formatting
entry: sh scripts/pre-commit.sh
language: system
- id: ruff
language_version: python3
args: [ --select, I, ]
- id: ruff-format

File diff suppressed because it is too large Load Diff

View File

@@ -26,11 +26,52 @@ git commit -m "Description of your changes"
git push origin your-branch-name
```
9. **Submit a Pull Request (PR)**: Open a PR from your forked repository to the main branch of this repo.
> Important: Describe the changes you've made clearly!
8. **Submit a Pull Request (PR)**: Open a PR from your forked repository to the main branch of this repo.
> Important: Describe the changes you've made clearly!
Our maintainers will review your PR, and once everything is good, your contributions will be merged!
## Code Style and Documentation
### Python Code Style
We use Ruff for code linting and formatting. Please ensure your code passes all linting checks before submitting a PR.
### Docstring Conventions
We follow Google-style docstrings with these specific conventions:
- Class docstrings should fully document all parameters used in `__init__`
- We don't require separate docstrings for `__init__` methods when parameters are documented in the class docstring
- Property methods should have docstrings explaining their purpose and return value
Example of correctly documented class:
```python
class MyClass:
"""Class description.
Additional details about the class.
Args:
param1: Description of first parameter.
param2: Description of second parameter.
"""
def __init__(self, param1, param2):
# No docstring required here as parameters are documented above
self.param1 = param1
self.param2 = param2
@property
def some_property(self) -> str:
"""Get the formatted property value.
Returns:
A string representation of the property.
"""
return f"Property: {self.param1}"
```
# Contributor Covenant Code of Conduct
@@ -51,23 +92,23 @@ diverse, inclusive, and healthy community.
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
- Demonstrating empathy and kindness toward other people
- Being respectful of differing opinions, viewpoints, and experiences
- Giving and gracefully accepting constructive feedback
- Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
- Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
- The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
- Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
@@ -162,4 +203,4 @@ For answers to common questions about this code of conduct, see the FAQ at
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
[translations]: https://www.contributor-covenant.org/translations

236
README.md
View File

@@ -1,43 +1,72 @@
<h1><div align="center">
 <img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
<img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
</div></h1>
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) <a href="https://app.commanddash.io/agent/github_pipecat-ai_pipecat"><img src="https://img.shields.io/badge/AI-Code%20Agent-EB9FDA"></a>
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![codecov](https://codecov.io/gh/pipecat-ai/pipecat/graph/badge.svg?token=LNVUIVO4Y9)](https://codecov.io/gh/pipecat-ai/pipecat) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat)
Pipecat is an open source Python framework for building voice and multimodal conversational agents. It handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions, letting you focus on creating engaging experiences.
# 🎙️ Pipecat: Real-Time Voice & Multimodal AI Agents
## What you can build
**Pipecat** is an open-source Python framework for building real-time voice and multimodal conversational agents. Orchestrate audio and video, AI services, different transports, and conversation pipelines effortlessly—so you can focus on what makes your agent unique.
- **Voice Assistants**: [Natural, real-time conversations with AI](https://demo.dailybots.ai/)
- **Interactive Agents**: Personal coaches and meeting assistants
- **Multimodal Apps**: Combine voice, video, images, and text
- **Creative Tools**: [Story-telling experiences](https://storytelling-chatbot.fly.dev/) and social companions
- **Business Solutions**: [Customer intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0) and support bots
- **Complex conversational flows**: [Refer to Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) to learn more
## 🚀 What You Can Build
## See it in action
- **Voice Assistants** natural, streaming conversations with AI
- **AI Companions** coaches, meeting assistants, characters
- **Multimodal Interfaces** voice, video, images, and more
- **Interactive Storytelling** creative tools with generative media
- **Business Agents** customer intake, support bots, guided flows
- **Complex Dialog Systems** design logic with structured conversations
🧭 Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
## 🧠 Why Pipecat?
- **Voice-first**: Integrates speech recognition, text-to-speech, and conversation handling
- **Pluggable**: Supports many AI services and tools
- **Composable Pipelines**: Build complex behavior from modular components
- **Real-Time**: Ultra-low latency interaction with different transports (e.g. WebSockets or WebRTC)
## 🎬 See it in action
<p float="left">
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png" width="280" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png" width="280" /></a>
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png" width="400" /></a>
<br/>
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png" width="280" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png" width="280" /></a>
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png" width="400" /></a>
</p>
## Key features
## 📱 Client SDKs
- **Voice-first Design**: Built-in speech recognition, TTS, and conversation handling
- **Flexible Integration**: Works with popular AI services (OpenAI, ElevenLabs, etc.)
- **Pipeline Architecture**: Build complex apps from simple, reusable components
- **Real-time Processing**: Frame-based pipeline architecture for fluid interactions
- **Production Ready**: Enterprise-grade WebRTC and Websocket support
You can connect to Pipecat from any platform using our official SDKs:
💡 Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
| Platform | SDK Repo | Description |
| -------- | ------------------------------------------------------------------------------ | -------------------------------- |
| Web | [pipecat-client-web](https://github.com/pipecat-ai/pipecat-client-web) | JavaScript and React client SDKs |
| iOS | [pipecat-client-ios](https://github.com/pipecat-ai/pipecat-client-ios) | Swift SDK for iOS |
| Android | [pipecat-client-android](https://github.com/pipecat-ai/pipecat-client-android) | Kotlin SDK for Android |
| C++ | [pipecat-client-cxx](https://github.com/pipecat-ai/pipecat-client-cxx) | C++ client SDK |
## Getting started
## 🧩 Available services
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youre ready. You can also add a 📞 telephone number, 🖼️ image output, 📺 video input, use different LLMs, and more.
| Category | Services |
|---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Parakeet (NVIDIA)](https://docs.pipecat.ai/server/services/stt/parakeet), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [FastPitch (NVIDIA)](https://docs.pipecat.ai/server/services/tts/fastpitch), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
| Analytics & Metrics | [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
## ⚡ Getting started
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youre ready.
```shell
# Install the module
@@ -53,150 +82,71 @@ To keep things lightweight, only the core framework is included by default. If y
pip install "pipecat-ai[option,...]"
```
Available options include:
| Category | Services | Install Command Example |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) | `pip install "pipecat-ai[deepgram]"` |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Together AI](https://docs.pipecat.ai/server/services/llm/together) | `pip install "pipecat-ai[openai]"` |
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) | `pip install "pipecat-ai[cartesia]"` |
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) | `pip install "pipecat-ai[openai]"` |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local | `pip install "pipecat-ai[daily]"` |
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) | `pip install "pipecat-ai[tavus,simli]"` |
| Vision & Image | [Moondream](https://docs.pipecat.ai/server/services/vision/moondream), [fal](https://docs.pipecat.ai/server/services/image-generation/fal) | `pip install "pipecat-ai[moondream]"` |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) | `pip install "pipecat-ai[silero]"` |
| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/server/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) | `pip install "pipecat-ai[canonical]"` |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
## Code examples
## 🧪 Code examples
- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [Example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) — complete applications that you can use as starting points for development
## A simple voice agent running locally
## 🛠️ Hacking on the framework itself
Here is a very basic Pipecat bot that greets a user when they join a real-time session. We'll use [Daily](https://daily.co) for real-time media transport, and [Cartesia](https://cartesia.ai/) for text-to-speech.
1. Set up a virtual environment before following these instructions. From the root of the repo:
```python
import asyncio
```shell
python3 -m venv venv
source venv/bin/activate
```
from pipecat.frames.frames import EndFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
2. Install the development dependencies:
async def main():
# Use Daily as a real-time media transport (WebRTC)
transport = DailyTransport(
room_url=...,
token="", # leave empty. Note: token is _not_ your api key
bot_name="Bot Name",
params=DailyParams(audio_out_enabled=True))
```shell
pip install -r dev-requirements.txt
```
# Use Cartesia for Text-to-Speech
tts = CartesiaTTSService(
api_key=...,
voice_id=...
)
3. Install the git pre-commit hooks (these help ensure your code follows project rules):
# Simple pipeline that will process text to speech and output the result
pipeline = Pipeline([tts, transport.output()])
```shell
pre-commit install
```
# Create Pipecat processor that can run one or more pipelines tasks
runner = PipelineRunner()
4. Install the `pipecat-ai` package locally in editable mode:
# Assign the task callable to run the pipeline
task = PipelineTask(pipeline)
```shell
pip install -e .
```
# Register an event handler to play audio when a
# participant joins the transport WebRTC session
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
participant_name = participant.get("info", {}).get("userName", "")
# Queue a TextFrame that will get spoken by the TTS service (Cartesia)
await task.queue_frame(TextFrame(f"Hello there, {participant_name}!"))
> The `-e` or `--editable` option allows you to modify the code without reinstalling.
# Register an event handler to exit the application when the user leaves.
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.queue_frame(EndFrame())
5. Include optional dependencies as needed. For example:
# Run the pipeline task
await runner.run(task)
```shell
pip install -e ".[daily,deepgram,cartesia,openai,silero]"
```
if __name__ == "__main__":
asyncio.run(main())
```
6. (Optional) If you want to use this package from another directory:
Run it with:
```shell
python app.py
```
Daily provides a prebuilt WebRTC user interface. While the app is running, you can visit at `https://<yourdomain>.daily.co/<room_url>` and listen to the bot say hello!
## WebRTC for production use
WebSockets are fine for server-to-server communication or for initial development. But for production use, youll need client-server audio to use a protocol designed for real-time media transport. (For an explanation of the difference between WebSockets and WebRTC, see [this post.](https://www.daily.co/blog/how-to-talk-to-an-llm-with-your-voice/#webrtc))
One way to get up and running quickly with WebRTC is to sign up for a Daily developer account. Daily gives you SDKs and global infrastructure for audio (and video) routing. Every account gets 10,000 audio/video/transcription minutes free each month.
Sign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://docs.daily.co/reference/rest-api/rooms) in the developer Dashboard.
## Hacking on the framework itself
_Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo:_
```shell
python3 -m venv venv
source venv/bin/activate
```
From the root of this repo, run the following:
```shell
pip install -r dev-requirements.txt
```
This will install the necessary development dependencies. Also, make sure you install the git pre-commit hooks:
```shell
pre-commit install
```
The hooks will just save you time when you submit a PR by making sure your code follows the project rules.
To use the package locally (e.g. to run sample files), run:
```shell
pip install --editable ".[option,...]"
```
The `--editable` option makes sure you don't have to run `pip install` again and you can just edit the project files locally.
If you want to use this package from another directory, you can run:
```shell
pip install "path_to_this_repo[option,...]"
```
```shell
pip install "path_to_this_repo[option,...]"
```
### Running tests
Install the test dependencies:
```shell
pip install -r test-requirements.txt
```
From the root directory, run:
```shell
pytest
```
## Setting up your editor
### Setting up your editor
This project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting via [Ruff](https://github.com/astral-sh/ruff).
### Emacs
#### Emacs
You can use [use-package](https://github.com/jwiegley/use-package) to install [emacs-lazy-ruff](https://github.com/christophermadsen/emacs-lazy-ruff) package and configure `ruff` arguments:
@@ -218,7 +168,7 @@ You can use [use-package](https://github.com/jwiegley/use-package) to install [e
:hook ((python-mode . pyvenv-auto-run)))
```
### Visual Studio Code
#### Visual Studio Code
Install the
[Ruff](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, and enable formatting on save:
@@ -230,7 +180,7 @@ Install the
}
```
### PyCharm
#### PyCharm
`ruff` was installed in the `venv` environment described before, now to enable autoformatting on save, go to `File` -> `Settings` -> `Tools` -> `File Watchers` and add a new watcher with the following settings:
@@ -240,7 +190,7 @@ Install the
4. **Arguments**: `format $FilePath$`
5. **Program**: `$PyInterpreterDirectory$/ruff`
## Contributing
## 🤝 Contributing
We welcome contributions from the community! Whether you're fixing bugs, improving documentation, or adding new features, here's how you can help:
@@ -253,7 +203,7 @@ Before submitting a pull request, please check existing issues and PRs to avoid
We aim to review all contributions promptly and provide constructive feedback to help get your changes merged.
## Getting help
## 🛟 Getting help
➡️ [Join our Discord](https://discord.gg/pipecat)

11
codecov.yml Normal file
View File

@@ -0,0 +1,11 @@
coverage:
range: 50..90 # coverage lower than 50 is red, higher than 90 green, between color code
status:
project:
default:
target: auto # auto % coverage target
threshold: 5% # allow for 5% reduction of coverage without failing
# do not run coverage on patch nor changes
patch: false

View File

@@ -1,11 +1,13 @@
build~=1.2.2
grpcio-tools~=1.69.0
coverage~=7.6.12
grpcio-tools~=1.67.1
pip-tools~=7.4.1
pre-commit~=4.0.1
pyright~=1.1.392
pyright~=1.1.397
pytest~=8.3.4
pytest-asyncio~=0.25.2
ruff~=0.9.1
setuptools~=75.8.0
pytest-asyncio~=0.25.3
pytest-aiohttp==1.1.0
ruff~=0.11.1
setuptools~=70.0.0
setuptools_scm~=8.1.0
python-dotenv~=1.0.1

View File

@@ -1,22 +0,0 @@
# Description
Is this reporting a bug or feature request?
If reporting a bug, please fill out the following:
### Environment
- pipecat-ai version:
- python version:
- OS:
### Issue description
Provide a clear description of the issue.
### Repro steps
List the steps to reproduce the issue.
### Expected behavior
### Actual behavior
### Logs

View File

@@ -50,6 +50,13 @@ autodoc_mock_imports = [
"pyht.protos",
"pyht.protos.api_pb2",
"pipecat_ai_playht", # PlayHT wrapper
"aiortc",
"aiortc.mediastreams",
"cv2",
"av",
"pyneuphonic",
"mem0",
"mlx_whisper",
"anthropic",
"assemblyai",
"boto3",
@@ -68,7 +75,6 @@ autodoc_mock_imports = [
"openpipe",
"simli",
"soundfile",
# Existing mocks
"pipecat_ai_krisp",
"pyaudio",
"_tkinter",
@@ -79,6 +85,66 @@ autodoc_mock_imports = [
"pydantic.Field",
"pydantic._internal._model_construction",
"pydantic._internal._fields",
# Moondream dependencies
"torch",
"transformers",
"intel_extension_for_pytorch",
# Ultravox dependencies
"huggingface_hub",
"vllm",
"vllm.engine.arg_utils",
"transformers.AutoTokenizer",
# Langchain dependencies
"langchain_core",
"langchain_core.messages",
"langchain_core.runnables",
"langchain_core.messages.AIMessageChunk",
"langchain_core.runnables.Runnable",
# LiveKit dependencies
"livekit",
"livekit.rtc",
"livekit_api",
"livekit_protocol",
"tenacity",
"tenacity.retry",
"tenacity.stop_after_attempt",
"tenacity.wait_exponential",
"rtc",
"rtc.Room",
"rtc.RoomOptions",
"rtc.AudioSource",
"rtc.LocalAudioTrack",
"rtc.TrackPublishOptions",
"rtc.TrackSource",
"rtc.AudioStream",
"rtc.AudioFrameEvent",
"rtc.AudioFrame",
"rtc.Track",
"rtc.TrackKind",
"rtc.RemoteParticipant",
"rtc.RemoteTrackPublication",
"rtc.DataPacket",
# Riva dependencies
"riva",
"riva.client",
"riva.client.Auth",
"riva.client.ASRService",
"riva.client.StreamingRecognitionConfig",
"riva.client.RecognitionConfig",
"riva.client.AudioEncoding",
"riva.client.proto.riva_tts_pb2",
"riva.client.SpeechSynthesisService",
# Local CoreML Smart Turn dependencies
"coremltools",
"coremltools.models",
"coremltools.models.MLModel",
"torch",
"torch.nn",
"torch.nn.functional",
"transformers",
"transformers.AutoFeatureExtractor",
# Also add specific classes that are imported
"AutoFeatureExtractor",
]
# HTML output settings
@@ -110,12 +176,25 @@ def verify_modules():
},
}
# Skip importing modules that are in autodoc_mock_imports
skipped_modules = set(autodoc_mock_imports)
missing = []
for category, modules in required_modules.items():
if isinstance(modules, dict):
# Handle nested structure
for subcategory, submodules in modules.items():
for module in submodules:
# Check if module is in autodoc_mock_imports
if (
f"pipecat.{category}.{subcategory}.{module}" in skipped_modules
or module in skipped_modules
):
logger.info(
f"Skipping import of mocked module: pipecat.{category}.{subcategory}.{module}"
)
continue
try:
__import__(f"pipecat.{category}.{subcategory}.{module}")
logger.info(
@@ -129,6 +208,11 @@ def verify_modules():
else:
# Handle flat structure
for module in modules:
# Check if module is in autodoc_mock_imports
if f"pipecat.{category}.{module}" in skipped_modules or module in skipped_modules:
logger.info(f"Skipping import of mocked module: pipecat.{category}.{module}")
continue
try:
__import__(f"pipecat.{category}.{module}")
logger.info(f"Successfully imported pipecat.{category}.{module}")

View File

@@ -45,8 +45,10 @@ Transport & Serialization
Utilities
~~~~~~~~~
* :mod:`Adapters <pipecat.adapters>`
* :mod:`Clocks <pipecat.clocks>`
* :mod:`Metrics <pipecat.metrics>`
* :mod:`Observers <pipecat.observers>`
* :mod:`Sync <pipecat.sync>`
* :mod:`Transcriptions <pipecat.transcriptions>`
* :mod:`Utils <pipecat.utils>`
@@ -56,10 +58,12 @@ Utilities
:caption: API Reference
:hidden:
Adapters <api/pipecat.adapters>
Audio <api/pipecat.audio>
Clocks <api/pipecat.clocks>
Frames <api/pipecat.frames>
Metrics <api/pipecat.metrics>
Observers <api/pipecat.observers>
Pipeline <api/pipecat.pipeline>
Processors <api/pipecat.processors>
Serializers <api/pipecat.serializers>

View File

@@ -10,31 +10,44 @@ pipecat-ai[anthropic]
pipecat-ai[assemblyai]
pipecat-ai[aws]
pipecat-ai[azure]
pipecat-ai[canonical]
pipecat-ai[cartesia]
pipecat-ai[cerebras]
pipecat-ai[deepseek]
pipecat-ai[daily]
pipecat-ai[deepgram]
pipecat-ai[elevenlabs]
pipecat-ai[fal]
pipecat-ai[fireworks]
pipecat-ai[fish]
pipecat-ai[gladia]
pipecat-ai[google]
pipecat-ai[grok]
pipecat-ai[groq]
# pipecat-ai[krisp] # Mocked instead
pipecat-ai[langchain]
pipecat-ai[livekit]
# pipecat-ai[krisp] # Mocked
pipecat-ai[koala]
# pipecat-ai[langchain] # Mocked
# pipecat-ai[livekit] # Mocked
pipecat-ai[lmnt]
pipecat-ai[local]
pipecat-ai[moondream]
# pipecat-ai[local-smart-turn] # Mocked
# pipecat-ai[mem0] # Mocked
# pipecat-ai[mlx-whisper] # Mocked
# pipecat-ai[moondream] # Mocked
pipecat-ai[nim]
# pipecat-ai[neuphonic] # Mocked
pipecat-ai[noisereduce]
pipecat-ai[openai]
# pipecat-ai[openpipe]
# pipecat-ai[playht] # Mocked due to grpcio conflict with riva
pipecat-ai[riva]
pipecat-ai[qwen]
pipecat-ai[remote-smart-turn]
# pipecat-ai[riva] # Mocked
pipecat-ai[silero]
pipecat-ai[simli]
pipecat-ai[soundfile]
pipecat-ai[tavus]
pipecat-ai[together]
# pipecat-ai[ultravox] # Mocked
# pipecat-ai[webrtc] # Mocked
pipecat-ai[websocket]
pipecat-ai[whisper]

View File

@@ -18,6 +18,9 @@ AZURE_DALLE_API_KEY=...
AZURE_DALLE_ENDPOINT=https://...
AZURE_DALLE_MODEL=...
# Cartesia
CARTESIA_API_KEY=...
# Daily
DAILY_API_KEY=...
DAILY_SAMPLE_ROOM_URL=https://...
@@ -26,6 +29,9 @@ DAILY_SAMPLE_ROOM_URL=https://...
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
# Neuphonic
NEUPHONIC_API_KEY=...
# Fal
FAL_KEY=...
@@ -84,3 +90,14 @@ ASSEMBLYAI_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...
# Piper
PIPER_BASE_URL=...
# Smart turn
LOCAL_SMART_TURN_MODEL_PATH=
FAL_SMART_TURN_API_KEY=...
# Twilio
TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=

View File

@@ -39,7 +39,7 @@ Next, follow the steps in the README for each demo.
| [Translation Chatbot](translation-chatbot) | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI |
| [Moondream Chatbot](moondream-chatbot) | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU** | Deepgram, ElevenLabs, OpenAI, Moondream, Daily, Daily Prebuilt UI |
| [Patient intake](patient-intake) | A chatbot that can call functions in response to user input. | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI |
| [Dialin Chatbot](dialin-chatbot) | A chatbot that connects to an incoming phone call from Daily or Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [Phone Chatbot](phone-chatbot) | A chatbot that connects to PSTN/SIP phone calls, powered by Daily or Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [Twilio Chatbot](twilio-chatbot) | A chatbot that connects to an incoming phone call from Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [studypal](studypal) | A chatbot to have a conversation about any article on the web | |
| [WebSocket Chatbot Server](websocket-server) | A real-time websocket server that handles audio streaming and bot interactions with speech-to-text and text-to-speech capabilities. | Cartesia, Deepgram, OpenAI, Websockets |

View File

@@ -0,0 +1,45 @@
# Bot ready signaling
A simple Pipecat example demonstrating how to handle signaling between the client and the bot,
ensuring that the bot starts sending audio only when the client is available,
thereby avoiding the risk of cutting off the beginning of the audio.
## Quick Start
### First, start the bot server:
1. Navigate to the server directory:
```bash
cd server
```
2. Create and activate a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install requirements:
```bash
pip install -r requirements.txt
```
4. Copy env.example to .env and configure:
- Add your API keys
5. Start the server:
```bash
python server.py
```
### Next, connect using the client app:
For client-side setup, refer to the [JavaScript Guide](client/javascript/README.md).
## Important Note
Ensure the bot server is running before using any client implementations.
## Requirements
- Python 3.10+
- Node.js 16+ (for JavaScript)
- Daily API key
- Cartesia API key
- Modern web browser with WebRTC support

View File

@@ -0,0 +1,27 @@
# JavaScript Implementation
Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
## Setup
1. Run the bot server. See the [server README](../../README).
2. Navigate to the `client/javascript` directory:
```bash
cd client/javascript
```
3. Install dependencies:
```bash
npm install
```
4. Run the client app:
```
npm run dev
```
5. Visit http://localhost:5173 in your browser.

View File

@@ -0,0 +1,34 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI Chatbot</title>
</head>
<body>
<div class="container">
<div class="status-bar">
<div class="status">
Status: <span id="connection-status">Disconnected</span>
</div>
<div class="controls">
<button id="connect-btn">Connect</button>
<button id="disconnect-btn" disabled>Disconnect</button>
</div>
</div>
<audio id="bot-audio" autoplay></audio>
<div class="debug-panel">
<h3>Debug Info</h3>
<div id="debug-log"></div>
</div>
</div>
<script type="module" src="/src/app.js"></script>
<link rel="stylesheet" href="/src/style.css">
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,20 @@
{
"name": "client",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.0.9"
},
"dependencies": {
"@daily-co/daily-js": "0.74.0"
}
}

View File

@@ -0,0 +1,216 @@
/**
* Copyright (c) 20242025, Daily
*
* SPDX-License-Identifier: BSD 2-Clause License
*/
import Daily from "@daily-co/daily-js";
/**
* ChatbotClient handles the connection and media management for a real-time
* voice interaction with an AI bot.
*/
class ChatbotClient {
constructor() {
// Initialize client state
this.dailyCallObject = null;
this.setupDOMElements();
this.setupEventListeners();
}
/**
* Set up references to DOM elements and create necessary media elements
*/
setupDOMElements() {
// Get references to UI control elements
this.connectBtn = document.getElementById('connect-btn');
this.disconnectBtn = document.getElementById('disconnect-btn');
this.statusSpan = document.getElementById('connection-status');
this.debugLog = document.getElementById('debug-log');
// Create an audio element for bot's voice output
this.botAudio = document.createElement('audio');
this.botAudio.autoplay = true;
this.botAudio.playsInline = true;
document.body.appendChild(this.botAudio);
}
/**
* Set up event listeners for connect/disconnect buttons
*/
setupEventListeners() {
this.connectBtn.addEventListener('click', () => this.connect());
this.disconnectBtn.addEventListener('click', () => this.disconnect());
}
/**
* Add a timestamped message to the debug log
*/
log(message) {
const entry = document.createElement('div');
entry.textContent = `${new Date().toISOString()} - ${message}`;
// Add styling based on message type
if (message.startsWith('User: ')) {
entry.style.color = '#2196F3'; // blue for user
} else if (message.startsWith('Bot: ')) {
entry.style.color = '#4CAF50'; // green for bot
}
this.debugLog.appendChild(entry);
this.debugLog.scrollTop = this.debugLog.scrollHeight;
console.log(message);
}
/**
* Update the connection status display
*/
updateStatus(status) {
this.statusSpan.textContent = status;
this.log(`Status: ${status}`);
}
handleEventToConsole (evt) {
this.log(`Received event: ${evt.action}`);
};
/**
* Set up listeners for track events (start/stop)
* This handles new tracks being added during the session
*/
setupTrackListeners() {
if (!this.dailyCallObject) return;
this.dailyCallObject.on("joined-meeting", () => {
this.updateStatus('Connected');
this.connectBtn.disabled = true;
this.disconnectBtn.disabled = false;
this.log('Client connected');
});
this.dailyCallObject.on("track-started", (evt) => {
if (evt.track.kind === "audio" && evt.participant.local === false) {
this.log("Audio track started.")
this.setupAudioTrack(evt.track);
}
});
this.dailyCallObject.on("track-stopped", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-joined", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-updated", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-left", () => {
// When the bot leaves, we are also disconnecting from the call
this.disconnect()
});
this.dailyCallObject.on("left-meeting", () => {
this.updateStatus('Disconnected');
this.connectBtn.disabled = false;
this.disconnectBtn.disabled = true;
this.log('Client disconnected');
});
this.dailyCallObject.on("error", this.handleEventToConsole.bind(this));
}
/**
* Set up an audio track for playback
* Handles both initial setup and track updates
*/
setupAudioTrack(track) {
this.log(`Setting up audio track, track state: ${track.readyState}, muted: ${track.muted}`);
// Check if we're already playing this track
if (this.botAudio.srcObject) {
const oldTrack = this.botAudio.srcObject.getAudioTracks()[0];
if (oldTrack?.id === track.id) return;
}
// Create a new MediaStream with the track and set it as the audio source
this.botAudio.srcObject = new MediaStream([track]);
this.botAudio.onplaying = async (event) => {
this.log("onplaying")
this.log("Will send the audio message to play the audio at the next tick")
this.dailyCallObject.sendAppMessage("playable")
}
}
async fetchRoomInfo() {
let connectUrl = '/connect'
let res = await fetch(connectUrl, {
method: "POST",
mode: "cors",
headers: new Headers({
"Content-Type": "application/json"
}),
})
if (res.ok) {
return res.json();
}
}
/**
* Initialize and connect to the bot
* This sets up the RTVI client, initializes devices, and establishes the connection
*/
async connect() {
try {
// Initialize the client
this.dailyCallObject = Daily.createCallObject({
subscribeToTracksAutomatically: true,
});
// Set up listeners for media track events
this.setupTrackListeners();
this.log('Creating the bot...');
let roomInfo = await this.fetchRoomInfo()
// Connect to the bot
this.log('Connecting to bot...');
// Only for making debugger easier
window.callObject = this.dailyCallObject;
await this.dailyCallObject.join({
url: roomInfo.room_url,
});
this.log('Connection complete');
} catch (error) {
// Handle any errors during connection
this.log(`Error connecting: ${error.message}`);
this.log(`Error stack: ${error.stack}`);
this.updateStatus('Error');
// Clean up if there's an error
if (this.dailyCallObject) {
try {
await this.dailyCallObject.leave();
} catch (disconnectError) {
this.log(`Error during disconnect: ${disconnectError.message}`);
}
}
}
}
/**
* Disconnect from the bot and clean up media resources
*/
async disconnect() {
if (this.dailyCallObject) {
try {
// Disconnect the RTVI client
await this.dailyCallObject.leave();
await this.dailyCallObject.destroy();
this.dailyCallObject = null;
// Clean up audio
if (this.botAudio.srcObject) {
this.botAudio.srcObject.getTracks().forEach((track) => track.stop());
this.botAudio.srcObject = null;
}
} catch (error) {
this.log(`Error disconnecting: ${error.message}`);
}
}
}
}
// Initialize the client when the page loads
window.addEventListener('DOMContentLoaded', () => {
new ChatbotClient();
});

View File

@@ -0,0 +1,98 @@
body {
margin: 0;
padding: 20px;
font-family: Arial, sans-serif;
background-color: #f0f0f0;
}
.container {
max-width: 1200px;
margin: 0 auto;
}
.status-bar {
display: flex;
justify-content: space-between;
align-items: center;
padding: 10px;
background-color: #fff;
border-radius: 8px;
margin-bottom: 20px;
}
.controls button {
padding: 8px 16px;
margin-left: 10px;
border: none;
border-radius: 4px;
cursor: pointer;
}
#connect-btn {
background-color: #4caf50;
color: white;
}
#disconnect-btn {
background-color: #f44336;
color: white;
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.main-content {
background-color: #fff;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
}
.bot-container {
display: flex;
flex-direction: column;
align-items: center;
}
#bot-video-container {
width: 640px;
height: 360px;
background-color: #e0e0e0;
border-radius: 8px;
margin: 20px auto;
overflow: hidden;
display: flex;
align-items: center;
justify-content: center;
}
#bot-video-container video {
width: 100%;
height: 100%;
object-fit: cover;
}
.debug-panel {
background-color: #fff;
border-radius: 8px;
padding: 20px;
}
.debug-panel h3 {
margin: 0 0 10px 0;
font-size: 16px;
font-weight: bold;
}
#debug-log {
height: 200px;
overflow-y: auto;
background-color: #f8f8f8;
padding: 10px;
border-radius: 4px;
font-family: monospace;
font-size: 12px;
line-height: 1.4;
}

View File

@@ -0,0 +1,13 @@
import { defineConfig } from 'vite';
export default defineConfig({
server: {
proxy: {
// Proxy /api requests to the backend server
'/connect': {
target: 'http://0.0.0.0:7860', // Replace with your backend URL
changeOrigin: true,
},
},
},
});

View File

@@ -0,0 +1 @@
22.14

View File

@@ -0,0 +1,60 @@
# React Native Implementation
Basic implementation using the [Pipecat React Native SDK](https://docs.pipecat.ai/client/react-native/introduction).
## Usage
### Expo requirements
This project cannot be used with an [Expo Go](https://docs.expo.dev/workflow/expo-go/) app because [it requires custom native code](https://docs.expo.io/workflow/customizing/).
When a project requires custom native code or a config plugin, we need to transition from using [Expo Go](https://docs.expo.dev/workflow/expo-go/)
to a [development build](https://docs.expo.dev/development/introduction/).
More details about the custom native code used by this demo can be found in [rn-daily-js-expo-config-plugin](https://github.com/daily-co/rn-daily-js-expo-config-plugin).
### Building remotely
If you do not have experience with Xcode and Android Studio builds or do not have them installed locally on your computer, you will need to follow [this guide from Expo to use EAS Build](https://docs.expo.dev/development/create-development-builds/#create-and-install-eas-build).
### Building locally
You will need to have installed locally on your computer:
- [Xcode](https://developer.apple.com/xcode/) to build for iOS;
- [Android Studio](https://developer.android.com/studio) to build for Android;
#### Install the demo dependencies
```bash
# Use the version of node specified in .nvmrc
nvm i
# Install dependencies
npm i
# Before a native app can be compiled, the native source code must be generated.
npx expo prebuild
# Configure the environment variable to connect to the local server
cp env.example .env
# edit .env and add your local ip address, for example: http://192.168.1.16:7860
```
#### Running on Android
After plugging in an Android device [configured for debugging](https://developer.android.com/studio/debug/dev-options), run the following command:
```
npm run android
```
#### Running on iOS
Run the following command:
```
npm run ios
```
#### Connect to the server
Use the http://localhost:5173 in your app.

View File

@@ -0,0 +1,75 @@
{
"expo": {
"name": "bot-ready-rn",
"slug": "bot-ready-rn",
"version": "1.0.0",
"orientation": "portrait",
"icon": "./assets/icon.png",
"userInterfaceStyle": "light",
"splash": {
"image": "./assets/splash.png",
"resizeMode": "contain",
"backgroundColor": "#ffffff"
},
"updates": {
"fallbackToCacheTimeout": 0
},
"assetBundlePatterns": [
"**/*"
],
"ios": {
"supportsTablet": true,
"bitcode": false,
"bundleIdentifier": "co.daily.expo.BotReady",
"infoPlist": {
"UIBackgroundModes": [
"voip"
]
},
"appleTeamId": "EEBGKV9N3N"
},
"android": {
"adaptiveIcon": {
"foregroundImage": "./assets/adaptive-icon.png",
"backgroundColor": "#FFFFFF"
},
"package": "co.daily.expo.BotReady",
"permissions": [
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.BLUETOOTH",
"android.permission.CAMERA",
"android.permission.INTERNET",
"android.permission.MODIFY_AUDIO_SETTINGS",
"android.permission.RECORD_AUDIO",
"android.permission.SYSTEM_ALERT_WINDOW",
"android.permission.WAKE_LOCK",
"android.permission.FOREGROUND_SERVICE",
"android.permission.FOREGROUND_SERVICE_CAMERA",
"android.permission.FOREGROUND_SERVICE_MICROPHONE",
"android.permission.FOREGROUND_SERVICE_MEDIA_PROJECTION",
"android.permission.POST_NOTIFICATIONS"
]
},
"web": {
"favicon": "./assets/favicon.png"
},
"plugins": [
"@config-plugins/react-native-webrtc",
"@daily-co/config-plugin-rn-daily-js",
[
"expo-build-properties",
{
"android": {
"minSdkVersion": 24,
"compileSdkVersion": 35,
"targetSdkVersion": 34,
"buildToolsVersion": "35.0.0"
},
"ios": {
"deploymentTarget": "15.1"
}
}
]
]
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

View File

@@ -0,0 +1,7 @@
module.exports = function(api) {
api.cache(true);
return {
presets: ['babel-preset-expo'],
plugins: [["module:react-native-dotenv"]],
};
};

View File

@@ -0,0 +1 @@
API_BASE_URL=http://YOUR_LOCAL_IP:7860

View File

@@ -0,0 +1,7 @@
import { registerRootComponent } from "expo";
import App from "./src/App";
// registerRootComponent calls AppRegistry.registerComponent('main', () => App);
// It also ensures that the environment is set up appropriately
registerRootComponent(App);

View File

@@ -0,0 +1,4 @@
// Learn more https://docs.expo.io/guides/customizing-metro
const { getDefaultConfig } = require('expo/metro-config');
module.exports = getDefaultConfig(__dirname);

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,31 @@
{
"name": "bot-ready-rn",
"version": "1.0.0",
"scripts": {
"start": "expo start --dev-client",
"android": "expo run:android --device",
"ios": "expo run:ios --device",
"web": "expo start --web"
},
"dependencies": {
"@config-plugins/react-native-webrtc": "^10.0.0",
"@daily-co/config-plugin-rn-daily-js": "0.0.7",
"@daily-co/react-native-daily-js": "^0.70.0",
"@daily-co/react-native-webrtc": "^118.0.3-daily.2",
"@react-native-async-storage/async-storage": "1.23.1",
"expo": "^52.0.0",
"expo-build-properties": "~0.13.1",
"expo-dev-client": "~5.0.5",
"expo-splash-screen": "~0.29.16",
"expo-status-bar": "~2.0.0",
"react": "18.3.1",
"react-native": "0.76.3",
"react-native-background-timer": "^2.4.1",
"react-native-dotenv": "^3.4.11",
"react-native-get-random-values": "^1.11.0"
},
"devDependencies": {
"@babel/core": "^7.12.9"
},
"private": true
}

View File

@@ -0,0 +1,121 @@
import React, { useState, useEffect } from 'react';
import {SafeAreaView, View, Text, Button, StyleSheet, ScrollView} from 'react-native';
import Daily from "@daily-co/react-native-daily-js";
import { API_BASE_URL } from "@env";
const CallScreen = () => {
const [connectionStatus, setConnectionStatus] = useState('Disconnected');
const [isConnected, setIsConnected] = useState(false);
const [callObject, setCallObject] = useState(null);
const [logs, setLogs] = useState([]);
useEffect(() => {
if (callObject) {
setupTrackListeners(callObject);
}
}, [callObject]);
const log = (message) => {
setLogs((prevLogs) => [...prevLogs, `${new Date().toISOString()} - ${message}`]);
console.log(message);
};
const setupTrackListeners = (callObject) => {
callObject.on("joined-meeting", () => {
setConnectionStatus('Connected');
setIsConnected(true);
log('Client connected');
});
callObject.on("left-meeting", () => {
setConnectionStatus('Disconnected');
setIsConnected(false);
log('Client disconnected');
});
callObject.on("participant-left", () => {
// When the bot leaves, we are also disconnecting from the call
disconnect().catch((err) => {
log(`Failed to disconnect ${err}`);
})
});
// Trigger so the bot can start sending audio
callObject.on("track-started", (evt) => {
if (evt.track.kind === "audio" && evt.participant.local === false) {
handleEventToConsole(evt)
log("Sending the message that will trigger the bot to play the audio.")
callObject.sendAppMessage("playable")
}
});
callObject.on("error", (evt) => log(`Error: ${evt.error}`));
// Other events just for awareness
callObject.on("track-stopped", handleEventToConsole);
callObject.on("participant-joined", handleEventToConsole);
callObject.on("participant-updated", handleEventToConsole);
};
const handleEventToConsole = (evt) => {
log(`Received event: ${evt.action}`);
};
const connect = async () => {
try {
const callObject = Daily.createCallObject({ subscribeToTracksAutomatically: true });
setCallObject(callObject);
const connectionUrl = `${API_BASE_URL}/connect`
const res = await fetch(connectionUrl, { method: "POST", headers: { "Content-Type": "application/json" } });
const roomInfo = await res.json();
await callObject.join({ url: roomInfo.room_url });
} catch (error) {
log(`Error connecting: ${error.message}`);
}
};
const disconnect = async () => {
if (callObject) {
try {
await callObject.leave();
await callObject.destroy();
setCallObject(null);
} catch (error) {
log(`Error disconnecting: ${error.message}`);
}
}
};
return (
<SafeAreaView style={styles.safeArea}>
<View style={styles.container}>
<View style={styles.statusBar}>
<Text>Status: <Text style={styles.status}>{connectionStatus}</Text></Text>
<View style={styles.controls}>
<Button
title={isConnected ? "Disconnect" : "Connect"}
onPress={isConnected ? disconnect : connect}
/>
</View>
</View>
<View style={styles.debugPanel}>
<Text style={styles.debugTitle}>Debug Info</Text>
<ScrollView style={styles.debugLog}>
{logs.map((logEntry, index) => (
<Text key={index} style={styles.logText}>{logEntry}</Text>
))}
</ScrollView>
</View>
</View>
</SafeAreaView>
);
};
const styles = StyleSheet.create({
safeArea: { flex: 1, backgroundColor: '#f0f0f0', padding: 20 },
container: { flex: 1, margin: 20 },
statusBar: { flexDirection: 'row', justifyContent: 'space-between', alignItems: 'center', padding: 10, backgroundColor: '#fff', borderRadius: 8, marginBottom: 20 },
status: { fontWeight: 'bold' },
controls: { flexDirection: 'row', gap: 10 },
debugPanel: { height: '80%', backgroundColor: '#fff', borderRadius: 8, padding: 20},
debugTitle: { fontSize: 16, fontWeight: 'bold' },
debugLog: { height: '100%', overflow: 'scroll', backgroundColor: '#f8f8f8', padding: 10, borderRadius: 4, fontFamily: 'monospace', fontSize: 12, lineHeight: 1.4 },
});
export default CallScreen;

View File

@@ -0,0 +1,50 @@
# Bot ready signaling Server
A FastAPI server that manages bot instances and provide endpoint for Pipecat client connections.
## Endpoints
- `POST /connect` - Pipecat client connection endpoint
## Environment Variables
Copy `env.example` to `.env` and configure:
```ini
# Required API Keys
DAILY_API_KEY= # Your Daily API key
CARTESIA_API_KEY= # Your Cartesia API key
# Optional Configuration
DAILY_API_URL= # Optional: Daily API URL (defaults to https://api.daily.co/v1)
DAILY_SAMPLE_ROOM_URL= # Optional: Fixed room URL for development
HOST= # Optional: Host address (defaults to 0.0.0.0)
FAST_API_PORT= # Optional: Port number (defaults to 7860)
```
## Running the Server
Set up and activate your virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
Install dependencies:
```bash
pip install -r requirements.txt
```
If you want to use the local version of `pipecat` in this repo rather than the last published version, also run:
```bash
pip install --editable "../../../[daily,cartesia,openai]"
```
Run the server:
```bash
python server.py
```

View File

@@ -0,0 +1,3 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=
CARTESIA_API_KEY=

View File

@@ -0,0 +1,4 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,cartesia,openai]

View File

@@ -6,6 +6,7 @@
import argparse
import os
from typing import Optional
import aiohttp
@@ -18,7 +19,7 @@ async def configure(aiohttp_session: aiohttp.ClientSession):
async def configure_with_args(
aiohttp_session: aiohttp.ClientSession, parser: argparse.ArgumentParser | None = None
aiohttp_session: aiohttp.ClientSession, parser: Optional[argparse.ArgumentParser] = None
):
if not parser:
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")

View File

@@ -0,0 +1,147 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
from typing import Any, Dict
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
# Load environment variables from .env file
load_dotenv(override=True)
# Dictionary to track bot processes: {pid: (process, room_url)}
bot_procs = {}
# Store Daily API helpers
daily_helpers = {}
def cleanup():
"""Cleanup function to terminate all bot processes.
Called during server shutdown.
"""
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
"""FastAPI lifespan manager that handles startup and shutdown tasks.
- Creates aiohttp session
- Initializes Daily API helper
- Cleans up resources on shutdown
"""
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
# Initialize FastAPI app with lifespan manager
app = FastAPI(lifespan=lifespan)
# Configure CORS to allow requests from any origin
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
async def create_room_and_token() -> tuple[str, str]:
"""Helper function to create a Daily room and generate an access token.
Returns:
tuple[str, str]: A tuple containing (room_url, token)
Raises:
HTTPException: If room creation or token generation fails
"""
room = await daily_helpers["rest"].create_room(DailyRoomParams())
if not room.url:
raise HTTPException(status_code=500, detail="Failed to create room")
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
return room.url, token
@app.post("/connect")
async def bot_connect(request: Request) -> Dict[Any, Any]:
"""Connect endpoint that creates a room and returns connection credentials.
This endpoint is called by client to establish a connection.
Returns:
Dict[Any, Any]: Authentication bundle containing room_url and token
Raises:
HTTPException: If room creation, token generation, or bot startup fails
"""
print("Creating room for RTVI connection")
room_url, token = await create_room_and_token()
print(f"Room URL: {room_url}")
# Start the bot process
try:
bot_file = "signalling_bot"
proc = subprocess.Popen(
[f"python3 -m {bot_file} -u {room_url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room_url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
# Return the authentication bundle in format expected by DailyTransport
return {"room_url": room_url, "token": token}
if __name__ == "__main__":
import uvicorn
# Parse command line arguments for server configuration
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Travel Companion FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
# Start the FastAPI server
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -0,0 +1,95 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
from dataclasses import dataclass
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.frames.frames import AudioRawFrame, EndFrame, OutputAudioRawFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
@dataclass
class SilenceFrame(OutputAudioRawFrame):
def __init__(
self,
*,
sample_rate: int,
duration: float,
):
# Initialize the parent class with the silent frame's data
super().__init__(
audio=self.create_silent_audio_frame(sample_rate, 1, duration).audio,
sample_rate=sample_rate,
num_channels=1,
)
@staticmethod
def create_silent_audio_frame(
sample_rate: int, num_channels: int, duration: float
) -> AudioRawFrame:
"""Create an AudioRawFrame containing silence."""
frame_size = num_channels * 2 # 2 bytes per sample for 16-bit audio
total_frames = int(sample_rate * duration)
total_bytes = total_frames * frame_size
silent_audio = bytes(total_bytes) # Create a byte array filled with zeros
return AudioRawFrame(audio=silent_audio, sample_rate=sample_rate, num_channels=num_channels)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True)
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when we receive a specific message
@transport.event_handler("on_app_message")
async def on_app_message(transport, message, sender):
logger.debug(f"Received app message: {message} - {sender}")
if "playable" not in message:
return
await task.queue_frames(
[
SilenceFrame(
sample_rate=task.params.audio_out_sample_rate,
duration=0.5,
),
TTSSpeakFrame(f"Hello there, how are you doing today ?"),
EndFrame(),
]
)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,66 +0,0 @@
# Chatbot with canonical-metrics
This project implements a chatbot using a pipeline architecture that integrates audio processing, transcription, and a language model for conversational interactions. The chatbot operates within a daily communication environment, utilizing various services for text-to-speech and language model responses.
## Features
- **Audio Input and Output**: Captures microphone input and plays back audio responses.
- **Voice Activity Detection**: Utilizes Silero VAD to manage audio input intelligently.
- **Text-to-Speech**: Integrates ElevenLabs TTS service to convert text responses into audio.
- **Language Model Interaction**: Uses OpenAI's GPT-4 model to generate responses based on user input.
- **Transcription Services**: Captures and transcribes participant speech for analytics.
- **Metrics Collection**: Sends audio data for analysis via Canonical Metrics Service.
## Requirements
- Python 3.10+
- `python-dotenv`
- Additional libraries from the `pipecat` package.
## Setup
1. Clone the repository.
2. Install the required packages.
3. Set up environment variables for API keys:
- `OPENAI_API_KEY`
- `ELEVENLABS_API_KEY`
- `CANONICAL_API_KEY`
- `CANONICAL_API_URL`
4. Run the script.
## Usage
The chatbot introduces itself and engages in conversations, providing brief and creative responses. Designed for flexibility, it can support multiple languages with appropriate configuration.
## Events
- Participants joining or leaving the call are handled dynamically, adjusting the chatbot's behavior accordingly.
The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
## Build and test the Docker image
```
docker build -t chatbot .
docker run --env-file .env -p 7860:7860 chatbot
```

View File

@@ -1,146 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import uuid
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.services.canonical import CanonicalMetricsService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
audio_out_enabled=True,
audio_in_enabled=True,
camera_out_enabled=False,
vad_enabled=True,
vad_audio_passthrough=True,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
#
# Spanish
#
# transcription_settings=DailyTranscriptionSettings(
# language="es",
# tier="nova",
# model="2-general"
# )
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
#
# English
#
voice_id="cgSgspJ2msm6clMCkdW9",
aiohttp_session=session,
#
# Spanish
#
# model="eleven_multilingual_v2",
# voice_id="gD1IexrzCvsXPHUuT0s3",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
messages = [
{
"role": "system",
#
# English
#
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your responses to 12 words or fewer.",
#
# Spanish
#
# "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
"""
CanonicalMetrics uses AudioBufferProcessor under the hood to buffer the audio. On
call completion, CanonicalMetrics will send the audio buffer to Canonical for
analysis. Visit https://voice.canonical.chat to learn more.
"""
audio_buffer_processor = AudioBufferProcessor(num_channels=2)
canonical = CanonicalMetricsService(
audio_buffer_processor=audio_buffer_processor,
aiohttp_session=session,
api_key=os.getenv("CANONICAL_API_KEY"),
call_id=str(uuid.uuid4()),
assistant="pipecat-chatbot",
assistant_speaks_first=True,
context=context,
)
pipeline = Pipeline(
[
transport.input(), # microphone
context_aggregator.user(),
llm,
tts,
transport.output(),
audio_buffer_processor, # captures audio into a buffer
canonical, # uploads audio buffer to Canonical AI for metrics
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.queue_frame(EndFrame())
@transport.event_handler("on_call_state_updated")
async def on_call_state_updated(transport, state):
if state == "left":
await task.queue_frame(EndFrame())
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,5 +0,0 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,openai,silero,elevenlabs,canonical]

View File

@@ -18,14 +18,13 @@ from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
@@ -33,10 +32,16 @@ load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# Create the recordings directory if it doesn't exist
os.makedirs("recordings", exist_ok=True)
async def save_audio(audio: bytes, sample_rate: int, num_channels: int):
async def save_audio(audio: bytes, sample_rate: int, num_channels: int, name: str):
if len(audio) > 0:
filename = f"conversation_recording{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}.wav"
filename = os.path.join(
"recordings",
f"{name}_conversation_recording{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}.wav",
)
with io.BytesIO() as buffer:
with wave.open(buffer, "wb") as wf:
wf.setsampwidth(2)
@@ -61,9 +66,7 @@ async def main():
DailyParams(
audio_out_enabled=True,
audio_in_enabled=True,
camera_out_enabled=False,
vad_enabled=True,
vad_audio_passthrough=True,
video_out_enabled=False,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
#
@@ -83,7 +86,6 @@ async def main():
# English
#
voice_id="cgSgspJ2msm6clMCkdW9",
aiohttp_session=session,
#
# Spanish
#
@@ -91,7 +93,7 @@ async def main():
# voice_id="gD1IexrzCvsXPHUuT0s3",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
@@ -110,8 +112,9 @@ async def main():
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
# Save audio every 10 seconds.
audiobuffer = AudioBufferProcessor(buffer_size=480000)
# NOTE: Watch out! This will save all the conversation in memory. You
# can pass `buffer_size` to get periodic callbacks.
audiobuffer = AudioBufferProcessor(enable_turn_audio=True)
pipeline = Pipeline(
[
@@ -125,21 +128,30 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels)
await save_audio(audio, sample_rate, num_channels, "full")
@audiobuffer.event_handler("on_user_turn_audio_data")
async def on_user_turn_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "user")
@audiobuffer.event_handler("on_bot_turn_audio_data")
async def on_bot_turn_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "bot")
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await audiobuffer.start_recording()
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.queue_frame(EndFrame())
await task.cancel()
runner = PipelineRunner()

View File

@@ -53,4 +53,3 @@ async def configure(aiohttp_session: aiohttp.ClientSession):
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)
return (url, token)

View File

@@ -0,0 +1,39 @@
# Daily Custom Tracks
This example shows how to send and receive Daily custom tracks. We will run a simple `daily-python` application to send an audio file with a custom track (named "pipecat") to a room. Then, the Pipecat bot will mirror that custom track into another custom track (named "pipecat-mirror") in the same room.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
## Run the bot
Start the bot by giving it a Daily room URL.
```bash
python bot.py -u ROOM_URL
```
The bot will wait for the first participant to join. Then, it will mirror a custom track named "pipecat" into a new custom track named "pipecat-mirror".
## Run the sender
Now, run the custom track sender. This is a simple `daily-python` application that opens and audio file and sends it as a custom track to the same Daily room.
```bash
python custom_track_sender.py -u ROOM_URL -i office-ambience-mono-16000.mp3
```
## Open client
Finally, open the client so you can hear both custom tracks.
```bash
open index.html
```
Once the client is opened, copy the URL of the Daily room and join it. You should be able to select which custom track you want to hear.

View File

@@ -0,0 +1,87 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import sys
import aiohttp
from loguru import logger
from runner import configure
from pipecat.frames.frames import Frame, InputAudioRawFrame, OutputAudioRawFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.transports.services.daily import DailyParams, DailyTransport
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
class CustomTrackMirrorProcessor(FrameProcessor):
def __init__(self, transport_destination: str, **kwargs):
super().__init__(**kwargs)
self._transport_destination = transport_destination
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, InputAudioRawFrame) and frame.transport_source:
output_frame = OutputAudioRawFrame(
audio=frame.audio,
sample_rate=frame.sample_rate,
num_channels=frame.num_channels,
)
output_frame.transport_destination = self._transport_destination
await self.push_frame(output_frame)
else:
await self.push_frame(frame, direction)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url,
None,
"Custom tracks mirror",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
microphone_out_enabled=False, # Disable since we just use custom tracks
audio_out_destinations=["pipecat-mirror"],
),
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
CustomTrackMirrorProcessor("pipecat-mirror"),
transport.output(), # Transport bot output
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=16000,
audio_out_sample_rate=16000,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_audio(participant["id"], audio_source="pipecat")
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,74 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import time
from daily import CallClient, CustomAudioSource, Daily
from pydub import AudioSegment
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument("-u", "--url", type=str, required=True, help="URL of the Daily room to join")
parser.add_argument(
"-i", "--input", type=str, required=True, help="Input audio file (needs 16000 sample rate)"
)
args, _ = parser.parse_known_args()
audio = AudioSegment.from_mp3(args.input)
raw_bytes = audio.raw_data
sample_rate = audio.frame_rate
channels = audio.channels
print(f"Length: {len(raw_bytes)} bytes")
print(f"Sample rate: {sample_rate}, Channels: {channels}")
# Initialize the Daily context & create call client
Daily.init()
client = CallClient()
# Join the room and indicate we have a custom track named "pipecat".
client.join(
args.url,
client_settings={
"publishing": {
"camera": False,
"microphone": False,
"customAudio": {"pipecat": True},
},
},
)
# Just sleep for a couple of seconds. To do this well we should really use
# completions.
time.sleep(2)
# Create the custom audio source. This is where we will write our audio.
audio_source = CustomAudioSource(sample_rate, channels)
# Create an audio track and assign it our audio source.
client.add_custom_audio_track("pipecat", audio_source)
# Just sleep for a second. To do this well we should really use completions.
time.sleep(1)
try:
# Just write one second of audio until we have read all the file.
chunk_size = sample_rate * channels * 2
while len(raw_bytes) > 0:
chunk = raw_bytes[:chunk_size]
raw_bytes = raw_bytes[chunk_size:]
audio_source.write_frames(chunk)
except KeyboardInterrupt:
client.leave()
# Just sleep for a second. To do this well we should really use completions.
time.sleep(1)
client.release()

View File

@@ -0,0 +1,173 @@
<html>
<head>
<title>daily custom tracks</title>
</head>
<script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
<link
rel="stylesheet"
type="text/css"
href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
/>
<script>
function enableButton(buttonId, enable) {
const button = document.getElementById(buttonId);
button.disabled = !enable;
}
function enableJoinButton(enable) {
enableButton("join-button", enable);
}
function enableLeaveButton(enable) {
enableButton("leave-button", enable);
}
function destroyPlayers(query) {
const items = document.querySelectorAll(query);
if (items) {
for (const item of items) {
item.remove();
}
}
}
function destroyParticipantPlayers(participantId) {
destroyPlayers(`audio[data-participant-id="${participantId}"]`);
destroyPlayers(`button[data-participant-id="${participantId}"]`);
}
async function startPlayer(player, track) {
player.muted = false;
player.autoplay = true;
if (track != null) {
player.srcObject = new MediaStream([track]);
}
}
async function buildAudioPlayer(track, participantId) {
const audioContainer = document.getElementById("audio-container");
const player = document.createElement("audio");
player.dataset.participantId = participantId;
// Create a new button for controlling audio
const audioControlButton = document.createElement("button");
audioControlButton.className = "ui primary green button"
audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
audioControlButton.dataset.participantId = participantId;
audioControlButton.onclick = () => {
if (player.paused) {
player.play();
audioControlButton.className = "ui primary red button"
} else {
player.pause();
audioControlButton.className = "ui primary green button"
}
};
audioContainer.appendChild(player);
audioContainer.appendChild(audioControlButton);
await startPlayer(player, track);
player.pause()
return player;
}
function subscribeToTracks(participantId) {
console.log(`subscribing to track`);
if (participantId === "local") {
return;
}
callObject.updateParticipant(participantId, {
setSubscribedTracks: {
audio: true,
video: false,
custom: true,
},
});
}
function startDaily() {
enableJoinButton(true);
enableLeaveButton(false);
window.callObject = window.DailyIframe.createCallObject({});
callObject.on("participant-joined", (e) => {
if (!e.participant.local) {
console.log("participant-joined", e.participant);
subscribeToTracks(e.participant.session_id);
}
});
callObject.on("participant-left", (e) => {
console.log("participant-left", e.participant.session_id);
destroyParticipantPlayers(e.participant.session_id);
});
callObject.on("track-started", async (e) => {
console.log("track-started", e.track);
if (e.track.kind === "audio") {
await buildAudioPlayer(e.track, e.participant.session_id);
}
});
}
async function joinRoom() {
enableJoinButton(false);
enableLeaveButton(true);
const meetingUrl = document.getElementById("meeting-url").value;
callObject.join({
url: meetingUrl,
startVideoOff: true,
startAudioOff: true,
subscribeToTracksAutomatically: false,
receiveSettings: {
base: { video: { layer: 0 } },
},
});
}
async function leaveRoom() {
enableJoinButton(true);
enableLeaveButton(false);
callObject.leave();
const audioContainer = document.getElementById("audio-container");
audioContainer.replaceChildren();
}
</script>
<body onload="startDaily()">
<div class="ui centered page grid" style="margin-top: 30px">
<div class="ten wide column">
<div class="ui form" style="margin-top: 30px">
<div class="field">
<label>Meeting URL</label>
<input id="meeting-url" value="" />
</div>
</div>
</div>
</div>
<div class="ui centered aligned header" style="margin-top: 30px">
<button id="join-button" class="ui primary button" onclick="joinRoom()">
Join
</button>
<button id="leave-button" class="ui button" onclick="leaveRoom()">
Leave
</button>
</div>
<div id="tile" class="ui container" style="margin-top: 30px">
<div id="tile" class="ui center aligned grid">
<div id="audio-container"></div><br/>
</div>
</div>
</body>
</html>

View File

@@ -0,0 +1,2 @@
pydub
pipecat-ai[daily]

View File

@@ -53,4 +53,3 @@ async def configure(aiohttp_session: aiohttp.ClientSession):
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)
return (url, token)

View File

@@ -1,7 +1,12 @@
FROM python:3.10-bullseye
RUN mkdir /app
RUN mkdir /app/assets
RUN mkdir /app/utils
COPY *.py /app/
COPY requirements.txt /app/
WORKDIR /app
RUN pip3 install -r requirements.txt

View File

@@ -0,0 +1,39 @@
# Daily Multi Translation
This example shows how to use Daily to stream multiple simultaneous translations using a single transport. Daily provides custom tracks and in this example we will simultaneously translate incoming audio in English to Spanish, French and German, each of them being sent to a custom track.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser. This will open a Daily Prebuilt room where you will speak in English (make sure you are not muted).
## Open client
Next, you need to open the client that will listen to the translations.
```bash
open index.html
```
Once the client is opened, copy the URL of the Daily room created above and join it. You should be able to select which translation you want to hear.
## Build and test the Docker image
```
docker build -t daily-multi-translation .
docker run --env-file .env -p 7860:7860 daily-multi-translation
```

View File

@@ -0,0 +1,165 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.mixers.soundfile_mixer import SoundfileMixer
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
from pipecat.pipeline.parallel_pipeline import ParallelPipeline
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
BACKGROUND_SOUND_FILE = "office-ambience-mono-16000.mp3"
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Multi translation bot",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
audio_out_mixer={
"spanish": SoundfileMixer(
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
),
"french": SoundfileMixer(
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
),
"german": SoundfileMixer(
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
),
},
audio_out_destinations=["spanish", "french", "german"],
microphone_out_enabled=False, # Disable since we just use custom tracks
vad_analyzer=SileroVADAnalyzer(),
),
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts_spanish = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="cefcb124-080b-4655-b31f-932f3ee743de",
transport_destination="spanish",
)
tts_french = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="8832a0b5-47b2-4751-bb22-6a8e2149303d",
transport_destination="french",
)
tts_german = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="38aabb6a-f52b-4fb0-a3d1-988518f4dc06",
transport_destination="german",
)
messages_spanish = [
{
"role": "system",
"content": "You will be provided with a sentence in English, and your task is to only translate it into Spanish.",
},
]
messages_french = [
{
"role": "system",
"content": "You will be provided with a sentence in English, and your task is to only translate it into French.",
},
]
messages_german = [
{
"role": "system",
"content": "You will be provided with a sentence in English, and your task is to only translate it into German.",
},
]
llm_spanish = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
llm_french = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
llm_german = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
context_spanish = OpenAILLMContext(messages_spanish)
context_aggregator_spanish = llm_spanish.create_context_aggregator(context_spanish)
context_french = OpenAILLMContext(messages_french)
context_aggregator_french = llm_french.create_context_aggregator(context_french)
context_german = OpenAILLMContext(messages_german)
context_aggregator_german = llm_german.create_context_aggregator(context_german)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
ParallelPipeline(
# Spanish pipeline.
[
context_aggregator_spanish.user(),
llm_spanish,
tts_spanish,
context_aggregator_spanish.assistant(),
],
# French pipeline.
[
context_aggregator_french.user(),
llm_french,
tts_french,
context_aggregator_french.assistant(),
],
# German pipeline.
[
context_aggregator_german.user(),
llm_german,
tts_german,
context_aggregator_german.assistant(),
],
),
transport.output(), # Transport bot output
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=16000,
audio_out_sample_rate=16000,
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
report_only_initial_ttfb=True,
),
observers=[TranscriptionLogObserver()],
)
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,6 +1,5 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=7df...
OPENAI_API_KEY=sk-PL...
ELEVENLABS_API_KEY=aeb...
CANONICAL_API_KEY=can...
CANONICAL_API_URL=
DEEPGRAM_API_KEY=efb...
CARTESIA_API_KEY=aeb...

View File

@@ -0,0 +1,202 @@
<html>
<head>
<title>daily multi translation</title>
</head>
<script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
<script
src="https://code.jquery.com/jquery-3.1.1.min.js"
integrity="sha256-hVVnYaiADRTO2PzUGmuLJr8BLUSjGIZsDYGmIJLv2b8="
crossorigin="anonymous"
></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
<link
rel="stylesheet"
type="text/css"
href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
/>
<script>
function enableButton(buttonId, enable) {
const button = document.getElementById(buttonId);
button.disabled = !enable;
}
function enableJoinButton(enable) {
enableButton("join-button", enable);
}
function enableLeaveButton(enable) {
enableButton("leave-button", enable);
}
function destroyPlayers(query) {
const items = document.querySelectorAll(query);
if (items) {
for (const item of items) {
item.remove();
}
}
}
function destroyParticipantPlayers(participantId) {
destroyPlayers(`video[data-participant-id="${participantId}"]`);
destroyPlayers(`audio[data-participant-id="${participantId}"]`);
destroyPlayers(`button[data-participant-id="${participantId}"]`);
}
async function startPlayer(player, track) {
player.muted = false;
player.autoplay = true;
if (track != null) {
player.srcObject = new MediaStream([track]);
}
}
async function buildVideoPlayer(track, participantId) {
const videoContainer = document.getElementById("video-container");
const player = document.createElement("video");
player.dataset.participantId = participantId;
videoContainer.appendChild(player);
await startPlayer(player, track);
await player.play();
return player;
}
async function buildAudioPlayer(track, participantId) {
const audioContainer = document.getElementById("audio-container");
const player = document.createElement("audio");
player.dataset.participantId = participantId;
// Create a new button for controlling audio
const audioControlButton = document.createElement("button");
audioControlButton.className = "ui primary green button"
audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
audioControlButton.dataset.participantId = participantId;
audioControlButton.onclick = () => {
if (player.paused) {
player.play();
audioControlButton.className = "ui primary red button"
} else {
player.pause();
audioControlButton.className = "ui primary green button"
}
};
audioContainer.appendChild(player);
audioContainer.appendChild(audioControlButton);
await startPlayer(player, track);
player.pause()
return player;
}
function subscribeToTracks(participantId) {
console.log(`subscribing to track`);
if (participantId === "local") {
return;
}
callObject.updateParticipant(participantId, {
setSubscribedTracks: {
audio: true,
video: true,
custom: true,
},
});
}
function startDaily() {
enableJoinButton(true);
enableLeaveButton(false);
window.callObject = window.DailyIframe.createCallObject({});
callObject.on("participant-joined", (e) => {
if (!e.participant.local) {
console.log("participant-joined", e.participant);
subscribeToTracks(e.participant.session_id);
}
});
callObject.on("participant-left", (e) => {
console.log("participant-left", e.participant.session_id);
destroyParticipantPlayers(e.participant.session_id);
});
callObject.on("track-started", async (e) => {
console.log("track-started", e.track);
if (e.track.kind === "video") {
await buildVideoPlayer(e.track, e.participant.session_id);
} else if (e.track.kind === "audio") {
await buildAudioPlayer(e.track, e.participant.session_id);
}
});
}
async function joinRoom() {
enableJoinButton(false);
enableLeaveButton(true);
const meetingUrl = document.getElementById("meeting-url").value;
callObject.join({
url: meetingUrl,
startVideoOff: true,
startAudioOff: true,
subscribeToTracksAutomatically: false,
receiveSettings: {
base: { video: { layer: 0 } },
},
});
}
async function leaveRoom() {
enableJoinButton(true);
enableLeaveButton(false);
callObject.leave();
const videoContainer = document.getElementById("video-container");
videoContainer.replaceChildren();
const audioContainer = document.getElementById("audio-container");
audioContainer.replaceChildren();
}
</script>
<body onload="startDaily()">
<div class="ui centered page grid" style="margin-top: 30px">
<div class="ten wide column">
<div class="ui form" style="margin-top: 30px">
<div class="field">
<label>Meeting URL</label>
<input id="meeting-url" value="" />
</div>
</div>
</div>
</div>
<div class="ui centered aligned header" style="margin-top: 30px">
<button id="join-button" class="ui primary button" onclick="joinRoom()">
Join
</button>
<button id="leave-button" class="ui button" onclick="leaveRoom()">
Leave
</button>
</div>
<div id="tile" class="ui container" style="margin-top: 30px">
<div id="tile" class="ui center aligned grid">
<div id="audio-container"></div><br/>
</div>
</div>
<div id="tile" class="ui container" style="margin-top: 30px">
<div id="tile" class="ui center aligned grid">
<div id="video-container" class="ui segment"></div>
</div>
</div>
</body>
</html>

View File

@@ -0,0 +1,5 @@
aiofiles
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,deepgram,openai,silero,cartesia]

View File

@@ -0,0 +1,55 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)

View File

@@ -1,3 +1,9 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import asyncio
import os
@@ -12,8 +18,8 @@ from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
@@ -35,8 +41,7 @@ async def main(room_url: str, token: str):
api_key=daily_api_key,
audio_in_enabled=True,
audio_out_enabled=True,
camera_out_enabled=False,
vad_enabled=True,
video_out_enabled=False,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
),
@@ -47,7 +52,7 @@ async def main(room_url: str, token: str):
voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
@@ -70,7 +75,7 @@ async def main(room_url: str, token: str):
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
@@ -79,11 +84,13 @@ async def main(room_url: str, token: str):
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.queue_frame(EndFrame())
await task.cancel()
@transport.event_handler("on_call_state_updated")
async def on_call_state_updated(transport, state):
if state == "left":
# Here we don't want to cancel, we just want to finish sending
# whatever is queued, so we use an EndFrame().
await task.queue_frame(EndFrame())
runner = PipelineRunner()

View File

@@ -1,3 +1,9 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp

View File

@@ -1,3 +1,9 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
@@ -5,6 +11,15 @@ import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
@@ -12,33 +27,23 @@ logger.add(sys.stderr, level="DEBUG")
async def main(room_url: str, token: str):
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
transport = DailyTransport(
room_url,
token,
"bot",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY", ""), voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22"
api_key=os.getenv("CARTESIA_API_KEY", ""), voice_id="71a7ad14-091c-4e8e-a314-022ece01c121"
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
@@ -63,7 +68,7 @@ async def main(room_url: str, token: str):
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
@@ -79,7 +84,7 @@ async def main(room_url: str, token: str):
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.queue_frame(EndFrame())
await task.cancel()
runner = PipelineRunner()

View File

@@ -1,5 +1,4 @@
python-dotenv==1.0.1
modal==0.71.3
pipecat-ai[daily,silero,cartesia,openai]==0.0.52
pipecat-ai[daily,silero,cartesia,openai]
fastapi==0.115.6
aiohttp==3.11.11

View File

@@ -0,0 +1,178 @@
# Handling PSTN/SIP Dial-in on Pipecat Cloud
This repository contains two server implementations for handling
the pinless dial-in workflow in Pipecat Cloud. This is the companion to the
Pipecat Cloud [pstn_sip starter image](https://github.com/daily-co/pipecat-cloud-images/tree/main/pipecat-starters/pstn_sip).
In addition you can use `/api/dial` to trigger dial-out, and
eventually, call-transfers.
1. [FastAPI Server](fastapi-webhook-server/README.md) -
A FastAPI implementation that handles PSTN (Public Switched Telephone
Network) and SIP (Session Initiation Protocol) calls using the Daily API.
2. [Next.js Serverless](nextjs-webhook-server/README.md) -
A Next.js API implementation designed for deployment on Vercel's
serverless platform.
Both implementations provide:
- HMAC signature validation for pinless webhook
- Structured logging
- Support for dial-in and dial-out settings
- Voicemail detection and call transfer functionality (coming soon)
- Test request handling
## Choosing an Implementation
- Use the **FastAPI Server** if you:
- Need a standalone server
- Prefer Python and FastAPI
- Want to deploy to traditional hosting platforms
- Use the **Next.js Serverless** implementation if you:
- Want serverless deployment
- Prefer JavaScript/TypeScript
- Already use Next.js and Vercel for other projects
- Need quick scaling and zero maintenance
## Prerequisites
### Environment Variables
Both implementations require similar environment variables:
- `PIPECAT_CLOUD_API_KEY`: Pipecat Cloud API Key, begins with pk\_\*
- `AGENT_NAME`: Your Daily agent name
- `PINLESS_HMAC_SECRET`: Your HMAC secret for request verification
- `LOG_LEVEL`: (Optional) Logging level (defaults to 'info')
See the individual README files in each implementation directory for
specific setup instructions.
### Phone number setup
You can buy a phone number through the Pipecat Cloud Dashboard:
1. Go to `Settings` > `Telephony`
2. Follow the UI to purchase a phone number
3. Configure the webhook URL to receive incoming calls (e.g. `https://my-webhook-url.com/api/dial`)
Or purchase the number using Daily's
[PhoneNumbers API](https://docs.daily.co/reference/rest-api/phone-numbers).
```bash
curl --request POST \
--url https://api.daily.co/v1/domain-dialin-config \
--header 'Authorization: Bearer $TOKEN' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "pinless_dialin",
"name_prefix": "Customer1",
"phone_number": "+1PURCHASED_NUM",
"room_creation_api": "https://example.com/api/dial",
"hold_music_url": "https://example.com/static/ringtone.mp3",
"timeout_config": {
"message": "No agent is available right now"
}
}'
```
The API will return a static SIP URI (`sip_uri`) that can be called
from other SIP services.
### `room_creation_api`
To make and receive calls currently you have to host a server that
handles incoming calls. In the coming weeks, incoming calls will be
directly handled within Daily and we will expose an endpoint similar
to `{service}/start` that will manage this for you.
In the meantime, the server described below serves as the webhook
handler for the `room_creation_api`. Configure your pinless phone
number or SIP interconnect to the `ngrok` tunnel or
the actual server URL, append `/api/dial` to the webhook URL.
## Example curl commands
Note: Replace `http://localhost:3000` with your actual server URL and
phone numbers with valid values for your use case.
### Dialin Request
The server will receive a request when a call is received from Daily.
### Dialout Request
Dial a number, will use any purchased number
```bash
curl -X POST http://localhost:3000/api/dial \
-H "Content-Type: application/json" \
-d '{
"dialout_settings": [
{
"phoneNumber": "+1234567890",
}
]
}'
```
Dial a number with callerId, which is the UUID of a purchased number.
```bash
curl -X POST http://localhost:3000/api/dial \
-H "Content-Type: application/json" \
-d '{
"dialout_settings": [
{
"phoneNumber": "+1234567890",
"callerId": "purchased_phone_uuid"
}
]
}'
```
Dial a number
```bash
curl -X POST http://localhost:3000/api/dial \
-H "Content-Type: application/json" \
-d '{
"dialout_settings": [
{
"phoneNumber": "+1234567890",
"callerId": "purchased_phone_uuid"
}
]
}'
```
### Advanced Request with Voicemail Detection
```bash
curl -X POST http://localhost:3000/api/dial \
-H "Content-Type: application/json" \
-d '{
"To": "+1234567890",
"From": "+1987654321",
"callId": "call-uuid-123",
"callDomain": "domain-uuid-456",
"dialout_settings": [
{
"phoneNumber": "+1234567890",
"callerId": "purchased_phone_uuid"
}
],
"voicemail_detection": {
"testInPrebuilt": true
},
"call_transfer": {
"mode": "dialout",
"speakSummary": true,
"storeSummary": true,
"operatorNumber": "+1234567890",
"testInPrebuilt": true
}
}'
```

View File

@@ -0,0 +1,98 @@
# FastAPI server for handling Daily PSTN/SIP Webhook
A FastAPI server that handles PSTN (Public Switched Telephone Network) and SIP (Session Initiation Protocol) calls using the Daily API.
## Setup
1. Clone the repository
2. Navigate to the `fastapi-webhook-server` directory:
```bash
cd fastapi-webhook-server
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Copy `env.example` to `.env`:
```bash
cp env.example .env
```
5. Update `.env` with your credentials:
- `AGENT_NAME`: Your Daily agent name
- `PIPECAT_CLOUD_API_KEY`: Your Daily API key
- `PINLESS_HMAC_SECRET`: Your HMAC secret for request verification
## Running the Server
Start the server:
```bash
python server.py
```
The server will run on `http://localhost:7860` and you can expose it via ngrok for testing:
```bash
`ngrok http 7860`
```
> Tip: Use a subdomain for a consistent URL (e.g. `ngrok http -subdomain=mydomain http://localhost:7860`)
## API Endpoints
### GET /
Health check endpoint that returns a "Hello, World!" message.
### POST /api/dial
Initiates a PSTN/SIP call with the following request body format:
```json
{
"To": "+14152251493",
"From": "+14158483432",
"callId": "string-contains-uuid",
"callDomain": "string-contains-uuid",
"dialout_settings": [
{
"phoneNumber": "+14158483432",
"callerId": "+14152251493"
}
],
"voicemail_detection": {
"testInPrebuilt": true
},
"call_transfer": {
"mode": "dialout",
"speakSummary": true,
"storeSummary": true,
"operatorNumber": "+14152250006",
"testInPrebuilt": true
}
}
```
#### Response
Returns a JSON object containing:
- `status`: Success/failure status
- `data`: Response from Daily API
- `room_properties`: Properties of the created Daily room
## Error Handling
- 401: Invalid signature
- 400: Invalid authorization header (e.g. missing Daily API key in bot.py)
- 405: Method not allowed (e.g. incorrect route on the webhook URL)
- 500: Server errors (missing API key, network issues)
- Other status codes are passed through from the Daily API

View File

@@ -0,0 +1,3 @@
AGENT_NAME="your-agent-name"
PIPECAT_CLOUD_API_KEY="your-daily-api-key"
PINLESS_HMAC_SECRET="hmac-secret-pinless-dialin"

View File

@@ -0,0 +1,6 @@
fastapi
uvicorn
python-dotenv
requests
pydantic
loguru

View File

@@ -0,0 +1,202 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
# server.py
import base64 # for calculating hmac signature
import hmac
import os # for accessing environment variables
import time # for setting expiration time
from typing import Any, Dict, List, Optional
import requests
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from loguru import logger
from pydantic import BaseModel, Field
load_dotenv(override=True)
app = FastAPI()
class RoomRequest(BaseModel):
test: Optional[str] = Field(None, alias="Test", description="Test field")
To: Optional[str] = Field(None, alias="to", description="Destination phone number")
From: Optional[str] = Field(None, alias="from", description="Source phone number")
callId: Optional[str] = Field(None, alias="call_id", description="Unique call identifier")
callDomain: Optional[str] = Field(
None, alias="call_domain", description="Call domain identifier"
)
dialout_settings: Optional[List[Dict[str, Any]]] = Field(
None, description="An array of phone numbers or SIP URIs to dialout to"
)
voicemail_detection: Optional[Dict[str, Any]] = Field(
None, description="A flag to perform voicemail or answeing-machine detection"
)
call_transfer: Optional[Dict[str, Any]] = Field(None, description="to initiate a call transfer")
class Config:
populate_by_name = True
alias_generator = None
"""
body can contain any fields, but for handling PSTN/SIP,
we recommend sending the following custom values:
dialin, dialout, voicemail detection, and call transfer
"To": "+14152251493",
"From": "+14158483432",
"callId": "string-contains-uuid",
"callDomain": "string-contains-uuid"
These need to be remapped to dialin_settings
"dialout_settings": [
{"phoneNumber": "+14158483432", "callerId": "+14152251493"},
{"sipUri": "sip:username@sip.hostname"}
],
},
voicemail_detection:{
testInPrebuilt: true
},
"call_transfer": {
"mode": "dialout",
"speakSummary": true,
"storeSummary": true,
"operatorNumber": "+14152250006",
"testInPrebuilt": true
}
"""
@app.get("/")
async def read_root():
return {"message": "Hello, World!"}
@app.post("/api/dial")
async def dial(request: RoomRequest, raw_request: Request):
logger.info("Incoming request to /dial:")
logger.info(f"Headers: {dict(raw_request.headers)}")
raw_body = await raw_request.body()
raw_body_str = raw_body.decode()
logger.info(f"Raw body: {raw_body_str}")
logger.info(f"Parsed body: {request.dict()}")
# calculate signature and compare/verify
hmac_secret = os.getenv("PINLESS_HMAC_SECRET")
timestamp = raw_request.headers.get("x-pinless-timestamp")
signature = raw_request.headers.get("x-pinless-signature")
if not hmac_secret:
logger.debug("Skipping HMAC validation - PINLESS_HMAC_SECRET not set")
elif timestamp and signature:
message = timestamp + "." + raw_body_str
base64_decoded_secret = base64.b64decode(hmac_secret)
computed_signature = base64.b64encode(
hmac.new(base64_decoded_secret, message.encode(), "sha256").digest()
).decode()
if computed_signature != signature:
logger.error(f"Invalid signature. Expected {signature}, got {computed_signature}")
raise HTTPException(status_code=401, detail="Invalid signature")
else:
logger.debug("Skipping HMAC validation - no signature headers present")
if request.test == "test":
logger.debug("Test request received")
return {"status": "success", "message": "Test request received"}
dialin_settings = None
# these fields are camelCase in the request
required_fields = ["To", "From", "callId", "callDomain"]
if all(
field in request.dict() and request.dict()[field] is not None for field in required_fields
):
# transform from camelCase to snake_case because daily-python expects snake_case
dialin_settings = {
"From": request.From,
"To": request.To,
"call_id": request.callId,
"call_domain": request.callDomain,
# transform from camelCase to snake_case
}
logger.debug(f"Populated dialin_settings from request: {dialin_settings}")
daily_room_properties = {
"enable_dialout": request.dialout_settings is not None,
}
if dialin_settings is not None:
sip_config = {
"display_name": request.From,
"sip_mode": "dial-in",
"num_endpoints": 2 if request.call_transfer is not None else 1,
"codecs": {"audio": ["OPUS"]},
}
daily_room_properties["sip"] = sip_config
# Setting default expiry to 5 minutes from now
daily_room_properties["exp"] = int(time.time()) + (5 * 60)
logger.debug(f"Daily room properties: {daily_room_properties}")
payload = {
"createDailyRoom": True,
"dailyRoomProperties": daily_room_properties,
"body": {
"dialin_settings": dialin_settings,
"dialout_settings": request.dialout_settings,
"voicemail_detection": request.voicemail_detection,
"call_transfer": request.call_transfer,
},
}
pcc_api_key = os.getenv("PIPECAT_CLOUD_API_KEY")
agent_name = os.getenv("AGENT_NAME", "my-first-agent")
if not pcc_api_key:
raise HTTPException(status_code=500, detail="DAILY_API_KEY environment variable is not set")
headers = {"Authorization": f"Bearer {pcc_api_key}", "Content-Type": "application/json"}
url = f"https://api.pipecat.daily.co/v1/public/{agent_name}/start"
logger.debug(f"Making API call to Daily: {url} {headers} {payload}")
try:
response = requests.post(url, json=payload, headers=headers)
response.raise_for_status()
response_data = response.json()
logger.debug(f"Response: {response_data}")
return {
"status": "success",
"data": response_data,
"room_properties": daily_room_properties,
}
except requests.exceptions.HTTPError as e:
# Pass through the status code and error details from the Daily API
status_code = e.response.status_code
error_detail = e.response.json() if e.response.content else str(e)
logger.error(f"HTTP error: {error_detail}")
raise HTTPException(status_code=status_code, detail=error_detail)
except requests.exceptions.RequestException as e:
logger.error(f"Request error: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
if __name__ == "__main__":
try:
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=7860)
except KeyboardInterrupt:
logger.info("Server stopped manually")

View File

@@ -0,0 +1,53 @@
# dependencies
/node_modules
/.pnp
.pnp.js
# testing
/coverage
# next.js
/.next/
/out/
# production
/build
# misc
.DS_Store
*.pem
# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*
.pnpm-debug.log*
# local env files
.env*.local
# vercel
.vercel
# typescript
*.tsbuildinfo
next-env.d.ts
# IDE specific files
.idea/
.vscode/
*.swp
*.swo
# Logs
logs
*.log
# OS generated files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

View File

@@ -0,0 +1,115 @@
# Next.js server for handling Daily PSTN/SIP Webhook
Next.js API routes for handling Daily PSTN/SIP Pipecat requests.
## Features
- API endpoint for handling Daily PSTN/SIP Pipecat requests
- HMAC signature validation
- Structured logging with Pino
- Support for dial-in and dial-out settings
- Voicemail detection and call transfer functionality
- Test request handling
## Setup
1. Clone the repository
2. Navigate to the `nextjs-webhook-server` directory:
```bash
cd nextjs-webhook-server
```
3. Install dependencies:
```bash
npm install
```
4. Create `.env.local` file with your credentials:
```bash
cp env.local.example .env.local
```
5. Update your `.env` with your secrets:
```bash
PIPECAT_CLOUD_API_KEY=pk_*
AGENT_NAME=my-first-agent
PINLESS_HMAC_SECRET=your_hmac_secret
LOG_LEVEL=info
```
### Running the server
Run the development server:
```bash
npm run dev
```
The server will run on `http://localhost:7860` and you can expose it via ngrok for testing:
```bash
`ngrok http 7860`
```
> Tip: Use a subdomain for a consistent URL (e.g. `ngrok http -subdomain=mydomain http://localhost:7860`)
## API Endpoints
### GET /api
Returns a simple "Hello, World!" message with a cute cat emoji to verify the server is running.
### POST /api/dial
Handles dial-in and dial-out requests for Pipecat Cloud.
#### Test Requests
The endpoint handles test requests when a webhook is configured. Send a request with `"Test": "test"` to verify your setup:
```json
{
"Test": "test"
}
```
#### Production Request Format
```json
{
// for dial-in from webhook
"To": "+14152251493",
"From": "+14158483432",
"callId": "string-contains-uuid",
"callDomain": "string-contains-uuid",
// for making a dial out to a phone or SIP
"dialout_settings": [
{ "phoneNumber": "+14158483432", "callerId": "purchased_phone_uuid" },
{ "sipUri": "sip:username@sip.hostname.com" }
]
}
```
## Deployment
The application is configured for Vercel deployment:
1. Push your code to a Git repository
2. Import your project in Vercel dashboard
3. Configure environment variables:
- `PIPECAT_CLOUD_API_KEY`
- `AGENT_NAME`
- `PINLESS_HMAC_SECRET`
- `LOG_LEVEL` (optional, defaults to 'info')
4. Deploy!
## Security
- HMAC signature validation for request authentication
- Environment variables for sensitive credentials
- Method validation (POST only for /dial)

View File

@@ -0,0 +1,4 @@
AGENT_NAME=my-first-agent
PIPECAT_CLOUD_API_KEY=your_daily_api_key
PINLESS_HMAC_SECRET=your_hmac_secret
LOG_LEVEL="info"

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,22 @@
{
"name": "my-daily-app",
"version": "0.1.0",
"private": true,
"scripts": {
"dev": "next dev -p 7860",
"build": "next build",
"start": "next start -p 7860",
"lint": "next lint"
},
"dependencies": {
"axios": "^1.6.0",
"next": "^14.0.0",
"pino": "^8.15.0",
"react": "^18.2.0",
"react-dom": "^18.2.0"
},
"devDependencies": {
"eslint": "^8.46.0",
"eslint-config-next": "^14.0.0"
}
}

View File

@@ -0,0 +1,176 @@
import { logger } from '../../lib/utils';
import axios from 'axios';
import crypto from 'crypto';
const validateSignature = (body, signature, timestamp, secret) => {
// Skip if any required fields are missing
if (!signature || !timestamp || !secret) {
logger.warn('Missing required fields for HMAC validation');
return true;
}
try {
const decodedSecret = Buffer.from(secret, 'base64');
const hmac = crypto.createHmac('sha256', decodedSecret);
const signatureData = `${timestamp}.${body}`;
const computedSignature = hmac.update(signatureData).digest('base64');
logger.debug('Signature validation:', {
timestamp,
signatureData: signatureData.substring(0, 50) + '...',
computedSignature,
receivedSignature: signature
});
return computedSignature === signature;
} catch (error) {
logger.error('Error validating signature:', error);
return true; // Allow request to proceed on error
}
};
export default async function handler(req, res) {
// Only allow POST requests
if (req.method !== 'POST') {
return res.status(405).json({ error: 'Method not allowed' });
}
try {
logger.info('Incoming request to /api/dial:');
logger.info(`Headers: ${JSON.stringify(req.headers)}`);
const rawBody = JSON.stringify(req.body);
logger.info(`Raw body: ${rawBody}`);
const signature = req.headers['x-pinless-signature'];
const timestamp = req.headers['x-pinless-timestamp'];
if (signature && timestamp) {
logger.info('Validating HMAC signature');
if (!validateSignature(rawBody, signature, timestamp, process.env.PINLESS_HMAC_SECRET)) {
logger.error('Invalid HMAC signature', { signature, timestamp });
return res.status(401).json({
error: 'Invalid signature',
message: 'Invalid HMAC signature'
});
}
} else {
logger.info('Skipping HMAC validation - no signature headers present');
}
// Extract request data
const {
Test: test,
To,
From,
callId,
callDomain,
dialout_settings,
voicemail_detection,
call_transfer
} = req.body;
// Handle test requests when a webhook is configured
if (test === 'test') {
logger.debug('Test request received');
return res.status(200).json({ status: 'success', message: 'Test request received' });
}
// Process dialin settings
let dialin_settings = null;
const requiredFields = ['To', 'From', 'callId', 'callDomain'];
if (requiredFields.every(field => req.body[field] !== undefined && req.body[field] !== null)) {
dialin_settings = {
// snake_case because pipecat expects this format
From,
To,
call_id: callId,
call_domain: callDomain,
};
logger.debug(`Populated dialin_settings from request: ${JSON.stringify(dialin_settings)}`);
}
// Set up Daily room properties
const daily_room_properties = {
enable_dialout: dialout_settings !== undefined && dialout_settings !== null,
exp: Math.floor(Date.now() / 1000) + (5 * 60), // 5 minutes from now
};
// Configure SIP if dialin settings are provided
if (dialin_settings !== null) {
const sip_config = {
display_name: From,
sip_mode: 'dial-in',
num_endpoints: call_transfer !== null ? 2 : 1,
codecs: {"audio": ["OPUS"]},
};
daily_room_properties.sip = sip_config;
}
// Prepare payload for {service}/start API call
const payload = {
createDailyRoom: true,
dailyRoomProperties: daily_room_properties,
body: {
dialin_settings,
dialout_settings,
voicemail_detection,
call_transfer,
},
};
logger.debug(`Daily room properties: ${JSON.stringify(daily_room_properties)}`);
// Get Daily API key and agent name from environment variables
const pccApiKey = process.env.PIPECAT_CLOUD_API_KEY;
const agentName = process.env.AGENT_NAME || 'my-first-agent';
if (!pccApiKey) {
throw new Error('PIPECAT_CLOUD_API_KEY environment variable is not set');
}
// Set up headers for Daily API call
const headers = {
'Authorization': `Bearer ${pccApiKey}`,
'Content-Type': 'application/json',
};
const url = `https://api.pipecat.daily.co/v1/public/${agentName}/start`;
logger.debug(`Making API call to Daily: ${url} ${JSON.stringify(headers)} ${JSON.stringify(payload)}`);
try {
const response = await axios.post(url, payload, { headers });
logger.debug(`Response: ${JSON.stringify(response.data)}`);
return res.status(200).json({
status: 'success',
data: response.data,
room_properties: daily_room_properties,
});
} catch (error) {
if (error.response) {
// Pass through status code and error details from the Daily API
const statusCode = error.response.status;
const errorDetail = error.response.data || error.message;
logger.error(`HTTP error: ${JSON.stringify(errorDetail)}`);
return res.status(statusCode).json(errorDetail);
} else {
logger.error(`Request error: ${error.message}`);
return res.status(500).json({ error: error.message });
}
}
} catch (error) {
logger.error(`Unexpected error: ${error.message}`);
return res.status(500).json({ error: 'Internal server error', message: error.message });
}
}
// Configure body parser to preserve raw body text
export const config = {
api: {
bodyParser: {
sizeLimit: '1mb',
},
},
};

View File

@@ -0,0 +1,6 @@
import { logger } from '../../lib/utils';
export default function handler(req, res) {
logger.info('Received request to /api');
res.status(200).json({ message: 'Hello, World! from ᓚᘏᗢ' });
}

View File

@@ -0,0 +1,6 @@
module.exports = {
version: 2,
buildCommand: "next build",
outputDirectory: ".next",
cleanUrls: true
};

View File

@@ -0,0 +1,94 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
dist/
*.egg-info/
*.egg
.installed.cfg
.eggs/
downloads/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
MANIFEST
# Virtual Environments
venv/
env/
.env
.venv/
ENV/
env.bak/
venv.bak/
# IDE
.idea/
.vscode/
.spyderproject
.spyproject
.ropeproject
# Testing and Coverage
.coverage
.coverage.*
htmlcov/
.pytest_cache/
.tox/
.nox/
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
cover/
# Logs and Databases
*.log
*.db
db.sqlite3
db.sqlite3-journal
pip-log.txt
# System Files
.DS_Store
Thumbs.db
desktop.ini
*.swp
*.swo
*.bak
*.tmp
*~
# Build and Documentation
docs/_build/
.pybuilder/
target/
instance/
.webassets-cache
.pdm.toml
.pdm-python
.pdm-build/
__pypackages__/
# Other
*.mo
*.pot
*.sage.py
.mypy_cache/
.dmypy.json
dmypy.json
.pyre/
.pytype/
cython_debug/
.ipynb_checkpoints
# Pipecat cloud
.pcc-deploy.toml

View File

@@ -0,0 +1,7 @@
FROM dailyco/pipecat-base:latest
COPY ./requirements.txt requirements.txt
RUN pip install --no-cache-dir --upgrade -r requirements.txt
COPY ./bot.py bot.py

View File

@@ -0,0 +1,196 @@
# Pipecat Cloud Starter Project
[![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.daily.co) [![Discord](https://img.shields.io/discord/1217145424381743145)](https://discord.gg/dailyco)
A template voice agent for [Pipecat Cloud](https://www.daily.co/products/pipecat-cloud/) that demonstrates building and deploying a conversational AI agent.
> **For a detailed step-by-step guide, see our [Quickstart Documentation](https://docs.pipecat.daily.co/quickstart).**
## Prerequisites
- Python 3.10+
- Linux, MacOS, or Windows Subsystem for Linux (WSL)
- [Docker](https://www.docker.com) and a Docker repository (e.g., [Docker Hub](https://hub.docker.com))
- A Docker Hub account (or other container registry account)
- [Pipecat Cloud](https://pipecat.daily.co) account
> **Note**: If you haven't installed Docker yet, follow the official installation guides for your platform ([Linux](https://docs.docker.com/engine/install/), [Mac](https://docs.docker.com/desktop/setup/install/mac-install/), [Windows](https://docs.docker.com/desktop/setup/install/windows-install/)). For Docker Hub, [create a free account](https://hub.docker.com/signup) and log in via terminal with `docker login`.
## Get Started
### 1. Get the starter project
Clone the starter project from GitHub:
```bash
git clone https://github.com/daily-co/pipecat-cloud-starter
cd pipecat-cloud-starter
```
### 2. Set up your Python environment
We recommend using a virtual environment to manage your Python dependencies.
```bash
# Create a virtual environment
python -m venv .venv
# Activate it
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install the Pipecat Cloud CLI
pip install pipecatcloud
```
### 3. Authenticate with Pipecat Cloud
```bash
pcc auth login
```
### 4. Acquire required API keys
This starter requires the following API keys:
- **OpenAI API Key**: Get from [platform.openai.com/api-keys](https://platform.openai.com/api-keys)
- **Cartesia API Key**: Get from [play.cartesia.ai/keys](https://play.cartesia.ai/keys)
- **Daily API Key**: Automatically provided through your Pipecat Cloud account
### 5. Configure to run locally (optional)
You can test your agent locally before deploying to Pipecat Cloud:
```bash
# Set environment variables with your API keys
export CARTESIA_API_KEY="your_cartesia_key"
export DAILY_API_KEY="your_daily_key"
export OPENAI_API_KEY="your_openai_key"
```
> Your `DAILY_API_KEY` can be found at [https://pipecat.daily.co](https://pipecat.daily.co) under the `Settings` in the `Daily (WebRTC)` tab.
First install requirements:
```bash
pip install -r requirements.txt
```
Then, launch the bot.py script locally:
```bash
LOCAL_RUN=1 python bot.py
```
## Deploy & Run
### 1. Build and push your Docker image
```bash
# Build the image (targeting ARM architecture for cloud deployment)
docker build --platform=linux/arm64 -t my-first-agent:latest .
# Tag with your Docker username and version
docker tag my-first-agent:latest your-username/my-first-agent:0.1
# Push to Docker Hub
docker push your-username/my-first-agent:0.1
```
### 2. Create a secret set for your API keys
The starter project requires API keys for OpenAI and Cartesia:
```bash
# Copy the example env file
cp env.example .env
# Edit .env to add your API keys:
# CARTESIA_API_KEY=your_cartesia_key
# OPENAI_API_KEY=your_openai_key
# Create a secret set from your .env file
pcc secrets set my-first-agent-secrets --file .env
```
Alternatively, you can create secrets directly via CLI:
```bash
pcc secrets set my-first-agent-secrets \
CARTESIA_API_KEY=your_cartesia_key \
OPENAI_API_KEY=your_openai_key
```
### 3. Deploy to Pipecat Cloud
```bash
pcc deploy my-first-agent your-username/my-first-agent:0.1 --secrets my-first-agent-secrets
```
> **Note (Optional)**: For a more maintainable approach, you can use the included `pcc-deploy.toml` file:
>
> ```toml
> agent_name = "my-first-agent"
> image = "your-username/my-first-agent:0.1"
> secret_set = "my-first-agent-secrets"
>
> [scaling]
> min_instances = 0
> ```
>
> Then simply run `pcc deploy` without additional arguments.
> **Note**: If your repository is private, you'll need to add credentials:
>
> ```bash
> # Create pull secret (youll be prompted for credentials)
> pcc secrets image-pull-secret pull-secret https://index.docker.io/v1/
>
> # Deploy with credentials
> pcc deploy my-first-agent your-username/my-first-agent:0.1 --credentials pull-secret
> ```
### 4. Check deployment and scaling (optional)
By default, your agent will use "scale-to-zero" configuration, which means it may have a cold start of around 10 seconds when first used. By default, idle instances are maintained for 5 minutes before being terminated when using scale-to-zero.
For more responsive testing, you can scale your deployment to keep a minimum of one instance warm:
```bash
# Ensure at least one warm instance is always available
pcc deploy my-first-agent your-username/my-first-agent:0.1 --min-instances 1
# Check the status of your deployment
pcc agent status my-first-agent
```
By default, idle instances are maintained for 5 minutes before being terminated when using scale-to-zero.
### 5. Create an API key
```bash
# Create a public API key for accessing your agent
pcc organizations keys create
# Set it as the default key to use with your agent
pcc organizations keys use
```
### 6. Start your agent
```bash
# Start a session with your agent in a Daily room
pcc agent start my-first-agent --use-daily
```
This will return a URL, which you can use to connect to your running agent.
## Documentation
For more details on Pipecat Cloud and its capabilities:
- [Pipecat Cloud Documentation](https://docs.pipecat.daily.co)
- [Pipecat Project Documentation](https://docs.pipecat.ai)
## Support
Join our [Discord community](https://discord.gg/dailyco) for help and discussions.

View File

@@ -0,0 +1,165 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecatcloud.agent import DailySessionArguments
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMMessagesFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
# Load environment variables
load_dotenv(override=True)
# Check if we're in local development mode
LOCAL_RUN = os.getenv("LOCAL_RUN")
async def main(transport: DailyTransport):
"""Main pipeline setup and execution function.
Args:
transport: The DailyTransport object for the bot
"""
logger.debug("Starting bot")
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"), voice_id="71a7ad14-091c-4e8e-a314-022ece01c121"
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
report_only_initial_ttfb=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
logger.info("First participant joined: {}", participant["id"])
await transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
messages.append(
{
"role": "system",
"content": "Please start with 'Hello World' and introduce yourself to the user.",
}
)
await task.queue_frames([LLMMessagesFrame(messages)])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
logger.info("Participant left: {}", participant)
await task.cancel()
runner = PipelineRunner()
await runner.run(task)
async def bot(args: DailySessionArguments):
"""Main bot entry point compatible with the FastAPI route handler.
Args:
room_url: The Daily room URL
token: The Daily room token
body: The configuration object from the request body
session_id: The session ID for logging
"""
from pipecat.audio.filters.krisp_filter import KrispFilter
logger.info(f"Bot process initialized {args.room_url} {args.token}")
transport = DailyTransport(
args.room_url,
args.token,
"Pipecat Bot",
DailyParams(
audio_in_enabled=True,
audio_in_filter=None if LOCAL_RUN else KrispFilter(),
audio_out_enabled=True,
transcription_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
)
try:
await main(transport)
logger.info("Bot process completed")
except Exception as e:
logger.exception(f"Error in bot process: {str(e)}")
raise
# Local development functions
async def local_daily():
"""Function for local development testing."""
from local_runner import configure
try:
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Pipecat Bot",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
transcription_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
)
await main(transport)
except Exception as e:
logger.exception(f"Error in local development mode: {e}")
# Local development entry point
if LOCAL_RUN and __name__ == "__main__":
try:
asyncio.run(local_daily())
except Exception as e:
logger.exception(f"Failed to run in local mode: {e}")

View File

@@ -0,0 +1,4 @@
CARTESIA_API_KEY=
OPENAI_API_KEY=
# Local dev only
DAILY_API_KEY=

View File

@@ -0,0 +1,47 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
from fastapi import HTTPException
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
async def configure(aiohttp_session: aiohttp.ClientSession):
(url, token) = await configure_with_args(aiohttp_session)
return (url, token)
async def configure_with_args(aiohttp_session: aiohttp.ClientSession = None):
key = os.getenv("DAILY_API_KEY")
if not key:
raise Exception(
"No Daily API key specified. set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
room = await daily_rest_helper.create_room(
DailyRoomParams(properties={"enable_prejoin_ui": False})
)
if not room.url:
raise HTTPException(status_code=500, detail="Failed to create room")
url = room.url
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)

View File

@@ -0,0 +1,8 @@
agent_name = "my-first-agent"
image = "your-username/my-first-agent:0.1"
image_credentials = "your-dockerhub-creds"
secret_set = "my-first-agent-secrets"
enable_krisp = true
[scaling]
min_instances = 0

View File

@@ -0,0 +1,3 @@
pipecatcloud
pipecat-ai[cartesia,daily,openai,silero]>=0.0.58
python-dotenv~=1.0.1

View File

@@ -1,3 +0,0 @@
**/.DS_Store
.env
.env.*

View File

@@ -1,40 +0,0 @@
FROM python:3.11-bullseye
ARG DEBIAN_FRONTEND=noninteractive
ARG USE_PERSISTENT_DATA
ENV PYTHONUNBUFFERED=1
# Expose FastAPI port
ENV FAST_API_PORT=7860
EXPOSE 7860
# Install system dependencies
RUN apt-get update && apt-get install --no-install-recommends -y \
build-essential \
git \
ffmpeg \
google-perftools \
ca-certificates curl gnupg \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Set up a new user named "user" with user ID 1000
RUN useradd -m -u 1000 user
# Set home to the user's home directory
ENV HOME=/home/user \
PATH=/home/user/.local/bin:$PATH \
PYTHONPATH=$HOME/app \
PYTHONUNBUFFERED=1
# Switch to the "user" user
USER user
# Set the working directory to the user's home directory
WORKDIR $HOME/app
# Install Python dependencies
COPY *.py .
COPY ./requirements.txt requirements.txt
RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
# Start the FastAPI server
CMD python3 bot_runner.py --host "0.0.0.0" --port ${FAST_API_PORT}

View File

@@ -1,85 +0,0 @@
<div align="center">
 <img alt="pipecat" width="300px" height="auto" src="image.png">
</div>
# Dialin example
Example project that demonstrates how to add phone number dialin to your Pipecat bots. We include examples for both Daily (`bot_daily.py`) and Twilio (`bot_twilio.py`), depending on who you want to use as a phone vendor.
- 🔁 Transport: Daily WebRTC
- 💬 Speech-to-Text: Deepgram via Daily transport
- 🤖 LLM: GPT4-o / OpenAI
- 🔉 Text-to-Speech: ElevenLabs
#### Should I use Daily or Twilio as a vendor?
If you're starting from scratch, using Daily to provision phone numbers alongside Daily as a transport offers some convenience (such as automatic call forwarding.)
If you already have Twilio numbers and workflows that you want to connect to your Pipecat bots, there is some additional configuration required (you'll need to create a `on_dialin_ready` and use the Twilio client to trigger the forward.)
You can read more about this, as well as see respective walkthroughs in our docs.
## Setup
```shell
# Install the requirements
pip install -r requirements.txt
# Setup your env
mv env.example .env
```
## Using Daily numbers
Run `bot_runner.py` to handle incoming HTTP requests:
`python bot_runner.py --host localhost`
Then target the following URL:
`POST /daily_start_bot`
For more configuration options, please consult Daily's API documentation.
## Using Twilio numbers
As above, but target the following URL:
`POST /twilio_start_bot`
For more configuration options, please consult Twilio's API documentation.
## Deployment example
A Dockerfile is included in this demo for convenience. Here is an example of how to build and deploy your bot to [fly.io](https://fly.io).
*Please note: This demo spawns agents as subprocesses for convenience / demonstration purposes. You would likely not want to do this in production as it would limit concurrency to available system resources. For more information on how to deploy your bots using VMs, refer to the Pipecat documentation.*
### Build the docker image
`docker build -t tag:project .`
### Launch the fly project
`mv fly.example.toml fly.toml`
`fly launch` (using the included fly.toml)
### Setup your secrets on Fly
Set the necessary secrets (found in `env.example`)
`fly secrets set DAILY_API_KEY=... OPENAI_API_KEY=... ELEVENLABS_API_KEY=... ELEVENLABS_VOICE_ID=...`
If you're using Twilio as a number vendor:
`fly secrets set TWILIO_ACCOUNT_SID=... TWILIO_AUTH_TOKEN=...`
### Deploy!
`fly deploy`
## Need to do something more advanced?
This demo covers the basics of bot telephony. If you want to know more about working with PSTN / SIP, please ping us on [Discord](https://discord.gg/pipecat).

Some files were not shown because too many files have changed in this diff Show More