Compare commits

...

2317 Commits

Author SHA1 Message Date
Chad Bailey
0369733f9c explicit MetadataFrame 2025-01-31 19:27:01 +00:00
Chad Bailey
74b85a450f wip 2025-01-31 19:27:01 +00:00
Aleix Conchillo Flaqué
bb9a2560c3 Merge pull request #1118 from pipecat-ai/aleix/task-manager
introduce TaskManager
2025-01-31 10:24:52 -08:00
Aleix Conchillo Flaqué
002699f16c rtvi: delay creating tasks until we get StartFrame 2025-01-31 10:06:11 -08:00
chadbailey59
a17243bc1e More Storybot updates (#1116)
* initial changes for gemini storybot

* storybot updates for gemini

* more storybot updates

* interim interruptible commit

* cleanup

* cleanup

* cleanup

* first draft

* wip

* more storybot fixes

* more storybot updates WIP

* committing before changing the image prompting strategy

* wip

* prompt updating

* cleanup

* cleanup

* cleanup

* readme cleanup

* fixup
2025-01-30 20:13:18 -06:00
Aleix Conchillo Flaqué
d95819746a tests: make sure QueuedFrameProcessor push frames 2025-01-30 13:48:44 -08:00
Aleix Conchillo Flaqué
b65f32e8e1 task: start TaskObserver when tasks can be created
We have to start proxy observer tasks once we know the TaskManager has an event
loop.
2025-01-30 13:46:56 -08:00
Aleix Conchillo Flaqué
0131d0a531 examples: make sure unhandled frames are always pushed 2025-01-30 13:15:49 -08:00
Aleix Conchillo Flaqué
642affb2fe add missing super().process_frame() calls 2025-01-30 13:15:17 -08:00
Aleix Conchillo Flaqué
a145005498 SyncParallelPipeline: cleanup source/sink processors 2025-01-30 13:13:02 -08:00
Aleix Conchillo Flaqué
241f241ed9 SyncParallelPipeline: don't add source/sink processors inside pipeline 2025-01-30 13:12:37 -08:00
Aleix Conchillo Flaqué
85e572e2d8 gladia: cleanup receive messages task 2025-01-30 13:10:47 -08:00
Aleix Conchillo Flaqué
10716e8ec1 utils: protect obj_id() and obj_count() with a lock 2025-01-30 13:10:36 -08:00
Aleix Conchillo Flaqué
41d60a14cc introduce TaskManager and PipelineRunner event loop 2025-01-30 13:10:36 -08:00
Aleix Conchillo Flaqué
e69c065a86 update CHANGELOG and fix formatting 2025-01-30 08:55:29 -08:00
Aleix Conchillo Flaqué
f90c17ab30 Merge pull request #1083 from team-telnyx/creating_telnyx_chatbot
Creating telnyx chatbot
2025-01-30 08:49:20 -08:00
Aleix Conchillo Flaqué
bc4fdd587a Merge pull request #1103 from pipecat-ai/aleix/tts-service-push-silence-before-tts-stop-frame
services(tts): allow pushing silence audio before TTSStoppedFrame
2025-01-30 08:48:41 -08:00
Aleix Conchillo Flaqué
665a6017f9 services(tts): allow pushing silence audio before TTSStoppedFrame 2025-01-30 08:46:56 -08:00
Aleix Conchillo Flaqué
4119d7a115 Merge pull request #1104 from pipecat-ai/aleix/twilio-transport-message-frames
serializers(twilio): handle transport message frames
2025-01-30 08:45:55 -08:00
Aleix Conchillo Flaqué
2634b03ffa serializers(twilio): handle transport message frames 2025-01-30 08:30:09 -08:00
Aleix Conchillo Flaqué
6a50759b9f Merge pull request #1105 from pipecat-ai/aleix/websocket-client
added new websocket client transport
2025-01-30 08:28:26 -08:00
Mark Backman
7982faba67 Merge pull request #1115 from pipecat-ai/mb/elevenlabs-language-fixes
Improve ElevenLabs language checking logic
2025-01-30 10:03:22 -05:00
Mark Backman
2b4bf57c04 Improve ElevenLabs language checking logic 2025-01-30 09:52:36 -05:00
Rafal Skorski
b93e4ab9cb Formatting adjusted and the encoding selection moved from TelnyFrameSerilaizer to websocket_endpoint function in server.py 2025-01-30 12:52:30 +01:00
Dominic Stewart
c140c04b9a Merge pull request #1080 from DominicStewart/dom/voicemail-detection-bot
Add voicemail detection example
2025-01-30 09:20:12 +09:00
Dominic
a7c8d2af8e Removed extra space too 2025-01-30 09:18:29 +09:00
Dominic
f3f520a76a Removed formatting that vs code automatically adds to readme file 2025-01-30 09:17:27 +09:00
Mark Backman
5e0f42a3e0 Merge pull request #1111 from pipecat-ai/mb/gemini-restructure-messages
GoogleLLMContext: Allow _restructure_from_openai_messages to handle c…
2025-01-29 19:06:47 -05:00
Mark Backman
220ce9fd0f GoogleLLMContext: Allow _restructure_from_openai_messages to handle context frames that contain function call data and / or messages 2025-01-29 16:01:39 -05:00
Filipi da Silva Fuchter
5d0486a26f Merge pull request #1008 from pipecat-ai/cutting_initial_words
Avoid cutting off the beginning of the audio
2025-01-29 17:02:40 -03:00
Aleix Conchillo Flaqué
091258f617 improve create_task names 2025-01-29 11:11:40 -08:00
Aleix Conchillo Flaqué
2a1408eb2a transports(websocket server): remove unused variable 2025-01-29 11:11:40 -08:00
Aleix Conchillo Flaqué
6393b41d58 transports(websocket): added WebsocketClientTransport 2025-01-29 11:11:37 -08:00
Filipi Fuchter
2a5728264c Adding missing dependency to openai 2025-01-29 15:52:42 -03:00
Filipi Fuchter
2ef0735462 Adding readme to teach how to use. 2025-01-29 15:45:48 -03:00
Filipi Fuchter
80bbfff4be Merge branch 'main' into cutting_initial_words 2025-01-29 15:36:52 -03:00
Aleix Conchillo Flaqué
4ff68e66b9 Merge pull request #1110 from pipecat-ai/aleix/frame-metadata
frames: added metadata field to Frame class
2025-01-29 10:30:59 -08:00
Aleix Conchillo Flaqué
3a688840fc frames: added metadata field to Frame class 2025-01-29 09:53:21 -08:00
Aleix Conchillo Flaqué
2ca8b95bbf Merge pull request #1106 from Vaibhav159/vl_moving_test_utils_to_pipecat_package
moving test utils inside of package
2025-01-29 09:44:34 -08:00
Mark Backman
2aafc6bd1d Merge pull request #1107 from AngeloGiacco/angelo/increase-ws-connection
fix: elevenlabs tts increase websocket max message size limit to 16MB
2025-01-29 10:04:42 -05:00
Angelo Giacco
0ff9ef8707 fix: add changelog 2025-01-29 14:27:39 +00:00
Angelo Giacco
596cae994d fix: elevenlabs tts increase websocket max message size limit to 16MB 2025-01-29 13:55:27 +00:00
Dominic
9ad9cb1ff8 Cleaned up formatting 2025-01-29 17:36:08 +09:00
Dominic Stewart
60e800e9ba Merge branch 'main' into dom/voicemail-detection-bot 2025-01-29 17:30:56 +09:00
Dominic
1c8f0ed7da Finalised code and added a bit about this example to the README 2025-01-29 17:27:44 +09:00
Vaibhav159
8407a86532 moving test utils inside of package 2025-01-29 12:46:43 +05:30
Dominic
417d661d28 Updated bot_runner and bot_daily with adjustments necessary to run voicemail detection from bot_daily code 2025-01-29 16:11:45 +09:00
Aleix Conchillo Flaqué
8cd23c42fc Merge pull request #1100 from pipecat-ai/aleix/use-task-cancel-on-left-disconnected
use `task.cancel()` when participant leaves/disconnects
2025-01-28 16:02:02 -08:00
Aleix Conchillo Flaqué
0547a15695 task: allow queuing a CancelFrame to cancel the task 2025-01-28 15:59:36 -08:00
Aleix Conchillo Flaqué
3fe2124314 examples: use task.cancel() when participant leaves or disconnects 2025-01-28 15:46:20 -08:00
Aleix Conchillo Flaqué
ba358a4f0a task: cleanup processors after task finishes running 2025-01-28 15:02:25 -08:00
Aleix Conchillo Flaqué
79ef8c947d Merge pull request #1099 from pipecat-ai/aleix/daily-transport-queue-events
transports(daily): queue events until join completes
2025-01-28 14:38:25 -08:00
Aleix Conchillo Flaqué
f024476b08 transports(daily): queue events until join completes 2025-01-28 11:22:42 -08:00
Dominic
73690a13d9 Moved voicemail detection to phone-chatbot and working on that now 2025-01-28 22:31:08 +09:00
Dominic
6ebf06a6fb Removed start_terminate_call function as unnecessary 2025-01-28 10:39:10 +09:00
Dominic
2f4f779c91 Fixed a few things 2025-01-28 10:39:10 +09:00
Dominic
941ee6e5e8 Add voicemail detection example 2025-01-28 10:39:10 +09:00
Aleix Conchillo Flaqué
cd5075ed7a Merge pull request #1097 from pipecat-ai/aleix/pipecat-0.0.57
prepare CHANGELOG for 0.0.54
2025-01-27 14:56:51 -08:00
Aleix Conchillo Flaqué
6f41a667c8 prepare CHANGELOG for 0.0.54 2025-01-27 14:48:56 -08:00
Aleix Conchillo Flaqué
0b222a7eae Merge pull request #1085 from pipecat-ai/aleix/task-creation-and-cancellation
improve task creation and cancellation
2025-01-27 14:47:20 -08:00
Aleix Conchillo Flaqué
f09f4b8fc4 services(tavus): fix EndFrame and CancelFrame processing 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
cca241a2b7 examples(22c): fix cancel_task call 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
1489e44740 gemini(multimodal live): fix model audio queue variable 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
f55f78e70e update CHANGELOG.md 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
10202dc529 transports(websockets): cancel or wait for tasks to finish 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
498805a34c FrameProcessor: add wait_for_task() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
509f143e1b update CHANGELOG.md 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
737e4fa3bd gemini(multimodal live): connect on StartFrame 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
8b5228a105 utils: move task functions to asyncio module 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
6cc01bc5b0 examples: update 14 series with TTSSpeakFrame 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
2a2928d96c gemini: create transcribe tasks only once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
a3a6adbd17 user_idle_processor: add missing parent cleanup() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
bf5ced18b2 fix parallel pipelines cleanup 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
2eccd1b1e9 utils: update some logging levels 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
9374bed878 tests: langchain fixes 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
c03d0352b1 utils/tasks: added new documentation 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
af90b8b4fa utils: add wait_for_task() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
0a9daa2f56 task: avoid canceling tasks more than once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
e48c0e52ef transports(daily): avoid canceling task more than once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
6bca8396d3 utils: error if we try to cancel the same task multiple times 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
c2d8a45a07 runner: warn about remaining dangling tasks 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
80a7f1b1e7 runner: improve signal handler task cancellation 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
aff6e24560 pipeline: fix pipeline cleanup 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
cb93f6b368 utils: store created tasks and add current_tasks() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
ff0bcec33a transports: improve task naming 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
5885fcc230 add id and name properties 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
57b186cde8 base_transport: add name and id fields 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
d1a3f404a5 improve task creation and cancellation
If a FrameProcessor needs to create a task it should use
FrameProcessor.create_task() and FrameProcessor.cancel_task(). This gives
Pipecat more control over all the tasks that are created in Pipecat.

Both functions internally use the utils module: utils.create_task() and
utils.cancel_task() which should also be used outside of FrameProcessors. That
is, unless strictly necessary, we should avoid using asyncio.create_task().
2025-01-27 14:42:23 -08:00
chadbailey59
179ddbea7d Add dialout to the Daily phone example (#998)
* added dialout to daily phone example

* cleanup

* cleanup

* pre-commit hook

* Fix typo

* More explicit README instructions

---------

Co-authored-by: Mark Backman <mark@daily.co>
2025-01-27 12:21:30 -06:00
Mark Backman
86c1e6a3bd Merge pull request #1081 from pipecat-ai/mb/user-idle-add-retry
Added retry functionality and a new callback to the UserIdleProcessor
2025-01-27 10:30:45 -05:00
Mark Backman
9e9822f17d Use inspect.signature to determine which callback to use 2025-01-27 10:24:58 -05:00
Mark Backman
5f9671e2ca Added retry functionality and a new callback to the UserIdleProcessor 2025-01-27 10:24:57 -05:00
Mark Backman
aac8961ae5 Merge pull request #1078 from pipecat-ai/mb/improve-error-handling-truncate-audio
Add better error handling for OpenAIRealtimeBetaLLMService truncate errors
2025-01-27 08:54:39 -05:00
Mark Backman
3e6377346a Merge pull request #1093 from pipecat-ai/mb/update-example-6a 2025-01-26 19:43:39 -05:00
Mark Backman
9d9a622b1a Merge pull request #1094 from pipecat-ai/mb/readme-service-section 2025-01-26 19:43:12 -05:00
Mark Backman
3e9a6b6262 Merge pull request #1095 from pipecat-ai/mb/elevenlabs-lang-codes 2025-01-26 12:21:28 -05:00
Mark Backman
fb3097560f Remove eleven_multilinguagal_v2 from language code list 2025-01-26 07:17:38 -05:00
Mark Backman
ff6368add0 Update README.md
Adding a section so that table can be linked to.
2025-01-25 16:12:53 -05:00
Mark Backman
89fd03d86f Merge pull request #1090 from vengad-arrowhead/main
Adding hindi danda symbol as end of sentence marker
2025-01-25 09:36:19 -05:00
Mark Backman
0672530d6b Fix foundational example 6a to switch images when the bot is speaking 2025-01-25 08:40:42 -05:00
vengadanathan srinivasan
7a0cfc8d3d Adding hindi danda symbol as end of sentence marker 2025-01-25 14:55:51 +05:30
Mark Backman
b881dd57b3 Merge pull request #1086 from pipecat-ai/mb/fix-expiry-time-type-mismatch 2025-01-24 17:31:08 -05:00
Mark Backman
abf0d0d053 Improve token parameter construction using DailyMeetingTokenProperties 2025-01-24 17:22:31 -05:00
Mark Backman
1acdf7aff7 Fix expiry_time type validation in get_token REST API helper 2025-01-24 17:21:50 -05:00
Mark Backman
96b90abda6 Merge pull request #1082 from pipecat-ai/mb/update-function-calling-examples
Update function calling examples to push a TextFrame in the start_cal…
2025-01-24 17:21:13 -05:00
Filipi da Silva Fuchter
202a844eeb Merge pull request #1051 from pipecat-ai/gemini_grounding_metadata_rtvi
Sending Search Response to RTVI
2025-01-24 19:20:50 -03:00
Filipi Fuchter
655d56f634 Fixing pydantic validation when creating meeting token. 2025-01-24 19:15:56 -03:00
Filipi Fuchter
07c84b733b Sending Search Response to RTVI 2025-01-24 18:59:46 -03:00
Filipi da Silva Fuchter
7c52736ff6 Merge pull request #1030 from pipecat-ai/gemini_grounding_metadata
Introduce support for extracting and processing grounding metadata from GoogleLLMService.
2025-01-24 15:41:54 -03:00
Mark Backman
48ce751602 Merge pull request #1075 from Vaibhav159/vl_add_daily_meeting_token_v2
adding models to DailyRestHelper
2025-01-24 13:21:52 -05:00
Vaibhav159
1f1e2dac2b wrapping things up 2025-01-24 23:44:23 +05:30
Vaibhav159
71c2dc3d05 minor typing change 2025-01-24 23:38:44 +05:30
Vaibhav159
ef02ece662 doc string 2025-01-24 22:47:40 +05:30
Vaibhav159
d5818fad5b addressing comments 2025-01-24 22:46:54 +05:30
Rafal Skorski
9c22bd8df1 Improving read me and encoding support 2025-01-24 16:44:11 +01:00
Mark Backman
dbea86baae Update function calling examples to push a TextFrame in the start_callback 2025-01-24 10:21:08 -05:00
Vaibhav159
c5faac1cf8 adding RecordingsBucketConfig 2025-01-24 15:14:20 +05:30
Vaibhav159
e106d7a215 adding line space 2025-01-24 09:12:07 +05:30
Vaibhav159
40c1a8369a updated changelog 2025-01-24 09:11:15 +05:30
Vaibhav159
6ab2404a98 adding more properties to daily room 2025-01-24 09:10:25 +05:30
Mark Backman
e61c996a2e Merge pull request #1079 from ecdeng/patch-1
Update cartesia.py to use the new model pointer `sonic`
2025-01-23 22:15:30 -05:00
Eric Deng
2c81dc1f06 Update cartesia.py to use the new model pointer sonic instead of sonic-english
We are now using `sonic` as a pointer to the latest stable release (https://docs.cartesia.ai/build-with-sonic/models#continuous-updates). sonic-english will forever point to `sonic-2024-10-19`, which is already out of date.
2025-01-23 15:47:07 -08:00
Mark Backman
53251dcb88 Add better error handling for OpenAIRealtimeBetaLLMService truncate errors 2025-01-23 14:25:08 -05:00
Mark Backman
d4e4b12109 Merge pull request #1071 from porcelaincode/patch-1
Update runner.py
2025-01-23 13:19:22 -05:00
Mark Backman
466d26a4f2 Merge pull request #1077 from Vaibhav159/vl_fix_missing_leftover_audio
adding missing audio buffer fix
2025-01-23 13:16:41 -05:00
Vaibhav159
ef511d580d adding missing audio buffer fix 2025-01-23 23:17:49 +05:30
Vaibhav159
5957ddb038 adding missing audio buffer fix 2025-01-23 23:17:18 +05:30
Vaibhav159
799c2d14b8 adding meeting token v2 func 2025-01-23 21:40:42 +05:30
Rafal Skorski
8eef21db6e Adding telnyx serializer 2025-01-23 15:39:46 +01:00
vatsal
dee1224530 Update runner.py 2025-01-23 13:21:49 +05:30
Mark Backman
fc6aa6eae8 Merge pull request #1060 from chhao01/patch-1
[bug]TypeError: object of type 'NoneType' has no len()
2025-01-22 19:14:35 -05:00
Mark Backman
ddd5bf70ab Merge pull request #1061 from Allenmylath/patch-21
Update README.md
2025-01-22 19:13:15 -05:00
allenmylath
aa59744444 Update examples/README.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-23 05:38:37 +05:30
chadbailey59
067ddfe505 Storytelling chatbot updates (#1066)
* initial changes for gemini storybot

* storybot updates for gemini

* more storybot updates

* interim interruptible commit

* cleanup

* cleanup

* cleanup

* cleanup
2025-01-22 15:20:21 -06:00
Mark Backman
a64df978e7 Merge pull request #1046 from pipecat-ai/mb/transcript-tts
Modified `TranscriptProcessor` to use `TTSTextFrame`s
2025-01-22 15:46:01 -05:00
Mark Backman
7167719761 Emit a transcription callback when receiving a CancelFrame, update examples accordingly 2025-01-22 14:56:29 -05:00
Mark Backman
e1430be9f9 Code review fixes 2025-01-22 14:56:29 -05:00
Mark Backman
c2fe8e7fdb Updated CHANGELOG 2025-01-22 14:56:28 -05:00
Mark Backman
31c77d8e35 Update examples for the updated TranscriptProcessor 2025-01-22 14:56:00 -05:00
Mark Backman
2a60d54830 Update the AssistantTranscriptProcessor to use TTSTextFrames in place of OpenAILLMContextFrames 2025-01-22 14:56:00 -05:00
Aleix Conchillo Flaqué
b3c99887dc Merge pull request #1068 from Canonical-AI-Inc/import-fix
Fixing missing import
2025-01-22 11:37:49 -08:00
Mark Backman
38ad75cc17 Merge pull request #1065 from pipecat-ai/mb/fix-openai_realtime-function-calling
OpenAIRealtimeBetaLLMService: Fixed an error in function calling
2025-01-22 14:37:01 -05:00
Adrian Cowham
2debac314c fixing missing import 2025-01-22 11:06:53 -08:00
Mark Backman
e0c9a1a1a2 Merge pull request #1041 from Allenmylath/patch-20
Update bot.py
2025-01-22 09:18:19 -05:00
allenmylath
4cdcca588e Update examples/moondream-chatbot/bot.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-22 19:40:12 +05:30
allenmylath
a90e81e2eb Update examples/moondream-chatbot/bot.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-22 19:38:36 +05:30
Mark Backman
0ba60c9e28 Merge pull request #975 from imsakg/main
fix(gemini): prevent non-audio modality processing
2025-01-22 09:03:18 -05:00
Mark Backman
5ca5fbd825 OpenAIRealtimeBetaLLMService: Fixed an error in function calling 2025-01-22 08:54:03 -05:00
allenmylath
2b52e2c109 Update README.md
Silero-tts changed to VAD, also description regarding session handling added to websocket chatbot
2025-01-22 14:42:35 +05:30
Cheng Hao
7e8fc2e7e2 [bug]TypeError: object of type 'NoneType' has no len()
Sometimes the chunk.choices is None, and I got exception like: 
```
TypeError: object of type 'NoneType' has no len()
```
2025-01-22 15:31:27 +08:00
Aleix Conchillo Flaqué
0d79a9eaa6 update CHANGELOG.md 2025-01-21 18:00:10 -08:00
Aleix Conchillo Flaqué
f89b9ec23f Merge pull request #1057 from pipecat-ai/aleix/replace-resampy-soxr
improve audio resampling by switching from resampy to soxr
2025-01-21 17:52:49 -08:00
Mark Backman
20d5824e56 Merge pull request #1058 from pipecat-ai/mb/fix-trace-log 2025-01-21 20:44:50 -05:00
Aleix Conchillo Flaqué
f23baa78d8 test-requirements: add soxr and remove resampy 2025-01-21 17:40:17 -08:00
Aleix Conchillo Flaqué
cacd6ba3fa improve audio resampling by switching from resampy to soxr 2025-01-21 17:40:17 -08:00
Aleix Conchillo Flaqué
f87ecd3a51 Merge pull request #1048 from pipecat-ai/aleix/add-unittest-utils
tests: add some initial run_test() utilities
2025-01-21 17:39:06 -08:00
Mark Backman
b96a922aa8 Fix trace log line for resume_processing_frames 2025-01-21 18:15:03 -05:00
Aleix Conchillo Flaqué
401d3ff267 tests: added PipelineTask tests 2025-01-21 11:45:43 -08:00
Aleix Conchillo Flaqué
ab4221a4db task: added BaseTask 2025-01-21 11:45:43 -08:00
Aleix Conchillo Flaqué
bd6f82cf94 task: allow specifying heartbeat period 2025-01-21 11:45:43 -08:00
Aleix Conchillo Flaqué
dd21b424d6 pyproject: ignore 'audioop' deprecation warning 2025-01-21 10:27:34 -08:00
Aleix Conchillo Flaqué
76884877dd tests: add pytest-asyncio dependency 2025-01-21 10:23:19 -08:00
Aleix Conchillo Flaqué
0d6c680133 README: add unit tests badge 2025-01-21 10:14:37 -08:00
Aleix Conchillo Flaqué
a27fe4bde2 tests: move test_ai_services to test_utils_string 2025-01-21 10:06:14 -08:00
Aleix Conchillo Flaqué
177cb2ca8b tests: initial pipeline and parallelpipeline tests 2025-01-21 09:57:54 -08:00
Aleix Conchillo Flaqué
3c970a3cee tests: add more filter tests 2025-01-21 09:43:57 -08:00
Aleix Conchillo Flaqué
af02f8f1cd filters(frame_filter): allow more than one frame 2025-01-21 09:43:33 -08:00
Aleix Conchillo Flaqué
2e0fb198bf frame_processor: allow pushing more frames after EndFrame
This can be useful for testing purposes. In real practice, there shouldn't be
any frames after an EndFrame is pushed.
2025-01-21 09:42:15 -08:00
Filipi da Silva Fuchter
4f758c5a3b Merge pull request #1050 from pipecat-ai/fix_rtvi_warning_msg
Ignoring transport messages that are not intended to RTVI.
2025-01-21 13:36:50 -03:00
Rafal Skorski
89b87289e2 elevenlabs key added to env.example 2025-01-21 17:12:27 +01:00
Rafal Skorski
e0e190a1a2 Create telnyx chat bot example application 2025-01-21 17:09:55 +01:00
Filipi Fuchter
3e0836b340 Ignoring transport messages that are not intended to RTVI. 2025-01-21 10:08:14 -03:00
Aleix Conchillo Flaqué
2f23693bf3 tests: fix test_protobuf_serializer.py 2025-01-20 18:39:59 -08:00
Aleix Conchillo Flaqué
b7dd9748cf serializers: fix special fix initialization 2025-01-20 18:39:41 -08:00
Aleix Conchillo Flaqué
d4d9c3b7ae tests: fix test_aggregators.py 2025-01-20 18:16:14 -08:00
Aleix Conchillo Flaqué
090bc81ec5 tests: add some initial run_test() utilities 2025-01-20 17:41:21 -08:00
Filipi Fuchter
9b61633aa0 Introduce support for extracting and processing grounding metadata from Google LLM responses. 2025-01-20 11:28:12 -03:00
Mark Backman
e3d53d3d9a Merge pull request #1044 from pipecat-ai/mb/elevenlabs-http-fix-voice-settings
Fixed a type error when using voice_settings in ElevenLabsHttpTTSService
2025-01-20 08:11:38 -05:00
Mark Backman
262d3a19c9 Fixed a type error when using voice_settings in ElevenLabsHttpTTSService 2025-01-20 07:57:02 -05:00
allenmylath
491feb691c Update bot.py
quiet and talking frames are determined based on BotStartedSpeakingFrame and BotStoppedSpeakingFrame not ttsframe
2025-01-20 14:00:17 +05:30
Aleix Conchillo Flaqué
e4f83b237e update CHANGELOG (remove 07d-interruptible-elevenlabs-http.py) 2025-01-19 11:36:18 -08:00
Aleix Conchillo Flaqué
a169e0cde9 Merge pull request #1035 from pipecat-ai/aleix/prepare-0.0.53
update CHANGELOG for 0.0.53
2025-01-18 14:50:35 -08:00
Aleix Conchillo Flaqué
c6d643d4ec update CHANGELOG for 0.0.53 2025-01-18 14:48:48 -08:00
Aleix Conchillo Flaqué
2abbd4bb27 Merge pull request #1039 from pipecat-ai/aleix/fish-audio-websocket-service
services(fish): FishAudioTTSService to use WebsocketService
2025-01-18 14:48:20 -08:00
Aleix Conchillo Flaqué
e0011a3996 services(fish): FishAudioTTSService to use WebsocketService 2025-01-18 14:29:45 -08:00
Aleix Conchillo Flaqué
ea44c59ddd Merge pull request #1037 from Vaibhav159/fixing_unused_11labs_package
removing unused 11labs package imports
2025-01-17 22:08:04 -08:00
Vaibhav159
a9c7dbbc05 removing unused code 2025-01-18 10:58:07 +05:30
Vaibhav159
8a87e92b2b adding missing 11labs package 2025-01-18 10:48:57 +05:30
Mark Backman
982f2becc6 Merge pull request #1002 from pipecat-ai/mb/add-on-error-callback
Register the on_error handler
2025-01-17 21:58:59 -05:00
Mark Backman
e049ae470d Register the on_error handler 2025-01-17 21:49:42 -05:00
Mark Backman
e159f2dce1 Merge pull request #1024 from pipecat-ai/mb/elevenlabs-http
Add ElevenLabsHttpTTSService
2025-01-17 21:30:31 -05:00
Aleix Conchillo Flaqué
e9162ae467 Merge pull request #1004 from Fluentsai/feature/dtmf_input
Twilio serializer reading dtmf websocket messages
2025-01-17 18:14:46 -08:00
Aleix Conchillo Flaqué
bb65512ff4 Merge pull request #1034 from pipecat-ai/aleix/ulaw-resample-update
ulaw resample update
2025-01-17 17:47:18 -08:00
Mark Backman
b81323d676 Code review fixes + docstrings 2025-01-17 20:12:43 -05:00
Aleix Conchillo Flaqué
65fa77dfa5 audio: use resample_audio to resample ulaw bytes 2025-01-17 15:24:41 -08:00
Aleix Conchillo Flaqué
9ddd9ae27c Merge pull request #1011 from Vaibhav159/vl_deepgram_metrics_without_vad
adding metric generation without deepgram VAD
2025-01-17 14:47:19 -08:00
Aleix Conchillo Flaqué
12fc6e17ef Merge pull request #1033 from pipecat-ai/aleix/observers-performance
task: add TaskObserver and avoid pipeline blocking
2025-01-17 14:43:26 -08:00
Aleix Conchillo Flaqué
3e4020cdba task: add TaskObserver and avoid pipeline blocking
Observers now process frames in separate tasks. This avoids blocking the
pipeline while the observer is processing the frame.
2025-01-17 11:15:52 -08:00
Aleix Conchillo Flaqué
4f883ee31f Merge pull request #1023 from pipecat-ai/aleix/introduce-heartbeat-frames
introduce heartbeat frames
2025-01-17 10:31:07 -08:00
Mark Backman
3ff360f042 Merge pull request #1032 from pipecat-ai/mb/user-idle-fixes
Start UserIdleProcessor on speaking frame, fix bug not pushing EndFrame
2025-01-17 13:18:09 -05:00
Aleix Conchillo Flaqué
45cbad5b3e task: add HEARTBEAT_MONITOR_SECONDS 2025-01-17 10:11:28 -08:00
Aleix Conchillo Flaqué
477d0d154b frame_processor: make sure clock is initialized 2025-01-17 10:05:23 -08:00
Aleix Conchillo Flaqué
4b3c776f58 task: don't use push queue to send a heartbeat
This is because we might be waiting for the EndFrame. Currently, if we push an
EndFrame to the task, the task will block until the EndFrame traverses all the
pipeline.
2025-01-17 10:04:24 -08:00
Aleix Conchillo Flaqué
da0c4cfd99 task: increase heartbeat monitoring to 5 seconds 2025-01-17 10:04:05 -08:00
Aleix Conchillo Flaqué
f22a00570d task: start heartbeats task when push task starts 2025-01-17 10:03:13 -08:00
Mark Backman
85f4663a41 Start UserIdleProcessor on speaking frame, fix bug not pushing EndFrame 2025-01-17 12:54:17 -05:00
Aleix Conchillo Flaqué
915e3bb3c7 Merge pull request #1029 from Vaibhav159/vl_fixing_idle_frame_processor_logic
fixing IdleFrameProcessor and UserIdleProcessor init logic
2025-01-17 06:48:13 -08:00
Vaibhav159
80779c48d6 sort fix 2025-01-17 20:07:25 +05:30
Vaibhav159
c444557965 fixing IdleFrameProcessor and UserIdleProcessor init logic 2025-01-17 19:50:53 +05:30
Mark Backman
d51893f61c Refactor for aiohttp, correct use of settings 2025-01-16 23:49:53 -05:00
Mark Backman
740d2743df Add TTFB metrics 2025-01-16 23:05:53 -05:00
Mark Backman
0dd22fb879 Merge pull request #1022 from pipecat-ai/mb/fix-abstractmethod
Remove @abstractmethod from set_model and set_model in TTSService class
2025-01-16 22:59:26 -05:00
Mark Backman
225b65c3d2 Add ElevenLabsHttpTTSService 2025-01-16 22:46:32 -05:00
Aleix Conchillo Flaqué
2503f76107 examples: add 31-heartbeats.py 2025-01-16 19:31:13 -08:00
Aleix Conchillo Flaqué
ff8aa68942 introduce heartbeat frames 2025-01-16 19:31:13 -08:00
Maxim Makatchev
c5edbf4b75 Made InputDTMFFrame a DataFrame and moved up to data frames 2025-01-17 12:27:04 +09:00
Aleix Conchillo Flaqué
799777774b Merge pull request #1018 from pipecat-ai/aleix/streamline-thread-pool-executors
transports: streamline max_workers for ThreadPoolExecutors
2025-01-16 19:05:41 -08:00
Mark Backman
fdef8a97e2 Remove @abstractmethod from set_model and set_model in TTSService class 2025-01-16 21:36:51 -05:00
Mark Backman
0163247410 Merge pull request #1021 from pipecat-ai/mb/improve-30
Add a second observer to the 30-observer.py example
2025-01-16 21:19:35 -05:00
James Hush
221e044046 demo: Update translator bot example (#1005)
* docs: Update translator bot example

Updates the translator bot to do the following:

- Allow you to specify the in and out languages
- Uses TranscriptionProcessor to handle transcriptions

* Simplify the example, improve performance

---------

Co-authored-by: Mark Backman <mark@daily.co>
2025-01-17 10:08:15 +08:00
Mark Backman
532fd31fd7 Add a second observer to the 30-observer.py example 2025-01-16 19:46:18 -05:00
Mark Backman
3e178fd46f Merge pull request #1020 from pipecat-ai/mb/observer-foundational
Add foundational example 30 to show how to use an Observer
2025-01-16 19:28:26 -05:00
Mark Backman
07cb8b7a89 Extend the example to include BotStartedSpeakingFrame and BotStoppedSpeakingFrame 2025-01-16 19:24:01 -05:00
Mark Backman
e805738d4c Merge pull request #1009 from pipecat-ai/mb/tts-ignore-interim-transcripts
TTSService should only process LLMTextFrames
2025-01-16 17:09:24 -05:00
Mark Backman
119bc7e35f Update check to exclude transcription frames 2025-01-16 16:43:46 -05:00
Mark Backman
b9b02845a3 Add foundational example 30 to show how to use an Observer 2025-01-16 16:37:32 -05:00
Aleix Conchillo Flaqué
3714f12edc Merge pull request #1019 from Canonical-AI-Inc/canonical-transcripts
Add transcript to Canonical Metrics Service
2025-01-16 13:36:55 -08:00
Aleix Conchillo Flaqué
d2b8171197 transports: streamline max_workers for ThreadPoolExecutors 2025-01-16 13:34:04 -08:00
Filipi Fuchter
c4c15eff39 Sending a silence frame to prevent the audio from clipping. 2025-01-16 18:30:19 -03:00
Adrian Cowham
d0b48c95bb updated the example to use stereo audio and pass in the context. also updated the service to send the transcripts if they're available 2025-01-16 13:12:38 -08:00
Aleix Conchillo Flaqué
73ed0c1ad7 Merge pull request #1017 from pipecat-ai/aleix/additional-trace-logging
additional trace logging
2025-01-16 12:38:47 -08:00
Vanessa Pyne
c211580fec Merge pull request #1016 from pipecat-ai/vp-1007-nonetype
services(gemini_multimodal_live): set content to [] if not present in messages
2025-01-16 14:14:50 -06:00
Aleix Conchillo Flaqué
359b55a85e additional trace logging 2025-01-16 11:19:42 -08:00
Filipi Fuchter
7efd00e0f7 Asking for the bot to send the audio only when the audio element is already on playing state. 2025-01-16 16:00:56 -03:00
kompfner
8b602a3f62 Merge pull request #1010 from pipecat-ai/ios-simplechatbot-assorted-improvements
iOS SimpleChatbot assorted improvements
2025-01-16 13:59:45 -05:00
kompfner
485c231f69 Merge pull request #1012 from pipecat-ai/simplechatbot-readme-local-pipecat
Add to the SimpleChatbot server README a step for pointing to the loc…
2025-01-16 13:46:19 -05:00
vipyne
8ba3b150eb services(gemini_multimodal_live): set content to [] if not present in messages
... which it will be if the message is a tool call
2025-01-16 11:59:02 -06:00
Paul Kompfner
b5f72b4378 Add to the SimpleChatbot server README a step for pointing to the local version of pipecat 2025-01-16 11:59:44 -05:00
Vaibhav159
85e7d62f94 fixing log text 2025-01-16 21:36:51 +05:30
Vaibhav159
923d33eeff fixing ruff 2025-01-16 21:32:48 +05:30
Vaibhav159
7ee6e7193d adding metric generation without deepgram VAD 2025-01-16 21:23:56 +05:30
Paul Kompfner
156fffe6fc In iOS SimpleChatbot demo, add clarifying note to Audio Settings section header explaining that "(No selection = system default)".
Ideally we could add a row showing that the system default is selected, but this is OK as a short-term fix. Also, the presence of that row might suggest that "system default" is selectable, but it's not: this is currently a limitation in the Pipecat Client.
2025-01-16 10:32:55 -05:00
Paul Kompfner
c9834e2712 In iOS SimpleChatbot demo, remove unused LLMHelperDelegate protocol conformance 2025-01-16 10:31:17 -05:00
Paul Kompfner
1e7e307f69 In iOS SimpleChatbot demo, call release() when disconnecting the voice client, since we're not using it after disconnecting 2025-01-16 10:30:06 -05:00
Mark Backman
67e47a388d TTSService should only process LLMTextFrames 2025-01-16 10:03:24 -05:00
Filipi Fuchter
119c0da299 Configuring a proxy so we can test from mobile 2025-01-16 11:02:53 -03:00
Filipi Fuchter
ea1323723d Handling the signalling to play the audio 2025-01-16 10:42:22 -03:00
Filipi Fuchter
d2efe27350 Improving the logs and updating status 2025-01-16 10:36:45 -03:00
Filipi Fuchter
5dc7d2a378 Creating the bot when pressing to connect. 2025-01-16 10:28:39 -03:00
Filipi Fuchter
88c540f9bc Starting to create the example signalling through app message. 2025-01-16 10:14:38 -03:00
Maxim Makatchev
dcf317f2fa Twilio serializer reading dtmf websocket messages and generating InputDTMFFrame containing the corresponding value of KeypadEntry 2025-01-16 17:43:12 +09:00
Aleix Conchillo Flaqué
b8ffd7b16b Merge pull request #996 from pipecat-ai/aleix/introduce-observers
introduce pipeline frame observers
2025-01-15 18:05:33 -08:00
Aleix Conchillo Flaqué
08f1dda94e observers: add a timestamp to on_push_frame() 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
45039e7cde update CHANGELOG.md 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
e50c76d075 examples(simple-chatbot): use RTVIObserver for server-client messages 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
dd9f9179cc rtvi(RTVIObserver): use observers for RTVI server->client messages 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
c8da531402 pipeline(task): add support for pipeline frame observers 2025-01-15 17:43:59 -08:00
Aleix Conchillo Flaqué
25bcaf5c7c observers: introduce pipeline observers 2025-01-15 17:43:59 -08:00
Aleix Conchillo Flaqué
2d0f3341c3 frames: add LLMTextFrame and TTSTextFrame
This is to distinguish what type of service has generated the TextFrames.
2025-01-15 17:43:59 -08:00
Aleix Conchillo Flaqué
7626d7b04b Merge pull request #999 from pipecat-ai/aleix/add-pre-commit-hooks
add pre-commit hooks
2025-01-15 17:39:34 -08:00
Aleix Conchillo Flaqué
f78520f7d0 add pre-commit hooks
Fixes #945
2025-01-15 13:44:21 -08:00
Aleix Conchillo Flaqué
bb4766455d Merge pull request #997 from pipecat-ai/aleix/update-dependencies-01-15-25
update dependencies (go back to numpy1)
2025-01-15 13:35:46 -08:00
Aleix Conchillo Flaqué
9dacbbbbf4 fix ruff formatting 2025-01-15 13:02:13 -08:00
Aleix Conchillo Flaqué
4de192fbb0 update dependencies (go back to numpy1)
Fixes #911, #913
2025-01-15 12:04:28 -08:00
kompfner
80b6c28431 Merge pull request #992 from pipecat-ai/live-updates-to-selected-and-available-mics
In the iOS SimpleChatbot demo, wire up live updates to the selected m…
2025-01-15 15:00:14 -05:00
Mark Backman
f471744bca Merge pull request #995 from pipecat-ai/vp-riva-bump
deps(riva): bump to 2.18.0
2025-01-15 14:35:39 -05:00
Mark Backman
d5df4b064b Merge pull request #987 from pipecat-ai/mb/deepseek-typo
Fix error log in DeepSeekLLMService and CerebrasLLMService
2025-01-15 14:31:34 -05:00
Mark Backman
06a0e29920 Merge pull request #991 from pipecat-ai/mb/update-web-simple-chatbot
Update simple-chatbot example to use the latest client SDKs
2025-01-15 13:36:03 -05:00
Aleix Conchillo Flaqué
64eb8e7262 Merge pull request #994 from Vaibhav159/vl_deepgram_with_vad
finalize on DeepgramSTTService on VAD
2025-01-15 10:28:11 -08:00
Filipi da Silva Fuchter
d8386c12dc Merge pull request #990 from pipecat-ai/bumping_ios_example
Using PipecatClient version 0.3.2
2025-01-15 14:29:01 -03:00
vipyne
50e798bcd9 deps(riva): bump to 2.18.0 2025-01-15 10:24:57 -06:00
Vaibhav159
d1ac7751da finalize on DeepgramSTTService 2025-01-15 20:43:23 +05:30
Paul Kompfner
110ce27c91 In the iOS SimpleChatbot demo, wire up live updates to the selected mic and available mics list. This is beneficial for a few reasons:
- Live updates are nice! We can now more easily see what's going on when we connect or disconnect a mic.
- Resolves an issue where the initial selected mic was not shown.
- Let us see when the Pipecat client automatically switches to a new mic, like when one is connected.
2025-01-15 09:56:27 -05:00
Mark Backman
8b657158ca Update React simple-chatbot client to use latest client SDKs 2025-01-15 09:50:43 -05:00
Mark Backman
cce14fca97 Update JS simple-chatbot client to use latest client SDKs 2025-01-15 09:47:20 -05:00
Filipi Fuchter
7c051516d8 Using PipecatClient version 0.3.2 2025-01-15 09:57:57 -03:00
Mark Backman
5f402ad741 Merge pull request #988 from pipecat-ai/mb/readme-openrouter
Update README.md
2025-01-14 18:38:35 -05:00
Mark Backman
a80b186cea Update README.md
Add OpenRouter to the README
2025-01-14 18:08:14 -05:00
Mark Backman
c65aaf3b2e Merge pull request #967 from sahilsuman933/openrouter-integration
feat(services): Add OpenRouter LLM Service Integration
2025-01-14 18:06:13 -05:00
Mark Backman
e815d7776f Fix error log in DeepSeekLLMService and CerebrasLLMService 2025-01-14 18:03:29 -05:00
sahil suman
11fc08ef24 fix changelog
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:57:09 +05:30
sahil suman
6f3b0fdf73 fix changelog
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:56:16 +05:30
sahil suman
885bc32827 added changes in changelog.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:53:04 +05:30
sahil suman
7339cc7197 Merge remote-tracking branch 'origin/main' into openrouter-integration 2025-01-15 02:52:19 +05:30
sahil suman
62e9e6bc5a changed the file name.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:21:58 +05:30
Sahil Suman
329da50338 Update src/pipecat/services/openrouter.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-15 02:20:22 +05:30
sahil suman
4d307d26d8 made the required changes.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:19:05 +05:30
Mark Backman
a74b9354ec Merge pull request #962 from pipecat-ai/mb/improve-tts-reconnection-logic
Improve websocket based TTS service reconnection logic
2025-01-14 14:48:00 -05:00
sahil suman
11381a536f added example for function calling and made the required changes.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 01:00:33 +05:30
Mark Backman
b53bc8a879 _calculate_wait_times as private, add and use WebsocketServiceException 2025-01-14 13:20:13 -05:00
Mark Backman
e3d8910814 Update CHANGELOG 2025-01-14 13:12:40 -05:00
Mark Backman
e60a59434f Refactor LMNTTTSService to make a websocket connection directly, then use the WebsocketService base class 2025-01-14 13:09:58 -05:00
Mark Backman
5e5de618f3 Update PlayHTTTSService to use the WebsocketService base class 2025-01-14 13:09:58 -05:00
Mark Backman
8af92f7923 Update ElevenLabsTTSService to use the WebsocketService base class 2025-01-14 13:09:58 -05:00
Mark Backman
f39e17857e Add a WebsocketService base class to retry, ensure that retries reset after a successful connection, update Cartesia to use the new WebsocketService 2025-01-14 13:09:58 -05:00
Aleix Conchillo Flaqué
5b632de04a Merge pull request #982 from pipecat-ai/aleix/pipelinetask-cleanup-sink
pipeline(task): cleanup Sink processor
2025-01-14 09:14:03 -08:00
Mark Backman
6bcc196489 Merge pull request #969 from pipecat-ai/mb/deepseek
Add support for DeepSeek LLM
2025-01-14 09:40:06 -05:00
Mark Backman
66375e9dff Update dot-env.template API keys 2025-01-14 09:34:34 -05:00
Mark Backman
bc839492b6 Add support for DeepSeek LLM 2025-01-14 09:34:33 -05:00
Filipi da Silva Fuchter
4854645637 Merge pull request #960 from pipecat-ai/example_gemini_with_goolge_search
Example with Gemini using google search to retrieve news.
2025-01-14 10:07:15 -03:00
Mark Backman
98e80b7d4a Merge pull request #970 from pipecat-ai/mb/user-controlled-run-llm
Add an override_run_llm option to optionally defer function call completion
2025-01-13 18:48:00 -05:00
Mark Backman
8c0ecb89de Refactor for new on_context_updated callback and new frame properties 2025-01-13 17:20:41 -05:00
Aleix Conchillo Flaqué
4c8fcb2cfc pipeline(task): cleanup Sink processor
Fixes #953
2025-01-13 13:29:44 -08:00
Aleix Conchillo Flaqué
92313d6ce7 Merge pull request #972 from pipecat-ai/aleix/simple-chatbot-android-workflow-update
github: only run android simple-chatbot worflow if android example modified
2025-01-13 13:26:12 -08:00
Mark Backman
1ca6ecc46e Update CHANGELOG 2025-01-13 09:49:09 -05:00
Mark Backman
f1947d7d38 Update Anthropic and Gemini to allow overriding run_llm 2025-01-13 09:48:43 -05:00
Mark Backman
0852570212 Update Grok for function call override 2025-01-13 09:48:43 -05:00
Mark Backman
874b8bb136 Allow for an override of running a completion after a function call completes, OpenAI 2025-01-13 09:48:43 -05:00
Mark Backman
da1878537b Merge pull request #974 from pipecat-ai/mb/26d-example
Align 26d example with foundation norms
2025-01-12 19:44:31 -05:00
Mark Backman
f406d93b0f Align 26d example with foundation norms 2025-01-12 19:19:16 -05:00
Aleix Conchillo Flaqué
3cd2b90177 Merge pull request #971 from pipecat-ai/aleix/update-copyright-keep-original-year
update copyright keeping original year (2024)
2025-01-12 11:37:15 -08:00
Aleix Conchillo Flaqué
c4f0c7bcfd github: only run android simple-chatbot worflow if android example modified 2025-01-12 11:35:34 -08:00
Aleix Conchillo Flaqué
95e69597f3 update copyright keeping original year (2024) 2025-01-12 11:34:00 -08:00
Aleix Conchillo Flaqué
710baa5e17 Merge pull request #973 from pipecat-ai/aleix/simple-chatbot-clients
examples/simple-chatbot: move clients to client directory
2025-01-12 11:28:21 -08:00
Mert Sefa AKGUN
14e5419913 fix(gemini): prevent non-audio modality processing
Add an early return in the _handle_transcribe_model_audio method to
prevent unnecessary processing when the modalities setting is not set
to audio. This change ensures that audio transcription only occurs
when appropriate.
2025-01-12 22:17:10 +03:00
Mark Backman
8c953bac41 Merge pull request #966 from imsakg/main
fix(services): handle TranscriptionFrame separately in TTSService
2025-01-12 11:33:38 -05:00
Mark Backman
4c0861ce39 Some addition links and README changes 2025-01-12 09:27:23 -05:00
Mark Backman
12b1e1db9d Merge pull request #965 from pipecat-ai/mb/aws-add-session-token
Add optional aws_session_token for PollyTTSService
2025-01-12 09:13:03 -05:00
Mark Backman
53bfdfd83f Merge pull request #963 from pipecat-ai/mb/cleanup-examples
Update examples to align with latest best practices
2025-01-12 09:12:34 -05:00
Mark Backman
2a5593afea Merge pull request #968 from pipecat-ai/mb/readme-websocket
Update README.md
2025-01-12 09:12:19 -05:00
Aleix Conchillo Flaqué
a04a920e54 examples/simple-chatbot: move clients to client directory 2025-01-11 19:16:05 -08:00
Aleix Conchillo Flaqué
2ce6d92455 Merge pull request #959 from KevGTL/fix-livekit-transport
fix: push input audio frame only via push_audio_frame()
2025-01-11 19:03:35 -08:00
Mark Backman
1ecd5da219 Update README.md
Add websocket docs links to README.
2025-01-11 08:37:17 -05:00
sahil suman
e04da334d7 add support for openrouter.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-11 17:50:58 +05:30
Mert Sefa AKGUN
7ec351813c style(ai_services): fix import order with ruff 2025-01-11 13:04:26 +03:00
Mert Sefa AKGUN
df6c2fc403 fix(services): handle TranscriptionFrame separately in TTSService
Exclude TranscriptionFrame from text frame processing in TTSService by updating the type check condition. This resolves unintended processing behavior when handling different frame types.
2025-01-11 13:00:38 +03:00
Mark Backman
71e107725c Add optional aws_session_token for PollyTTSService 2025-01-10 19:33:47 -05:00
Mark Backman
4d0c11fcab Update examples to align with latest best practices 2025-01-10 15:07:06 -05:00
Mark Backman
a8ae79831e Merge pull request #921 from pipecat-ai/mb/playht-http
PlayHTHttpTTSService fixes
2025-01-10 13:26:45 -05:00
Mark Backman
86516d2415 PlayHTHttpTTSService fixes 2025-01-10 13:21:27 -05:00
Vanessa Pyne
5cd9dab14b Merge pull request #949 from imsakg/main
fix(examples): correct TTS service import and setup
2025-01-10 10:58:50 -06:00
Kwindla Hultman Kramer
a3e2e06975 Merge pull request #961 from pipecat-ai/khk/tiny-chatbot-readme-fix
fixed 404 in SimpleChatbot iOS example README
2025-01-10 08:45:05 -08:00
Kwindla Hultman Kramer
e7107b99c5 fixed 404 in SimpleChatbot iOS example README 2025-01-10 08:37:13 -08:00
Filipi Fuchter
aa1b8879ee Fixing ruff format 2025-01-10 13:21:51 -03:00
Mark Backman
6802459165 Merge pull request #956 from pipecat-ai/mb/tavus
Update the Tavus example and comment about using the PERSONA_ID
2025-01-10 11:18:05 -05:00
Filipi Fuchter
6719d1fddc Example with Gemini using google search to retrieve news. 2025-01-10 13:13:59 -03:00
kompfner
a798bf18f2 Merge pull request #955 from pipecat-ai/ios-simple-chatbot-mainactor-fixes
iOS SimpleChatbot @MainActor fixes
2025-01-10 09:37:02 -05:00
Kevin Oury
f9d0cca60f fix: push input audio frame only via push_audio_frame() 2025-01-10 15:02:38 +01:00
Mark Backman
cb22de0d13 Update the Tavus example and comment about using the PERSONA_ID 2025-01-10 08:01:00 -05:00
marcus-daily
7d161cc53b Setting target SDK to 35 2025-01-10 09:50:37 +00:00
marcus-daily
255abf46ef Updating Gradle and AGP 2025-01-10 09:50:37 +00:00
marcus-daily
27579bcb70 Fixing imports 2025-01-10 09:50:37 +00:00
marcus-daily
1295b64879 Updating library dependencies 2025-01-10 09:50:37 +00:00
marcus-daily
ca57670f65 Removing unnecessary drawables 2025-01-10 09:50:37 +00:00
marcus-daily
06d0a231b9 Android demo app for simple-chatbot example 2025-01-10 09:50:37 +00:00
Mert Sefa AKGUN
67af4e619b style(examples): fix ruff formatting in Gemini text example
Refactor `CartesiaTTSService` instantiation to comply with line
length requirements from the ruff linter.
2025-01-10 12:32:53 +03:00
Mert Sefa AKGUN
21c274944e Update examples/foundational/26d-gemini-multimodal-live-text.py
Co-authored-by: Vanessa Pyne <vipyne@gmail.com>
2025-01-10 12:28:13 +03:00
Paul Kompfner
3239249feb In the iOS SimpleChatbot, fix @MainActor-related warnings (which would be errors in Swift 6). The delegate methods aren't contractually guaranteed to run on the main thread, so we can't mark them as @MainActor. 2025-01-09 17:35:44 -05:00
Paul Kompfner
216979c377 Bump iOS SimpleChatbot's pipecat-client-ios-daily dependency to version 0.3.1 2025-01-09 16:22:26 -05:00
Filipi da Silva Fuchter
b9db53d3cd Merge pull request #952 from pipecat-ai/fixing_gemini_function_calling
Fixing GeminiMultimodalLiveLLMService function calling to work with pipecat-flows
2025-01-09 17:50:25 -03:00
Filipi Fuchter
58bfcc8370 Fixing GeminiMultimodalLiveLLMService function calling when using with pipecat-flows. 2025-01-09 12:22:37 -03:00
Mert Sefa AKGUN
6664c492ac feat(gemini): enable audio transcription in live text example
Add options to transcribe both user and model audio during the GeminiMultimodalLiveLLMService setup in the 26d-gemini-multimodal-live-text.py example.
2025-01-09 15:38:33 +03:00
Mert Sefa AKGUN
7634058f97 fix(examples): correct TTS service import and setup
- Update import to use CartesiaTTSService instead of CartesiaMultiLingualTTSService.
- Adjust GeminiMultimodalLiveLLMService setup to use set_model_modalities with TEXT modality.
2025-01-09 02:19:08 +03:00
Mark Backman
39c6446bdc Merge pull request #947 from pipecat-ai/mb/add-rime-set-voices
Add setters for model and voice to RimeHttpTTSService
2025-01-08 14:25:24 -05:00
Filipi da Silva Fuchter
2df7dfcc91 Merge pull request #943 from pipecat-ai/simple_chat_bot_ios
SimpleChatbot iOS app.
2025-01-08 16:17:39 -03:00
Mark Backman
c23c9e046c Add setters for model and voice to RimeHttpTTSService 2025-01-08 14:17:32 -05:00
Mark Backman
9dae753e8c Merge pull request #926 from imsakg/main
feat(gemini): add text handling to GeminiMultimodalLive
2025-01-08 13:42:17 -05:00
Mert Sefa AKGUN
40e9ee6d63 fix(examples): correct import order in Gemini example
- Move `CartesiaMultiLingualTTSService` import to maintain proper order.
- Reorganize `enum` import to adhere to styling standards.
2025-01-08 21:14:29 +03:00
Mert Sefa AKGUN
a342fe732e docs: update CHANGELOG with Gemini modalities and examples 2025-01-08 19:34:42 +03:00
Mert Sefa AKGUN
a729834482 refactor(gemini): reposition WebSocket connection code
Move WebSocket connection setup earlier in the function for better
organization and to prepare for subsequent configuration steps.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
94a6f1086e feat(gemini): change default modality to AUDIO
Modify the default modality in the `InputParams` class from TEXT to AUDIO
to better align with the intended use case for GeminiMultimodalLive
service.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
b42d3a8257 feat(gemini): add modality configuration for GeminiMultimodalLive
- Introduce `GeminiMultimodalModalities` enum for modality options.
- Add modality field to `InputParams`, defaulting to text.
- Simplify modality setup with `set_model_modalities` method.
- Refactor WebSocket configuration to support dynamic response modalities.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
12ae980abe feat(gemini): handle full text response in GeminiMultimodalLive
- Add a buffer to store bot text responses.
- Push a `LLMFullResponseStartFrame` when text begins.
- Clear the text buffer and send `LLMFullResponseEndFrame` after processing.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
cdb909958c feat(examples): add Gemini multimodal live text example
Introduce a new example `26d-gemini-multimodal-live-text.py` to
demonstrate the use of GeminiMultimodalLiveLLMService with text-only
responses. This example sets up a pipeline for audio input via DailyTransport,
processing with Gemini, and output via Cartesia TTS.
2025-01-08 19:29:35 +03:00
Mert Sefa AKGUN
c72c3025f6 feat(gemini): add configuration methods for response modalities
- Introduce `set_model_only_audio` and `set_model_only_text` methods
  to toggle between audio-only and text-only response modes in
  `GeminiMultimodalLiveLLMService`.
- Refactor configuration setup to a class attribute for improved
  reusability and maintenance.
- Remove redundant configuration instantiation in the WebSocket
  connection setup process.
2025-01-08 19:29:35 +03:00
Mert Sefa AKGUN
5cbd719780 feat(gemini): add text handling to GeminiMultimodalLive
- Introduce text attribute in Part class for handling string data.
- Incorporate text processing in GeminiMultimodalLiveLLMService to push TextFrame if text is present.
2025-01-08 19:29:35 +03:00
Filipi Fuchter
23d6290672 Removing not used class. 2025-01-08 12:05:04 -03:00
Filipi Fuchter
d4e7e11981 SimpleChatbot iOS app. 2025-01-08 12:00:11 -03:00
Mark Backman
8057fe3fcf Merge pull request #742 from Vaibhav159/vl_feature_websocket_fastapi_timeout
adding session_timeout param
2025-01-08 09:05:41 -05:00
Vaibhav159
3b446234a7 fix hyperlink 2025-01-08 10:54:27 +05:30
Vaibhav159
768487ffb3 final changelog 2025-01-08 10:53:32 +05:30
Vaibhav159
2da5620d10 adding changelog 2025-01-08 10:50:09 +05:30
Vaibhav159
af90d65b3b adding session timeout example in websocket-server example 2025-01-08 10:43:10 +05:30
Vaibhav159
c8569a7b67 Merge remote-tracking branch 'upstream/main' into vl_feature_websocket_fastapi_timeout 2025-01-08 10:21:36 +05:30
Vaibhav159
0ecd98c873 Merge branch 'main' into vl_feature_websocket_fastapi_timeout 2025-01-08 10:20:55 +05:30
Mark Backman
6f863ba2c6 Merge pull request #938 from jcbjoe/jg/optional-authentication-polly
Changed Polly authentication params to be optional
2025-01-07 15:37:23 -05:00
Mark Backman
602ca5ebe6 Merge pull request #939 from Vaibhav159/vl_adding_daily_room_properties
adding more daily room params
2025-01-07 14:33:59 -05:00
Vaibhav159
787ade41f3 adding missing doc string 2025-01-08 00:58:01 +05:30
Joe Garlick
bb767831d5 Added: Changelog entry 2025-01-07 19:05:02 +00:00
Mark Backman
bc25a771dc Merge pull request #935 from pipecat-ai/hush/modalUpdate
docs: update dependencies for modal demo
2025-01-07 13:57:46 -05:00
Vaibhav159
f37626f81d adding more daily room params 2025-01-07 21:38:05 +05:30
Mark Backman
9d54578e65 Merge pull request #934 from pipecat-ai/mb/bump-open-ai-version
Bump openai version to 1.59.0 for realtime and model updates
2025-01-07 08:29:45 -05:00
Joe Garlick
79afe7ec2a Changed: Polly authentication information to be optional 2025-01-07 11:43:57 +00:00
James Hush
2c1fd3c3cc docs: update dependencies for modal demo 2025-01-07 15:45:55 +08:00
Mark Backman
b0dd8e03a6 Bump openai version to 1.59.0 for realtime and model updates 2025-01-06 17:05:22 -05:00
Mark Backman
ee20e48ef8 Merge pull request #931 from pipecat-ai/mb/fix-openai-realtime-
Fix truncation timing of OpenAIRealtimeBetaLLMService
2025-01-06 16:25:09 -05:00
Mark Backman
12b5c5a646 Fix truncation timing of OpenAIRealtimeBetaLLMService 2025-01-06 15:37:58 -05:00
Mark Backman
7a021cc82d Merge pull request #929 from pipecat-ai/mb/add-google-journey-support
Added support for Google Journey TTS voices
2025-01-06 15:13:00 -05:00
Mark Backman
3e1ec4a8ee Added support for Google Journey TTS voices 2025-01-06 14:54:34 -05:00
Mark Backman
a1377b7f1a Merge pull request #924 from xtreme-sameer-vohra/patch-1
Update frames.py
2025-01-06 14:13:10 -05:00
Mark Backman
d6335886e2 Merge pull request #848 from Vaibhav159/vl_add_audio_and_chat_livekit_example
adding example for livekit audio and chat version
2025-01-06 13:27:38 -05:00
Vaibhav159
b3b7a5f023 adding 2025 license 2025-01-06 22:10:46 +05:30
Vaibhav159
5138017b57 ruff changes 2025-01-06 22:07:59 +05:30
Vaibhav159
87670067d7 adding changelog 2025-01-06 22:03:11 +05:30
Vaibhav159
656cd2859e Merge branch 'main' into vl_add_audio_and_chat_livekit_example 2025-01-06 21:57:43 +05:30
Mark Backman
15b2cc210c Merge pull request #927 from pipecat-ai/mb/update-copyright
Update copyright to 2025
2025-01-06 10:33:04 -05:00
Mark Backman
4667624b60 Update copyright to 2025 2025-01-06 10:19:37 -05:00
Sameer Vohra
d07ba80572 Update frames.py
fix minor typo in docs
2025-01-05 22:57:54 -05:00
Aleix Conchillo Flaqué
386ba61483 Merge pull request #909 from pipecat-ai/aleix/pipecat-0.0.52
update CHANGELOG for 0.0.52
2024-12-24 08:16:05 -08:00
Aleix Conchillo Flaqué
e9d275f270 update CHANGELOG for 0.0.52 2024-12-23 19:52:34 -08:00
Aleix Conchillo Flaqué
3a4994370c update README 2024-12-23 19:20:23 -08:00
Aleix Conchillo Flaqué
6125ea882d update README 2024-12-23 19:19:39 -08:00
Aleix Conchillo Flaqué
0a1ce1bb63 update CHANGELOG 2024-12-23 19:13:59 -08:00
Kwindla Hultman Kramer
ab3bcde5f7 Merge pull request #907 from pipecat-ai/khk/gemini-20241221
Gemini unary API fixes and natural conversation demo
2024-12-23 17:34:57 -08:00
Kwindla Hultman Kramer
1368d3db5c revert elevenlabs example changes 2024-12-23 17:33:59 -08:00
Aleix Conchillo Flaqué
cd7dec7391 Merge pull request #906 from pipecat-ai/aleix/fix-duplicate-base-output-frames
transports(base_output): fix duplicate push_frame()
2024-12-23 06:12:31 -08:00
Kwindla Hultman Kramer
a5e985094b remove stray line 2024-12-22 19:45:57 -08:00
Aleix Conchillo Flaqué
c04c69df95 transports(base_output): fix duplicate push_frame() 2024-12-22 14:43:38 -08:00
Aleix Conchillo Flaqué
9c105e25ac Merge pull request #905 from pipecat-ai/aleix/daily-python-0.14.2
pyproject: update daily-python to 0.14.2
2024-12-22 13:03:25 -08:00
Aleix Conchillo Flaqué
6901c4fa57 pyproject: update daily-python to 0.14.2 2024-12-22 12:30:17 -08:00
Mark Backman
469c13c07e Merge pull request #903 from pipecat-ai/mb/send-prebuilt-chat
Add the ability to send_prebuilt_chat_message when using the DailyTra…
2024-12-22 14:33:50 -05:00
Mark Backman
46871ae686 Merge pull request #899 from pipecat-ai/mb/add-fish-audio
Add Fish Audio TTS service
2024-12-22 14:26:59 -05:00
Kwindla Hultman Kramer
ab5df1a236 feature complete gemini audio, transcription, and phrase endpointing demo 2024-12-22 11:19:02 -08:00
Kwindla Hultman Kramer
f5f0de00e4 still some cleanup to do 2024-12-21 23:04:00 -08:00
Kwindla Hultman Kramer
f3dd35bfd9 working but needs cleanup 2024-12-21 22:18:56 -08:00
Kwindla Hultman Kramer
53a5e63990 function calling dead-end 2024-12-21 18:10:25 -08:00
Kwindla Hultman Kramer
d435a6a6d6 fixes to audio buffer 2024-12-21 16:22:53 -08:00
Kwindla Hultman Kramer
59240c7b96 delay gemini multimodal live websocket connect 2024-12-21 14:36:37 -08:00
Mark Backman
6c11753985 Add the ability to send_prebuilt_chat_message when using the DailyTransport 2024-12-21 14:04:46 -05:00
Mark Backman
6fabb7e7d5 Fix metrics calculations 2024-12-21 13:25:43 -05:00
Mark Backman
bce218915e Add Fish to the README 2024-12-21 12:54:07 -05:00
Mark Backman
627c91f4a6 Flush the audio 2024-12-21 12:52:28 -05:00
Mark Backman
dac4468ca1 Add Fish Audio TTS service 2024-12-21 12:42:56 -05:00
Mark Backman
503eddf7d6 Merge pull request #897 from pipecat-ai/mb/update-playht
Update PlayHT to use the latest Websocket connection endpoint
2024-12-20 20:31:41 -05:00
Aleix Conchillo Flaqué
1a0f6f2a21 Merge pull request #898 from pipecat-ai/aleix/reset-input-queue-flag-if-interruption
frame_processor: reset input queue flag with interruptions
2024-12-20 13:58:12 -08:00
Aleix Conchillo Flaqué
43759295cc frame_processor: reset input queue flag with interruptions 2024-12-20 09:33:20 -08:00
Mark Backman
900b95eb92 Update PlayHT to use the latest Websocket connection endpoint 2024-12-20 10:44:47 -05:00
marcus-daily
41d07692ca Fix import order 2024-12-20 14:30:38 +00:00
marcus-daily
dcf6b6e120 Add an RTVIProcessor to the simple-chatbot pipeline 2024-12-20 14:30:38 +00:00
Mark Backman
99dba3b6b9 Merge pull request #893 from pipecat-ai/mb/changelog-11L
Added an `auto_mode` input parameter to `ElevenLabsTTSService`
2024-12-19 21:38:06 -05:00
Aleix Conchillo Flaqué
4547609ffb examples(01a): remove unused import 2024-12-19 17:49:27 -08:00
Mark Backman
9554804a49 Update 11L default model, allow language to be used by more models 2024-12-19 20:33:58 -05:00
Mark Backman
656cbc35e1 Make auto_mode an input parametere for ElevenLabsTTSService; add changelog entry 2024-12-19 20:33:56 -05:00
Aleix Conchillo Flaqué
6f7c4dd998 Merge pull request #894 from pipecat-ai/aleix/daily-python-0.14.0
transports(daily): update to daily-python 0.14.0
2024-12-19 17:14:31 -08:00
Aleix Conchillo Flaqué
8b496f8c6f transports(daily): daily-python 0.14.0 (SIP transfer/refer, DTMF) 2024-12-19 17:08:29 -08:00
Aleix Conchillo Flaqué
15047f5f0a Merge pull request #885 from pipecat-ai/aleix/parallelpipeline-wait-for-slowest-endframe
pipeline(parallel): wait for slowest endframe
2024-12-19 15:18:22 -08:00
Aleix Conchillo Flaqué
e08c24dc41 Merge pull request #883 from pipecat-ai/aleix/base-output-transport-avoid-pushing-endframe
transport(base output): avoid pushing EndFrame twice
2024-12-19 11:26:31 -08:00
Aleix Conchillo Flaqué
5341739ece transport(base output): avoid pushing EndFrame twice 2024-12-19 11:19:49 -08:00
Mark Backman
5b0fc3fa15 Merge pull request #891 from louisjoecodes/louis/flush-shorter-messages-elevenlabs
feat: set auto_mode=true - ElevenLabs tts WSS
2024-12-19 12:08:04 -05:00
Louis Jordan
b7b8e59e9e feat: set auto_mode=true - ElevenLabs tts WSS 2024-12-19 16:57:17 +00:00
Mark Backman
6e0d3aef32 Merge pull request #860 from pipecat-ai/mb/transcription
Add a TranscriptProcessor and new frames
2024-12-19 08:15:53 -05:00
Mark Backman
1ccc84dd7a Merge pull request #888 from pipecat-ai/mb/add-cerebras
Add CerebrasLLMService and foundational example
2024-12-19 08:14:53 -05:00
Mark Backman
c9dd906057 Tailor chat completion inputs to Cerebras API 2024-12-19 08:10:33 -05:00
Mark Backman
4f093f11db Add CerebrasLLMService and foundational example 2024-12-19 08:10:31 -05:00
Mark Backman
887a9170b2 Merge pull request #889 from pipecat-ai/mb/openai-realtime-model
Add model parameter to OpenAI realtime service constructor, update de…
2024-12-19 08:08:52 -05:00
Aleix Conchillo Flaqué
f2e191855a Merge pull request #881 from pipecat-ai/aleix/langchain-updates
pyproject: update langchaing to 0.3.12
2024-12-18 19:42:39 -08:00
Aleix Conchillo Flaqué
78b90e9591 Merge pull request #884 from pipecat-ai/aleix/filters-handle-endframe
processors(filters): allow passing EndFrame
2024-12-18 19:35:56 -08:00
Aleix Conchillo Flaqué
17decee788 Merge pull request #882 from pipecat-ai/aleix/stop-transport-parent-first
transports: call parent stop() before disconnecting
2024-12-18 19:35:39 -08:00
Aleix Conchillo Flaqué
f89014d100 pyproject: update langchaing to 0.3.12 2024-12-18 19:34:49 -08:00
Mark Backman
3b3e22fe7c Add model parameter to OpenAI realtime service constructor, update default model 2024-12-18 18:12:51 -05:00
Aleix Conchillo Flaqué
0df0194cc1 Merge pull request #886 from pipecat-ai/aleix/koala-noise-suppression
audio(koala): add new audio filter KoalaFilter
2024-12-18 14:02:04 -08:00
Mark Backman
8a7a61914e Code review feedback 2024-12-17 22:35:13 -05:00
Mark Backman
1117c21483 Refactor TranscriptProcessor into user and assistant processors 2024-12-17 22:34:22 -05:00
Mark Backman
4211664a77 TranscriptProcessor to handle simple and list content 2024-12-17 22:34:03 -05:00
Mark Backman
1f8a217cd1 Code review changes 2024-12-17 22:34:02 -05:00
Mark Backman
b5bd662fe1 Add changelog and rename examples 2024-12-17 22:33:39 -05:00
Mark Backman
dd2703317a Add timestamp frames and include timestamps in the transcription event and frame 2024-12-17 22:31:15 -05:00
Mark Backman
77aeda36eb Update OpenAI's from_standard_message to convert back to OpenAI's simple format 2024-12-17 22:31:15 -05:00
Mark Backman
51b235df4b Add docstrings for Google and Anthropic's to_standard_messages and from_standard_message functions 2024-12-17 22:31:15 -05:00
Mark Backman
4f2aee5fba Update OpenAI's to_standard_messages to return the verboase message format 2024-12-17 22:31:15 -05:00
Mark Backman
55879bf365 Add TranscriptionProcessor 2024-12-17 22:31:15 -05:00
Aleix Conchillo Flaqué
7322badbe7 audio(koala): add new audio filter KoalaFilter 2024-12-17 18:45:10 -08:00
Aleix Conchillo Flaqué
42bea578e8 pipeline(parallel): wait for slowest endframe
If we are sending an EndFrame and a ParallelPipeline has multiple pipelines we
want to wait before pushing the EndFrame downstream until the slowest pipeline
is finished. Otherwise, we could be disconnecting from the transport too early.
2024-12-17 17:05:11 -08:00
Aleix Conchillo Flaqué
2dfdceb9e6 processors(filters): allow passing EndFrame 2024-12-17 16:22:19 -08:00
Aleix Conchillo Flaqué
5bfcac1f5c transports: call parent stop() before disconnecting
This rollbacks a previous change https://github.com/pipecat-ai/pipecat/pull/855
which was trying to fix an issue in the wrong way.

The reasoning behind this fix is that the parent class might be sending audio or
messages (through the subclass) and if we disconnect before all the data is sent
we will run into incomplete audio or even errors. Therefore, we first make sure
the parent tasks stop and then it will be safe to disconnect.
2024-12-17 16:02:33 -08:00
Aleix Conchillo Flaqué
fb9f72d38b Merge pull request #880 from pipecat-ai/aleix/ruff-check-import-linter
ruff check import linter
2024-12-17 14:14:47 -08:00
Aleix Conchillo Flaqué
146a341a38 Merge pull request #879 from Vaibhav159/vl_add_readme_for_ruff_formatter_in_pycharm
updating readme to support auto-formatting of ruff in pycharm
2024-12-17 11:49:01 -08:00
Aleix Conchillo Flaqué
b9ca667d31 pyproject: use tool.ruff.lint sections 2024-12-17 11:40:43 -08:00
Aleix Conchillo Flaqué
5c57cccea3 github: run ruff check import linter 2024-12-17 11:29:28 -08:00
Aleix Conchillo Flaqué
17162258a2 fix ruff linter import organization 2024-12-17 11:28:58 -08:00
Aleix Conchillo Flaqué
da3fb98101 examples(storytelling-chatbot): update dependencies 2024-12-17 11:24:50 -08:00
Aleix Conchillo Flaqué
6244124d14 README: added Emacs import re-organization with Ruff 2024-12-17 11:20:18 -08:00
Vaibhav159
53049adeea removing --config flag 2024-12-18 00:47:00 +05:30
Vaibhav159
4208d2d7c4 updating readme to support auto-formatting of ruff in pycharm 2024-12-17 23:38:36 +05:30
Mark Backman
9f7f74e4d8 Merge pull request #869 from Vaibhav159/vl_fixing_deepgram_language_bug_#868
fixing [#868] bug where deepgram client fails due to langauge
2024-12-17 12:50:57 -05:00
Vaibhav159
f14d32d09e fixing ruff issue 2024-12-17 23:11:18 +05:30
Vaibhav159
7351e281e2 ruff change 2024-12-17 22:21:56 +05:30
Vaibhav159
b94b10f7d6 added change log 2024-12-17 22:11:52 +05:30
Vaibhav159
1cc90eb1a3 Merge branch 'main' into vl_fixing_deepgram_language_bug_#868 2024-12-17 22:09:30 +05:30
Vaibhav159
5f7d28bb05 adding type check and value check 2024-12-17 22:07:35 +05:30
Mark Backman
204a08ab8f Merge pull request #877 from pipecat-ai/mb/grok-function-calling-fix
Add custom assistant context aggregator for Grok due to content requi…
2024-12-17 10:51:19 -05:00
Aleix Conchillo Flaqué
141b0a6560 sentry: fix formatting 2024-12-17 07:14:31 -08:00
Mark Backman
ca086a856f Add custom assistant context aggregator for Grok due to content requirement in function calling 2024-12-17 09:11:21 -05:00
Aleix Conchillo Flaqué
fe0a7d07bd update CHANGELOG 2024-12-16 21:02:38 -08:00
Aleix Conchillo Flaqué
79eb29d614 Merge pull request #875 from pipecat-ai/aleix/update-dependencies
update dependencies
2024-12-16 20:58:30 -08:00
Aleix Conchillo Flaqué
da15c83bab fix ruff formatting 2024-12-16 20:52:40 -08:00
Aleix Conchillo Flaqué
d6bac77b3c pyproject: add audioop-lts for python 3.13 2024-12-16 20:50:25 -08:00
Aleix Conchillo Flaqué
7faa4eb295 update dev-requirements 2024-12-16 20:50:25 -08:00
Aleix Conchillo Flaqué
0e31413851 pyproject: update numpy, pydantic, loguru 2024-12-16 19:20:34 -08:00
Aleix Conchillo Flaqué
16948b251d services: fix infinite websocket-bases TTS services retries
Fixes #871
2024-12-16 16:36:44 -08:00
Mark Backman
f3112a8638 Merge pull request #866 from pipecat-ai/mb/readme-links
Fix a bunch of README docs links
2024-12-16 10:51:01 -05:00
Mark Backman
0293d40e4e Merge pull request #870 from pipecat-ai/mb/dotenv
Add python-dotenv to dev-requirements.txt
2024-12-16 10:50:46 -05:00
Mark Backman
64038442ed Add python-dotenv to dev-requirements.txt 2024-12-16 09:23:12 -05:00
Vaibhav159
facc280599 fixing [#868] bug where deepgram client fails due to langauge 2024-12-16 17:47:50 +05:30
Mark Backman
f90cbe8086 Fix a bunch of README docs links 2024-12-15 14:30:20 -05:00
Mark Backman
09a611d44b Merge pull request #856 from pipecat-ai/mb/daily-rest-helpers
Remove default 5 min exp time for created rooms, add docstrings
2024-12-13 12:08:58 -05:00
Mark Backman
16d7fb2c4a Remove default 5 min exp time for created rooms, add docstrings 2024-12-13 12:02:26 -05:00
Aleix Conchillo Flaqué
643160c960 Merge pull request #858 from pipecat-ai/aleix/fastpitch-timeout
riva: make sure we don't block on fastpitch
2024-12-13 08:20:38 -08:00
Aleix Conchillo Flaqué
aac907aadb riva: make sure we don't block on fastpitch 2024-12-13 07:32:51 -08:00
Aleix Conchillo Flaqué
8f24ca4e58 Merge pull request #857 from pipecat-ai/aleix/fix-riva-tts-audio-stuttering
riva: fix FastPitchTTSService audio stuttering
2024-12-12 22:20:00 -08:00
Aleix Conchillo Flaqué
420ce16807 riva: fix FastPitchTTSService audio stuttering 2024-12-12 22:15:44 -08:00
Aleix Conchillo Flaqué
2b8c35c681 Merge pull request #855 from pipecat-ai/aleix/transport-services-disconnect-fixes
transports(services): disconnect client first
2024-12-12 19:40:03 -08:00
Mark Backman
3d96369193 Merge pull request #852 from pipecat-ai/mb/readme-docs-badge
Add docs badge to README
2024-12-12 22:21:41 -05:00
Aleix Conchillo Flaqué
d44b36a07c Merge pull request #854 from pipecat-ai/aleix/aiservice-add-missing-process-frame
AIService: add missing super().process_frame()
2024-12-12 19:10:21 -08:00
Aleix Conchillo Flaqué
ccc96994e9 pyproject: update livekit 2024-12-12 19:09:36 -08:00
Aleix Conchillo Flaqué
337d421338 transports: disconnect client first 2024-12-12 19:09:06 -08:00
Aleix Conchillo Flaqué
752720b4d5 AIService: add missing super().process_frame() 2024-12-12 17:25:38 -08:00
Aleix Conchillo Flaqué
f8e69cfa00 Merge pull request #853 from pipecat-ai/revert-849-aleix/no-need-for-super-process-frame
Revert "no longer necessary to call super().process_frame(frame, direction)"
2024-12-12 17:21:20 -08:00
Aleix Conchillo Flaqué
6d11911d83 Revert "no longer necessary to call super().process_frame(frame, direction)" 2024-12-12 17:03:40 -08:00
Mark Backman
ec6e71c8ea Add docs badge to README 2024-12-12 18:08:24 -05:00
Aleix Conchillo Flaqué
10f854aeba Merge pull request #846 from pipecat-ai/aleix/base-output-transport-audio-sync
transport(output): fix non-audio frames sync after audio frames
2024-12-12 14:29:42 -08:00
Aleix Conchillo Flaqué
d8caf007b0 Merge pull request #849 from pipecat-ai/aleix/no-need-for-super-process-frame
no longer necessary to call super().process_frame(frame, direction)
2024-12-12 14:29:10 -08:00
Mark Backman
26ea64ef12 Merge pull request #850 from pipecat-ai/mb/fix-docs-builds
Fix docs generation build issues
2024-12-12 17:27:00 -05:00
Mark Backman
19c178ebc7 Fix docs generation build issues 2024-12-12 17:18:04 -05:00
Aleix Conchillo Flaqué
3c3fd67d96 no longer necessary to call super().process_frame(frame, direction) 2024-12-12 13:03:41 -08:00
Mark Backman
7bbc0ee8df Merge pull request #845 from pipecat-ai/mb/more-docs-updates
Docs auto-gen improvements
2024-12-12 15:42:34 -05:00
Mark Backman
67804edce6 Remove formats from .readthedocs.yaml 2024-12-12 15:41:11 -05:00
Mark Backman
ec082d0888 Remove deprecated VAD module 2024-12-12 15:32:38 -05:00
Mark Backman
8631d71d5a Fix more missing docs 2024-12-12 15:16:37 -05:00
Vaibhav159
62fc95300b adding livekit audio and chat version 2024-12-13 01:09:47 +05:30
Aleix Conchillo Flaqué
db7eaed980 transport(output): fix non-audio frames sync after audio frames 2024-12-12 10:56:02 -08:00
Mark Backman
44c5220104 Update README 2024-12-12 13:28:05 -05:00
Mark Backman
276fd86ecb More fixes for missing packages 2024-12-12 13:25:13 -05:00
Mark Backman
2de0737056 Merge pull request #844 from pipecat-ai/cb-gemini-example-fix
Update requirements.txt for simple-chatbot
2024-12-12 11:18:58 -05:00
Mark Backman
b5d5a0e923 Add special cases for displaying some names 2024-12-12 11:15:36 -05:00
Mark Backman
f3ed12c30b Clean up module and package display names 2024-12-12 11:11:53 -05:00
Mark Backman
e14399727b Add README and build script for local testing 2024-12-12 11:06:53 -05:00
Mark Backman
414dcf9810 Improve TOC in sidebar, fix missing services 2024-12-12 11:06:09 -05:00
chadbailey59
88d530e840 Update requirements.txt for simple-chatbot
The gemini example doesn't actually work from a fresh install, because the requirements.txt file doesn't include google :)
2024-12-12 09:31:15 -06:00
Aleix Conchillo Flaqué
af821d8e95 Merge pull request #841 from pipecat-ai/aleix/aws-to-polly
polly: renamed AWSTTSService to PollyTTSService
2024-12-11 18:13:02 -08:00
Aleix Conchillo Flaqué
133e1aff6c polly: renamed AWSTTSService to PollyTTSService 2024-12-11 17:56:43 -08:00
Aleix Conchillo Flaqué
def415f476 Merge pull request #840 from pipecat-ai/aleix/11labs-playht-more-languages
tts: support more languages in playht and elevenlabs
2024-12-11 14:58:03 -08:00
Aleix Conchillo Flaqué
a34d16dabe tts: support more languages in playht and elevenlabs 2024-12-11 14:53:24 -08:00
Mark Backman
ec7260b237 Merge pull request #839 from pipecat-ai/mb/bump-versions
Bump openai and aiohttp package versions
2024-12-11 17:06:15 -05:00
Mark Backman
96c6c71d5b Bump openai and aiohttp package versions 2024-12-11 16:48:36 -05:00
Aleix Conchillo Flaqué
8e140b2be6 Merge pull request #838 from pipecat-ai/aleix/prepare-0.0.50
update CHANGELOG fot 0.0.50
2024-12-11 11:49:15 -08:00
Aleix Conchillo Flaqué
a70c785b2e update CHANGELOG fot 0.0.50 2024-12-11 11:33:13 -08:00
Aleix Conchillo Flaqué
f1d3c5e9ad Merge pull request #837 from pipecat-ai/aleix/update-protobuf-to-5.29.1
pyproject: update protobuf to 5.29.1
2024-12-11 11:31:49 -08:00
Aleix Conchillo Flaqué
346329ba73 pyproject: update protobuf to 5.29.1 2024-12-11 11:29:48 -08:00
Aleix Conchillo Flaqué
6089d4255c Merge pull request #836 from pipecat-ai/aleix/moondream-studypal-fixes
examples: fixes for moondream-chatbot and studypal
2024-12-11 11:16:09 -08:00
Aleix Conchillo Flaqué
cff9bb6068 Merge pull request #835 from pipecat-ai/aleix/even-more-parallel-pipeline-fixes
parallel_pipeline: fix system frames and parallel pipelines again
2024-12-11 11:15:59 -08:00
Aleix Conchillo Flaqué
fdefdc9d68 Merge pull request #834 from pipecat-ai/aleix/transcription-are-text
frames: transcriptions should be TextFrames as before
2024-12-11 11:15:43 -08:00
Aleix Conchillo Flaqué
2dd418a38d parallel_pipeline: fix system frames and parallel pipelines again
The previous fixes didn't take into account that system frames can be generated
inside the internal pipelines.
2024-12-11 10:55:04 -08:00
Aleix Conchillo Flaqué
42f5ec20f6 examples: fixes for moondream-chatbot and studypal 2024-12-11 10:46:38 -08:00
Aleix Conchillo Flaqué
5b5125b74c frames: transcriptions should be TextFrames as before 2024-12-11 10:42:38 -08:00
Mark Backman
be4df5f713 Merge pull request #833 from pipecat-ai/mb/update-changelog-for-gemini
Update the CHANGELOG and README for Gemini Multimodal Live
2024-12-11 11:41:42 -05:00
Mark Backman
5418cdc4d1 Update the CHANGELOG and README for Gemini Multimodal Live 2024-12-11 11:40:16 -05:00
Mark Backman
6c9f5a81dc Merge pull request #832 from pipecat-ai/khk/gemini-live-function-calling
Gemini Multimodal Live function calling example
2024-12-11 11:39:19 -05:00
Mark Backman
027e360436 Fix demo numbering and prompt the bot to say hi in 26b 2024-12-11 11:36:38 -05:00
Kwindla Hultman Kramer
c219172266 Gemini Multimodal Live function calling example 2024-12-11 08:29:09 -08:00
Mark Backman
7b040be209 Merge pull request #830 from pipecat-ai/khk/gemini-multimodal-live
Gemini Multimodal Live API service
2024-12-11 11:25:55 -05:00
Mark Backman
0d74531f36 Minor changes to demos 2024-12-11 11:23:59 -05:00
Mark Backman
3341c4f608 Merge pull request #831 from pipecat-ai/mb/gemini-simple-chatbot
Gemini updates to the simple-chatbot demo
2024-12-11 11:15:15 -05:00
Mark Backman
1e45e55528 Add copyright block to audio_transcriber 2024-12-11 11:06:48 -05:00
Mark Backman
8086a94e49 Renumber foundational demos 2024-12-11 10:56:51 -05:00
Kwindla Hultman Kramer
81895f4a5c Gemini Multimodal Live API service 2024-12-11 07:38:23 -08:00
Mark Backman
2846d6f461 Update READMEs and comment files 2024-12-11 00:06:35 -05:00
Mark Backman
14f309ce2b Add Gemini Live bot file 2024-12-10 22:25:17 -05:00
Aleix Conchillo Flaqué
62ec2f5d1e Merge pull request #814 from pipecat-ai/aleix/simli-updates
minor simli updates
2024-12-10 18:48:29 -08:00
Aleix Conchillo Flaqué
4f9a4ebce2 Merge pull request #820 from pipecat-ai/aleix/more-parallelpipeline-fixes
parallel_pipeline: fix system frames again
2024-12-10 18:43:34 -08:00
Aleix Conchillo Flaqué
5b478a5c7a add SimliVideoService to CHANGELOG 2024-12-10 18:42:26 -08:00
Aleix Conchillo Flaqué
87c1f2bcce services(simli): remove ready flag, events vs sleep, handle CancelledError 2024-12-10 18:42:12 -08:00
Aleix Conchillo Flaqué
b85072637f examples(26-simli-layer): use room returned by configure() 2024-12-10 18:42:12 -08:00
Aleix Conchillo Flaqué
ffe1e023e7 Merge pull request #819 from pipecat-ai/aleix/fix-openaillmcontext-from-image-frame
fix OpenAILLMContext from image frame
2024-12-10 18:39:55 -08:00
Aleix Conchillo Flaqué
9a358b2e86 Merge pull request #824 from pipecat-ai/aleix/openpipe-use-openai-base-service
services(openpipe): use OpenAILLMService to get access to aggregators
2024-12-10 18:34:46 -08:00
Aleix Conchillo Flaqué
b034c6e247 Merge pull request #821 from pipecat-ai/aleix/update-pyproject
pyproject: update onnxruntime, whisper and azure
2024-12-10 18:34:27 -08:00
Aleix Conchillo Flaqué
c7ca0eea0f Merge pull request #823 from pipecat-ai/aleix/fix-15a-switch-languages
examples: fix 15a-switch-languages pipeline
2024-12-10 18:34:13 -08:00
Aleix Conchillo Flaqué
29d931cdcd Merge pull request #822 from pipecat-ai/aleix/fix-11-sound-effects
examples: fix 11-sound-effects
2024-12-10 18:33:53 -08:00
Aleix Conchillo Flaqué
ecf0c61af9 services(openpipe): use OpenAILLMService to get access to aggregators 2024-12-10 18:29:03 -08:00
Aleix Conchillo Flaqué
67e8252d76 examples: fix 15a-switch-languages pipeline 2024-12-10 18:27:49 -08:00
Aleix Conchillo Flaqué
775aa9493e examples: fix 11-sound-effects 2024-12-10 18:25:43 -08:00
Aleix Conchillo Flaqué
c446f91d4a pyproject: update onnxruntime, whisper and azure 2024-12-10 18:16:27 -08:00
Aleix Conchillo Flaqué
7b6bbc29ed parallel_pipeline: fix system frames again 2024-12-10 18:12:33 -08:00
Aleix Conchillo Flaqué
9e7ecccf1e google: fix VisionImageRawFrame context 2024-12-10 17:39:52 -08:00
Aleix Conchillo Flaqué
a618bd3fa6 openai: remove from_image_frame() and use add_image_frame_message() 2024-12-10 17:39:52 -08:00
Aleix Conchillo Flaqué
246c825a82 examples: rename 07p-interruptible-google-audio-in to 07s 2024-12-10 17:07:17 -08:00
Aleix Conchillo Flaqué
9e6fabf110 Merge pull request #818 from pipecat-ai/aleix/fastpitch-rename
riva: rename FastpitchTTSService to FastPitchTTSService
2024-12-10 13:36:38 -08:00
Aleix Conchillo Flaqué
d2dabe4358 riva: rename FastpitchTTSService to FastPitchTTSService 2024-12-10 13:30:43 -08:00
Vanessa Pyne
1db624575f Merge pull request #795 from pipecat-ai/vp-nvidia-riva
[WIP] add nvidia riva
2024-12-10 15:17:26 -06:00
vipyne
a49b4e450b services(riva): check service config before running tts 2024-12-10 15:15:46 -06:00
vipyne
9211a37efc services(riva): convention tweaks 2024-12-10 15:15:46 -06:00
vipyne
3f9d39329c services(riva): model -> function_id 2024-12-10 15:15:46 -06:00
vipyne
5a98ae6380 chore: update test-requirements 2024-12-10 15:15:46 -06:00
vipyne
8caad15e9b examples trivial update 2024-12-10 15:15:46 -06:00
vipyne
9222d9f721 services(riva): cleanup 2024-12-10 15:15:46 -06:00
vipyne
5a467a30a3 add nvidia riva - fastpitch 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
d74e728332 pyproject: update google-cloud-texttospeech to 2.21.1 2024-12-10 15:15:46 -06:00
vipyne
8a9fdaf441 services(riva): cleanup 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
4b55c73fbe services(riva): make FastpitchTTSService asyncio 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
7e407e5548 services(riva): first working version of ParakeetSTTService 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
ce94421c90 pyproject: add riva option and update protobuf and playht 2024-12-10 15:15:46 -06:00
vipyne
49ce3dcb27 add nvidia riva - fastpitch 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
6ba2dea6f0 Merge pull request #812 from zzz-heygen/zzz/fix_serializer_backward_compat
fix: make ProtobufFrameSerializer backwards compatible
2024-12-10 13:11:09 -08:00
Aleix Conchillo Flaqué
9ac34ac371 Merge pull request #816 from pipecat-ai/aleix/rtvi-version-update
rtvi: update protocol version to 0.3.0
2024-12-10 11:52:28 -08:00
Aleix Conchillo Flaqué
a8644d2129 Merge pull request #815 from pipecat-ai/aleix/identity-filter
processors(filters): add IdentityFilter
2024-12-10 11:09:20 -08:00
Aleix Conchillo Flaqué
3bf15476a4 processors(filters): add IdentityFilter 2024-12-10 11:01:59 -08:00
Aleix Conchillo Flaqué
acb3e21432 rtvi: update protocol version to 0.3.0 2024-12-10 10:57:42 -08:00
Mark Backman
8c9c81d84b Merge pull request #810 from pipecat-ai/mb/read-the-docs
Changes for Read the Docs hosting
2024-12-10 12:48:26 -05:00
Aleix Conchillo Flaqué
e51e2f781d Merge pull request #765 from simliai/simli
Add Simli Service
2024-12-10 09:23:06 -08:00
Dan Goodman
af6f5ecc86 customize Anthropic client via kwargs, also bumps default model version (#813)
* customize Anthropic client via kwargs

* bump default model
2024-12-10 09:13:44 -08:00
antonyesk601
81a18633ca Remove duplicate frame push if simli connection isn't ready 2024-12-10 10:18:31 +00:00
antonyesk601
397342d0b9 Inizialize simli_client on StartFrame; Follow variable naming scheme; Use logger instead of print statements; 2024-12-10 10:11:07 +00:00
zzz
d6b3a50108 x 2024-12-10 07:50:50 +00:00
Mark Backman
66b08161f1 Changes for Read the Docs hosting 2024-12-10 00:54:21 -05:00
Mark Backman
e7fa1cacce Merge pull request #800 from pipecat-ai/mb/autogen-docs
Auto-generate API reference docs
2024-12-09 22:05:08 -05:00
Mark Backman
2d3864ee09 Move API docs generation to docs/api 2024-12-09 20:44:10 -05:00
Aleix Conchillo Flaqué
0287f06379 Merge pull request #809 from pipecat-ai/aleix/parallel-pipeline-fix-system-frames
fix system frames parallel pipeline
2024-12-09 15:48:27 -08:00
Mark Backman
681c8ffb1d Merge pull request #807 from pipecat-ai/mb/stt-mute-strategy
Add new STT mute strategy, accept a set of strategies
2024-12-09 18:34:30 -05:00
Mark Backman
676643d558 Code review fixes 2024-12-09 18:27:07 -05:00
Mark Backman
0c4cbc2615 Push FunctionCall Frames upstream and downstream; update example 2024-12-09 18:27:07 -05:00
Aleix Conchillo Flaqué
e690c98230 transports(daily): no need for joining flag
This was put back because of an issue in ParallelPipeline but that issue is now
fixed so the joining check is not really necessary.
2024-12-09 09:38:30 -08:00
Aleix Conchillo Flaqué
e0a6c6871c parallel_pipeline: don't queue system frames 2024-12-09 09:38:30 -08:00
Mark Backman
29a042a101 Add changelog entry 2024-12-09 10:52:32 -05:00
Mark Backman
1cc2da571e Add new STT mute strategy, accept a set of strategies 2024-12-09 10:50:08 -05:00
Kwindla Hultman Kramer
c6b401b5d1 Merge pull request #805 from pipecat-ai/khk/parallel-pipeline-fix
Check to avoid double-join in ParallelPipeline case
2024-12-07 21:49:16 -08:00
Kwindla Hultman Kramer
315b7fcc34 check to avoid double-join 2024-12-07 21:22:36 -08:00
Mark Backman
e9f5fe0f37 Merge pull request #802 from Allenmylath/patch-22
Update README.md
2024-12-07 10:14:44 -05:00
allenmylath
64faf2218e Update examples/patient-intake/README.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2024-12-07 19:08:00 +05:30
allenmylath
e77a785a7d Update README.md 2024-12-07 13:36:50 +05:30
Mark Backman
03a269fb87 Merge pull request #801 from pipecat-ai/aleix/rtvi-handle-transport-urgent-frames
rtvi: handle transport urgent frames
2024-12-06 21:33:18 -05:00
Aleix Conchillo Flaqué
d1a55c6063 rtvi: handle transport urgent frames 2024-12-06 17:51:09 -08:00
Mark Backman
61d0fa42f1 Add a workflow to generate the docs 2024-12-06 20:32:33 -05:00
Mark Backman
16de1fca9b Add Read the Docs config 2024-12-06 20:15:17 -05:00
Mark Backman
2ad83f23c8 Initial reference docs commit 2024-12-06 19:44:44 -05:00
Aleix Conchillo Flaqué
422ee98db0 Merge pull request #798 from pipecat-ai/aleix/functioncall-data-frames
frames: FunctionCallResultFrame should be a DataFrame as before
2024-12-06 16:38:23 -08:00
Aleix Conchillo Flaqué
3d4620cf95 frames: FunctionCallResultFrame should be a DataFrame as before 2024-12-06 11:54:50 -08:00
Aleix Conchillo Flaqué
752a6f02b5 Merge pull request #799 from pipecat-ai/aleix/cartesia-interruptions-fix
cartesia: fix broken interruptions
2024-12-06 11:52:22 -08:00
Aleix Conchillo Flaqué
7e41809ec2 cartesia: fix broken interruptions 2024-12-06 11:49:03 -08:00
Aleix Conchillo Flaqué
e344a73d14 Merge pull request #797 from pipecat-ai/aleix/xtts-default-language
services(xtts): default language to Language.EN
2024-12-06 11:00:53 -08:00
Aleix Conchillo Flaqué
d6f480fa50 Merge pull request #791 from pipecat-ai/aleix/fastapi-generic-websocket
FastAPIWebsocketTransport: fix to work with text and binary
2024-12-06 10:46:16 -08:00
Aleix Conchillo Flaqué
423d6485f8 services(xtts): default language to Language.EN 2024-12-06 10:45:20 -08:00
Aleix Conchillo Flaqué
842b3de7f5 FastAPIWebsocketTransport: fix to work with text and binary 2024-12-06 10:31:42 -08:00
Aleix Conchillo Flaqué
3cb7829624 update CHANGELOG 2024-12-06 10:31:11 -08:00
Aleix Conchillo Flaqué
4292507616 Merge pull request #793 from balalofernandez/send-interruption-to-cartesia
fix: Send interruption to cartesia
2024-12-06 10:26:34 -08:00
Aleix Conchillo Flaqué
98c9759f41 Merge pull request #796 from pipecat-ai/aleix/improve-tts-reconnection
services: improve Cartesia, 11Labs, PlayHT and LMNT TTS reconnection
2024-12-06 10:22:54 -08:00
Aleix Conchillo Flaqué
bafb867ffc services: improve Cartesia, 11Labs, PlayHT and LMNT TTS reconnection 2024-12-06 10:11:59 -08:00
Mark Backman
b05809be2e Merge pull request #794 from pipecat-ai/mb/upgrade-anthropic
Upgrade Anthropic to the latest to avoid collision with aiohttp 3.11.9
2024-12-06 12:01:51 -05:00
Mark Backman
57d346ce13 Upgrade Anthropic to the latest to avoid collision with aiohttp 3.11.9 2024-12-06 11:59:19 -05:00
balalo
9001cb17ce Fix interruption frame to avoid issues with sending None 2024-12-06 17:42:46 +01:00
Mark Backman
40cfd9776f Merge pull request #792 from pipecat-ai/mb/cartesia-languages
Add additional languages for Cartesia
2024-12-06 09:57:38 -05:00
Mark Backman
d68b3ad1b2 Add additional languages for Cartesia 2024-12-06 09:22:05 -05:00
Kwindla Hultman Kramer
9b51588b92 Merge pull request #782 from pipecat-ai/khk/flash-transcription
Async Google LLM + Gemini Flash transcription example
2024-12-05 12:50:18 -08:00
Aleix Conchillo Flaqué
9a36a4ca32 Merge pull request #790 from pipecat-ai/aleix/base-output-transport-wait-for-output-tasks
transports(base_output): wait for output tasks on EndFrame
2024-12-05 11:30:55 -08:00
Aleix Conchillo Flaqué
f80a97b545 transports(base_output): wait for output tasks on EndFrame 2024-12-05 11:26:18 -08:00
Mark Backman
274278e229 Merge pull request #789 from pipecat-ai/mb/update-simple-chatbot-demo
Add RTVI transcripts, align styling
2024-12-05 11:56:07 -05:00
Mark Backman
6b94bcac03 Add RTVI transcripts, align styling 2024-12-05 11:12:48 -05:00
Aleix Conchillo Flaqué
969b87dee9 update aiohttp version to 3.11.9 2024-12-05 07:35:21 -08:00
balalo
bc699735a3 Send interruption message to cartesia 2024-12-05 16:23:40 +01:00
Mark Backman
00fd381808 Merge pull request #745 from pipecat-ai/mb/user-idle
Only run the UserIdleProcessor while pipeline is running
2024-12-05 10:12:02 -05:00
Mark Backman
672b1c6d73 Merge pull request #786 from Allenmylath/patch-21
Update README.md
2024-12-05 09:15:24 -05:00
Mark Backman
f455eb171b Merge pull request #784 from pipecat-ai/mb/simple-bot-client
Update the simple-chatbot demo to have JS and React clients
2024-12-05 08:34:33 -05:00
allenmylath
62c8c90e17 Update README.md 2024-12-05 13:23:05 +05:30
Aleix Conchillo Flaqué
28bb448605 Merge pull request #783 from pipecat-ai/aleix/deepgram-vad-event-handlers
deepgram: add VAD event handlers
2024-12-04 19:35:22 -08:00
Aleix Conchillo Flaqué
3d76b30a7c deepgram: add VAD event handlers 2024-12-04 19:31:09 -08:00
Aleix Conchillo Flaqué
0ae8ca0813 Merge pull request #781 from pipecat-ai/aleix/websocket-transports-mixer-fixes
websocket transports mixer fixes
2024-12-04 19:12:20 -08:00
Aleix Conchillo Flaqué
0935d773f5 transport(websockets): fix initial busy loop when using audio mixers 2024-12-04 19:10:39 -08:00
Aleix Conchillo Flaqué
e0f7a8a9f4 audio(mixer): SoundfileMixer doesn't resample files anymore 2024-12-04 19:09:50 -08:00
Aleix Conchillo Flaqué
2a0e01898f Merge pull request #780 from pipecat-ai/aleix/gstreamer-default-sample-rate
gstreamer: update default sample rate to 24000
2024-12-04 19:09:02 -08:00
Aleix Conchillo Flaqué
9d25e325dd Merge pull request #779 from pipecat-ai/aleix/websocket-server-audio-mixins-fix
frames: fix AudioRawFrame mixin
2024-12-04 19:08:41 -08:00
Aleix Conchillo Flaqué
37c21426bf Merge pull request #778 from pipecat-ai/aleix/transports-disconnect-on-last-transport
transports: fix premature input transport closing
2024-12-04 19:08:23 -08:00
Mark Backman
c467ec8ded Merge pull request #772 from pipecat-ai/mb/nim-llm
Add a NIM LLM service
2024-12-04 21:41:09 -05:00
Kwindla Hultman Kramer
a367a038f1 fix for finally clause 2024-12-04 18:31:30 -08:00
Mark Backman
e45a123eab Add image to README 2024-12-04 21:29:22 -05:00
Mark Backman
2ecc0e2b13 Remove node modules 2024-12-04 21:28:17 -05:00
Mark Backman
d532e924cd Add .gitignore 2024-12-04 21:28:17 -05:00
Mark Backman
36208049dc Update changelog 2024-12-04 21:28:17 -05:00
Mark Backman
1d11419691 Update the simple-chatbot demo to have JS and React clients 2024-12-04 21:13:14 -05:00
Mark Backman
05451f882d Merge pull request #777 from pipecat-ai/mb/twilio-example
Improve twilio-chatbot README
2024-12-04 20:26:45 -05:00
Kwindla Hultman Kramer
9c22f5b81b async google llm 2024-12-04 15:52:52 -08:00
Aleix Conchillo Flaqué
891f261191 gstreamer: update default sample rate to 24000 2024-12-04 14:41:44 -08:00
Aleix Conchillo Flaqué
13c27eaa1d frames: fix AudioRawFrame mixin 2024-12-04 13:25:37 -08:00
Mark Backman
c395d1a234 Merge pull request #773 from Allenmylath/patch-20
Update README.md
2024-12-04 14:45:38 -05:00
Mark Backman
49639c8631 Improve the twilio-chatbot README 2024-12-04 14:42:05 -05:00
Mark Backman
695a98a1f7 Remove streams.xml from version control 2024-12-04 14:26:10 -05:00
Mark Backman
5cbc37472c Update .gitignore to exclude streams.xml 2024-12-04 14:25:10 -05:00
Aleix Conchillo Flaqué
5b6d9a1050 transports: fix premature input transport closing 2024-12-04 10:56:57 -08:00
allenmylath
332d36475b Update examples/patient-intake/README.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2024-12-04 23:27:25 +05:30
Mark Backman
29b67578e3 Update README 2024-12-04 12:52:09 -05:00
Mark Backman
9db3743901 Update pyproject.toml with a nim optional dep 2024-12-04 12:52:09 -05:00
Mark Backman
496aded031 Update changelog 2024-12-04 12:38:05 -05:00
Mark Backman
1c1fa0db65 Add a NIM LLM service 2024-12-04 12:35:24 -05:00
Mark Backman
a2ad40d7e0 Merge pull request #775 from pipecat-ai/mb/llm-stubs
Added LLM services for GroqLLMService and GrokLLMService
2024-12-04 12:26:19 -05:00
Mark Backman
2bb3682d88 Update README 2024-12-04 12:24:39 -05:00
Kwindla Hultman Kramer
f33f08d667 partially working audio+transcription parallel pipelines 2024-12-04 08:51:35 -08:00
Mark Backman
d9bc2b618f Update FireworksLLMService to use OpenAILLMService 2024-12-04 11:51:05 -05:00
Mark Backman
d5a50e2cad Update AzureLLMService to use OpenAILLMService 2024-12-04 11:01:56 -05:00
Mark Backman
7013343bf0 Update the changelog 2024-12-04 10:10:55 -05:00
Mark Backman
728acba8a5 Add LLMService stubs for Grok and Groq, add examples 2024-12-04 10:08:28 -05:00
allenmylath
3b2c78747c Update README.md 2024-12-04 10:24:17 +05:30
allenmylath
44a0acffc8 Update README.md 2024-12-04 10:21:17 +05:30
Aleix Conchillo Flaqué
c31d5a4f1a Merge pull request #771 from pipecat-ai/aleix/daily-execute-callbacks-from-task
transports(daily): use a task to execute callbacks
2024-12-03 19:55:38 -08:00
Aleix Conchillo Flaqué
52caaa4afb transports(daily): use a task to execute callbacks
This commit fixes an issue where we were not waiting for
`asyncio.run_coroutine_threadsafe` to complete which can cause a series of
undesired issues (e.g. not actually executing the coroutine).
2024-12-03 18:58:54 -08:00
Aleix Conchillo Flaqué
115e75d808 Merge pull request #770 from pipecat-ai/aleix/system-input-frames-and-audio-buffer-processor
system input frames and audio buffer processor fixes
2024-12-03 18:58:13 -08:00
Mark Backman
897e024dd8 Only run the UserIdleProcessor while pipeline is running 2024-12-03 21:09:03 -05:00
Aleix Conchillo Flaqué
1cf93f1dcb FrameProcessor: ignore other frames during CancelFrame 2024-12-03 16:26:29 -08:00
Aleix Conchillo Flaqué
d278996d5b updated CHANGELOG 2024-12-03 16:12:40 -08:00
Aleix Conchillo Flaqué
322dd0cea1 AudioBufferProcessor: use on_audio_data event handler to retrieve audio 2024-12-03 16:12:40 -08:00
Aleix Conchillo Flaqué
a6a4910931 transports(services): incoming transport messages should be urgent 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
52cefaa9d6 frames: remove AppFrame 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
42658ecd92 frames: use mixins for audio and image data 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
a6606a4040 transports(base_output): remove unused code 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
d6c944cdc1 processors(audio): fix AudioBufferProcessor interruptions 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
a5c7b02a73 frames: input frames are now system frames
Input frames from a transport should be processed fast and there's no need for
them to be queued internally in each element.
2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
6b9223d87e Merge pull request #768 from pipecat-ai/aleix/websocket-server-interruptions
transports(websockets): use frame serializers during interruptions
2024-12-02 19:18:20 -08:00
Aleix Conchillo Flaqué
c2135cbe11 transports(websockets): use frame serializers during interruptions 2024-12-02 19:17:17 -08:00
Aleix Conchillo Flaqué
32495ddd0b Merge pull request #769 from pipecat-ai/aleix/daily-subscribe-video-source
transports(daily): subscribe to the desired video source
2024-12-02 19:16:14 -08:00
Aleix Conchillo Flaqué
4301f0abf7 Merge pull request #767 from pipecat-ai/aleix/warn-transcription-no-token
transports(daily): warn if transcription enabled but no token provided
2024-12-02 15:06:35 -08:00
Aleix Conchillo Flaqué
5e854c4d03 transports(daily): subscribe to the desired video source 2024-12-02 12:13:23 -08:00
Aleix Conchillo Flaqué
bec46a87ae Merge pull request #766 from Allenmylath/patch-20
Update requirements.txt
2024-12-02 10:32:36 -08:00
Aleix Conchillo Flaqué
71cf94e936 transports(daily): warn if transcription enabled but no token provided 2024-12-02 09:55:17 -08:00
allenmylath
acbecf1c4c Update requirements.txt
daily is not used here.transport is fastapi websocket.
2024-12-02 21:36:29 +05:30
Mark Backman
6095fd342e Merge pull request #763 from Allenmylath/patch-19
Update README.md
2024-12-02 09:30:36 -05:00
Waleed
bf40b4936b updated env template; added simli variables 2024-12-02 12:05:55 +01:00
Waleed
c60dd8d4d2 updated environment variable name for cartesia 2024-12-02 12:05:32 +01:00
Waleed
d472aaf391 updated readme. Added simli 2024-12-02 11:50:51 +01:00
Waleed
6cc0b74e6c integrated simli 2024-12-02 11:35:46 +01:00
allenmylath
23316fbcf9 Update README.md 2024-12-02 13:35:44 +05:30
James Hush
5e22ef251d fix: add logging and error handling for issue #721 (#755) 2024-11-29 13:06:45 +08:00
Mark Backman
c5324df807 Merge pull request #752 from pipecat-ai/mb/google-context-message-conversion
Use Google Gemini message format when adding message to the LLM context
2024-11-27 14:13:17 -05:00
Mark Backman
3c19a7ae3d Use Google Gemini message format when adding message to the LLM context 2024-11-27 12:46:51 -05:00
Mark Backman
98c0a6e047 Merge pull request #749 from pipecat-ai/mb/pipecat-flows-standalone
Make Pipecat Flows an independent package
2024-11-25 17:09:11 -05:00
Mark Backman
f599e160de Make Pipecat Flows an independent package 2024-11-25 13:42:08 -05:00
Mark Backman
11c5d822f9 Merge pull request #746 from pipecat-ai/mb/update-flows
Bumping pipecat-ai-flows version
2024-11-22 11:25:03 -05:00
Mark Backman
c3e22f0931 Bumping pipecat-ai-flows version 2024-11-22 11:21:40 -05:00
Kwindla Hultman Kramer
9409546f90 Merge pull request #743 from pipecat-ai/khk/gemini-exp
Empty text content bug fix for Gemini
2024-11-21 14:04:28 -08:00
Kwindla Hultman Kramer
8ddac0ccd8 Testing with gemini-exp-1114. Bug fix 2024-11-21 10:33:12 -08:00
Vaibhav159
6e8e7fa19a adding session_timeout in fastapi 2024-11-21 14:56:42 +05:30
Vaibhav159
7dfa886669 moving logic to WebsocketServerInputTransport 2024-11-21 14:45:24 +05:30
Vaibhav159
da254c5143 correcting _monitor_websocket 2024-11-21 12:36:51 +05:30
Vaibhav159
e11f128110 adding on_session_timeout 2024-11-21 12:34:32 +05:30
Vaibhav-Lodha
3aa89fb13a adding session_timeout param 2024-11-21 12:20:51 +05:30
Mark Backman
f938960d50 Merge pull request #736 from pipecat-ai/mb/language-support
Make language support more robust
2024-11-20 13:03:47 -05:00
Mark Backman
2981d87bc1 Update changelog 2024-11-20 12:56:35 -05:00
Mark Backman
106042bbb2 Make language support more robust 2024-11-20 12:56:11 -05:00
Filipi da Silva Fuchter
d25ddeb962 Merge pull request #739 from pipecat-ai/krisp_v7
bumping krisp to support v7
2024-11-20 11:39:39 -03:00
Filipi Fuchter
c441baa692 bumping krisp to support v7 2024-11-20 11:37:45 -03:00
Mark Backman
676ff14913 Merge pull request #735 from pipecat-ai/vp-internal-push-frame-fix
internal push frame fix
2024-11-20 06:34:40 -05:00
Vanessa Pyne
14893ade92 Update src/pipecat/processors/frame_processor.py
Co-authored-by: Mark Backman <mark@daily.co>
2024-11-19 22:37:58 -06:00
Mark Backman
2a39ff69d6 Merge pull request #720 from pipecat-ai/mb/conversation-flow 2024-11-19 21:46:20 -05:00
Mark Backman
e79289454a Merge pull request #734 from pipecat-ai/mb/fix-cartesia 2024-11-19 21:27:52 -05:00
Mark Backman
25d02da1b2 Merge pull request #738 from pipecat-ai/mb/natural-conversation-demo 2024-11-19 21:27:38 -05:00
Mark Backman
a36fc370fa Improve the 22c foundational example 2024-11-19 15:49:40 -05:00
Mark Backman
e4c2f6d4c2 Update changelog 2024-11-18 21:32:53 -05:00
Mark Backman
97659ca3f0 Use the new pipecat-ai-flows module 2024-11-18 21:29:35 -05:00
vipyne
e00c75ce3f fix: raise exception in internal_push_frame 2024-11-18 16:01:04 -06:00
Mark Backman
cf62167f54 Revert: services(cartesia): generated TTSStoppedFrame after no more audio 2024-11-18 12:25:04 -05:00
Mark Backman
b3dfeb61c4 Add CHANGELOG entry 2024-11-18 12:18:20 -05:00
Mark Backman
bd020320cd Support a list of messages 2024-11-18 12:18:20 -05:00
Mark Backman
7a55d2d7db Add end session handler and update example 2024-11-18 12:18:20 -05:00
Mark Backman
b7308dca5d Fix issue where actions would execute on terminating nodes 2024-11-18 12:18:20 -05:00
Mark Backman
5301f44b3b Add pre- and post-actions 2024-11-18 12:18:20 -05:00
Mark Backman
686165b95a Add ability to register actions 2024-11-18 12:18:20 -05:00
Mark Backman
4e0ecdd673 Class name updates and remove FrameProcessor base class 2024-11-18 12:18:20 -05:00
Mark Backman
1b74560f9d Move function registration into the ConversationFlowProcessor class 2024-11-18 12:18:20 -05:00
Mark Backman
0c1070433f Clean up and commenting 2024-11-18 12:18:20 -05:00
Mark Backman
ece2c08cde debugging 2024-11-18 12:18:20 -05:00
Mark Backman
0b9742da9e Add a conversation flow processor 2024-11-18 12:18:20 -05:00
Aleix Conchillo Flaqué
635aa6eb5b Merge pull request #729 from pipecat-ai/aleix/fastapi-websocket-dont-close
transports(fastapi): don't try to close socket
2024-11-18 16:01:41 +01:00
Mark Backman
1ff17cc2b6 Merge pull request #733 from pipecat-ai/aleix/add-missing-init-files
processors: add missing __init__.py
2024-11-18 09:44:56 -05:00
Mark Backman
41ce9e9087 Merge pull request #697 from pipecat-ai/cst/leave-message
add handler for disconnect-bot message
2024-11-18 09:38:11 -05:00
Mark Backman
4803c54ecf Update CHANGELOG 2024-11-18 09:36:19 -05:00
Christian Stuff
5d7b3f2b38 add handler for disconnect-bot message 2024-11-18 09:33:30 -05:00
Aleix Conchillo Flaqué
23e5b1ec4d processors: add missing __init__.py 2024-11-18 11:32:20 +01:00
Aleix Conchillo Flaqué
7f5a8928b8 transports(fastapi): don't try to close socket
The websocket is passed from outside (in the transport constructor) so we should
not be trying to close it. FastAPI does actually close it later. We didn't see
any issue because these functions were not implemented properly. The value to
check was `application_state` instead of `client_state`. But in any case,
Pipecat should not be responsible for closing things passed from outside.
2024-11-18 01:15:19 +01:00
Aleix Conchillo Flaqué
53f675f5cf Merge pull request #727 from pipecat-ai/aleix/pipecat-0.0.49
update CHANGELOG for 0.0.49
2024-11-18 06:27:12 +08:00
Aleix Conchillo Flaqué
8173e4ce55 update CHANGELOG for 0.0.49 2024-11-17 23:26:09 +01:00
Aleix Conchillo Flaqué
5445bb0363 rtvi: add on_bot_started event 2024-11-17 22:40:00 +01:00
Mark Backman
a2a94724e5 Merge pull request #725 from pipecat-ai/mb/fix-simple-chatbot
Fix simple-chatbot example
2024-11-16 12:10:05 -05:00
Aleix Conchillo Flaqué
a8f9b0635a Merge pull request #722 from pipecat-ai/aleix/more-dailin-events
transports(daily): add more dial-in events
2024-11-17 01:09:01 +08:00
Mark Backman
4273a31fd5 Fix simple-chatbot example 2024-11-16 07:48:42 -05:00
Aleix Conchillo Flaqué
67f975a2c8 transports(daily): add more dial-in events 2024-11-16 01:22:50 +01:00
Mark Backman
d0bca67666 Merge pull request #716 from pipecat-ai/mb/mute-stt-service
Add STTMuteFilter to un/mute the STT
2024-11-14 19:55:00 -05:00
Mark Backman
966974bfc6 Change STTMuteProcessor to STTMuteFilter 2024-11-14 19:47:37 -05:00
Mark Backman
f807f233bd Suppress UserStartedSpeakingFrame and UserStoppedSpeakingFrame when muted 2024-11-14 17:11:51 -05:00
Mark Backman
33108f5798 Code review feedback 2024-11-14 17:05:08 -05:00
Mark Backman
52de825af8 Update CHANGELOG 2024-11-14 13:47:08 -05:00
Mark Backman
5fe679039c Add STTMuteProcessor to un/mute the STT 2024-11-14 13:35:02 -05:00
Kwindla Hultman Kramer
534f710f5d Merge pull request #688 from pipecat-ai/khk/natural-conversation
More work on llm-as-judge phrase endpointing
2024-11-14 09:15:16 -08:00
Mark Backman
53a11744a8 Merge pull request #712 from pipecat-ai/aleix/some-languages-tweaks
some languages tweaks
2024-11-14 09:33:26 -05:00
Mark Backman
72412cc0c4 Code review feedback 2024-11-14 09:31:04 -05:00
Mark Backman
b77ac07bc6 Merge pull request #715 from pipecat-ai/mb/update-readme-2
Add visual divider below Pipecat README image
2024-11-14 08:54:25 -05:00
Mark Backman
eb6926e0ce Add visual divider below Pipecat README image 2024-11-14 08:51:07 -05:00
Mark Backman
3b2c9de944 Merge pull request #713 from pipecat-ai/mb/update-readme
Update README
2024-11-14 08:45:28 -05:00
Mark Backman
27ff868e5a Move CONTRIBUTING to top directory 2024-11-14 08:43:03 -05:00
Mark Backman
57ef525a8e Update README 2024-11-14 08:43:03 -05:00
Aleix Conchillo Flaqué
d1db54d5fe examples(playht): use a 2.0 engine 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
4f88fc0eb8 services(tts): initialize language to the proper language code 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
37d1f4c4e1 services(tts): some language to service language cleanup 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
ef9e86d997 services(playht): make sure we only skip wav header no matter the size 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
2d2ef5a417 services(playht): voice engine is Play3.0-mini 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
c1fff00586 services(playht): fix language codes 2024-11-13 17:19:23 +01:00
Mark Backman
0af2196f50 Merge pull request #708 from pipecat-ai/mb/add-rime-ai
Add RimeTTSService
2024-11-12 18:29:53 -05:00
Mark Backman
cd42320788 Update changelog 2024-11-12 18:28:04 -05:00
Mark Backman
70fce52499 Merge pull request #710 from pipecat-ai/mb/update-readme-krisp
Update Krisp README instructions
2024-11-12 11:15:25 -05:00
Mark Backman
70b60c0593 Update Krisp README instructions 2024-11-12 10:26:12 -05:00
Jon Taylor
2d8aa03f31 Merge pull request #706 from pipecat-ai/jpt/modal-example
barebones modal.com deployment example
2024-11-12 11:41:00 +00:00
Kwindla Hultman Kramer
581ff26704 Merge pull request #707 from pipecat-ai/khk/clean-up
tiny PR to remove old comment lines
2024-11-11 21:14:16 -08:00
Kwindla Hultman Kramer
335178ff06 some gemini audio input examples 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
ee53535f41 gemini audio-in with no transcription 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
91ac40307e small fix and more prompt examples 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
b6c2c1f730 anthropic natural conversation example using claude haiku 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
b56c789ae4 fixes for proposed judge pipeline 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
bd435d9e62 missing commit 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
55a81df84f contributing to llm-as-judge phrase endpointing work 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
87434460f5 temp hacking 2024-11-11 21:04:50 -08:00
Mark Backman
958ec42e8d Add Rime.ai TTS service 2024-11-11 21:58:09 -05:00
Jon Taylor
d1fff60d1d barebones modal.com deployment example 2024-11-11 22:30:07 +00:00
Kwindla Hultman Kramer
1438e5654a remove old comment 2024-11-10 16:08:10 -08:00
Aleix Conchillo Flaqué
1d4be0139a Merge pull request #705 from pipecat-ai/aleix/prepare-0.0.48
update CHANGELOG for 0.0.48
2024-11-10 14:08:33 -08:00
Aleix Conchillo Flaqué
f58c3ee322 update CHANGELOG for 0.0.48 2024-11-10 23:01:03 +01:00
Aleix Conchillo Flaqué
379750df91 Merge pull request #704 from pipecat-ai/aleix/cartesia-tts-stopped-frame
services(cartesia): generated TTSStoppedFrame after no more audio
2024-11-10 05:17:36 -08:00
Aleix Conchillo Flaqué
d125a38737 services(cartesia): generated TTSStoppedFrame after no more audio
The TTSStoppedFrame should be generated when the TTS services stoped generating
audio not when the bot stops speaking.
2024-11-10 09:55:45 +01:00
Mark Backman
446bb0aeaf Merge pull request #702 from pipecat-ai/mb/azure-websocket
Add an Azure TTS websocket service
2024-11-09 17:41:53 -05:00
Aleix Conchillo Flaqué
d839080834 Merge pull request #642 from pipecat-ai/aleix/input-queues-block-frames
introduce frame processor input queues block frames
2024-11-09 14:30:17 -08:00
Mark Backman
9b85d0642b Add a changelog entry 2024-11-09 12:37:29 -05:00
Mark Backman
230b51a117 Add an Azure TTS websocket service 2024-11-09 12:37:29 -05:00
Mark Backman
3a965ca396 Merge pull request #701 from pipecat-ai/khk/anthropic-function-calling-fix
fixes for anthropic function calling
2024-11-09 06:39:34 -05:00
Kwindla Hultman Kramer
33fc5bf990 improved 20c-persistent-context-anthropic.py 2024-11-08 16:42:30 -08:00
Kwindla Hultman Kramer
a54ca08405 fixes for anthropic function calling 2024-11-08 16:33:02 -08:00
Filipi da Silva Fuchter
4379db43ed Merge pull request #689 from pipecat-ai/filipi/krisp
Making pipecat work with Krisp
2024-11-08 16:22:52 -03:00
Filipi Fuchter
e915c676aa Added support for Krisp audio filter 2024-11-08 16:18:10 -03:00
Mark Backman
e0a003afa1 Merge pull request #695 from pipecat-ai/mb/initialize-azure-lang
Initialize the speech_recognition_language for Azure TTS
2024-11-08 06:40:40 -05:00
James Hush
d5666727ce feat: toggle looping with soundfile mixer (#693)
* feat: toggle looping with soundfile mixer

* Implement PR changes
2024-11-07 21:08:37 -08:00
Mark Backman
f6d7402530 Update changelog 2024-11-07 15:16:03 -05:00
Mark Backman
aefe190c9f Initialize the speech_recognition_language for Azure TTS 2024-11-07 15:14:05 -05:00
Vanessa Pyne
29925a8f21 Merge pull request #551 from Allenmylath/patch-3
Frame types and short descriptionCreate Frames.md
2024-11-07 10:05:32 -06:00
Aleix Conchillo Flaqué
beb3271168 services(tts): make sure word timestamp is reset properly 2024-11-06 18:54:12 -08:00
Aleix Conchillo Flaqué
b959ac6e1e Merge pull request #694 from pipecat-ai/aleix/daily-add-on-transcription-message
transports(daily): call on_transcription_message event handler
2024-11-06 15:21:17 -08:00
Aleix Conchillo Flaqué
17f4286942 transports(daily): call on_transcription_message event handler 2024-11-06 15:10:58 -08:00
Aleix Conchillo Flaqué
ce89bbb16e tts(elevenlabs): support pausing and resuming frames while speaking 2024-11-06 14:38:33 -08:00
Aleix Conchillo Flaqué
865768039b processors: remove block_on_frames and add pause_processing_frames() instead 2024-11-06 14:20:25 -08:00
Aleix Conchillo Flaqué
7071482583 try to use queue_frame() instead of process_frame() 2024-11-06 14:18:21 -08:00
Aleix Conchillo Flaqué
5353d13151 update CHANGELOG 2024-11-06 13:16:58 -08:00
Aleix Conchillo Flaqué
a9e565f355 processors: fix input queue interruptions 2024-11-06 13:12:24 -08:00
Aleix Conchillo Flaqué
b6f0c16591 examples: restore EndFrame() on 01 and 02 foundational 2024-11-06 13:05:03 -08:00
Aleix Conchillo Flaqué
49005d02f5 services(tts): use TTSSpeakFrame in say() method 2024-11-06 13:05:03 -08:00
Aleix Conchillo Flaqué
6d8b885071 transports(base_output): push bot started/stopped frames downstream 2024-11-06 13:04:37 -08:00
Aleix Conchillo Flaqué
2eccb33e73 processors: allow passing a callback when queued frame is processed 2024-11-06 13:04:37 -08:00
Aleix Conchillo Flaqué
22ca4c5a02 processors: cancel input task and empty queue with interruptions 2024-11-06 13:04:37 -08:00
Aleix Conchillo Flaqué
84f26ac1ca processors: introduce input queues
Frame processors can now decide if they should continue processing frames or
not, and if so also decide when to continue processing frames. For example,
asynchronous TTS services will stop processing frames until they have generated
all the audio for an LLM response.
2024-11-06 12:13:49 -08:00
Aleix Conchillo Flaqué
74937411e6 Merge pull request #691 from pipecat-ai/aleix/rtvi-manual-bot-ready
rtvi: bot-ready message needs to be sent manual
2024-11-06 10:53:25 -08:00
Aleix Conchillo Flaqué
8aab068ffd rtvi: bot-ready message needs to be sent manual 2024-11-05 10:52:54 -08:00
Aleix Conchillo Flaqué
bd50201ce4 transports(daily): just make it clear we subscribe to camera 2024-11-04 17:32:46 -08:00
Aleix Conchillo Flaqué
6082da284e Merge pull request #611 from pipecat-ai/aleix/audio-filters
introduce audio filters
2024-11-04 16:34:47 -08:00
Aleix Conchillo Flaqué
358c458265 transports(base_input): handle filter contorl frames 2024-11-04 16:19:52 -08:00
Aleix Conchillo Flaqué
807dbbe326 audio(noisereduce): allow enabling/disabling filter 2024-11-04 16:13:29 -08:00
Aleix Conchillo Flaqué
3c116b291d audio(mixers): some cosmetics 2024-11-04 15:37:08 -08:00
Aleix Conchillo Flaqué
0dd413ee90 audio(filters): add noisereduce filter 2024-11-04 15:37:08 -08:00
Aleix Conchillo Flaqué
abc8ede3d7 introduce audio filters 2024-11-04 15:37:08 -08:00
Aleix Conchillo Flaqué
126324ca1b Merge pull request #687 from pipecat-ai/aleix/transport-audio-mixers
introduce transport audio mixers
2024-11-04 13:14:36 -08:00
Aleix Conchillo Flaqué
602915ae18 examples(websocket-server): allow interruptions 2024-11-04 13:05:02 -08:00
Aleix Conchillo Flaqué
0ac9e2dd3f transports(network): synchronize with time before sending data 2024-11-04 13:04:18 -08:00
Aleix Conchillo Flaqué
a9ef5ca95d examples: add bot background sound example 2024-11-03 11:13:02 -08:00
Aleix Conchillo Flaqué
81c476dd4c introduce output transport audio mixers 2024-11-03 11:13:02 -08:00
Kwindla Hultman Kramer
151242d3a0 Merge pull request #666 from pipecat-ai/khk/realtime-pipecat-vad
Support using Pipecat turn detection instead of OpenAI Realtime API turn detection
2024-11-02 08:36:31 -07:00
Kwindla Hultman Kramer
93c6e5098c added comment explaining config of TurnDetection 2024-11-02 08:24:54 -07:00
Aleix Conchillo Flaqué
4455b2a428 rtvi: create queues before tasks 2024-11-01 23:06:50 -07:00
Aleix Conchillo Flaqué
94062592ef base_output: generate smaller audio frames of the same incoming type 2024-11-01 23:06:50 -07:00
Aleix Conchillo Flaqué
d2401a76c8 base_output: only generate bot speaking with TTS audio frames 2024-11-01 23:06:50 -07:00
Aleix Conchillo Flaqué
e2b1b56e86 examples: don't require room token if using an STT 2024-11-01 23:06:50 -07:00
Mark Backman
84bd767312 Merge pull request #685 from pipecat-ai/mb/add-recording-events
Add recording events and callbacks
2024-11-01 12:02:46 -04:00
Mark Backman
802c29e9e1 Add recording events and callbacks 2024-11-01 10:20:00 -04:00
Aleix Conchillo Flaqué
f83381860c Merge pull request #677 from pipecat-ai/aleix/add-notifier-and-notifier-filters
add notifiers and more frame filters
2024-10-31 15:55:07 -07:00
Aleix Conchillo Flaqué
4dad1bfe49 examples: add foundational/22-natural-conversation.py 2024-10-31 12:10:33 -07:00
marcus-daily
9ee8896b64 Removing unnecessary ruff arguments from README 2024-10-31 18:02:29 +00:00
marcus-daily
5f7a2f66d4 Add .idea to .gitignore 2024-10-31 18:02:29 +00:00
marcus-daily
76e5f1e847 Remove unnecessary ruff params in CI 2024-10-31 15:07:28 +00:00
marcus-daily
6975340d6c Set Ruff config for the project 2024-10-31 15:07:28 +00:00
marcus-daily
0f4cf56418 Load dotenv in simple chatbot server (fixes #415) 2024-10-31 12:08:30 +00:00
Aleix Conchillo Flaqué
018e51e8a3 add notifiers and more frame filters 2024-10-30 16:36:17 -07:00
Vanessa Pyne
b050143952 Merge pull request #676 from RonakAgarwalVani/fix/chunk-choices-delta-none
Fix uncaught exception when accessing 'tool_calls' in NoneType delta in response handling
2024-10-30 14:44:32 -05:00
Mark Backman
98ea1f0791 Merge pull request #675 from pipecat-ai/mb/playht-add-request-id
Add a request_id to each TTS sequence
2024-10-30 13:56:15 -04:00
Mark Backman
8272c35527 Use a request_id in TTS commands for the PlayHT websocket service 2024-10-30 13:54:18 -04:00
Mark Backman
e973e82e05 Merge pull request #672 from pipecat-ai/mb/fix-playht
Fix PlayHT TTFB metrics
2024-10-30 13:53:02 -04:00
RonakAgarwalVani
d1396bf618 Update openai.py 2024-10-30 14:26:49 +05:30
Vanessa Pyne
8186e423de Merge pull request #637 from pipecat-ai/vp-issue-template
docs: add ISSUE_TEMPLATE.md
2024-10-29 15:08:42 -05:00
vipyne
3010addb8b docs: add CONTRIBUTING.md 2024-10-29 15:03:07 -05:00
vipyne
029e0d391e docs: add ISSUE_TEMPLATE.md 2024-10-29 15:03:07 -05:00
Vanessa Pyne
bf31223577 Merge pull request #671 from pipecat-ai/vp-issue-635
docs: small fix
2024-10-29 14:34:13 -05:00
vipyne
42cc79154f docs: small fix 2024-10-29 14:33:57 -05:00
Mark Backman
05b857006a Update changelog 2024-10-28 20:56:29 -04:00
Mark Backman
2e57d21b89 Fix ttfb metrics 2024-10-28 20:27:24 -04:00
Aleix Conchillo Flaqué
fa05ec46be Merge pull request #667 from pipecat-ai/aleix/base-output-bot-speaking-detection
transports(base_output): use audio frames for bot speaking detection
2024-10-28 10:54:54 -07:00
Aleix Conchillo Flaqué
e3ce619284 transports(base_output): use audio frames for bot speaking detection 2024-10-28 10:07:37 -07:00
Vanessa Pyne
fb512dcd74 Merge pull request #630 from MoofSoup/update-readme
docs: simplify readme
2024-10-28 10:26:30 -05:00
Aleix Conchillo Flaqué
ca15d97383 Merge pull request #662 from pipecat-ai/aleix/daily-transport-async-functions
transports(daily): make functions async
2024-10-25 16:14:06 -07:00
Aleix Conchillo Flaqué
b32448e967 transports(daily): make functions async 2024-10-25 15:01:52 -07:00
Aleix Conchillo Flaqué
7e30da6183 Merge pull request #661 from pipecat-ai/aleix/allow-updating-subscritption-before
transports(daily): allow updating subscriptions before join
2024-10-25 15:00:34 -07:00
Aleix Conchillo Flaqué
a6dd2600d2 examples(tavus): await update_subscriptions 2024-10-25 14:56:56 -07:00
Aleix Conchillo Flaqué
b905b57dfc transports(daily): allow updating subscriptions before join 2024-10-25 14:46:17 -07:00
Kwindla Hultman Kramer
e1a7edfb58 make it possible to use Pipecat turn detection instead of OpenAI turn detection 2024-10-25 15:59:48 -05:00
Aleix Conchillo Flaqué
1b30b1fc23 Merge pull request #665 from pipecat-ai/aleix/fix-bot-started-stopped-speaking
transports(base_output): fix constant bot started/stopped speaking fr…
2024-10-25 13:00:38 -07:00
Aleix Conchillo Flaqué
55026898f6 transports(base_output): use vad stop secs for bot stopped speaking 2024-10-25 12:59:15 -07:00
Aleix Conchillo Flaqué
4283557894 audio(vad): expose params property 2024-10-25 12:59:15 -07:00
Aleix Conchillo Flaqué
5ab00e01aa transports(base_output): fix constant bot started/stopped speaking frames 2024-10-25 12:10:24 -07:00
Aleix Conchillo Flaqué
fcfc729e83 Merge pull request #664 from pipecat-ai/aleix/fix-aws-stuttering
services(aws): read stream and resample in a thread
2024-10-25 11:49:28 -07:00
Aleix Conchillo Flaqué
4eacb34fd8 services(aws): read stream and resample in a thread 2024-10-25 11:22:28 -07:00
Aleix Conchillo Flaqué
3a8aacccf7 Merge pull request #663 from pipecat-ai/aleix/audio-resampling-with-resampy
audio: use resamply for audio resampling
2024-10-25 10:16:20 -07:00
roey
54c0bf0c70 Adding TavusVideoService layer (#617)
Co-authored-by: roey <159067767+roey-tavus@users.noreply.github.com>
Co-authored-by: Mert Gerdan <mert@tavus.io>
Co-authored-by: Aleix Conchillo Flaqué <aleix@daily.co>
2024-10-25 09:46:25 -07:00
Aleix Conchillo Flaqué
778b05a252 audio: use resamply for audio resampling 2024-10-25 09:22:22 -07:00
Mark Backman
f16a416c2b Merge pull request #660 from pipecat-ai/mb/add-gemini-inputs
Add input params to Google Gemini
2024-10-24 20:58:19 -04:00
Aleix Conchillo Flaqué
1be63bccb8 Merge pull request #647 from pipecat-ai/aleix/daily-transport-only-transcribe-users
transport(daily): only transcribe users
2024-10-24 17:40:34 -07:00
Mark Backman
37820ac0df Add input params to Google Gemini 2024-10-24 20:12:41 -04:00
Aleix Conchillo Flaqué
8ea80d43f4 transports(daily): only transcribe user audio 2024-10-24 17:06:43 -07:00
Aleix Conchillo Flaqué
e117d70a00 update to daily-python 0.12.0 2024-10-24 16:49:19 -07:00
Aleix Conchillo Flaqué
2ba753272a Merge pull request #658 from pipecat-ai/aleix/default-to-24000-sample-rate
update TTS and transport output sample rate to 24000
2024-10-24 16:48:41 -07:00
Aleix Conchillo Flaqué
60c8c2f6e9 examples(15a): use daily transcription instead of local whisper 2024-10-24 16:47:41 -07:00
Aleix Conchillo Flaqué
cfb48200c2 services(azure): support sample rates 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
6d317c6e8e audio: don't resample if same sample rate 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
158d52856f transports(livekit): fix VADAnalyzer import 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
92a69e404f update TTS and transport output sample rate to 24000 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
d24c6185d8 Merge pull request #654 from pipecat-ai/aleix/daily-allow-completion-futures
transport(daily): allow completion futures
2024-10-24 14:28:53 -07:00
Mark Backman
1fd21578a6 Merge pull request #657 from pipecat-ai/mb/add-elevenlabs-output-format-type
Add ElevenLabs output format type
2024-10-24 17:07:04 -04:00
Mark Backman
700db87127 Merge pull request #656 from pipecat-ai/mb/add-gemini-metrics
Add Gemini token usage metrics
2024-10-24 17:04:56 -04:00
Mark Backman
6f1310569c Add ElevenLabs output format type 2024-10-24 17:03:45 -04:00
Aleix Conchillo Flaqué
14cedb0be8 Merge pull request #655 from pipecat-ai/aleix/fix-together-params
services(together): fix together AI InputParams
2024-10-24 13:51:38 -07:00
Mark Backman
fae97f9051 Add Gemini token usage metrics 2024-10-24 16:37:21 -04:00
Aleix Conchillo Flaqué
d930a46e64 services(together): fix together AI InputParams 2024-10-24 13:08:35 -07:00
Aleix Conchillo Flaqué
2e6b5d1843 transports(daily): fix aiohttp timeout 2024-10-24 11:44:30 -07:00
Aleix Conchillo Flaqué
88362db034 transports(daily): no more need for an output message queue 2024-10-24 11:44:30 -07:00
Aleix Conchillo Flaqué
f7f0c44c32 transports(daily): don't block event handlers 2024-10-24 11:44:30 -07:00
Mark Backman
33553b71d4 Merge pull request #653 from pipecat-ai/mb/align-tts-constructors
Align TTSService constructors
2024-10-24 13:52:43 -04:00
Mark Backman
be8ca505cd Merge pull request #652 from pipecat-ai/khk/more-gemini
Gemini new context manager and rewrite to use google data structures internally
2024-10-24 13:47:38 -04:00
Mark Backman
e957cce422 Align TTSService constructors 2024-10-24 13:42:06 -04:00
Mark Backman
418a13a4ec Merge pull request #650 from pipecat-ai/mb/assembly-fix
AssemblyAI: don't disconnect on language change
2024-10-24 11:26:56 -04:00
Mark Backman
fc445c0a1f Merge pull request #649 from pipecat-ai/mb/open-ai-max-tokens
Add max_tokens and max_completion_tokens inputs for OpenAI
2024-10-24 11:26:44 -04:00
Mark Backman
f0c65468ed AssemblyAI: don't disconnect on language change 2024-10-24 08:30:48 -04:00
Mark Backman
ce6a2bdcf7 Add max tokens inputs to OpenAI 2024-10-24 07:03:45 -04:00
Mark Backman
673542e235 Merge pull request #646 from pipecat-ai/mb/grok-function-calling
Support function calling for Grok
2024-10-23 21:56:38 -04:00
Kwindla Hultman Kramer
e032b0b70a gemini context aggregators 2024-10-23 18:44:09 -07:00
Mark Backman
e39f7e965b Support function calling for Grok 2024-10-23 17:22:26 -04:00
Mattie Ruth
d26751e968 add missing PipelineParams to enable the metrics (#645) 2024-10-23 16:46:46 -04:00
Aleix Conchillo Flaqué
e0ca4a9c23 Merge pull request #643 from pipecat-ai/aleix/daily-update-subscriptions
transports(daily): add update_subscriptions()
2024-10-22 17:07:07 -07:00
Aleix Conchillo Flaqué
801e52c095 transports(daily): add update_subscriptions() 2024-10-22 15:02:55 -07:00
Aleix Conchillo Flaqué
a46eaa838b Merge pull request #641 from pipecat-ai/aleix/prepare-0.0.47
prepare 0.0.47
2024-10-22 10:30:42 -07:00
Aleix Conchillo Flaqué
7c432499db update CHANGELOG for 0.0.47 2024-10-22 10:02:50 -07:00
Aleix Conchillo Flaqué
8d75fcc9f0 use warnings package to report deprecated code 2024-10-22 10:02:21 -07:00
Aleix Conchillo Flaqué
61d73f81ae Merge pull request #639 from pipecat-ai/aleix/daily-transcription-model
transport(daily): use "nova-2-general" for transcription
2024-10-22 09:40:43 -07:00
Aleix Conchillo Flaqué
951255def9 transport(daily): use "nova-2-general" for transcription 2024-10-22 09:40:03 -07:00
Moof Soup
bf5a7c3562 docs: Clarify README example and token usage
clarified readme example
2024-10-21 19:54:34 -07:00
Mark Backman
e556f34094 Merge pull request #638 from pipecat-ai/mb/fix-silero-vad-import
Fix Silero VAD import issue
2024-10-21 20:48:06 -04:00
Mark Backman
ccc3691620 Fix Silero VAD import issue 2024-10-21 20:39:20 -04:00
Vanessa Pyne
5321affda7 Merge pull request #588 from Allenmylath/patch-11
Update README.md
2024-10-21 11:20:05 -05:00
Mark Backman
e5ad8dc67b Merge pull request #627 from pipecat-ai/mb/upgrade-gladia-to-v2-api
Update GladiaSTTService to use the Gladia V2 API
2024-10-21 12:01:20 -04:00
Mark Backman
46927805bc Update GladiaSTTService to use the Gladia V2 API 2024-10-21 07:10:38 -04:00
Aleix Conchillo Flaqué
b6b1ef0a40 Merge pull request #589 from Allenmylath/patch-12
Update Dockerfile
2024-10-20 10:59:43 -07:00
Mark Backman
e62f762382 Merge pull request #625 from pipecat-ai/mb/add-assemblyai-stt
Add support for AssemblyAI STT
2024-10-20 13:59:33 -04:00
Aleix Conchillo Flaqué
dbfda14342 Merge pull request #587 from Allenmylath/patch-9
Update env.example
2024-10-20 10:58:50 -07:00
Aleix Conchillo Flaqué
fee85418cd Merge pull request #620 from gregschwartz/main
Start agent/call/bot at localhost root
2024-10-20 10:14:10 -07:00
Mark Backman
015faa3dbd Update CHANGELOG and README 2024-10-20 08:57:57 -04:00
Mark Backman
1dbf4ff27d Add AssemblyAI STT service 2024-10-20 08:57:57 -04:00
Aleix Conchillo Flaqué
4f1b2dce9b Merge pull request #624 from pvilchez/fix_enable_usage_metrics
Fixing `enable_usage_metrics` setting.
2024-10-20 01:00:12 -07:00
Paul Vilchez
5640bd9447 Fixing a config mismatch which caused usage stats to only report when enable_metrics was true. 2024-10-20 03:33:13 -04:00
Aleix Conchillo Flaqué
ee5ae0d631 Merge pull request #621 from pipecat-ai/aleix/prepare-0.0.46
update CHANGELOG for 0.0.46
2024-10-19 18:26:05 -07:00
Aleix Conchillo Flaqué
4b8a4b86fe update CHANGELOG for 0.0.46 2024-10-19 18:25:29 -07:00
Aleix Conchillo Flaqué
3556c9ce0f Merge pull request #618 from pipecat-ai/aleix/examples-switch-to-llm-context
examples: use OpenAILLMContext in all the examples
2024-10-19 18:24:39 -07:00
Aleix Conchillo Flaqué
f971dbe027 examples(audio-recording): record audio into a file 2024-10-19 18:24:00 -07:00
Aleix Conchillo Flaqué
3815e9dec3 examples: fix dialin-chatbot python arguments 2024-10-19 18:24:00 -07:00
Aleix Conchillo Flaqué
320f622255 examples: upgrade storytelling frontend packages 2024-10-19 18:24:00 -07:00
Aleix Conchillo Flaqué
be4bdabdf4 examples: use OpenAILLMContext in all the examples 2024-10-19 18:24:00 -07:00
Greg Schwartz
1fa52b62aa Put start agent/call at localhost root. Before you had to read in the docs to go to /start, or /start_call or /start_bot. Which isn't mentioned in the console output, and is inconsistent, adding friction to learning the codebase 2024-10-19 16:18:43 -07:00
Aleix Conchillo Flaqué
4f66e5d55f Merge pull request #619 from pipecat-ai/aleix/split-vad
move SileroVAD processor to processors package
2024-10-18 23:30:07 -07:00
Aleix Conchillo Flaqué
3502509d3e move SileroVAD processor to processors package 2024-10-18 23:28:29 -07:00
Aleix Conchillo Flaqué
d71ea1c0e0 Merge pull request #615 from DamienDeepgram/patch-1
Update default Deepgram model
2024-10-18 22:47:30 -07:00
Kwindla Hultman Kramer
07712cdb16 gemini function calling and partial implementation of standard context stuff 2024-10-18 17:14:57 -07:00
DamienDeepgram
13f232bafc Update default model 2024-10-18 15:33:50 -07:00
Aleix Conchillo Flaqué
9dd3354b89 Merge pull request #613 from pipecat-ai/aleix/examples-endframe
examples: use EndFrame() when the participant leaves
2024-10-18 11:18:26 -07:00
Aleix Conchillo Flaqué
8c006c24a3 README: update example 2024-10-18 11:18:03 -07:00
Aleix Conchillo Flaqué
4550545528 examples: use EndFrame() when the participant leaves 2024-10-18 11:18:03 -07:00
Aleix Conchillo Flaqué
020f371ecb pyproject: update onnxruntime to support python 3.12 2024-10-18 10:20:28 -07:00
Aleix Conchillo Flaqué
f3c0767c81 Merge pull request #610 from pipecat-ai/aleix/stt-push-audio
allow STT services to passthrough audio frames
2024-10-17 21:02:30 -07:00
Aleix Conchillo Flaqué
c9318ecd5c examples: minor fixes 2024-10-17 16:15:09 -07:00
Aleix Conchillo Flaqué
12eb9437c1 services(stt): allow STT service to passthrough audio 2024-10-17 16:15:09 -07:00
Aleix Conchillo Flaqué
71c8c0dcdb Merge pull request #609 from pipecat-ai/aleix/livekit-force-specifying-vad
livekit force specifying vad
2024-10-17 14:08:55 -07:00
Aleix Conchillo Flaqué
8108423742 transport(livekit): force specifying a vad analyzer
Don't default to SileroVADAnalyzer(). Also, resample to input sample rate.
2024-10-17 14:06:43 -07:00
Aleix Conchillo Flaqué
d67e08be4d Merge pull request #608 from pipecat-ai/aleix/add-audio-utils-and-resample
add audio utils and resample
2024-10-17 14:00:49 -07:00
Aleix Conchillo Flaqué
d3f4ac61b6 move utils.audio to audio.utils and add resample_audio() 2024-10-17 13:59:32 -07:00
Aleix Conchillo Flaqué
c6d28bb0db Merge pull request #607 from pipecat-ai/aleix/pipecat-vad-deprecation
move vad package to audio.vad
2024-10-17 13:51:20 -07:00
Aleix Conchillo Flaqué
2a37b2459a move vad package to audio.vad 2024-10-17 13:49:16 -07:00
Mark Backman
d1000f2fe4 Merge pull request #606 from pipecat-ai/mb/add-playht-options
PlayHT: Add websocket TTS service; rename existing service to PlayHTHttpTTSService, upgrade client, add input params
2024-10-17 16:46:59 -04:00
Mark Backman
e2d7af4b62 Update changelog 2024-10-17 16:16:29 -04:00
Mark Backman
da3810f1a2 Add websocket support for PlayHT 2024-10-17 15:41:33 -04:00
Aleix Conchillo Flaqué
eb21597d1a Merge pull request #603 from pipecat-ai/aleix/silero-vad-processor-fixes
vad: add support for interruption to SileroVAD processor
2024-10-17 10:48:39 -07:00
Aleix Conchillo Flaqué
e3eea0c02f vad: add support for interruption to SileroVAD processor 2024-10-17 10:48:25 -07:00
Mark Backman
45606e177c Add input options to PlayHT, upgrade to latest PlayHT model 2024-10-17 11:56:12 -04:00
Aleix Conchillo Flaqué
197d7b3e2b Merge pull request #604 from natestraub/patch-1
services(livekit) - Stop Sending EndFrame when Participant Disconnects
2024-10-17 08:48:57 -07:00
Nathan Straub
d4ec6827ce services(livekit) - Stop Sending EndFrame when Participant Disconnects
How It Works Now:
A participant disconnecting triggers and EndFrame, invoking stop() on the input and output transports and causing the LiveKit room to disconnect.  

Proposal:
Match the daily implementation, and just trigger the callbacks in the LiveKitTransport.  Leave it up to the implementor to decide whether to send EndFrames when this happens.
2024-10-16 23:53:31 -07:00
Aleix Conchillo Flaqué
e31d1152db Merge pull request #601 from pipecat-ai/aleix/openai-realtime-misc
services(openai): rename OpenAILLMServiceRealtimeBeta to OpenAIRealti…
2024-10-16 16:20:18 -07:00
Mark Backman
bb48a81103 Merge pull request #602 from pipecat-ai/mb/adjust-logger-levels
Adjust log levels for log messages
2024-10-16 18:00:35 -04:00
Mark Backman
55f1ae2564 Adjust log levels for log messages 2024-10-16 17:30:47 -04:00
Kwindla Hultman Kramer
280691b1b3 explanatory comment in 19-openai-realtime-beta.py 2024-10-16 14:27:48 -07:00
Kwindla Hultman Kramer
93c9e219ce fix for message handling bug on initialization 2024-10-16 12:40:20 -07:00
Aleix Conchillo Flaqué
edd44cc181 services(openai): rename OpenAILLMServiceRealtimeBeta to OpenAIRealtimeBetaLLMService 2024-10-16 10:20:19 -07:00
Aleix Conchillo Flaqué
4075b19f7c Merge pull request #600 from pipecat-ai/aleix/prepare-0.0.45
update CHANGELOG to 0.0.45
2024-10-16 09:18:37 -07:00
Aleix Conchillo Flaqué
bb14918a33 update CHANGELOG to 0.0.45 2024-10-16 09:17:33 -07:00
Mark Backman
2aee8a12f8 Merge pull request #599 from pipecat-ai/mb/remove-metrics-from-transport
Move metrics from transport to rtvi
2024-10-16 11:39:58 -04:00
Mark Backman
5760fadb44 Update changelog 2024-10-16 11:33:56 -04:00
Mark Backman
af5a7e9092 Move metrics from transport to rtvi 2024-10-16 11:33:56 -04:00
Mark Backman
8d9a7486d1 Merge pull request #598 from pipecat-ai/mb/add-daily-metrics-message-frame
Comply with RTVI format for sending metrics data via Daily transport
2024-10-16 10:14:44 -04:00
Mark Backman
00d0f9ae48 Comply with RTVI format for sending metrics data 2024-10-16 09:00:38 -04:00
Aleix Conchillo Flaqué
d255b7d1b2 Merge pull request #596 from pipecat-ai/aleix/prepare-0.0.44
prepare for pipecat 0.0.44
2024-10-15 18:13:07 -07:00
Aleix Conchillo Flaqué
4eb2c95b63 update CHANGELOG for 0.0.44 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
3910aeb4de transports(daily): don't send messages if not joined 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
713dcb7a4d transports(daily): cancel messages task when canceling 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
04da51c7d8 transport(base_output): push EndFrame downstream at the right time 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
e52d18e42d processors(audiobuffer): make functions public 2024-10-15 15:31:59 -07:00
Aleix Conchillo Flaqué
0c4a513ca2 Merge pull request #595 from pipecat-ai/aleix/bot-speaking-system-frames
bot speaking system frames
2024-10-15 15:30:11 -07:00
Aleix Conchillo Flaqué
4a71eacac3 rtvi: reset bot transcription with interruptions 2024-10-15 14:58:21 -07:00
Aleix Conchillo Flaqué
f0d89e57ad frames: some frames need to be SystemFrames
We want to process user and bot started/stopped speaking frames as fast as
possible. If we queue them they might be processed too late.
2024-10-15 14:37:56 -07:00
Mark Backman
79b52d4301 Merge pull request #594 from pipecat-ai/mb/more-text-filter-massaging
More edge case handling for text filtering
2024-10-15 14:51:43 -04:00
Mark Backman
bb00dbefbc More edge case handling for text filtering 2024-10-15 14:08:27 -04:00
Aleix Conchillo Flaqué
0c250c0603 Merge pull request #583 from pipecat-ai/aleix/add-pts-to-llm-full-response-end-frame
add pts to llm full response end frame
2024-10-15 10:39:50 -07:00
Aleix Conchillo Flaqué
7bbaf4dfe9 rtvi: merge TTS/TTSText and LLM/LLMText processors 2024-10-15 10:24:43 -07:00
Aleix Conchillo Flaqué
3a3bf3fe34 services(cartesia): schedule TTSStoppedFrame after text 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
616aa54f75 ruff formatting 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
164f06415c servcies(cartesia): no need to send LLMFullResponseEndFrame
Interruptions are already handled by context aggregators.
2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
51bc4839d1 transport(base_output): simplify code 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
6d778e0491 services: add pts to LLMFullResponseEndFrame in WordTTSService 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
fc4fa2faaa Merge pull request #593 from pipecat-ai/aleix/bot-transcription-processor
rtvi: add RTVIBotTranscriptionProcessor to send `bot-transcription`
2024-10-15 10:03:39 -07:00
Aleix Conchillo Flaqué
90b7f65545 rtvi: add RTVIBotTranscriptionProcessor to send bot-transcription 2024-10-15 10:03:20 -07:00
Kwindla Hultman Kramer
f7b7f0d680 Merge pull request #541 from pipecat-ai/khk/openai-realtime-beta
openai realtime beta
2024-10-14 21:02:06 -07:00
Kwindla Hultman Kramer
5431c44e51 remove two debug lines 2024-10-14 21:01:20 -07:00
Kwindla Hultman Kramer
40b3e50815 fix system, consecutive same role, and empty message parsing for anthropic 2024-10-14 20:56:42 -07:00
allenmylath
ec98a13a08 Update Dockerfile
utils and assets not used in this example hence removed
2024-10-15 08:18:16 +05:30
allenmylath
b999b76f70 Update README.md
readme description still shows simple-chatbot definition hence made more accurate description
2024-10-15 08:14:43 +05:30
allenmylath
b64dbe7bb4 Update env.example
canonical api url is also used from env.
2024-10-15 08:10:07 +05:30
Kwindla Hultman Kramer
2f6232fac9 fix for initial-messages with single message, and hoisting system message into instructions 2024-10-14 18:14:35 -07:00
Aleix Conchillo Flaqué
b4f2525c76 Merge pull request #585 from pipecat-ai/aleix/daily-urgent-transport-message-hang
transports(daily): send transport messages in a task
2024-10-14 16:31:10 -07:00
Aleix Conchillo Flaqué
8e956a4e88 Merge pull request #584 from pipecat-ai/aleix/urgent-bot-tts-audio
rtvi: bot-tts-audio messages should also be urgent
2024-10-14 16:25:35 -07:00
Aleix Conchillo Flaqué
7b9712daad transports(daily): send transport messages in a task
We queue transport messages and send them in a task to avoid potential hangs by
sending urgent transport messages from a transport event handler.
2024-10-14 16:19:53 -07:00
Kwindla Hultman Kramer
d4269acd67 user started/stopped speaking frames and interruption frames 2024-10-14 16:07:04 -07:00
Kwindla Hultman Kramer
d2ae82fb38 added back in missing LLMFullResponseStartFrame and LLMFullResponseEndFrame 2024-10-14 15:18:50 -07:00
Lewis Wolfgang
270949e6cd Merge pull request #582 from pipecat-ai/lewis/update_readme_aboutsilerofirstrun
Minor README update about Silero VAD.
2024-10-14 16:26:28 -04:00
Aleix Conchillo Flaqué
cfada94c13 rtvi: bot-tts-audio messages should also be urgent 2024-10-14 12:46:11 -07:00
Lewis Wolfgang
68fd6f7c44 Minor README update about Silero VAD.
We no longer download the model during first run - it's part of the repo.
2024-10-14 13:11:16 -04:00
Mark Backman
96bfcc3dca Merge pull request #571 from pipecat-ai/mb/add-code-filtering
Add code and table filtering option to MarkdownTextFilter
2024-10-14 12:54:16 -04:00
Mark Backman
b0890b1f75 Code review fixes 2024-10-14 12:52:16 -04:00
Aleix Conchillo Flaqué
802b3e42c4 Merge pull request #579 from Allenmylath/patch-16
Update Dockerfile
2024-10-14 08:58:02 -07:00
Aleix Conchillo Flaqué
bd134839ff Merge pull request #578 from Allenmylath/patch-15
Create Dockerfile
2024-10-14 08:57:34 -07:00
Aleix Conchillo Flaqué
428ce63e17 Merge pull request #575 from Allenmylath/patch-12
Update README.md
2024-10-14 08:55:12 -07:00
Aleix Conchillo Flaqué
46d6cde383 Merge pull request #574 from Allenmylath/patch-11
Update requirements.txt
2024-10-14 08:54:44 -07:00
allenmylath
6de82b3c11 Create .env.example (#562)
* Create .env.example

.env.example file with required env variables not added hence adding

* Rename .env.example to env.example

file name corrected as directed
2024-10-14 08:52:46 -07:00
Mark Backman
ec0bc7a057 A few bug fixes 2024-10-14 09:44:20 -04:00
allenmylath
c62156a4c3 Update Dockerfile
assets and utils files not found hence removed
2024-10-14 12:00:29 +05:30
allenmylath
e8618a07d0 Create Dockerfile
there is Dockerfile in other examples. this docker file assumes that there is a .env file(i added env.example in another pull request)
2024-10-14 11:49:35 +05:30
allenmylath
0ba99514a9 Update README.md
env.example added hence addying copy command will be necessary
2024-10-14 11:22:56 +05:30
allenmylath
837c8dad27 Update requirements.txt
whisper not used but deepgram used hence changed
2024-10-14 11:20:12 +05:30
allenmylath
0e69625a01 Rename frames.md to frame.md
edited again to frame.md
2024-10-14 10:07:47 +05:30
allenmylath
4e0823fced Rename Frames.md to frames.md
file name changed as requested
2024-10-14 10:05:26 +05:30
Kwindla Hultman Kramer
6f2a464451 conversation save/load for openai, openai-realtime, and anthropic 2024-10-13 18:12:03 -07:00
Kwindla Hultman Kramer
ac4c5ab369 response content item truncation when interrupted 2024-10-13 14:38:04 -07:00
Kwindla Hultman Kramer
9e95419301 much cleanup 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
f390ec9608 temp commit; debugging 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
ce8a83efba tools frame support and wip message resetting/loading 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
e5a2bf9564 context management improvements 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
7838018686 fix default response properties getting appended to ResponseCreateEvent 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
31916ed9fd turn on/off openai vad 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
3a2fbc2b19 send user started/stopped speaking event from openai realtime events
send user started/stopped speaking event from openai realtime events
2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
43520b44da add 'failed' case to Response event object 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
ab4a8d791a RTVI processors should use TextFrame not TextFrame and all subclasses 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
40dc546b81 function call fix and user transcription frames 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
5426891feb added input audio pause setting. no frame to update that state, yet. 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
1c5ccd3406 fixes for settings updates, context updates, and response creation 2024-10-12 21:58:11 -07:00
Mark Backman
3a745bfa3f Handle self._context of None 2024-10-12 21:58:11 -07:00
Mark Backman
ac4e39991e Update ai_services for OpenAI Realtime param inputs 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
c870832da6 types seem complete; some ws error handling 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
e782016c57 renamed a file 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
00badaf98e more pydantic cleanup 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
7dfac0163b bits of pydantic 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
09a3c2a82d major functionality working (not configurable, occasional timing bugs maybe) 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
c32c65014b definitely broke something in the pipeline 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
f082eb10a2 small cleanup 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
b8898e449e lots of debugging statements. multiple function calls broken 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
d1f6d229ca space exploration prompt 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
4fa0318005 configurability via constructor 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
93ebb9d541 working 19-openai-realtime-beta.py example 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
16101c79c5 beginning of realtime impl 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
c866b3f2c9 Merge pull request #572 from pipecat-ai/khk/fix-deepgram-settings
fix for Deepgram settings not merging properly
2024-10-12 20:07:04 -07:00
Mark Backman
c26a45721f Set inputs as Optional 2024-10-12 21:52:56 -04:00
Mark Backman
d9c900f872 Satisfy minimal text requirements for Cartesia and OpenAI 2024-10-12 21:27:37 -04:00
chadbailey59
73becbad29 fixed parallel async function calls bug (#569) 2024-10-12 17:45:24 -05:00
Aleix Conchillo Flaqué
f1df3de263 Merge pull request #560 from Allenmylath/patch-7
Update requirements.txt aiohttp missing
2024-10-12 14:52:24 -07:00
Aleix Conchillo Flaqué
3bc5c8cda7 Merge pull request #557 from Allenmylath/patch-4
Update env.example wrong tts service in env
2024-10-12 14:51:54 -07:00
Aleix Conchillo Flaqué
7b3b1058b2 Merge pull request #559 from Allenmylath/patch-6
Update server.py
2024-10-12 14:51:24 -07:00
Aleix Conchillo Flaqué
87473f857f Merge pull request #558 from Allenmylath/patch-5
Update env.example wrong tts
2024-10-12 14:50:52 -07:00
Aleix Conchillo Flaqué
a96209185c Merge pull request #546 from Allenmylath/patch-2
Update README.md
2024-10-12 14:46:15 -07:00
Aleix Conchillo Flaqué
34cc2ed1a1 Merge pull request #532 from nmaswood/nmaswood/format-logs
Format and Support Unicode for LLM Message Debug Logs
2024-10-12 14:42:58 -07:00
Aleix Conchillo Flaqué
667aa0c25a Merge pull request #542 from joachimchauvet/main
Update LiveKit audio transport for changes introduced in v0.0.42
2024-10-12 14:13:02 -07:00
Mark Backman
12707f4ff7 _settings needs to be Dict 2024-10-12 12:19:54 -04:00
Kwindla Hultman Kramer
53451899a7 fix for Deepgram settings not merging 2024-10-11 21:07:39 -07:00
Aleix Conchillo Flaqué
dc73b20c0b Merge pull request #451 from Canonical-AI-Inc/recording
Audio recording FrameProcessor
2024-10-11 13:48:19 -07:00
Adrian Cowham
4330374ba4 passing kwargs and forcing keyword-only arguments 2024-10-11 12:01:51 -07:00
Adrian Cowham
79c8aa2c4a ruff formatting 2024-10-11 11:35:02 -07:00
Adrian Cowham
083d221dd2 PR feedback 2024-10-11 11:29:01 -07:00
Mark Backman
74d47b725f Add table filtering 2024-10-11 14:10:47 -04:00
Adrian Cowham
917e482876 Merge branch 'main' into recording 2024-10-11 10:36:04 -07:00
Adrian Cowham
522d931950 better interruption handling by moving the processors after the transport output 2024-10-11 10:33:12 -07:00
Mark Backman
d10c7ac7ce Add Changelog entry 2024-10-11 13:28:34 -04:00
Mark Backman
84705427c5 Add code filtering option to MarkdownTextFilter 2024-10-11 11:11:58 -04:00
Aleix Conchillo Flaqué
66a76af341 Merge pull request #567 from pipecat-ai/aleix/prepare-0.0.43
update CHANGELOG for 0.0.43
2024-10-10 14:09:18 -07:00
Aleix Conchillo Flaqué
d402d91c2f update CHANGELOG for 0.0.43 2024-10-10 14:06:18 -07:00
Mark Backman
b05130a089 Merge pull request #566 from pipecat-ai/mb/make-markdown-modifiable
Mark the Markdown processor a util, and allow it to take inputs
2024-10-10 17:00:19 -04:00
Mark Backman
b3cc0779f0 Update the changelog 2024-10-10 16:49:20 -04:00
Mark Backman
cbecae40a9 Mark the Markdown processor a util, and allow it to take inputs 2024-10-10 16:43:48 -04:00
Mark Backman
5b8753c8b6 Add speak_code input param 2024-10-10 13:17:37 -04:00
Mark Backman
3c5f9457f1 More edge case improvements 2024-10-10 12:07:00 -04:00
Mark Backman
e32e56d0bc Merge pull request #565 from pipecat-ai/mb/add-markdown-remover
Add a new processor which removes markdown and special chars from TTS text
2024-10-10 07:16:42 -04:00
Mark Backman
788aec665b Add a new processor which removes markdown and special chars from TTS text 2024-10-10 07:11:31 -04:00
Mark Backman
3cada03a92 Merge pull request #564 from pipecat-ai/mb/bot-tts-text-urgent
Make bot-tts-text messages urgent
2024-10-08 19:26:46 -04:00
Mark Backman
e21fb520f9 Make bot-tts-text messages urgent 2024-10-08 17:07:08 -04:00
allenmylath
864f4d385f Update requirements.txt aiohttp missing
aiohttp is not included but uded in code
2024-10-08 16:39:25 +05:30
allenmylath
26ac2878ae Update server.py
desccription of fastapi sagrgumentparser wrongly shown as stroy teller instead of patient-intake
2024-10-08 15:18:26 +05:30
allenmylath
cac63f5565 Update env.example wrong tts
cartesian used in code but elevenlabs in .env example
2024-10-08 14:24:23 +05:30
allenmylath
aadffd6199 Update env.example wrong tts service in env
cartesian used in code but env got elevenlabs
2024-10-08 14:15:54 +05:30
Aleix Conchillo Flaqué
3403197a90 Merge pull request #552 from pipecat-ai/aleix/rtvi-user-llm-text
rtvi: add RTVIUserLLMTextProcessor
2024-10-07 08:33:29 -07:00
Aleix Conchillo Flaqué
8cdb9ab1ad rtvi: internal transport message should be urgent 2024-10-07 08:04:14 -07:00
Mark Backman
5dbf26d283 Handle cases where text is either a list or a string 2024-10-07 07:21:32 -04:00
Mark Backman
8001bab9b0 Remove another instance of urgent=true 2024-10-07 06:58:32 -04:00
Aleix Conchillo Flaqué
12d0686adc rtvi: rename bot-audio to bot-tts-audio 2024-10-06 16:50:55 -07:00
Aleix Conchillo Flaqué
a28a5e954a add TransportMessageSystemFrame 2024-10-06 16:50:12 -07:00
Aleix Conchillo Flaqué
bb966a89d2 rtvi: add RTVIUserLLMTextProcessor 2024-10-06 01:05:58 -07:00
Aleix Conchillo Flaqué
4a74eb3321 use isinstance tuples 2024-10-06 00:45:27 -07:00
Aleix Conchillo Flaqué
1f54ee6991 pyproject: update deepgram to 3.7.3 2024-10-06 00:40:47 -07:00
Allenmylath
40af3571f0 Create Frames.md
Made asmall explanation for diffrent types of frames in pipcat
2024-10-05 22:04:03 +05:30
joachimchauvet
86143f79a1 use new InputAudioRawFrame and OutputAudioRawFrame 2024-10-05 14:17:27 +03:00
joachimchauvet
b373bc82b5 match behavior of Daily's on_first_participant_joined 2024-10-05 14:17:27 +03:00
Mark Backman
ea2a05a04b Merge pull request #545 from pipecat-ai/mb/fix-language-handling
Improve language string handling for TTS services
2024-10-04 10:03:06 -04:00
Mark Backman
5692ca586c Merge pull request #547 from pipecat-ai/mb/update-test-requirements
Update fastapi version in test-requirements.txt
2024-10-04 08:28:05 -04:00
Mark Backman
a11ad81f02 Update fastapi version in test-requirements.txt 2024-10-04 07:35:48 -04:00
Allenmylath
805efdb144 Update README.md
the description provided is that of simple chatbot and also the video of simple chatbot hence changed
2024-10-04 10:19:38 +05:30
Mark Backman
c49b31e6ad Add CHANGELOG entry 2024-10-03 23:13:59 -04:00
Mark Backman
7796a272ce Improve language handling for TTS services 2024-10-03 23:09:27 -04:00
Adrian Cowham
678e87fd31 comment back in some code 2024-10-03 14:12:23 -07:00
Adrian Cowham
4d81a2ebfe nuked the code that marks user audio in favor for InputAudioRawFrame. also moving to stereo instead of mono with the human and bot on their own channel. 2024-10-03 14:10:03 -07:00
Adrian Cowham
2d82702e04 merge from main 2024-10-03 09:42:06 -07:00
Mark Backman
27dcf83f37 Merge pull request #543 from pipecat-ai/mb/fix-deepgram-stt-language
Deepgram: disconnect and reconnect on language change
2024-10-03 12:40:27 -04:00
Mark Backman
72db83528d Update changelog 2024-10-03 12:37:26 -04:00
Mark Backman
45c7d36b2e Deepgram: disconnect and reconnect on language change 2024-10-03 12:31:42 -04:00
Aleix Conchillo Flaqué
65eeb0f1f6 Merge pull request #540 from pipecat-ai/cb/interruption-fix
Fixed RTVI `tts:interrupt` action not interrupting
2024-10-02 13:46:52 -07:00
Aleix Conchillo Flaqué
1d7d0bb1ea Merge pull request #539 from pipecat-ai/aleix/pipecat-0.0.42-fixes
pipecat 0.0.42 fixes
2024-10-02 13:34:28 -07:00
Aleix Conchillo Flaqué
598936bc53 services: apply service language code before using service 2024-10-02 13:30:01 -07:00
Chad Bailey
b1bf6f7733 fixed botinterruptionframe 2024-10-02 19:43:51 +00:00
Aleix Conchillo Flaqué
75d27aeb9f examples(storytelling): update packages 2024-10-02 12:00:00 -07:00
Aleix Conchillo Flaqué
0a37caf4b4 openai: fix image json logging 2024-10-02 11:57:50 -07:00
Aleix Conchillo Flaqué
6db65f4335 cartesia: use model_name instead of model_id 2024-10-02 11:57:36 -07:00
Aleix Conchillo Flaqué
3648874301 gladia: fix languages 2024-10-02 11:57:25 -07:00
Aleix Conchillo Flaqué
8bcb5d7fd2 services: async generators should yield frames 2024-10-02 11:57:08 -07:00
Aleix Conchillo Flaqué
8c01a900cd google: allow using GOOGLE_APPLICATION_CREDENTIALS 2024-10-02 11:56:01 -07:00
Mark Backman
d378e699d2 Merge pull request #538 from Allenmylath/patch-2
Update env.example for wrong tts
2024-10-02 12:53:50 -04:00
Mark Backman
c25c375c41 Merge pull request #537 from pipecat-ai/mb/fix-nested-strings
Fix nested strings issue
2024-10-02 12:39:00 -04:00
Allenmylath
70c3ff31fd Update env.example
elevenlabs is not used in code instead cartesian is used hence changed
2024-10-02 21:59:51 +05:30
Mark Backman
cd2e29f285 Fix nested strings issue 2024-10-02 12:26:30 -04:00
Aleix Conchillo Flaqué
6d4d7d763d Merge pull request #534 from pipecat-ai/aleix/prepare-0.0.42
update CHANGELOG for 0.0.42
2024-10-02 08:36:32 -07:00
Aleix Conchillo Flaqué
6c1851eef8 update CHANGELOG for 0.0.42 2024-10-02 08:36:17 -07:00
Mark Backman
096a15eef6 Merge pull request #527 from pipecat-ai/mb/google-tts-inputs
Further consolidate service update settings into a single ServiceUpdateSettingsFrame class
2024-10-02 11:13:25 -04:00
Mark Backman
3d642df2b0 Revert aligning voice_id name in TTS service constructor 2024-10-02 11:07:48 -04:00
Mark Backman
d75a02dc51 Use Language enum and set languages accordingly 2024-10-01 21:03:01 -04:00
Mark Backman
28643b453d Update to use LLM, STT, TTS subclasses and remove setter methods 2024-10-01 20:30:27 -04:00
Nasr Maswood
d5635de5f6 add new lines and unicode to JSON debug logs 2024-10-01 13:31:58 -04:00
Mark Backman
88cca7bf68 Consolidate service UpdateSettingsFrame into a single ServiceUpdateSettingsFrame 2024-10-01 11:01:04 -04:00
Mark Backman
a397b859fe Add support for gender and google_style inputs to Google TTS 2024-10-01 10:39:45 -04:00
Kwindla Hultman Kramer
8aae4e9856 Merge pull request #531 from pipecat-ai/khk/function-calling-improvements 2024-10-01 07:23:38 -07:00
Kwindla Hultman Kramer
92d8b37229 implement vision for openai 2024-09-30 21:49:29 -07:00
Kwindla Hultman Kramer
0801fc578b Merge pull request #530 from pipecat-ai/khk/tts-say-fix
fix for multi-sentence tts say utterances
2024-09-30 20:59:53 -07:00
Kwindla Hultman Kramer
0d5cb84531 function calling testing and improvements 2024-09-30 20:59:28 -07:00
Kwindla Hultman Kramer
47b943a117 Merge pull request #522 from pipecat-ai/rebase-openai-multi-function-call
Handle parallel function calls for OpenAI LLMs
2024-09-30 16:23:37 -07:00
Kwindla Hultman Kramer
128355add5 fix for multi-sentence tts say utterances 2024-09-30 16:19:31 -07:00
Kwindla Hultman Kramer
0499fe41e4 get rid of some debug log lines used during development 2024-09-30 16:08:33 -07:00
Kwindla Hultman Kramer
6ad3437fd2 throw error if the llm tries to call a function that's not registered 2024-09-30 16:08:33 -07:00
Kwindla Hultman Kramer
a5c73ec829 handle openai multiple function calls 2024-09-30 16:08:30 -07:00
JeevanReddy
def04ac0ce openai can give multiple tool calls, current implementation assumes only one function call at a time. Fixed this to handle multiple function calls. 2024-09-30 16:07:56 -07:00
Kwindla Hultman Kramer
5d63615b1b Merge pull request #528 from pipecat-ai/khk/sentence-splits
TTS sentence aggregation fix
2024-09-30 16:07:21 -07:00
Kwindla Hultman Kramer
90ee284fe0 Merge pull request #520 from pipecat-ai/khk/context-frame-push
pushing context frames from assistant aggregators
2024-09-30 16:06:54 -07:00
Kwindla Hultman Kramer
539e0b66fb small fix as per aleix 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
fef393dcac assistant aggregator switch for space padding or not 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
ed607d5c4b typo fix 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
37da7e44cd whitespace fix 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
69c7edd60c pushing context frames from assistant aggregators 2024-09-30 16:05:28 -07:00
Aleix Conchillo Flaqué
392f210371 Merge pull request #524 from pipecat-ai/aleix/everything-is-async
all frame processors are asynchrnous
2024-09-30 15:59:03 -07:00
Mark Backman
9a63df1ea1 Merge pull request #529 from pipecat-ai/mb/daily-python-0-11-0
Update daily-python to 0.11.0
2024-09-30 18:29:27 -04:00
Mark Backman
f8a75cede9 Update daily-python to 0.11.0 2024-09-30 18:22:38 -04:00
Aleix Conchillo Flaqué
4d1e370e02 pipeline(task): since everything is async tasks should wait for EndFrame 2024-09-30 15:11:21 -07:00
Aleix Conchillo Flaqué
d080a31a5c tests: fix langchanin tests 2024-09-30 15:11:21 -07:00
Aleix Conchillo Flaqué
a90ebdfe7c syncparallelpipeline: fix now that all frames are asynchronous 2024-09-30 15:11:21 -07:00
Aleix Conchillo Flaqué
c8995b82e5 all frame processors are asynchrnous
In this commit we make all frame processors asynchronous, that is, they have an
internal queue and they push frames using a task from that queue.
2024-09-30 15:11:21 -07:00
Kwindla Hultman Kramer
6b7f924af6 tts sentence aggregation fix 2024-09-30 14:33:08 -07:00
Mark Backman
51580e5349 Merge pull request #526 from pipecat-ai/mb/google-tts-lang-update
Set Google TTS default language to en-US
2024-09-30 15:32:43 -04:00
Mark Backman
ed49cebf2c Set Google TTS default language to en-US 2024-09-30 15:16:46 -04:00
Mark Backman
46ac76701e Merge pull request #517 from pipecat-ai/mb/update-settings-frame
Consolidate update frames classes into a single UpdateSettingsFrame class
2024-09-30 12:56:45 -04:00
Mark Backman
1f77863aef Code review feedback 2024-09-30 12:50:40 -04:00
Mark Backman
d7555609fd Add TTS update settings options 2024-09-30 12:50:40 -04:00
Mark Backman
7fe118ce63 Align use of language param across TTS services 2024-09-30 12:50:40 -04:00
Mark Backman
44a349386c Consolidate update frames classes into a single UpdateSettingsFrame class 2024-09-30 12:50:39 -04:00
Mark Backman
97cba92fa5 Merge pull request #516 from pipecat-ai/mb/google-tts
Add Google TTS
2024-09-30 12:25:16 -04:00
Aleix Conchillo Flaqué
d9b16d4f73 services: import cosmetics 2024-09-27 13:32:27 -07:00
Aleix Conchillo Flaqué
50b6580fbb livekit: add license notice 2024-09-27 13:28:33 -07:00
Mark Backman
e7548f9494 Code review feedback 2024-09-27 08:02:44 -04:00
Mark Backman
830d2df671 Add Google TTS 2024-09-27 07:36:20 -04:00
Aleix Conchillo Flaqué
13b50a07db Merge pull request #515 from pipecat-ai/aleix/rtvi-frame-processors
RTVI frame processors
2024-09-27 00:48:09 -07:00
Aleix Conchillo Flaqué
4501dca133 Merge pull request #467 from joachimchauvet/main
Add LiveKit audio transport
2024-09-26 22:58:25 -07:00
Aleix Conchillo Flaqué
2c8e566507 rtvi: update version to 0.2 2024-09-26 22:42:36 -07:00
Aleix Conchillo Flaqué
6e8a202107 rtvi: fix handling transport messages 2024-09-26 22:42:19 -07:00
Aleix Conchillo Flaqué
2a05cd35b0 rtvi: add multiple RTVI frame processors 2024-09-26 22:42:08 -07:00
Mark Backman
55a70cde8f Merge pull request #514 from pipecat-ai/mb/aws-polly-tts
Add AWS Polly TTS support
2024-09-26 22:20:13 -04:00
Mark Backman
706c00d897 Code review feedback 2024-09-26 22:13:37 -04:00
Aleix Conchillo Flaqué
d323ea9e95 async_generator: keep pushing frames downstream 2024-09-26 16:44:49 -07:00
Aleix Conchillo Flaqué
b8ece84c6e services: super should be super() 2024-09-26 10:39:26 -07:00
Mark Backman
a018112a13 Merge pull request #510 from pipecat-ai/mb/deepgram-tts-http
Improve usability of Deepgram TTS: use Deepgram client, remove aiohttp
2024-09-26 13:38:42 -04:00
Mark Backman
d3a477902b Add changelog entry 2024-09-26 13:35:59 -04:00
Mark Backman
298b151486 Add setter methods 2024-09-26 13:35:59 -04:00
Mark Backman
6a6ea251ae Add AWS Polly TTS support 2024-09-26 13:35:59 -04:00
Aleix Conchillo Flaqué
c7c709a0a7 github: cache venv when running tests 2024-09-26 10:32:22 -07:00
Aleix Conchillo Flaqué
6ac57b4854 Merge pull request #494 from badbye/full-width-punctuations
add full-width punctuations as end of the sentence
2024-09-26 10:17:10 -07:00
Aleix Conchillo Flaqué
f5e0b946c7 services(cartesia): fix string formatting 2024-09-26 09:08:37 -07:00
Mark Backman
b1818cc370 Merge pull request #435 from golbin/main
Add speed and emotion options for Cartesia.
2024-09-26 07:14:59 -04:00
Jin Kim
d05717a1bd Apply Ruff formater 2024-09-26 19:52:25 +09:00
Aleix Conchillo Flaqué
d11daee31a Merge pull request #509 from pipecat-ai/aleix/frameprocessor-event-handlers
frame processor event handlers
2024-09-25 19:50:30 -07:00
Mark Backman
73da8c1910 Improve usability of Deepgram TTS: use Deepgram client, remove aiohttp 2024-09-25 22:43:10 -04:00
Aleix Conchillo Flaqué
f06aa300d0 rtvi: add on_bot_ready event 2024-09-25 16:52:18 -07:00
Aleix Conchillo Flaqué
c4e94e280e processors: add support for event handlers 2024-09-25 16:35:33 -07:00
Kwindla Hultman Kramer
8f2941c575 Merge pull request #492 from pipecat-ai/khk/flush-more-audio
add calls to flush_audio for say() and rtvi action
2024-09-25 12:35:50 -07:00
joachimchauvet
447baad5c3 update send_metrics() to support changes introduced in #474 2024-09-25 21:38:55 +03:00
Mark Backman
2703813e8a Merge pull request #496 from pipecat-ai/mb/azure-tts-inputs
Add Azure TTS input params
2024-09-25 14:38:01 -04:00
Mark Backman
521e152150 Merge pull request #495 from pipecat-ai/mb/elevenlabs-input-lang
Add language_code support for ElevenLabs TTS
2024-09-25 14:37:44 -04:00
Kwindla Hultman Kramer
3d43ad0f4d actually save the file 2024-09-25 10:59:00 -07:00
Kwindla Hultman Kramer
3621fceae2 fixes as noted by aleix 2024-09-25 09:19:58 -07:00
Aleix Conchillo Flaqué
e123f33c03 Merge pull request #506 from pipecat-ai/aleix/async-generator-processor
processors: add AsyncGeneratorProcessor
2024-09-25 00:04:09 -07:00
Aleix Conchillo Flaqué
b8713666c2 processors: add AsyncGeneratorProcessor 2024-09-25 00:01:04 -07:00
Aleix Conchillo Flaqué
cf0ab85e2c Merge pull request #505 from pipecat-ai/aleix/init-task-variables
initialize task variables and add minor description
2024-09-24 23:59:38 -07:00
Aleix Conchillo Flaqué
8502c7c801 Merge pull request #504 from pipecat-ai/aleix/rtvi-handle-frame
rtvi: add RTVIProcessor.handle_message()
2024-09-24 23:59:26 -07:00
Aleix Conchillo Flaqué
e89814dc6b Merge pull request #503 from pipecat-ai/aleix/end-cancel-task-frames
frames: add EndTaskFrame and CancelTaskFrame
2024-09-24 23:59:10 -07:00
Aleix Conchillo Flaqué
9461bacf0d pyproject: update fastapi to 0.115.0 2024-09-24 19:24:37 -07:00
Aleix Conchillo Flaqué
e276dcbab7 initialize task variables and add minor description 2024-09-24 19:19:00 -07:00
Aleix Conchillo Flaqué
1a3de0e819 rtvi: add RTVIProcessor.handle_message() 2024-09-24 19:12:06 -07:00
Aleix Conchillo Flaqué
ee3786fe15 frames: add EndTaskFrame and CancelTaskFrame 2024-09-24 19:10:22 -07:00
Aleix Conchillo Flaqué
31b5667cee frames: log text with [] so we can distinguish spaces better 2024-09-24 13:10:40 -07:00
Aleix Conchillo Flaqué
a483f1a083 rtvi: handle all actions from the action task 2024-09-24 10:48:15 -07:00
Aleix Conchillo Flaqué
2ecec1c9f8 Merge pull request #500 from pipecat-ai/aleix/rtvi-action-frames-task
RTVI action frames task
2024-09-24 10:13:43 -07:00
Aleix Conchillo Flaqué
08ac311971 rtvi: use task to process incoming action frames 2024-09-24 09:36:53 -07:00
Aleix Conchillo Flaqué
cb49b6a0d6 rtvi: add llm-text and tts-text server messages 2024-09-24 09:36:43 -07:00
Aleix Conchillo Flaqué
016da177db Merge pull request #499 from mercuryyy/main
Fix syntax error in deepgram.py
2024-09-24 09:10:05 -07:00
joachimchauvet
ec5998bc36 remove _internal_push_frame from LiveKitInputTransport 2024-09-24 14:54:37 +03:00
mercuryyy
b1e17ee347 Fix syntax error in deepgram.py 2024-09-24 07:45:29 -04:00
joachimchauvet
b6e1d6e6ae format with ruff 2024-09-24 10:21:02 +03:00
joachimchauvet
fa609f1afc adjust output sample rate and create user token 2024-09-24 10:16:54 +03:00
joachimchauvet
470b5eafe7 move tenacity imports inside try block 2024-09-24 10:16:54 +03:00
joachimchauvet
2e5b0c1d6b add tenacity dependency 2024-09-24 10:16:54 +03:00
joachimchauvet
a9390d96a1 add LiveKit audio transport 2024-09-24 10:16:54 +03:00
Mark Backman
8ee9621d66 Add setter functions 2024-09-23 21:12:01 -04:00
Jin Kim
49f2123893 Apply and Fix upstream changes for Cartesia 2024-09-24 07:59:26 +09:00
Jin Kim
cf72129852 Merge remote-tracking branch 'upstream/main' 2024-09-24 07:18:22 +09:00
Mark Backman
8edee8155d Add input params to Azure TTS 2024-09-23 17:52:23 -04:00
chadbailey59
c262b272fa Added RTVIActionFrame (#464)
* added RTVIActionFrame

* server-sent events

* reverted log changes

* fixup
2024-09-23 14:51:17 -05:00
Aleix Conchillo Flaqué
9ef9c1c58a Merge pull request #497 from pipecat-ai/aleix/ruff-formater
introduce Ruff formatting
2024-09-23 10:42:54 -07:00
Aleix Conchillo Flaqué
c7ff79a652 processors: fix formatting string 2024-09-23 09:53:37 -07:00
Aleix Conchillo Flaqué
da81df5284 github: install dev-requirements when running tests 2024-09-23 09:53:37 -07:00
Aleix Conchillo Flaqué
a4420dc88b README: add vscode and emacs ruff instructions 2024-09-23 09:53:37 -07:00
Aleix Conchillo Flaqué
eeb8338dce introduce Ruff formatting 2024-09-23 09:53:37 -07:00
Cyril S.
dfa4ac81fd Implement Sentry instrumentation for performance and error tracking (#470)
* feat: Add Sentry support in FrameProcessor

This update add optional Sentry integration for performance tracking and error monitoring.

Key changes include:

- Add conditional Sentry import and initialization check
- Implement Sentry spans in FrameProcessorMetrics to measure TTFB (Time To First Byte) and processing time when Sentry is available
- Maintain existing metrics functionality with MetricsFrame regardless of Sentry availability

* feat: Enable metrics in DeepgramSTTService for Sentry

This commit enhances the DeepgramSTTService class to enable metrics generation for use with Sentry.

Key changes include:

1. Enable general metrics generation:
   - Implement `can_generate_metrics` method, returning True when VAD is enabled
   - This allows metrics to be collected and used by both Sentry and the metrics system in frame_processor.py

2. Integrate Sentry-compatible performance tracking:
   - Add start_ttfb_metrics and start_processing_metrics calls in the VAD speech detection handler
   - Implement stop_ttfb_metrics call when receiving transcripts
   - Add stop_processing_metrics for final transcripts

3. Enhance VAD support for metrics:
   - Add `vad_enabled` property to check VAD event availability
   - Implement VAD-based speech detection handler for precise metric timing

These changes enable detailed performance tracking via both Sentry and the general metrics system when VAD is active. This allows for better monitoring and analysis of the speech-to-text process, providing valuable insights through Sentry and any other metrics consumers in the pipeline.

* Update frame_processor.py

* Refactor to support flexible metrics implementation

- Modified the __init__ method to accept a metrics parameter that is either FrameProcessorMetrics or one of its subclasses
- Updated the metrics initialization to create an instance with the processor's name
- Moved all FrameProcessorMetrics-related logic to a new processors\metrics\base.py file

* Implement flexible metrics system with Sentry integration

1. Created a new metrics module in processors/metrics/

2. Implemented FrameProcessorMetrics base class in base.py:

3. Implemented SentryMetrics class in sentry.py:
   - Inherits from FrameProcessorMetrics
   - Integrates with Sentry SDK for advanced metrics tracking
   - Implements Sentry-specific span creation and management for TTFB and processing metrics
   - Handles cases where Sentry is not available or initialized
2024-09-23 08:44:14 -07:00
Lewis Wolfgang
ea16dca8aa Merge pull request #469 from pipecat-ai/lewis/remove_torch_dependency
Remove torch dependency for using silero_vad
2024-09-23 09:59:40 -04:00
Mark Backman
306632b29a Add language_code support for ElevenLabs TTS 2024-09-23 09:01:02 -04:00
duyalei
4533ed014f add full-width punctuations as end of the sentence 2024-09-23 16:35:00 +08:00
Jin Kim
68cc4186ad Merge remote-tracking branch 'upstream/main' 2024-09-23 16:34:31 +09:00
Mark Backman
9a4e749c7c Merge pull request #491 from pipecat-ai/mb/elevenlabs-inputs
Add voice_settings and optimize_streaming_latency to ElevenLabs
2024-09-22 21:54:21 -04:00
Mark Backman
55c645c614 Add voice_settings and optimize_streaming_latency to ElevenLabs 2024-09-22 13:58:50 -04:00
Mark Backman
a1024bb365 Merge pull request #490 from pipecat-ai/mb/llm-rtvi-service-option
Add control frames for LLM param updates
2024-09-21 20:10:17 -04:00
Mark Backman
dfc82c3ba4 Merge pull request #486 from pipecat-ai/mb/llm-extra-params
Add extra input param to LLMs
2024-09-21 18:25:47 -04:00
Mark Backman
9e27a8aad0 Add control frames for LLM param updates 2024-09-21 00:02:58 -04:00
Mark Backman
c73111afea Add extra input param to LLMs 2024-09-21 00:01:25 -04:00
Kwindla Hultman Kramer
26a64afd8d Merge pull request #485 from pipecat-ai/khk/metrics-model-exclude-none
fixup for serialization issue
2024-09-20 18:24:19 -07:00
Kwindla Hultman Kramer
78a3f081de fixup for serialization issue 2024-09-20 18:21:06 -07:00
Mark Backman
e8f8a49646 Merge pull request #484 from pipecat-ai/mb/llm-input-params
Add input params for OpenAI, Anthropic, Together AI LLMs
2024-09-20 20:35:49 -04:00
Mark Backman
219304c5ee Added Changelog entries 2024-09-20 20:31:42 -04:00
Mark Backman
f3fd312b83 Add Together AI interruptible example 2024-09-20 20:21:19 -04:00
Mark Backman
357e66d64d Input params for Together AI LLM 2024-09-20 20:21:19 -04:00
Mark Backman
4fa1ea8c4b Input params for Anthropic LLM 2024-09-20 20:21:19 -04:00
Mark Backman
3b81cd462d Input params to OpenAI LLM 2024-09-20 20:21:19 -04:00
Aleix Conchillo Flaqué
14acf05a26 Merge pull request #480 from pipecat-ai/aleix/input-output-frames
introduce input/output audio and image frames
2024-09-20 14:44:37 -07:00
Mattie Ruth
58d9c84bc9 Merge pull request #474 from pipecat-ai/ruthless/improve-metrics-types-2
Ruthless/improve metrics types 2
2024-09-20 09:47:24 -04:00
Aleix Conchillo Flaqué
7e39d9ad3d introduce input/output audio and image frames
We now distinguish between input and output audio and image frames. We introduce
`InputAudioRawFrame`, `OutputAudioRawFrame`, `InputImageRawFrame` and
`OutputImageRawFrame` (and other subclasses of those). The input frames usually
come from an input transport and are meant to be processed inside the pipeline
to generate new frames. However, the input frames will not be sent through an
output transport. The output frames can also be processed by any frame processor
in the pipeline and they are allowed to be sent by the output transport.
2024-09-19 23:11:03 -07:00
mattie ruth backman
a4edb3dab1 Cleanup on aisle METRICS. Note: See below, this is a breaking change
1. Fleshed out MetricsFrames and broke it into a proper set of types
2. Add model_name as a property to the AIService so that it can be
   automatically included in metrics and also remove that
   overhead from all the various services themselves

Breaking change!

Because of the types improvements, the MetricsFrame type has
changed. Each frame will have a list of metrics simlilar to before
except each item in the list will only contain one type of metric:
"ttfb", "tokens", "characters", or "processing". Previously these
fields would be in every entry but set to None if they didn't apply.

While this changes internal handling of the MetricsFrame, it does NOT
break the RTVI/daily messaging of metrics. That format remains the same.

Also. Remember to use model_name for accessing a service's current
model and set_model_name for setting it.
2024-09-19 21:30:34 -04:00
Mattie Ruth
ed409d0460 Merge pull request #478 from pipecat-ai/ruthless/get-tests-running
Ruthless/get tests running
2024-09-19 21:01:27 -04:00
mattie ruth backman
50b45ac2da get the test infrastructure running again
disable broken tests for now
2024-09-19 20:58:17 -04:00
Kwindla Hultman Kramer
29bcbc68c5 Merge pull request #479 from pipecat-ai/khk/small-fixes
fix small issues that crept into main
2024-09-19 17:25:27 -07:00
Kwindla Hultman Kramer
affbe9ac7d fix small issues that crept into main 2024-09-19 17:17:33 -07:00
Aleix Conchillo Flaqué
1790fa452f Merge pull request #436 from pipecat-ai/aleix/frameprocessor-single-task
introduce synchronous and asynchronous frame processors
2024-09-19 11:22:56 -07:00
Aleix Conchillo Flaqué
607a246572 updated CHANGELOG with sync/async frame processors 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
4f1b06e6b2 pipeline: renamed ParallelTask to SyncParallelPipeline 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
62e9a33a70 examples: use CartesiaHttpTTSService to synchronize frames 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
3298f935ef services(fal,moondream): add missing **kwargs 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
0e8f56c752 services: move TTSService push_stop_frames to AsyncTTSService 2024-09-19 01:32:15 -07:00
Aleix Conchillo Flaqué
8224538372 services(cartesia): added CartesiaHttpTTSService 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
fbf6eef68f transports(base_output): wait for sink tasks before canceling audio/video tasks 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
f078d156de frames: StartFrame is now a SystemFrame 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
23d6eed5ea transports: input()/output() return subclass instead of base class 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
0ed3d118d6 services(moondream); update revision to 2024-08-26 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
337f048864 introduce synchronous and asynchronous frame processors
Pipecat has a pipeline-based architecture. The pipeline consists of frame
processors linked to each other. The elements travelling across the pipeline are
called frames.

To have a deterministic behavior the frames travelling through the pipeline
should always be ordered, except system frames which are out-of-band frames. To
achieve that, each frame processor should only output frames from a single task.

There are synchronous and asynchronous frame processors. The synchronous
processors push output frames from the same task that they receive input frames,
and therefore only pushing frames from one task. Asynchrnous frame processors
can have internal tasks to perform things asynchrnously (e.g. receiving data
from a websocket) but they also have a single task where they push frames from.
2024-09-19 01:31:10 -07:00
Mark Backman
6f3c421621 Merge pull request #475 from pipecat-ai/mb/tts-sample-rate
Add sample_rate setting to TTS services
2024-09-18 14:59:09 -04:00
Mark Backman
eadd68d40b Add sample_rate setting to TTS services 2024-09-18 14:50:20 -04:00
Lewis Wolfgang
71202e3cd5 Remove torch dependency for using silero_vad 2024-09-17 16:48:52 -04:00
Jin Kim
75008d8f11 Add speed and emotion setting method to Cartesia TTS service 2024-09-18 00:51:45 +09:00
Jin Kim
2da0ecbe3c Revert "model_id" as a main argument 2024-09-18 00:38:12 +09:00
Jin Kim
c7f814b2dc Merge remote-tracking branch 'upstream/main' 2024-09-18 00:33:29 +09:00
Aleix Conchillo Flaqué
13a4a05388 Merge pull request #466 from pipecat-ai/aleix/elevenlabs-cartesia-close-websocket-first
services(cartesia,elevenlabs): close websocket before the receiving task
2024-09-16 23:55:28 -07:00
Aleix Conchillo Flaqué
20c019ae16 services(cartesia,elevenlabs): close websocket before the receiving task 2024-09-16 23:54:21 -07:00
Adrian Cowham
387a36dd8a missed a debug print statement 2024-09-16 17:43:42 -07:00
Aleix Conchillo Flaqué
d9d6571c73 Merge pull request #465 from kunal-cai/ks--fix-ws
[Cartesia] Fix streaming truncation bug with Twilio Fast API WS
2024-09-16 17:17:13 -07:00
Kunal Shah
540cad4844 Undo sorting 2024-09-16 16:07:19 -07:00
Kunal Shah
0a26b650c0 Undo sorting 2024-09-16 16:06:25 -07:00
Kunal Shah
adaac003e5 [Cartesia] Fix streaming truncation bug with Twilio Fast API WS 2024-09-16 15:59:06 -07:00
Adrian Cowham
2e02ab740d PR feedback 2024-09-15 20:59:17 -07:00
Aleix Conchillo Flaqué
3d4f125071 Merge pull request #454 from pipecat-ai/aleix/initial-pipeline-clock-support
initial pipeline clock support
2024-09-13 13:51:04 -07:00
Aleix Conchillo Flaqué
bce87f8717 update CHANGELOG.md 2024-09-13 13:50:03 -07:00
Aleix Conchillo Flaqué
1fe940bd6b servceis(cartesia,elevenlabs): use word start times instead 2024-09-13 13:10:44 -07:00
Aleix Conchillo Flaqué
cb36a71381 fix some linting 2024-09-13 09:56:12 -07:00
Aleix Conchillo Flaqué
5acc4928fe examples: add 07d-interruptible-elevenlabs.py 2024-09-13 09:43:18 -07:00
Aleix Conchillo Flaqué
434493b8aa services(elevenlabs): implement word-by-word support through websockets 2024-09-13 09:31:35 -07:00
Aleix Conchillo Flaqué
f08b25dbb2 examples: assistant aggregator should always goes after transport 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
3665734972 transports(output): initial sink clock synchronization 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
a98d78cdea services(lmnt): change to subclass of AsyncTTSService 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
80f6d74e80 services(cartesia): change to subclass of AsyncWordTTSService 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
02d926e9bd services: create AsyncTTSService and AsyncWordTTSService 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
7749692f72 processors: get pipeline clock from StartFrame 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
7807cbeb39 pipeline(task): add a clock to the pipeline task 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
72f231b327 frames: add a presentation timestamp (pts) to each frame 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
3cbe97d346 clocks: added new BaseClock and SystemClock 2024-09-12 00:31:48 -07:00
Adrian Cowham
b4eff2028f Merge branch 'main' into recording 2024-09-10 10:18:57 -07:00
Adrian Cowham
f411bf33fd adding a frame processor with the ability to save a conversation to a buffer and another frame processor to upload audio to Canonical for evaluation and metrics collection. Examples included 2024-09-10 10:15:48 -07:00
Kwindla Hultman Kramer
b880e1a60e Merge pull request #448 from pipecat-ai/khk/aggregation-leading-space
fix for leading space in context aggregator strings
2024-09-10 09:57:35 -07:00
Aleix Conchillo Flaqué
886046e696 Merge pull request #445 from dleybz/patch-1
Update requirements.txt
2024-09-09 17:54:33 -07:00
Aleix Conchillo Flaqué
9106a5f8ae Merge pull request #449 from pipecat-ai/aleix/audio-out-bitrate
transports(daily): allow setting audio output bitrate (default 96kpbs)
2024-09-09 08:39:06 -07:00
Aleix Conchillo Flaqué
98286336bf transports(daily): allow setting audio output bitrate (default 96kpbs)
Fixes #388
2024-09-08 19:39:17 -07:00
Jin Kim
fa0deededa Add voice options and make to use InputParams for Cartesia. 2024-09-09 10:53:23 +09:00
Kwindla Hultman Kramer
081b001c8b fix for leading space in context aggregator strings 2024-09-07 16:42:52 -07:00
Danny D. Leybzon
c92531a02f Update requirements.txt
request.form() throws an error if you don't have python-multipart installed
2024-09-06 20:22:18 +02:00
Aleix Conchillo Flaqué
748a7af602 update CHANGELOG.md 2024-09-05 19:05:29 -07:00
Aleix Conchillo Flaqué
f4a0de6327 Merge pull request #444 from pipecat-ai/aleix/elevenlabs-streaming
services(elevenlabs): add elevenlabs package and use streaming
2024-09-05 11:24:12 -07:00
Aleix Conchillo Flaqué
e405d7af9f services(elevenlabs): add elevenlabs package and use streaming 2024-09-05 11:20:01 -07:00
Aashraya
51cd7fd285 twiliohandle interruption (#422)
* add interuption handler in twilio serializer

* fix autopep8

* revert ruff autoformatting

* address pr comments

* change interruption frame to user started frame in serializer

* remove overrrident handle interrupt

* remove unused import

* change userstarted to interuption frame
2024-09-02 11:06:38 -07:00
Aleix Conchillo Flaqué
aba5f89174 Merge pull request #437 from soof-golan/soof-obj-id-generation
Generate ids with itertools.count
2024-09-02 10:53:48 -07:00
Soof Golan
5c0f5a1613 Generate ids with itertools.count
Avoids the critical section with threading.Lock in favor of itertools.count.

`count` objects are threadsafe, and their critical section is implemented in C and provide better performance that Python level locking.
2024-09-02 15:39:58 +02:00
Aleix Conchillo Flaqué
7c342f7ba2 Merge pull request #433 from pipecat-ai/aleix/process-all-startframes
StartFrame should be the first frame every processor receives
2024-08-30 14:17:38 -07:00
Aleix Conchillo Flaqué
37e2388758 StartFrame should be the first frame every processor receives
Fixes #427
2024-08-29 22:43:44 -07:00
Aleix Conchillo Flaqué
05f0492a8d Merge pull request #421 from pipecat-ai/aleix/improve-multi-lingual-support
improve multi lingual support
2024-08-29 13:19:40 -07:00
Aleix Conchillo Flaqué
c0ac5c6ae8 services(lmnt): fix example and update README and CHANGELOG 2024-08-29 11:11:24 -07:00
Aleix Conchillo Flaqué
be923687fb processors(rtvi): user decices if bot interrupts on update config 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
5f32fb125d updated CHANGELOG.md 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
ae6fbb3146 services: just set model, voice, language independently 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
864768635a services: add voice and language to set_model() 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
d7c9679977 services: allow TTSModelUpdateFrame to also update language and voice 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
fedfc366f6 services(deepgram): fix strenum values 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
b3b39626e1 services: allow switching STT language and mdoel at the same time 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
4e0ece17b6 services: added support for setting STT model and language 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
fd3fdacdee transcriptions: added more languages 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
a253606d50 services(daily): on_joined now returns all data not only participant 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
568d9dc0a3 services(whisper): inherit from SegmentedSTTService 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
6629b853c5 services(deepgram): inherit from STTService instead of AsyncAIService 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
3931cb3235 services(cartesia): allow setting language and language voices 2024-08-29 11:00:01 -07:00
Aleix Conchillo Flaqué
38cd86ad52 services: added language to transcription frames 2024-08-29 10:59:02 -07:00
Aleix Conchillo Flaqué
c0cdabf61d frames: adde TTSLanguageUpdateFrame and TTSLanguageVoicesUpdateFrame 2024-08-29 10:59:02 -07:00
Aleix Conchillo Flaqué
51270a96c5 frames: add language to transcription frames 2024-08-29 10:59:02 -07:00
Kwindla Hultman Kramer
84d72c0d5c Merge pull request #425 from pipecat-ai/khk/rtvi-together-function-calling
fixup type mismatches between rtvi data structures and together.py
2024-08-28 13:11:52 -07:00
Aleix Conchillo Flaqué
79aca8169a Merge pull request #391 from sharvil/pr/add-lmnt
LMNT TTS
2024-08-27 21:40:46 -07:00
Kwindla Hultman Kramer
b9d362bd62 fixup type mismatches between rtvi data structures and together.py 2024-08-27 17:39:21 -07:00
Sharvil Nanavati
87c4a1bee1 Move stop frame task creation into TTSService.start 2024-08-27 04:45:21 +00:00
Sharvil Nanavati
c979762b70 Handle cancellation, stopping, and restarting 2024-08-27 01:24:00 +00:00
Sharvil Nanavati
1d92fc3199 Merge branch 'main' into pr/add-lmnt 2024-08-24 10:07:52 -07:00
Sharvil Nanavati
8ac7fb1a67 Use a single long-lived Task to push TTSStoppedFrame 2024-08-24 16:18:07 +00:00
Sharvil Nanavati
60c3d33def Default LMNT to 24kHz, add example 2024-08-24 15:40:29 +00:00
Sharvil Nanavati
8a39d3f4eb services: add a generic mechanism to produce TTSStoppedFrames 2024-08-24 15:40:12 +00:00
Aleix Conchillo Flaqué
e038767b6f Merge pull request #413 from pipecat-ai/aleix/pipecat-0.0.41
prepare pipecat 0.0.41
2024-08-22 17:01:43 -07:00
Aleix Conchillo Flaqué
0c46b3e481 prepare pipecat 0.0.41 2024-08-22 11:50:20 -07:00
Aleix Conchillo Flaqué
d42f072ff5 examples: fix studypal errors and update requirements 2024-08-22 11:50:05 -07:00
Aleix Conchillo Flaqué
9b6f29c24a Merge pull request #414 from pipecat-ai/aleix/add-livekit-dependency
added livekit dependency
2024-08-22 10:55:43 -07:00
Aleix Conchillo Flaqué
873d5dc23f added livekit dependency 2024-08-22 10:54:18 -07:00
Aleix Conchillo Flaqué
6d141fd47f Merge pull request #396 from nulyang/feat/livekit-serializers
Add livekit audio serializers
2024-08-22 10:44:24 -07:00
Aleix Conchillo Flaqué
c6f6cb2947 Merge pull request #412 from pipecat-ai/aleix/fastapi-variable-clash
transports(fastapi): fix variable name clash
2024-08-22 09:50:23 -07:00
Aleix Conchillo Flaqué
0eb189ce7f transports(fastapi): fix variable name clash 2024-08-22 08:50:03 -07:00
Sharvil Nanavati
f4fd7b7028 LMNT TTS 2024-08-22 00:47:41 +00:00
Aleix Conchillo Flaqué
21de8e0a35 transport(out): log bot started/stopped speaking 2024-08-21 17:23:44 -07:00
Aleix Conchillo Flaqué
6f55d494bd frames: use VADParams type in VADParamsUpdateFrame 2024-08-21 17:23:12 -07:00
Aleix Conchillo Flaqué
d216edc567 Merge pull request #409 from aashsach/anthropic-empty-tool-argument
handle empty parameters for anthropic function calling
2024-08-21 16:14:51 -07:00
Aashraya
ec6063ecc4 system is not a list, it is handled and assisgned as string 2024-08-21 16:31:50 +05:30
Aashraya
40fe4ce6fb handle empty parameters for anthropic function calling 2024-08-21 15:49:36 +05:30
Aleix Conchillo Flaqué
31d87a4048 update CHANGELOG.md for 0.0.40 2024-08-20 11:48:40 -07:00
Aleix Conchillo Flaqué
ac8b171fa9 Merge pull request #406 from pipecat-ai/hush/cartesiaDocs
Hush/cartesia docs
2024-08-20 11:17:52 -07:00
James Hush
1f06d78213 github: remove *requirements.txt from tests.yaml 2024-08-20 11:16:25 -07:00
James Hush
28eba17df8 docs: update Cartesia references 2024-08-20 11:13:13 -07:00
Aleix Conchillo Flaqué
dfc2e62339 Merge pull request #405 from pipecat-ai/aleix/revert-dailysettings-aliases
Revert "transports(daily): use aliases in DailyDialinSettings"
2024-08-20 08:53:31 -07:00
Aleix Conchillo Flaqué
80c89a39c9 processors(rtvi): add support for client-ready message (fix) 2024-08-20 07:54:11 -07:00
Aleix Conchillo Flaqué
9d1c16e996 Revert "transports(daily): use aliases in DailyDialinSettings"
This reverts commit 47d375309d.
2024-08-20 07:52:35 -07:00
Aleix Conchillo Flaqué
86604c2353 examples(studypal): use aiohttp instead of requests 2024-08-19 18:11:30 -07:00
Aleix Conchillo Flaqué
8f31a02938 Merge pull request #403 from yashn35/studypal-demo
Add studypal
2024-08-19 17:39:19 -07:00
Aleix Conchillo Flaqué
47d375309d transports(daily): use aliases in DailyDialinSettings 2024-08-19 17:27:43 -07:00
Yash Narayan
980265ca97 Add studypal 2024-08-19 16:58:29 -07:00
Aleix Conchillo Flaqué
90479fff95 processors(rtvi): add set_client_ready() 2024-08-19 16:41:43 -07:00
Aleix Conchillo Flaqué
1ce1fcb0ce Merge pull request #401 from pipecat-ai/aleix/use-cartesia-in-more-examples
examples: use Cartesia TTS in most examples
2024-08-19 16:07:35 -07:00
Aleix Conchillo Flaqué
1a662376fc examples: use Cartesia TTS in most examples 2024-08-19 15:31:34 -07:00
Aleix Conchillo Flaqué
1d24f926ec Merge pull request #400 from pipecat-ai/aleix/rtvi-client-ready
processors(rtvi): add support for client-ready message
2024-08-19 10:53:49 -07:00
Aleix Conchillo Flaqué
4f2c37c940 processors(rtvi): add support for client-ready message 2024-08-19 10:33:18 -07:00
Aleix Conchillo Flaqué
042115a6bb processors(rtvi): update initial config when sending bot-ready message 2024-08-19 09:32:27 -07:00
Aleix Conchillo Flaqué
c9f1469b41 transports(daily/helpers): add server error message to the logs 2024-08-19 08:44:05 -07:00
Aleix Conchillo Flaqué
54c9f604c9 updated CHANGELOG with VADParamsUpdateFrame 2024-08-18 21:20:40 -07:00
Kwindla Hultman Kramer
56fbcd6562 Merge pull request #397 from pipecat-ai/khk/rtvi-vad-params
VADParamsUpdateFrame and handling thereof
2024-08-18 21:14:58 -07:00
Kwindla Hultman Kramer
e6b0500568 make VADAnalyzer:set_params() public 2024-08-18 21:11:18 -07:00
Aleix Conchillo Flaqué
41038b6673 Merge pull request #394 from pipecat-ai/aleix/fix-function-calling-examples
fix function calling examples
2024-08-18 20:55:29 -07:00
Aleix Conchillo Flaqué
26d03f26c9 services(openai, anthropic): a None result should not run inference 2024-08-18 20:48:43 -07:00
Aleix Conchillo Flaqué
f3a4e54996 function calling: start callback should have function name first 2024-08-18 20:48:20 -07:00
Kwindla Hultman Kramer
925e80bb20 VADParamsUpdateFrame and handling thereof 2024-08-18 13:34:46 -07:00
nulyang
9bda09b1a8 serializers(livekit): Add audio serializers 2024-08-18 23:40:32 +08:00
Aleix Conchillo Flaqué
ef0d0531fa services: moved request_image_frame() to LLMService 2024-08-17 23:59:38 -07:00
Aleix Conchillo Flaqué
6520f20ffe fix function calling examples 2024-08-17 23:32:39 -07:00
Aleix Conchillo Flaqué
ebc4e0924b Merge pull request #387 from pipecat-ai/aleix/update-reqs-081624
update pyproject.toml and remove requirements files
2024-08-17 23:29:47 -07:00
Aleix Conchillo Flaqué
9e7c0e6033 Merge pull request #390 from sharvil/pr/websocket-fix
transports(websocket): fix `_audio_buffer` being accidentally overwritten
2024-08-17 23:26:35 -07:00
Aleix Conchillo Flaqué
cf5720f316 update CHANGELOG.md 2024-08-17 21:00:32 -07:00
Kwindla Hultman Kramer
655b468269 Merge pull request #393 from pipecat-ai/khk/anthropic-tools-ordering
fix for out-of-order image messages in anthropic context
2024-08-17 15:07:27 -07:00
Kwindla Hultman Kramer
17f8c93e44 fix for out-of-order image messages in anthropic context 2024-08-17 14:47:29 -07:00
Aleix Conchillo Flaqué
5b4061b0d5 processors(rtvi): fix send_error() 2024-08-16 23:46:57 -07:00
Aleix Conchillo Flaqué
6ce0227e98 processors(rtvi): error-response should always include and error 2024-08-16 23:23:55 -07:00
Aleix Conchillo Flaqué
a583a28850 processors(rtvi): error message should use error field 2024-08-16 23:22:27 -07:00
Aleix Conchillo Flaqué
32daf65adc processors(rtvi): send to the client if errors are fatal 2024-08-16 23:17:55 -07:00
Aleix Conchillo Flaqué
e22c80610e frames: add new FatalErrorFrame 2024-08-16 23:17:31 -07:00
Sharvil Nanavati
374f1e7e01 transports(websocket): fix _audio_buffer being accidentally overwritten
`BaseOutputTransport` declares an `_audio_buffer` instance variable.
`WebsocketServerOutputTransport` accidentally reuses that variable
internally assuming it's class-local and not inherited.

This PR renames the variable in `WebsocketServerOutputTransport`
to avoid the name collision.
2024-08-17 05:28:05 +00:00
Aleix Conchillo Flaqué
d2dfa93bf1 processors(rtvi): send bot-ready when participant joins 2024-08-16 13:58:21 -07:00
Aleix Conchillo Flaqué
fa8c6712c6 transports(daily): fix multiple DailyTransport initialization 2024-08-16 13:32:34 -07:00
Aleix Conchillo Flaqué
4c2b84cb4d update pyproject.toml and remove requirements files 2024-08-16 09:28:46 -07:00
Aleix Conchillo Flaqué
b57c9d569b Merge pull request #352 from pipecat-ai/aleix/rtvi-0.1
processors(rtvi): rtvi 0.1 message protocol
2024-08-15 17:35:50 -07:00
Aleix Conchillo Flaqué
f0e50ba000 Merge pull request #336 from nulyang/fix/azure-transcriptionframe
services(azure): fix TranscriptionFrame parameter type
2024-08-15 17:08:56 -07:00
Mattie Ruth
4a6638f749 Merge pull request #385 from pipecat-ai/mrkb/anthropic-beta-caching
Mrkb/anthropic beta caching
2024-08-15 18:26:51 -04:00
Aleix Conchillo Flaqué
31577252f3 processors(rtvi): handle ErrorFrames 2024-08-15 15:23:31 -07:00
Aleix Conchillo Flaqué
5d71c50080 transports(daily): make sure audio_in_task exists before canceling 2024-08-15 15:23:07 -07:00
Aleix Conchillo Flaqué
981269d594 pipeline(task): process ErrorFrame in same task and stop pipeline task 2024-08-15 15:22:40 -07:00
mattie ruth backman
848db985fc bump anthropic in 3.10 requirements 2024-08-15 16:51:48 -04:00
mattie ruth backman
d5d8e31447 add cache tokens to metrics event 2024-08-15 16:51:48 -04:00
Aleix Conchillo Flaqué
66670a2370 Merge pull request #384 from pipecat-ai/aleix/enable-prompt-caching-frames
services(anthropic): allow setting enable prompt caching via frame
2024-08-15 13:26:39 -07:00
Aleix Conchillo Flaqué
5637f349c6 services(anthropic): allow setting enable prompt caching via frame 2024-08-15 12:43:29 -07:00
Aleix Conchillo Flaqué
93248e1d00 Merge pull request #382 from pipecat-ai/khk/anthropic-beta-caching
Support for Anthropic prompt caching beta
2024-08-15 12:34:54 -07:00
Kwindla Hultman Kramer
187769357f update version number of anthropic dependency 2024-08-15 12:28:41 -07:00
Aleix Conchillo Flaqué
5be6422cc8 Revert "processors(rtvi): process options in the order they are defined"
This reverts commit 61ac83e2d9.
2024-08-15 11:51:00 -07:00
Aleix Conchillo Flaqué
8670b2d994 utils: add match_endofsentence and use it in processors 2024-08-15 11:26:25 -07:00
Aleix Conchillo Flaqué
0bc6db428d processors(rtvi): implement bot-started-speaking and bot-stopped-speaking 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
67d565930e services: send TTSStartFrame/TTSStopFrame when really needed 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
b2a7ff6fd3 processors(rtvi): all transport messages should be urgent 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
425a730d7c transports(base_output): send urgent transport messages immediately 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
84c5709722 frames: add urgent field to TransportMessageFrame 2024-08-15 11:05:10 -07:00
Kwindla Hultman Kramer
94deec01c9 okay, both files now 2024-08-15 00:57:10 -07:00
Kwindla Hultman Kramer
6e0dd4a779 Anthropic beta prompt caching 2024-08-15 00:54:43 -07:00
Kwindla Hultman Kramer
14bde340dd Merge pull request #381 from pipecat-ai/khk/anthropic-fixup-0814.2
Fixup anthropic context set_messages
2024-08-14 23:34:31 -07:00
Kwindla Hultman Kramer
253765c611 and fixing anthropic demos 2024-08-14 23:14:20 -07:00
Kwindla Hultman Kramer
2b26d7182f replaces 379 2024-08-14 22:40:09 -07:00
Aleix Conchillo Flaqué
61ac83e2d9 processors(rtvi): process options in the order they are defined 2024-08-14 22:26:49 -07:00
Aleix Conchillo Flaqué
d5c7b28cad Merge pull request #380 from pipecat-ai/aleix/rtvi-0.1-context-aggregators-updates
processors(aggregators): multiple LLM aggregators updates
2024-08-14 20:43:50 -07:00
Aleix Conchillo Flaqué
959580a708 processors(logger): fix linting 2024-08-14 20:39:24 -07:00
Aleix Conchillo Flaqué
3a5cd17ea3 processors(aggregators): multiple LLM aggregators updates 2024-08-14 20:23:18 -07:00
Kwindla Hultman Kramer
b78981bb9d Merge pull request #374 from pipecat-ai/khk/together
Together.ai service implementation with Llama 3.1 function calling
2024-08-14 17:29:07 -07:00
Kwindla Hultman Kramer
a6d90b0a00 linting fixes to anthropic.py 2024-08-14 17:27:00 -07:00
Aleix Conchillo Flaqué
67016492f2 transports(daily/helpers): add delete_room_from_url() 2024-08-14 17:14:02 -07:00
Aleix Conchillo Flaqué
2c38089527 processors(rtvi): handle incoming messages in a separate task 2024-08-14 15:34:02 -07:00
Kwindla Hultman Kramer
48f68ba6dc Service for together.ai, including Llama 3.1 function calling support 2024-08-13 15:01:54 -07:00
Aleix Conchillo Flaqué
574df4ba3d processors(rtvi): make sure to send bot-ready when transport is joined 2024-08-13 13:25:15 -07:00
Aleix Conchillo Flaqué
49ca16d125 pipeline(task): only send initial metrics frames if metrics enabled 2024-08-13 12:22:37 -07:00
Aleix Conchillo Flaqué
87525b085e processors(rtvi): linting and make send_error() public 2024-08-13 11:21:51 -07:00
Aleix Conchillo Flaqué
6b53c6add3 transports(daily): DailyTransport default DailyParams 2024-08-13 11:13:18 -07:00
Kwindla Hultman Kramer
29ca1b7855 Anthropic tool use core Pipecat pieces refactored (#369)
* processors(rtvi): rtvi 0.1 message protocol

* added a single function call handler

* wip - function calling

* fixup

* fixup

* fixup

* processors(rtvi): no need for configure_on_start()

* processors(rtvi): add new option values if they haven't been set yet

* Add the model name to the LLM usage metrics

* wip - anthropic tool calling

* still wip - anthropic tool use and vision

* anthropic tools and vision working

* anthropic tool calling and vision

* Cartesia error handling

* Anthropic tool use core Pipecat pieces refactored as per plan

* aleix has good ideas

* Usage metrics for Anthropic LLMs

* fix function call result state not getting cleared bug

* Pass **kwargs through from AnthropicLLMService constructor

* about to tinker with anthropic

* added openai function calling

* openai function calling

* fixup

---------

Co-authored-by: Aleix Conchillo Flaqué <aleix@daily.co>
Co-authored-by: Chad Bailey <chadbailey@gmail.com>
Co-authored-by: mattie ruth backman <mattieruth@gmail.com>
Co-authored-by: chadbailey59 <chadbailey59@users.noreply.github.com>
2024-08-13 13:01:24 -05:00
Aleix Conchillo Flaqué
a42d0c9907 processors(rtvi): add interrupt_bot() 2024-08-13 09:22:43 -07:00
marcus-daily
8bc6ceaa3d Fixing pep8 2024-08-13 15:32:23 +01:00
marcus-daily
0b8a1ab5d1 Handle describe-actions message 2024-08-13 15:32:23 +01:00
Brian Hill
358c287db2 chore: Enable build without git 2024-08-12 11:38:41 -04:00
Brian Hill
2e68453655 Merge pull request #371 from pipecat-ai/cbrianhill/allow-build-without-git
chore: Enable build without git
2024-08-12 10:15:55 -04:00
Brian Hill
89b8a9de7d chore: Enable build without git 2024-08-12 09:36:25 -04:00
Aleix Conchillo Flaqué
c4c2058df9 processors(rtvi): handle frames pushed from outside in order 2024-08-11 23:09:11 -07:00
Aleix Conchillo Flaqué
0d85c0085f processors(rtvi): interrupt the bot if a new config is received 2024-08-11 23:09:11 -07:00
Mattie Ruth
6fa8a8f84f Merge pull request #365 from pipecat-ai/ruthless/metrics 2024-08-11 20:35:05 -04:00
mattie ruth backman
a97775bff3 Add the model name to the LLM usage metrics 2024-08-11 12:08:46 -04:00
Aleix Conchillo Flaqué
32640e054d processors(rtvi): add new option values if they haven't been set yet 2024-08-10 21:25:39 -07:00
Aleix Conchillo Flaqué
aa42da5658 processors(rtvi): no need for configure_on_start() 2024-08-10 21:25:21 -07:00
nulyang
900a94a825 services(azure): fix TranscriptionFrame parameter type 2024-08-10 13:00:03 +08:00
Aleix Conchillo Flaqué
c37552de70 processors(rtvi): add support for action responses 2024-08-09 18:12:37 -07:00
Aleix Conchillo Flaqué
916b37926c processors(rtvi): rtvi 0.1 message protocol 2024-08-09 17:24:38 -07:00
Aleix Conchillo Flaqué
2b76c3c15a update macos-py3.10-requirements 2024-08-09 17:18:30 -07:00
Aleix Conchillo Flaqué
cedd7dde18 update linux-py3.10-requirements.txt 2024-08-09 17:14:46 -07:00
Lewis Wolfgang
d088608d8e Merge pull request #340 from pipecat-ai/lewis/silero-vad-via-pip
Install Silero VAD via pip
2024-08-09 13:27:29 -04:00
Aleix Conchillo Flaqué
06ee29bb8b Merge pull request #359 from pipecat-ai/aleix/twilio-elevenlabs-sample-rates
twilio and elevenlabs sample rates
2024-08-09 09:38:35 -07:00
Aleix Conchillo Flaqué
d255e954d6 services(elevenlabs): allow specifying output_format 2024-08-09 09:38:20 -07:00
Aleix Conchillo Flaqué
6a7ab6b8ac serializers(twilio): allow specifying input and output sample rates 2024-08-09 09:37:51 -07:00
Aleix Conchillo Flaqué
45b18cc0b1 Merge pull request #358 from pipecat-ai/aleix/daily-create-room-exp-fixes
transports(daily): fixed create_room expirations
2024-08-09 09:37:01 -07:00
Aleix Conchillo Flaqué
0479431f0a Merge pull request #357 from pipecat-ai/aleix/daily-on-participant-updated
transports(daily): added on_participant_updated event
2024-08-09 09:36:46 -07:00
Aleix Conchillo Flaqué
ec58dbd791 transports(daily): added on_participant_updated event
Fixes #353
2024-08-09 09:36:24 -07:00
Aleix Conchillo Flaqué
91de68aab3 Merge pull request #355 from pipecat-ai/aleix/usage-metrics-update
processors(base): add start_llm_usage_metrics and start_tts_usage_met…
2024-08-09 09:35:36 -07:00
Aleix Conchillo Flaqué
85efc30145 Merge pull request #356 from pipecat-ai/aleix/eleven_turbo_v2_5
services(elevenlabs): update default model to eleven_turbo_v2_5
2024-08-09 09:34:47 -07:00
Aleix Conchillo Flaqué
0032594f21 transports(daily): fixed create_room expirations
Fixes #348
2024-08-08 22:04:22 -07:00
Aleix Conchillo Flaqué
829fdc5679 services(elevenlabs): update default model to eleven_turbo_v2_5
Fixes #349
2024-08-08 21:38:18 -07:00
Aleix Conchillo Flaqué
22e176e329 processors(base): add start_llm_usage_metrics and start_tts_usage_metrics 2024-08-08 16:46:56 -07:00
Lewis Wolfgang
826a70a137 Merge pull request #354 from pipecat-ai/lewis/delete_room_by_name
Add delete_room_by_name to DailyRESTHelper
2024-08-08 17:09:21 -04:00
Lewis Wolfgang
dd0ea674af Treat 404 (room not found) as a success for deletion 2024-08-08 16:57:58 -04:00
Lewis Wolfgang
a4761b8921 Add delete_room_by_name to DailyRESTHelper 2024-08-08 16:31:01 -04:00
chadbailey59
3958bb7903 Additional LLM and TTS metrics (#343)
* added llm and tts usage metrics

* Metrics debug logging

* cleanup
2024-08-07 08:55:51 -05:00
Aleix Conchillo Flaqué
83a037a7ce Merge pull request #345 from pipecat-ai/aleix/base-output-render-time-fixes
transports(base_output): improve render sleep computation
2024-08-06 17:30:47 -07:00
Aleix Conchillo Flaqué
a3eb8337a6 Merge pull request #342 from pipecat-ai/aleix/base-output-transport-push-audio
transport(base_output): push audio downstream
2024-08-06 17:30:32 -07:00
Aleix Conchillo Flaqué
541072f8e0 transports(base_output): improve render sleep computation 2024-08-06 17:20:41 -07:00
Aleix Conchillo Flaqué
881248cbd6 transport(base_output): push audio downstream 2024-08-05 14:00:09 -07:00
Aleix Conchillo Flaqué
d4979f5e64 Merge pull request #337 from pipecat-ai/aleix/audio-video-sync-and-gstreamer
audio/video sync and gstreamer
2024-08-05 09:28:11 -07:00
Aleix Conchillo Flaqué
4133cd03bb processors(gstreamer): add clock_sync property 2024-08-05 09:23:25 -07:00
Lewis Wolfgang
9f07c3ca27 Fly.io example: remove step to cache silero models.
No longer necessary.
2024-08-05 10:12:35 -04:00
Lewis Wolfgang
b20bacb9ed Remove no longer needed code 2024-08-05 10:10:39 -04:00
Lewis Wolfgang
97cfbfee1d Install silero via pip 2024-08-05 10:01:27 -04:00
Aleix Conchillo Flaqué
fa7c941792 examples(gstreamer): add new GStreamer examples 2024-08-04 12:29:36 -07:00
Aleix Conchillo Flaqué
4738879f32 processors(gstreamer): add new GStreamerPipelineSource 2024-08-04 12:29:34 -07:00
Aleix Conchillo Flaqué
d5d88f756a transport(output): improve audio and image handling for video use cases 2024-08-04 12:29:08 -07:00
Aleix Conchillo Flaqué
65b136bf15 Merge pull request #334 from pipecat-ai/aleix/cleanup-examples-remove-requests
cleanup examples and remove requests
2024-08-01 22:05:01 -07:00
Aleix Conchillo Flaqué
bee0b238e4 examples(storytelling-chatbot): include package-lock.json 2024-08-01 18:23:30 -07:00
Aleix Conchillo Flaqué
c891168ffb services: revert optional aiohttp.ClientSession 2024-08-01 18:22:56 -07:00
Aleix Conchillo Flaqué
6376c2f6aa transport(websocket): fix cancel 2024-08-01 18:09:16 -07:00
Aleix Conchillo Flaqué
4d9b7cdd61 DailyRESTHelper now receives an aiohttp client session 2024-08-01 18:08:57 -07:00
Aleix Conchillo Flaqué
8263d1dd6f update CHANGELOG with latest changes 2024-07-31 23:44:07 -07:00
Aleix Conchillo Flaqué
faf41c0b36 services: ignore yielded None values 2024-07-31 23:41:03 -07:00
Aleix Conchillo Flaqué
27a09c0b2c cleanup examples and remove requests library 2024-07-31 23:39:51 -07:00
Aleix Conchillo Flaqué
3db7f6a284 Merge pull request #333 from pipecat-ai/aleix/allow-internal-http-sessions-rebased
services: allow internal http sessions if none is given
2024-07-31 21:57:00 -07:00
Aleix Conchillo Flaqué
3bfeb5b5ef services: allow internal http sessions if none is given 2024-07-31 21:56:19 -07:00
Aleix Conchillo Flaqué
62a7a555b5 Merge pull request #330 from pipecat-ai/aleix/stop-and-cancel-are-different
EndFrame tries to end gracefully CancelFrame cancels tasks
2024-07-31 15:51:29 -07:00
Aleix Conchillo Flaqué
d60e99a043 examples(06a-image-sync): make sure frames go downstream 2024-07-30 11:41:58 -07:00
Aleix Conchillo Flaqué
77723b34c7 EndFrame tries to end gracefully CancelFrame cancels tasks 2024-07-30 11:41:19 -07:00
Aleix Conchillo Flaqué
c466d34a06 Merge pull request #328 from pipecat-ai/aleix/rtvi-towards-custom-pipelines
processors(rtvi): refactor to allow future custom pipelines
2024-07-29 15:07:57 -07:00
Aleix Conchillo Flaqué
f816897833 Merge pull request #327 from pipecat-ai/aleix/bot-start-stop-speaking-frames
bot start stop speaking frames
2024-07-27 17:21:23 -07:00
Aleix Conchillo Flaqué
c1e8a5e522 processors(rtvi): refactor to allow future custom pipelines 2024-07-26 10:26:36 -07:00
Aleix Conchillo Flaqué
76aca32f2e transport(output): emit new bot start|stop speaking frames 2024-07-25 14:50:33 -07:00
Aleix Conchillo Flaqué
7e31b2a795 processors(user_idle): use user speaking instead of interruption frames 2024-07-25 14:47:56 -07:00
Aleix Conchillo Flaqué
028e38a86b Merge pull request #326 from pipecat-ai/aleix/rtvi-bot-ready-fixes
rtvi: send bot-ready when pipeline is ready and first participant joins
2024-07-25 11:39:14 -07:00
Aleix Conchillo Flaqué
8cf7649855 processors(rtvi): send bot-ready when pipeline AND first participant joins 2024-07-25 11:25:51 -07:00
Aleix Conchillo Flaqué
64f5119b08 transports(base): allow registering event handlers without decorators 2024-07-25 11:24:24 -07:00
Aleix Conchillo Flaqué
4d606aefb3 update CHANGELOG 2024-07-25 09:57:01 -07:00
Ankur Duggal
4bafdaa04d Deepgram Adjustments (#313) 2024-07-25 09:51:51 -07:00
Aleix Conchillo Flaqué
5afe1abf82 Merge pull request #323 from pipecat-ai/aleix/base-input-handle-incoming-interruptions
transports(inputs): handle start/stop interruption frames
2024-07-24 15:16:18 -07:00
Aleix Conchillo Flaqué
f066d50b98 transports(inputs): handle start/stop interruption frames 2024-07-24 15:15:09 -07:00
Aleix Conchillo Flaqué
91103e21cc github(publish_test): download tags and depth to 100 2024-07-24 14:49:09 -07:00
Aleix Conchillo Flaqué
f44dabcd65 Merge pull request #322 from pipecat-ai/aleix/base-input-transport-system-frames-fix
transports(inputs): don't queue incoming system frames
2024-07-24 14:44:18 -07:00
Aleix Conchillo Flaqué
0fd2fca231 frames: StartFrame is now a control frame 2024-07-24 14:42:59 -07:00
Aleix Conchillo Flaqué
5bb64098e7 transports(inputs): don't queue incoming system frames 2024-07-24 14:35:00 -07:00
Aleix Conchillo Flaqué
3fc85e75e0 Merge pull request #320 from pipecat-ai/aleix/req-updates-072324
update project requirements and dependencies
2024-07-23 17:45:18 -07:00
Aleix Conchillo Flaqué
3f61ea16b7 update project requirements and dependencies 2024-07-23 17:35:47 -07:00
Aleix Conchillo Flaqué
4b393092b5 Merge pull request #319 from pipecat-ai/aleix/daily-completion-callbacks-0.0.39-fix
transports(daily): fix completion callbacks handling
2024-07-23 15:27:26 -07:00
Aleix Conchillo Flaqué
b583f5162b transports(daily): fix completion callbacks handling 2024-07-23 15:25:59 -07:00
Aleix Conchillo Flaqué
060a22f395 github: only run publish_test manually
We need to run this manually to avoid test.pypi.org project size limits.
2024-07-23 14:19:24 -07:00
Aleix Conchillo Flaqué
d3e85355f1 Merge pull request #318 from pipecat-ai/aleix/prepare-0.0.38
update CHANGELOG for 0.0.38
2024-07-23 14:12:01 -07:00
Aleix Conchillo Flaqué
83e730b768 update CHANGELOG for 0.0.38 2024-07-23 14:10:10 -07:00
Aleix Conchillo Flaqué
5fcc96446c Merge pull request #317 from pipecat-ai/aleix/silero-repo-params
vad(silero): expose cache and repo parameters
2024-07-23 12:13:20 -07:00
Aleix Conchillo Flaqué
ad88925154 vad(silero): expose cache and repo parameters 2024-07-23 12:12:28 -07:00
Aleix Conchillo Flaqué
0a6ddbf15c Merge pull request #316 from pipecat-ai/aleix/metrics-improvements
metrics improvements
2024-07-23 11:23:57 -07:00
Aleix Conchillo Flaqué
08e0722d97 fix initial metrics format 2024-07-23 11:23:03 -07:00
Aleix Conchillo Flaqué
05d4fba551 processors(rtvi): send initial empty metrics 2024-07-23 11:22:41 -07:00
Aleix Conchillo Flaqué
f41c2b3c9f transports(daily): don't send empty metrics 2024-07-23 11:22:41 -07:00
Aleix Conchillo Flaqué
69f64899fe pipeline: add send_initial_empty_metrics flag 2024-07-23 11:22:41 -07:00
Aleix Conchillo Flaqué
33f0865430 Merge pull request #315 from pipecat-ai/aleix/stop-transcription-error
transports(daily): wait until start|stop_transcription are finished
2024-07-23 11:18:59 -07:00
Aleix Conchillo Flaqué
ad5b9202ab transports(daily): wait until start|stop_transcription are finished
Fixes #305
2024-07-22 22:59:30 -07:00
Aleix Conchillo Flaqué
1676693091 Merge pull request #314 from pipecat-ai/aleix/transcription-timestamps
services: transcription timestamp should use ISO8601 format
2024-07-22 22:43:01 -07:00
Aleix Conchillo Flaqué
0852b50b8f services: transcription timestamp should use ISO8601 format 2024-07-22 22:40:28 -07:00
Aleix Conchillo Flaqué
eb998aa502 Merge pull request #312 from pipecat-ai/aleix/rtvi-support
RTVI support
2024-07-22 16:58:40 -07:00
Aleix Conchillo Flaqué
6dab0e9de7 update CHANGELOG for 0.0.37 2024-07-22 16:00:30 -07:00
Aleix Conchillo Flaqué
95ff1d141c update CHANGELOG with RTVIProcessor 2024-07-22 16:00:26 -07:00
Aleix Conchillo Flaqué
87bc8a9da6 examples: remove RTVI since there are full demos elsewhere 2024-07-22 15:53:39 -07:00
Aleix Conchillo Flaqué
087fe9a537 services(cartesia): fix TTFB 2024-07-22 15:30:16 -07:00
Aleix Conchillo Flaqué
c1170260b5 processors(rtvi): use generic LLM and TTS names 2024-07-22 15:27:33 -07:00
Aleix Conchillo Flaqué
65cdf50774 processors(rtvi): fix task cleanup 2024-07-22 15:01:45 -07:00
Aleix Conchillo Flaqué
9233bb490c processors(rtvi): add support for "tts-text" messages 2024-07-22 11:40:17 -07:00
Aleix Conchillo Flaqué
43932220f7 processors(rtvi): use only user-transcription 2024-07-22 09:40:16 -07:00
Aleix Conchillo Flaqué
cea4d1894e processors(rtvi): change voice before LLM updates 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
80baa0358d processors(rtvi): lable is now rtvi 2024-07-22 09:32:18 -07:00
Chad Bailey
5d73db53a0 initial pseudo function calling 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
302ea90dce processors(rtvi): messages now require an id 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
37b04ed283 processors(rtvi): use send a type=response as command responses 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
be6995cfdf processors(rtvi): renamed realtime-ai to rtvi 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
dfbc11300c processors(realtime-ai): use label instead of tag 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
82d539d174 processors(realtime-ai): add support for interrupting the bot 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
6e00f31014 updated CHANGELOG with new frames and realtime-ai changes 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
a46ac3cc92 examples: moved 18-realtime-ai.py to examples/realtime-ai 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
6fbf98d8e2 processors(realtime-ai): llm-context now uses a data field 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
f094c42728 processors(realtime-ai): add transcription messages 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
13827e1282 processors(realtime-ai): send a successful response for every command 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
32170b47d9 processors(realtime-ai): add user-[start|stopped]-speaking messages 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
09c05354c2 processors(realtime-ai): fix voice initialization 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
b0b1475563 processors(realtime-ai): add support making TTS to speak 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
b85dd7283a processors(realtime-ai): add support for appending to the LLM context 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
846ae765e5 services(TTSService): fix sentence cleanup 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
4c629e538e processors(realtime-ai): add assistant before output transport
Cartesia can do word-to-word output instead of full sentences. This means that
for properly adding things into the context we need to add it before the
transport, otherwise some words might be lost.
2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
f6e22bb3b9 processors(realtime-ai): add silero vad to the transport 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
46a048d7f6 processors(realtime-ai): allow default setup to be None 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
bd9f4eea06 processors(realtime-ai): provide default values 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
0a672e61e2 processors(realtime-ai): update it to use groq by default 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
29a8530221 processors(realtime-ai): add support for updating config (model, voice...) 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
3e738642a7 processors(realtime-ai): add support for getting/updating LLM context 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
f551f55f03 examples: add new foundational/18-realtime-ai.py 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
9f012c8002 processors: add new RealtimeAIProcessor 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
0a69a9e5ef transport(daily): also accept TransportMessageFrame 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
194790183a processor: add support for setting a processor parent 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
2227721173 update CHANGELOG with StatelessTextTransformer fix (update) 2024-07-22 09:30:45 -07:00
Aleix Conchillo Flaqué
77a53da5f5 update CHANGELOG with StatelessTextTransformer fix 2024-07-22 09:28:38 -07:00
Aleix Conchillo Flaqué
ab63ff275d Merge pull request #310 from weedge/fix/StatelessTextTransformer
fix: push_frame use TextFrame
2024-07-22 09:25:27 -07:00
weedge
e5363f65f0 fix: push_frame use TextFrame
Signed-off-by: weedge <weege007@gmail.com>
2024-07-22 17:29:06 +08:00
Lewis Wolfgang
ffc157de65 Merge pull request #307 from pipecat-ai/lewis/increase_openai_keepalive_expiry
Allow openai http connections to remain open in the pool indefinitely.
2024-07-19 07:09:17 -04:00
Lewis Wolfgang
f9fdadb4c0 Allow openai http connections to remain open in the pool indefinitely.
Rather than expiring in 5 seconds.
2024-07-18 11:18:21 -04:00
Aleix Conchillo Flaqué
4efccb79f2 Merge pull request #306 from pipecat-ai/aleix/remove-llm-response-start-end-frame
remove LLMResponseStartFrame and LLMResponseEndFrame
2024-07-17 21:51:02 -07:00
Aleix Conchillo Flaqué
337968199a update CHANGELOG with CartesiaTTSService and TTSService updates 2024-07-17 20:58:10 -07:00
Aleix Conchillo Flaqué
37027f68cb remove LLMResponseStartFrame and LLMResponseEndFrame
This was added in the past to properly handle interruptions for the
LLMAssistantContextAggregator. But this is not necessary anymore since we can
handle interruptions by just processing the StartInterruptionFrame, so there's
no need for these extra frames.
2024-07-17 20:53:35 -07:00
Kwindla Hultman Kramer
d1b62c5495 Merge pull request #304 from pipecat-ai/khk/cartesia-continue
Cartesia streaming (WebSocket) and word-level timestamps support
2024-07-17 20:29:15 -07:00
Kwindla Hultman Kramer
355fe01cb7 fixed forgotten renames 2024-07-17 20:28:27 -07:00
Kwindla Hultman Kramer
9d050a16c7 committing an uncommitted file 2024-07-17 20:23:41 -07:00
Kwindla Hultman Kramer
fa53c67606 comments re fixes 2024-07-17 18:30:45 -07:00
Kwindla Hultman Kramer
5006376fe6 undo changes to 02-llm-say-one-thing.py 2024-07-17 15:18:47 -07:00
Kwindla Hultman Kramer
2204b8e205 cartesia streaming and context management via word-level timestamps 2024-07-17 15:17:00 -07:00
Kwindla Hultman Kramer
270007b17c wip - using cartesia word timestamps for context management 2024-07-17 14:13:52 -07:00
Kwindla Hultman Kramer
568eb2ef4c cartesia websockets and streaming 2024-07-17 14:13:52 -07:00
Kwindla Hultman Kramer
73ca9184a8 wip cartesia continuation (not working yet) 2024-07-17 14:13:52 -07:00
Aleix Conchillo Flaqué
5e8e11e16e pyproject: require python >= 3.10 2024-07-17 09:52:42 -07:00
Aleix Conchillo Flaqué
029bbc16f2 Merge pull request #286 from TomTom101/feat/regex_endofsentence
fix: No more falsely detect a sentence end on "U.S.A", "3:00 a.m."
2024-07-17 09:49:21 -07:00
Aleix Conchillo Flaqué
9e3d87e4f6 Merge pull request #291 from adidoit/main
Fix error with readme example - SyntaxError: positional argument follows keyword argument
2024-07-15 13:10:17 -04:00
Aleix Conchillo Flaqué
f1410a1127 Merge pull request #297 from wtlow003/main
fix: minor typo
2024-07-15 13:08:23 -04:00
wtlow003
2b980d16c3 fix: minor typo 2024-07-12 18:27:57 +08:00
Adi Pradhan
b2b97aafb8 fix error with readme example - SyntaxError: positional argument follows keyword argument 2024-07-10 09:50:20 -04:00
TomTom101
da2082b025 chore: Combined combinable lookaheads 2024-07-06 11:11:40 +02:00
TomTom101
327ea9d547 chore: Make it a const 2024-07-06 11:08:51 +02:00
TomTom101
b23db4a202 chore: commented regex 2024-07-06 11:06:52 +02:00
TomTom101
d1a36004ab fix: No more falsely detect a sentence end on "U.S.A", "3:00 a.m." and more 2024-07-06 11:01:32 +02:00
Jon Taylor
6071920c45 Merge pull request #284 from pipecat-ai/jpt/storybot-load-balance
Update storybot demo
2024-07-03 19:48:32 +01:00
Jon Taylor
5f539e1fba fixed teardown 2024-07-03 17:02:54 +01:00
Jon Taylor
8e1539c360 virtualized deployment and added room-based balancing 2024-07-03 16:48:14 +01:00
Aleix Conchillo Flaqué
065cfb2aca Merge pull request #280 from pipecat-ai/aleix/library-updates-070224
library updates 070224 and pipecat 0.0.36
2024-07-02 10:14:03 -07:00
Aleix Conchillo Flaqué
3147534e86 update CHANGELOG for 0.0.36 2024-07-02 10:13:26 -07:00
Aleix Conchillo Flaqué
be5603bf16 examples: fix 06a-image-sync.py 2024-07-02 10:11:50 -07:00
Aleix Conchillo Flaqué
b9b0bcdcbd services(azure): close the audio stream on exit 2024-07-02 10:11:35 -07:00
Aleix Conchillo Flaqué
5bcece56f3 services(cartesia): make sure we close the client on exit 2024-07-02 10:11:16 -07:00
Aleix Conchillo Flaqué
d67faef88c pyproject: multiple library updates 2024-07-02 09:05:37 -07:00
Aleix Conchillo Flaqué
8f6db5e905 Merge pull request #279 from pipecat-ai/aleix/gladia-stt-support
add Gladia STT support
2024-07-02 08:07:35 -07:00
Aleix Conchillo Flaqué
82e93a0560 use exclude_none=True when dumping BaseModels 2024-07-02 08:03:31 -07:00
Aleix Conchillo Flaqué
a9a82c083b services: add GladiaSTTService support 2024-07-02 08:03:29 -07:00
Aleix Conchillo Flaqué
974d9c33ed Merge pull request #278 from pipecat-ai/aleix/detect-user-idle
add support for detecting user idle
2024-07-02 08:01:27 -07:00
Jon Taylor
c1957ab694 Merge pull request #274 from pipecat-ai/jpt/deployment-examples
Example deployment pattern for fly.io
2024-07-02 10:17:13 +01:00
Jon Taylor
b20a10a4bc fixed double fly 2024-07-02 10:17:01 +01:00
Aleix Conchillo Flaqué
be14ce465d transports(daily): make sure we don't send data if client is closed 2024-07-01 18:26:13 -07:00
Aleix Conchillo Flaqué
d1ca0c5614 examples: added new 17-detect-user-idle.py 2024-07-01 18:17:43 -07:00
Aleix Conchillo Flaqué
535514f506 processors: added new UserIdleProcessor 2024-07-01 18:17:43 -07:00
Aleix Conchillo Flaqué
933b63cf13 processors: added new IdleFrameProcessor 2024-07-01 14:57:42 -07:00
Aleix Conchillo Flaqué
d7c3e380a5 added BotSpeakingFrame 2024-07-01 14:57:18 -07:00
Aleix Conchillo Flaqué
c5298f78cb add more missing keyword-only arguments 2024-07-01 12:34:53 -07:00
Jon Taylor
4f8f7b8d1d added on_call_state event to prevent idle vms 2024-07-01 19:21:16 +01:00
Aleix Conchillo Flaqué
d7d46919ac update macos-py3.10-requirements.txt 2024-07-01 11:00:59 -07:00
Aleix Conchillo Flaqué
e5d73d2e2e update linux-py3.10-requirements.txt 2024-07-01 10:58:49 -07:00
Aleix Conchillo Flaqué
b145e8ec90 update README with XTTS 2024-07-01 10:49:43 -07:00
Aleix Conchillo Flaqué
97ff4a1fb8 Merge pull request #275 from pipecat-ai/aleix/add-missing-keyword-separators
add missing keyword separators
2024-07-01 10:45:31 -07:00
Aleix Conchillo Flaqué
5018a552c1 services(xtts): no need the WAV header 2024-07-01 10:44:32 -07:00
Aleix Conchillo Flaqué
7f9fd9ffce examples: added 07i-interruptible-xtts 2024-07-01 10:41:34 -07:00
Aleix Conchillo Flaqué
ddd0ca6a8f update CHANGELOG 2024-07-01 10:27:26 -07:00
Aleix Conchillo Flaqué
06f817c7e3 transport(websocket): don't send if serializer returns None 2024-07-01 10:27:26 -07:00
Aleix Conchillo Flaqué
df4c3e56c4 services: add missing * keyword separator 2024-07-01 10:27:26 -07:00
Aleix Conchillo Flaqué
9d5c2b9656 Merge pull request #276 from eddieoz/feature/xtts
Added service XTTS
2024-07-01 10:26:53 -07:00
eddieoz
7ce59c5e2e added service xtts 2024-07-01 20:17:19 +03:00
Aleix Conchillo Flaqué
1c9631fc78 Merge pull request #271 from pipecat-ai/aleix/silero-vad-version
vad(silero): allow specifying a Silero VAD version
2024-07-01 09:39:59 -07:00
Aleix Conchillo Flaqué
efbe7297f7 vad(silero): allow specifying a Silero VAD version 2024-07-01 09:38:43 -07:00
Aleix Conchillo Flaqué
1b45946a61 Merge pull request #270 from pipecat-ai/aleix/async-frame-processor
add new AsyncFrameProcessor and AsyncAIService
2024-07-01 09:37:51 -07:00
Aleix Conchillo Flaqué
cbf5a6362c add new AsyncFrameProcessor and AsyncAIService 2024-07-01 09:37:02 -07:00
Aleix Conchillo Flaqué
583b96c341 Merge pull request #269 from pipecat-ai/aleix/improve-error-handling
improve error handling and don't swallow exceptions
2024-07-01 09:36:00 -07:00
Aleix Conchillo Flaqué
fc0920504d improve error handling and don't swallow exceptions 2024-07-01 09:35:45 -07:00
Aleix Conchillo Flaqué
abd65a93b2 Merge pull request #268 from pipecat-ai/aleix/websocket-dont-send-if-closed
transports(websocket): don't send data if websocket closed
2024-07-01 09:33:45 -07:00
Aleix Conchillo Flaqué
c3244fdd7a transports(websocket): don't send data if websocket closed 2024-07-01 09:31:58 -07:00
Aleix Conchillo Flaqué
e8f58938b0 Merge pull request #267 from pipecat-ai/aleix/processing-metrics
add support for processing metrics
2024-07-01 09:31:05 -07:00
Jon Taylor
602b4f34b1 added example fly.toml 2024-07-01 16:50:53 +01:00
Jon Taylor
0399c84dfa added flyio deployment example 2024-07-01 16:46:38 +01:00
Aleix Conchillo Flaqué
fd5d879bf5 add support for processing metrics
Processing metrics indicate how much time a processor takes to generate all of
its output.
2024-06-28 14:26:57 -07:00
Aleix Conchillo Flaqué
8dff460307 Merge pull request #266 from pipecat-ai/aleix/silero-num-frames-fixes
vad: fix Silero VAD required number of frames
2024-06-28 11:25:55 -07:00
Aleix Conchillo Flaqué
cce1ddb183 vad: fix Silero VAD required number of frames 2024-06-28 10:45:48 -07:00
Aleix Conchillo Flaqué
8691d14289 Merge pull request #255 from Viking5274/main
Fix twilio error
2024-06-26 10:17:03 -07:00
daniil5701133
dd402da9e5 added handling streamSid after first wss connect
fixx name
2024-06-26 18:56:30 +03:00
Aleix Conchillo Flaqué
2fd04248f1 examples(storytelling-chatbot): upgrade npm vulnerabilities 2024-06-25 22:04:55 -07:00
Aleix Conchillo Flaqué
0ac42006f8 Merge pull request #260 from pipecat-ai/aleix/more-interruption-fixes
more interruption fixes
2024-06-25 21:52:02 -07:00
Aleix Conchillo Flaqué
66e331248d update CHANGELOG for 0.0.34 2024-06-25 21:43:23 -07:00
Aleix Conchillo Flaqué
4be3e8c87d aggregators: revert using intermediate results 2024-06-25 21:33:17 -07:00
Aleix Conchillo Flaqué
dac033fe61 services(azure): allow transcriptions during interruptions
If the user interrupts we can't just discard transcriptions because the user is
actually interrupting and talking.
2024-06-25 21:33:06 -07:00
Aleix Conchillo Flaqué
d302cbb114 services(deepgram): allow transcriptions during interruptions
If the user interrupts we can't just discard transcriptions because the user is
actually interrupting and talking.
2024-06-25 21:32:21 -07:00
Aleix Conchillo Flaqué
e3b407db28 Merge pull request #259 from pipecat-ai/aleix/prepare-0.0.33
update CHANGELOG for 0.0.33
2024-06-25 12:05:07 -07:00
Aleix Conchillo Flaqué
4ef623f09e update CHANGELOG for 0.0.33 2024-06-25 11:53:07 -07:00
Aleix Conchillo Flaqué
253530a63d Merge pull request #258 from pipecat-ai/aleix/upgrade-cartesia-1.0.0
services(cartesia): upgrade to new cartesia 1.0.0
2024-06-25 11:52:04 -07:00
Aleix Conchillo Flaqué
4f38d989f5 services(cartesia): upgrade to new cartesia 1.0.0 2024-06-25 11:51:34 -07:00
Aleix Conchillo Flaqué
84074e90ee Merge pull request #257 from pipecat-ai/aleix/cancel-all-tasks-when-interrutpted
cancel all tasks when interrutpted
2024-06-25 11:16:00 -07:00
Aleix Conchillo Flaqué
38aee7d8f2 services(azure): cancel tasks when interrupted and ignore incoming transcriptions 2024-06-25 11:15:26 -07:00
Aleix Conchillo Flaqué
64198313c6 services(deepgram): cancel tasks when interrupted and ignore incoming transcriptions 2024-06-25 11:15:07 -07:00
Aleix Conchillo Flaqué
d61b6c301c transports(base_input): create push tasks after pushing interruption 2024-06-25 11:15:07 -07:00
Aleix Conchillo Flaqué
83d1931266 Merge pull request #256 from pipecat-ai/aleix/tts-cleanup-when-interrupted
services(tts): strip before TTS and cleanup when interrupted
2024-06-25 11:14:32 -07:00
Aleix Conchillo Flaqué
c31f2ab285 services(tts): strip before TTS and cleanup when interrupted 2024-06-25 11:13:19 -07:00
Aleix Conchillo Flaqué
0ddc5721b4 Merge pull request #252 from pipecat-ai/aleix/daily-check-size-read-audio-frames
transports(daily): always check size of read audio frames
2024-06-25 09:45:05 -07:00
Aleix Conchillo Flaqué
98bd183bc4 pyproject: fix cartesia version and update requirements files 2024-06-25 09:43:54 -07:00
Aleix Conchillo Flaqué
aaa154524c Merge pull request #253 from pipecat-ai/aleix/llm-response-use-intermediate-results
aggregators: uses intermediate results for LLMAssistantResponseAggreg…
2024-06-24 19:21:14 -07:00
Aleix Conchillo Flaqué
beced68337 aggregators: uses intermediate results for LLMAssistantResponseAggregator 2024-06-24 17:33:45 -07:00
Aleix Conchillo Flaqué
94823ab952 transports(daily): always check size of read audio frames 2024-06-24 14:56:24 -07:00
Kwindla Hultman Kramer
0b6a19802f Merge pull request #250 from pipecat-ai/lewis/flush-tts-on-llm-response-end
Flush output from TTSService on LLMFullResponseEndFrame
2024-06-22 20:37:45 -04:00
Lewis Wolfgang
c4a2d2197c Flush output from TTSService on LLMFullResponseEndFrame
To cover cases when the LLM response does not end in punctuation.
2024-06-22 14:57:44 -04:00
Aleix Conchillo Flaqué
269d06aa15 Merge pull request #249 from pipecat-ai/aleix/pipecat-0.0.32
update CHANGELOG.md for 0.0.32
2024-06-22 09:21:21 -07:00
Aleix Conchillo Flaqué
dfef1f2c54 update CHANGELOG.md for 0.0.32 2024-06-22 09:19:22 -07:00
Aleix Conchillo Flaqué
b62beaba0b Merge pull request #248 from pipecat-ai/aleix/deepgramstt-url
services(deepgram): add url to DeepgramSTTService
2024-06-21 22:26:23 -07:00
Aleix Conchillo Flaqué
adf414e40f services(deepgram): add url to DeepgramSTTService 2024-06-21 16:52:28 -07:00
Aleix Conchillo Flaqué
dc64e57f63 Merge pull request #241 from pipecat-ai/aleix/transports-async
transports: fully use asyncio in all read/write operations
2024-06-21 16:00:08 -07:00
Aleix Conchillo Flaqué
d3e410b2ac transports: fully use asyncio in all read/write operations 2024-06-21 15:55:15 -07:00
Aleix Conchillo Flaqué
c544b2474b update linux-py3.10-requirements with fastapi and new daily-python 2024-06-21 15:44:01 -07:00
Aleix Conchillo Flaqué
18243de358 add fastapi and update macos-py3.10-requirements.txt 2024-06-21 13:16:47 -07:00
Aleix Conchillo Flaqué
6625895d1f update macos-py3.10-requirements.txt 2024-06-21 13:13:02 -07:00
Aleix Conchillo Flaqué
f9ecce739e Merge pull request #247 from pipecat-ai/aleix/twilio-updates
some twilio updates
2024-06-21 10:14:40 -07:00
Aleix Conchillo Flaqué
0075dd8386 update linux/macos-py3.10-requirements.txt 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
eef1cde816 updated CHANGELOG.md with fastapi and twilio updates 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
8d867c30c6 transports(websocket): verify websockets module 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
42c668b7ae examples(twilio-chatbot): update instructions and renames 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
b62227b4ae serializers(twilio): formatting and allow str | bytes | None 2024-06-21 09:47:17 -07:00
Aleix Conchillo Flaqué
25ef0cb87b serializers: allow str | bytes | None 2024-06-21 09:42:43 -07:00
Aleix Conchillo Flaqué
e195941aa5 Merge pull request #246 from pipecat-ai/aleix/daily-dialout-answered
transports(daily): added dialout_answered event
2024-06-20 18:37:24 -07:00
Aleix Conchillo Flaqué
e09eef1dd7 Merge pull request #243 from Viking5274/main
Add twilio_websocket_service with example
2024-06-20 14:09:48 -07:00
Aleix Conchillo Flaqué
7c13663a4e transports(daily): added dialout_answered event 2024-06-20 13:01:25 -07:00
daniil5701133
5753869e5e add twilio-chatbot example with README.md info how to start app
created twilio_websocket_service.py, TwilioFrameSerializer.py

moved pcm_16000_to_ulaw_8000 and ulaw_8000_to_pcm_16000 to src/pipecat/utils/audio.py
fixed callback on disconnect
2024-06-20 23:00:01 +03:00
chadbailey59
ba878a19f4 fixed "Dr." interruption (#245) 2024-06-19 20:53:04 -05:00
Aleix Conchillo Flaqué
55a9de78cd Merge pull request #239 from pipecat-ai/aleix/azure-stt
azure stt support
2024-06-14 14:07:07 +08:00
Aleix Conchillo Flaqué
ff51fc9091 updated CHANGELOG and README 2024-06-13 17:03:49 -07:00
Aleix Conchillo Flaqué
a4f857ee34 examples: use new AzureSTTService in 07f-interruptible-azure 2024-06-13 17:03:49 -07:00
Aleix Conchillo Flaqué
3250d74bef services(azure): new AzureSTTService 2024-06-13 17:03:49 -07:00
Aleix Conchillo Flaqué
c086160239 examples: cleanup some 07 interruptible examples 2024-06-13 16:36:10 -07:00
Aleix Conchillo Flaqué
6cdccaff53 Merge pull request #238 from pipecat-ai/aleix/pipecat-0.0.31
pipecat 0.0.31
2024-06-14 06:31:41 +08:00
Aleix Conchillo Flaqué
a9ab8de25d update CHANGELOG for 0.0.31 2024-06-13 15:31:03 -07:00
Aleix Conchillo Flaqué
2a29cb18a5 transports(base_output): chunk audio into 20ms instead of 10ms 2024-06-13 15:30:41 -07:00
Aleix Conchillo Flaqué
4193a4f415 Merge pull request #237 from pipecat-ai/aleix/pipecat-0.0.30
update CHANGELOG for 0.0.30
2024-06-14 05:28:14 +08:00
Aleix Conchillo Flaqué
0226ec450a update CHANGELOG for 0.0.30 2024-06-13 14:27:37 -07:00
Aleix Conchillo Flaqué
020b8ebb35 Merge pull request #236 from pipecat-ai/aleix/report-only-initial-ttfb
report only initial ttfb
2024-06-14 05:24:52 +08:00
Aleix Conchillo Flaqué
1170b30c1b aggregator(user_response): also handle small VADParams.stop_secs 2024-06-13 13:30:31 -07:00
Aleix Conchillo Flaqué
0004d4a906 vad: reduce smoothing factor and increase confidence 2024-06-13 13:30:11 -07:00
Aleix Conchillo Flaqué
cb27e86266 metrics: allow sending only initial TTFB metrics 2024-06-13 13:30:00 -07:00
Aleix Conchillo Flaqué
77a3b2ea5c Merge pull request #235 from pipecat-ai/aleix/openpipe-refactoring
openpipe refactoring
2024-06-14 01:28:50 +08:00
Aleix Conchillo Flaqué
099e65f3b6 report processor name in error logs 2024-06-13 10:20:45 -07:00
Aleix Conchillo Flaqué
befb8db120 update pyproject and requirements 2024-06-13 10:20:45 -07:00
Aleix Conchillo Flaqué
9992d826b1 examples: renamed 06b-listen... to 07h-inte... 2024-06-13 10:18:20 -07:00
Aleix Conchillo Flaqué
18604e1a39 re-add removed CHANGELOG lines 2024-06-13 10:18:20 -07:00
Aleix Conchillo Flaqué
312c569182 services(openpipe): refactored so it's based on BaseOpenAILLMService 2024-06-13 09:30:50 -07:00
Aleix Conchillo Flaqué
b43e0ed130 Merge pull request #233 from KwalAI/openpipe-integration
OpenPipe Integration
2024-06-13 22:41:57 +08:00
Aleix Conchillo Flaqué
289debea34 Merge pull request #234 from pipecat-ai/aleix/fix-daily-room-properties-exp
transports(helpers): fix DailyRoomProperties.exp
2024-06-13 22:38:41 +08:00
Aleix Conchillo Flaqué
ccd6af7016 transports(helpers): fix DailyRoomProperties.exp 2024-06-12 23:15:22 -07:00
Ankur Duggal
effc69e4e4 formatting 2024-06-12 15:01:19 -07:00
Ankur Duggal
c7a0d0db64 OpenPipe Integration 2024-06-12 14:23:56 -07:00
Aleix Conchillo Flaqué
50d69a1ca4 Merge pull request #231 from pipecat-ai/aleix/websocket-deserializer-none
serializer: allow deserialize() to return None
2024-06-13 04:36:03 +08:00
Aleix Conchillo Flaqué
8a6b8fe70a Merge pull request #232 from pipecat-ai/aleix/pyproject-deepgram
pyproject: add deepgram-sdk
2024-06-13 03:53:08 +08:00
Aleix Conchillo Flaqué
c4e53aea71 update macos-py3.10-requirements with deepgram 2024-06-12 12:52:20 -07:00
Aleix Conchillo Flaqué
ad5125e93f pyproject: add deepgram-sdk 2024-06-12 12:50:18 -07:00
Aleix Conchillo Flaqué
8d92cbac93 Merge pull request #230 from pipecat-ai/aleix/processor-names
processor names
2024-06-13 03:16:07 +08:00
Aleix Conchillo Flaqué
0225443ec8 transports(base): always send MetricsFrame 2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué
71e1d0a334 pipeline: send initial TTFB initial metrics from PipelineTask 2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué
83f69e02fd allow specifying frame processor names 2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué
e1b2da1ff0 serializer: allow deserialize() to return None 2024-06-12 12:11:36 -07:00
Kwindla Hultman Kramer
5eb1b90a4b Merge pull request #229 from pipecat-ai/khk-deepgram-url-configurable
Deepgram TTS service improvements
2024-06-12 14:52:04 -04:00
Kwindla Hultman Kramer
9c4ee74b91 bot to test for demo 2024-06-12 10:41:49 -07:00
Aleix Conchillo Flaqué
f65f566829 re-add transports/services/helpers/__init__.py 2024-06-12 10:37:28 -07:00
Aleix Conchillo Flaqué
c8ad3123b7 Merge pull request #207 from pipecat-ai/dialin-example
New example: Dialin bot (call your Pipecat via phone)
2024-06-13 01:36:00 +08:00
Jon Taylor
8cefce28cf added example fly toml 2024-06-12 10:35:03 -07:00
Jon Taylor
a834d26885 removed https from daily boy 2024-06-12 10:35:03 -07:00
Jon Taylor
810e3cd551 added fly.example.toml due to gitignore 2024-06-12 10:35:03 -07:00
Jon Taylor
f258fa96cd added env to dockerignore 2024-06-12 10:35:03 -07:00
Jon Taylor
757ec61f14 added deepgram to readme 2024-06-12 10:35:03 -07:00
Jon Taylor
2c933f43d8 linting errors and removed unusued sip url 2024-06-12 10:35:03 -07:00
Jon Taylor
cc5bfa8af8 removed helps and fixed linting 2024-06-12 10:35:03 -07:00
Jon Taylor
de9f3e55f1 new example: dialin 2024-06-12 10:35:03 -07:00
Aleix Conchillo Flaqué
ed0c986218 Merge pull request #228 from pipecat-ai/aleix/websocket-fixes
websocket fixes
2024-06-13 01:30:21 +08:00
Aleix Conchillo Flaqué
72c27215b6 transports(websocket): use push_audio_frame() 2024-06-12 10:29:39 -07:00
Aleix Conchillo Flaqué
c23b14f768 examples: use DeepgramSTTService in websocker-server 2024-06-12 10:29:22 -07:00
Aleix Conchillo Flaqué
81282f9c4d services(deepgram): keep conenction alive 2024-06-12 10:29:22 -07:00
Aleix Conchillo Flaqué
2b324f6f81 Merge pull request #227 from pipecat-ai/aleix/daily-room-properties-extra
transports(daily): DailyRoomProperties now allow extra unknown parame…
2024-06-13 00:25:07 +08:00
Kwindla Hultman Kramer
049f110344 PipelineTask should not exit when Deepgram TTS returns a Bad Request "unutterable" 2024-06-12 09:24:09 -07:00
Kwindla Hultman Kramer
448a0307a8 rebasing 2024-06-12 07:54:18 -07:00
Aleix Conchillo Flaqué
7390e42f5c transports(daily): DailyRoomProperties now allow extra unknown parameters 2024-06-11 22:31:32 -07:00
Aleix Conchillo Flaqué
ee880d229f Merge pull request #223 from pipecat-ai/aleix/fix-lower-vad-stop-secs
processors: fix LLMResponseAggregator with lower VAD values
2024-06-12 13:30:34 +08:00
Aleix Conchillo Flaqué
9cd07d81f8 processors: fix LLMResponseAggregator with lower VAD values 2024-06-11 22:30:06 -07:00
Aleix Conchillo Flaqué
b453d089c3 Merge pull request #226 from pipecat-ai/aleix/chunk-audio-output
transport: chunk longer audio frames
2024-06-12 13:28:28 +08:00
Aleix Conchillo Flaqué
7410fe1d1e transport: chunk longer audio frames 2024-06-11 17:50:51 -07:00
Aleix Conchillo Flaqué
6323a77431 Merge pull request #224 from pipecat-ai/aleix/deepgram-stt-simple
deepgram stt simple
2024-06-12 08:48:19 +08:00
Aleix Conchillo Flaqué
0aedaa8553 services(deepgram): abstract StartFrame/EndFrame/CancelFrame 2024-06-10 21:18:42 -07:00
Aleix Conchillo Flaqué
6554479d39 transports: don't queue system frames 2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué
ce2ebd3198 examples: updated 07c-interruptible-deepgram to usee DeepgramSTTService 2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué
13ea1efc96 examples: add new 13b-deepgram-transcription 2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué
ef380321cf services: added new DeepgramSTTService 2024-06-10 21:00:01 -07:00
Kwindla Hultman Kramer
294b037730 configurable deepgram base url 2024-06-08 09:38:48 -04:00
Aleix Conchillo Flaqué
7603996612 Merge pull request #220 from pipecat-ai/aleix/pipecat-0.0.29
update CHANGELOG for 0.0.29
2024-06-08 04:43:52 +08:00
Aleix Conchillo Flaqué
3048d2b0b1 update CHANGELOG for 0.0.29 2024-06-07 13:43:00 -07:00
Aleix Conchillo Flaqué
0bb47a09d2 Merge pull request #218 from pipecat-ai/aleix/send-inital-metrics-mapping
send inital metrics mapping
2024-06-08 04:41:59 +08:00
Aleix Conchillo Flaqué
1afe6901d9 processors: add processors_with_metrics() and can_generate_metrics() 2024-06-07 13:38:21 -07:00
Aleix Conchillo Flaqué
3e019fb512 services(openai): remove unused _chat_completions 2024-06-07 13:18:11 -07:00
Aleix Conchillo Flaqué
e069aa9608 updated CHANGELOG with BasePipeline 2024-06-07 13:18:09 -07:00
Aleix Conchillo Flaqué
0b32e42d25 transports(daily): fix extra super().process_frame() 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
8d18be5069 services(anthropic): fix metrics 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
e715d99d0c pipeline: send initial ttfb metrics mapping 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
dc28590247 moved ParallelTask to pipecat.pipeline.parallel_task 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
139f158ea1 Merge pull request #219 from pipecat-ai/aleix/switch-voices
switch voices and languages
2024-06-08 04:13:25 +08:00
Aleix Conchillo Flaqué
4b2a18837f services(whisper): add text logging 2024-06-07 13:12:51 -07:00
Aleix Conchillo Flaqué
b4340d0185 services(whisper): increase no speech probability to 0.4 2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué
90d11398e6 examples: add 15a-switch-languages 2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué
bf8c73b25b examples: add 15-switch-voices 2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué
21cd21de1b processors(filters): add FunctionFilter 2024-06-07 13:12:18 -07:00
Aleix Conchillo Flaqué
c25f6e56e7 Merge pull request #217 from pipecat-ai/khk-tts-timings
Added TTFB timings for all TTS services
2024-06-07 05:42:52 +08:00
Aleix Conchillo Flaqué
a1f1d1995c transports: allow sending metrics 2024-06-06 14:35:34 -07:00
Aleix Conchillo Flaqué
390582d7f3 services: use start/stop_ttfb_metrics to report TTFB metrics 2024-06-06 14:00:10 -07:00
Aleix Conchillo Flaqué
e765a29ca2 processors: implement base process_frame(). all subsclassed should call it 2024-06-06 10:54:21 -07:00
Kwindla Hultman Kramer
cf5c244487 Merge branch 'main' into khk-tts-timings 2024-06-06 13:05:42 -04:00
Kwindla Hultman Kramer
a5eb30a93d changelog 2024-06-06 11:49:05 -04:00
Kwindla Hultman Kramer
ac7bc35944 azure tts ttfb 2024-06-06 11:45:48 -04:00
Kwindla Hultman Kramer
ddfd721f6e openai tts ttfb 2024-06-06 11:32:47 -04:00
Kwindla Hultman Kramer
aee3916cd1 cartesia async fixed 2024-06-06 11:24:26 -04:00
Kwindla Hultman Kramer
3eff1e559b pipecat async working, but maybe needs a threaded implementation 2024-06-06 11:11:06 -04:00
Kwindla Hultman Kramer
1a542c91fa temp commit, woring on playht 2024-06-06 10:48:22 -04:00
Aleix Conchillo Flaqué
cd60a84f8a Merge pull request #215 from pipecat-ai/aleix/silero-vad-memory-fix
vad(silero): fix memory issue
2024-06-06 05:50:47 +08:00
Aleix Conchillo Flaqué
3dd4bac6e6 vad(silero): fix memory issue 2024-06-05 14:50:28 -07:00
Kwindla Hultman Kramer
06ff9cfede added timing logs for cartesia, deepgram, elevenlabs 2024-06-05 16:12:10 -04:00
Aleix Conchillo Flaqué
2d1ed9a304 Merge pull request #214 from pipecat-ai/aleix/pipecat-0.0.27
transports(daily): added participants() and participant_counts()
2024-06-06 03:15:34 +08:00
Aleix Conchillo Flaqué
50b51c05f6 transports(daily): added participants() and participant_counts() 2024-06-05 12:14:00 -07:00
Aleix Conchillo Flaqué
5ce4b8dd5b update CHANGELOG with OpenAITTSService 2024-06-05 11:44:24 -07:00
Aleix Conchillo Flaqué
2f4467b5a5 Merge pull request #213 from pipecat-ai/aleix/pipecat-0.0.26
update CHANGELOG for 0.0.26
2024-06-06 01:10:01 +08:00
Aleix Conchillo Flaqué
e91ab54a69 update CHANGELOG for 0.0.26 2024-06-05 10:07:45 -07:00
Aleix Conchillo Flaqué
6a33432c82 Merge pull request #212 from pipecat-ai/aleix/make-pinlesscallupdate-public
transports(daily): move pinlessCallUpdate to public api
2024-06-05 23:14:14 +08:00
Aleix Conchillo Flaqué
135654a080 transports(daily): move pinlessCallUpdate to public api 2024-06-05 08:08:56 -07:00
Aleix Conchillo Flaqué
7b708a2bee Merge pull request #211 from pipecat-ai/aleix/base-transport-async
various fixes and improvements
2024-06-05 22:57:35 +08:00
Aleix Conchillo Flaqué
b515c28417 services(cartesia): allow output_format and model_id 2024-06-04 19:24:33 -07:00
Aleix Conchillo Flaqué
854ffb0323 update CHANGELOG for DailyRESTHelper 2024-06-04 15:45:17 -07:00
Aleix Conchillo Flaqué
891b7b22ea transports: push EndFrame/CancelFrame before stopping push task 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
c8d37a7227 pipeline(runner): add support for SIGTERM 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
489060881d update macos-py3.10-requirements 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
d56a4cce1b update CHANGELOG with latest changes 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
7eb9dfde38 pyproject: include langchain-community and langchain-openai 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
571e10f83e services(anthropic): fix interruptions with anthropic 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
af202d4fe5 pipeline(task): introduce has_finished() 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
4057fbbcfd transports(tk): fix pyaudio output stream cleanup 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
5cdb8a79a1 examples: use camera_out_is_live for live video 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
a674b43243 transport: remove redundant camera thread and switch audio pull for push 2024-06-04 15:43:54 -07:00
Jon Taylor
ac41f13b7c Merge pull request #205 from pipecat-ai/daily_rest_helpers
Created REST helpers for Daily covering commonly used methods for running / deployment
2024-06-04 22:26:39 +02:00
Jon Taylor
003b9887b1 made sip and sipuri optional and None 2024-06-04 19:03:58 +02:00
Jon Taylor
ba45c2ab5b addressed review (urllib import and linting 2024-06-04 18:39:35 +02:00
Aleix Conchillo Flaqué
9d36a48a80 Merge pull request #208 from pipecat-ai/aleix/cartesia-voice-load-startup
services(cartesia): load voices on startup
2024-06-04 22:54:25 +08:00
Aleix Conchillo Flaqué
20a525635e Merge pull request #201 from TomTom101/TomTom101/openai_tts
Added OpenAI TTS (#196)
2024-06-04 22:53:56 +08:00
Aleix Conchillo Flaqué
659eceea95 services(cartesia): load voices on startup 2024-06-03 14:08:04 -07:00
TomTom101
d462c03d00 chore: Review comments 2024-06-03 20:13:15 +02:00
Jon Taylor
6591e07eb4 removed hardcoded 'https' from API url 2024-06-03 19:32:14 +02:00
Aleix Conchillo Flaqué
fe71825954 Merge pull request #206 from pipecat-ai/aleix/fix-deepgram-tts
services(deepgram): fixed DeepgramTTSService
2024-06-04 00:28:53 +08:00
Aleix Conchillo Flaqué
43516f84fe services(deepgram): fixed DeepgramTTSService 2024-06-03 07:53:46 -07:00
Jon Taylor
0849edb00b added Daily REST helpers file for common methods used in Pipecat bots 2024-06-03 16:38:13 +02:00
Aleix Conchillo Flaqué
dd3b4083eb Merge pull request #204 from TomTom101/TomTom101/langchain
fix: Fixed imports, support new PipelineParams
2024-06-03 03:16:30 +08:00
TomTom101
89673a4040 test(langchain): Use new PipelineParams in test 2024-06-02 20:19:55 +02:00
TomTom101
410dbd3dfc fix: Fixed imports, support new PipelineParams 2024-06-02 20:16:11 +02:00
TomTom101
7085b1ea3f doc(openai): Added hint re the 24kHz sample rate 2024-06-01 20:35:46 +02:00
TomTom101
8683cae719 feat: OpenAITTS 2024-06-01 10:13:28 +02:00
Aleix Conchillo Flaqué
0197efa524 Merge pull request #200 from pipecat-ai/aleix/changelog-0.0.25
update CHANGELOG.md for version 0.0.25
2024-06-01 07:48:42 +08:00
Aleix Conchillo Flaqué
16e76caa33 update CHANGELOG.md for version 0.0.25 2024-05-31 16:48:03 -07:00
Aleix Conchillo Flaqué
1f5240694d Merge pull request #199 from pipecat-ai/aleix/langchain-changelog
move LangchainProcessor to processors/frameworks and update CHANGELOG
2024-06-01 07:46:51 +08:00
Aleix Conchillo Flaqué
f087151db7 move LangchainProcessor to processors/frameworks and update CHANGELOG 2024-05-31 16:45:39 -07:00
Aleix Conchillo Flaqué
0b691ff597 Merge pull request #198 from pipecat-ai/aleix/websocket-transport
websocket transport support
2024-06-01 04:40:39 +08:00
TomTom101
ae049961b7 wip: untested 2024-05-31 22:30:52 +02:00
Aleix Conchillo Flaqué
0d6eee705f Merge pull request #190 from TomTom101/TomTom101/langchain
Langchain service
2024-06-01 04:21:12 +08:00
Aleix Conchillo Flaqué
58d20ec9dc transport(websocket-server): add on_client_disconnected 2024-05-31 12:52:43 -07:00
Aleix Conchillo Flaqué
38befe1dc1 examples(websocket): rename server.py to bot.py 2024-05-31 12:09:54 -07:00
Aleix Conchillo Flaqué
2f335100a5 remove storage folder 2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué
3fef818843 examples(websocket-server): use VAD analyzer from transport 2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué
428c8af77e transports(websocket): base class from BaseInputTransport 2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué
54fccd2e25 pipeline: cleanup processors one by one 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
66c6a5dc0f transports(websocket): base class from BaseOutputTransport 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
92561ae19d some event loop parameter updates 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
b85e93410b transports(daily): fix event handlers callback 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
593993ba97 transports(base_input): remove unnecessary task 2024-05-31 11:37:41 -07:00
Aleix Conchillo Flaqué
7b8b606278 update CHANGELOG and create websocker-server instructions 2024-05-31 11:37:19 -07:00
Aleix Conchillo Flaqué
7116ad0607 examples: fix websocket-client audio playback 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
c507044277 examples: use gpt-4o model by default 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
5f45a9d90f examples: websocket-server updates 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
e31e87aabd transport(websocket): update audio_frame_size 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
2957416d90 serializers(protobuf): support id and name fields 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
b9b761b67a added sample_rate and num_channels to protobuf AudioRawFrame 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
a7539e9317 transports: simplify and fix async and nested decorators 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
75575c0c68 use get_event_loop() and move event handlers to BaseTransport 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
77b3e08214 examples: add and update wbesocket eaxmples 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
956b783c1a transports: added new WebsocketServerTransport 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
e90c080470 serializers: added BaseSerializer 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
37aabaa03a frames: generate protobuf pb2 file for pipecat package 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
3e289a7bef pyproject: add protobuf dependency 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
6dd5e3fdf5 dev-requirements: add grpcio-tools 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
e60df3c7c0 Merge pull request #195 from pipecat-ai/aleix/function-calling-move-to-llmservice
function calling move to LLMService
2024-06-01 02:36:29 +08:00
Aleix Conchillo Flaqué
42f772beed examples: some function calling examples cleanup 2024-05-31 11:36:04 -07:00
Aleix Conchillo Flaqué
3655c4a0fc services: move function calling registration to LLMService 2024-05-31 11:36:04 -07:00
Aleix Conchillo Flaqué
012dbffd94 update CHANGELOG.md for function calling 2024-05-31 11:36:03 -07:00
TomTom101
4b39efeee3 fix(langchain): try/catch langchain import in service; Only langchain is installed with the [langchain] extra (#190) 2024-05-31 10:19:27 +02:00
Kwindla Hultman Kramer
19caf750fd Merge pull request #194 from pipecat-ai/khk-cartesia-changelog
Added cartesia line to CHANGELOG.md
2024-05-30 14:18:41 -07:00
Kwindla Hultman Kramer
296611714f added cartesia line to CHANGELOG.md 2024-05-30 10:41:00 -07:00
chadbailey59
4c3d19cc8b Function calling (#175)
* added function calling code back

* removed old llm_context file

* added integration testing for openai

* added function calling example

* added function callbacks

* added function start callback

* fixup

* fixup

* added different return type support for function calling

* intake example working

* added frame loggers

* cleanup

* fixup

* Update openai.py

* removed function call frame types

* fixup

* re-added example

* renumbered wake phrase

* fixup for autopep8

* remove unused imports
2024-05-30 12:25:39 -05:00
Aleix Conchillo Flaqué
a3ba07c7a3 Merge pull request #193 from pipecat-ai/aleix/fix-camera-out-enabled-cpu
transport(output): fix high CPU usage with camera_out_enabled and no …
2024-05-31 01:25:06 +08:00
Kwindla Hultman Kramer
a1579808b2 Merge pull request #189 from pipecat-ai/khk-cartesia-etc
Cartesia TTS
2024-05-30 10:24:45 -07:00
Aleix Conchillo Flaqué
aecb9f5816 transport(output): fix high CPU usage with camera_out_enabled and no images 2024-05-30 10:18:43 -07:00
Aleix Conchillo Flaqué
a5d42a526c Merge pull request #191 from pipecat-ai/aleix/fix-silero-vad
vad: fix silero vad frame processor
2024-05-30 23:25:52 +08:00
Aleix Conchillo Flaqué
a9472f8116 vad: fix silero vad frame processor 2024-05-30 07:50:58 -07:00
TomTom101
b19243ab75 fix: corrected hint to install Langchain libs 2024-05-30 10:53:42 +02:00
TomTom101
2bf094b950 test(langchain): Rewrite to unittest, make it meaningful 2024-05-30 10:43:33 +02:00
Kwindla Hultman Kramer
d5f106ae19 pr fixes 2024-05-29 23:41:35 -07:00
Kwindla Hultman Kramer
920745345a cartesia tts support 2024-05-29 23:35:35 -07:00
TomTom101
143033d7db fix: install langchain-community with the langchain extra 2024-05-30 03:15:14 +02:00
TomTom101
335990c145 wip: hint to install langchain_community 2024-05-30 03:15:14 +02:00
TomTom101
6d24e836b0 wip: Example using LC message history 2024-05-30 03:15:14 +02:00
TomTom101
278a2fed56 wip: First stab at langchain support
Is this a service or processor?
How to deal with conversation history? LC has sophisticated means of this, but might get in the way of `LLMResponseAggregator`
2024-05-30 03:15:14 +02:00
Aleix Conchillo Flaqué
c444004eec Merge pull request #186 from pipecat-ai/aleix/update-changelog-0.0.24
update CHANGELOG.md 0.0.24
2024-05-29 23:23:06 +08:00
Aleix Conchillo Flaqué
72cf7896d7 update CHANGELOG.md 0.0.24 2024-05-29 08:22:33 -07:00
Aleix Conchillo Flaqué
31af5f8177 Merge pull request #182 from pipecat-ai/aleix/expo-se-dialin-ready
transports(daily): expose dialin-ready and handle timeouts
2024-05-29 23:05:47 +08:00
Aleix Conchillo Flaqué
6a68d9a57e pyproject: update daily-python to 0.9.0 2024-05-28 18:30:43 -07:00
Aleix Conchillo Flaqué
39f41ab25e transports(daily): expose dialin-ready and handle timeouts 2024-05-28 18:00:09 -07:00
Aleix Conchillo Flaqué
624cc1e987 Merge pull request #185 from pipecat-ai/aleix/add-start-recording
transport(daily): add start_recording, stop_recording and stop_dialout
2024-05-29 08:24:59 +08:00
Aleix Conchillo Flaqué
08a15e5cdd transports(daily): expose on_app_message 2024-05-28 17:23:34 -07:00
Aleix Conchillo Flaqué
4cd4787e4d transports(daily): added on_call_state_updated 2024-05-28 17:23:34 -07:00
Aleix Conchillo Flaqué
65afee2808 transport(daily): add start_recording, stop_recording and stop_dialout 2024-05-28 17:16:39 -07:00
Aleix Conchillo Flaqué
00ece864ec Merge pull request #184 from pipecat-ai/aleix/introduce-pipelineparams
introduce PipelineParams
2024-05-29 08:14:58 +08:00
Aleix Conchillo Flaqué
6d6d9bea5a introduce PipelineParams 2024-05-28 17:14:14 -07:00
Kwindla Hultman Kramer
7c213f8533 Merge pull request #183 from pipecat-ai/khk-deepgram-fix
moving Deepgram TTS base_url from beta to prod
2024-05-28 17:04:03 -07:00
Kwindla Hultman Kramer
3685c19b2d moving Deepgram TTS base_url from beta to prod 2024-05-28 15:59:26 -07:00
Aleix Conchillo Flaqué
650a2b4da4 Merge pull request #174 from pipecat-ai/fix-azure-llm-service
services(azure): fix AzureLLMService
2024-05-25 00:27:51 +08:00
Aleix Conchillo Flaqué
afea6f38f6 examples: no need to define tts twice 2024-05-24 09:23:00 -07:00
Aleix Conchillo Flaqué
c45d428551 services(google): make api_key argument mandatory 2024-05-24 09:23:00 -07:00
Aleix Conchillo Flaqué
4e594aa9b0 services: BaseOpenAILLMService.create_client() now returns the client 2024-05-24 09:04:15 -07:00
Aleix Conchillo Flaqué
32f91c5f31 services(azure): fix AzureLLMService
Fixes #160
2024-05-23 16:51:04 -07:00
Aleix Conchillo Flaqué
a32ece897a Merge pull request #179 from pipecat-ai/aleix/aiohttp-response-text
fix aiohttp response text
2024-05-24 07:42:05 +08:00
Aleix Conchillo Flaqué
88f6436aaa fix aiohttp response text 2024-05-23 15:51:00 -07:00
Aleix Conchillo Flaqué
fac43cea06 Merge pull request #178 from pipecat-ai/aleix/daily-python-0.8.0-deps
update linux/macos requirements
2024-05-24 05:50:10 +08:00
Aleix Conchillo Flaqué
a9e6aeed54 update linux/macos requirements 2024-05-23 14:49:34 -07:00
Aleix Conchillo Flaqué
fa9f49f5bb Merge pull request #177 from pipecat-ai/aleix/dialin-ready-missing-sipuri
transports(daily): fix dialin-ready event handling
2024-05-24 05:39:31 +08:00
Aleix Conchillo Flaqué
2a6183aba5 transports(daily): fix dialin-ready event handling 2024-05-23 14:38:37 -07:00
Aleix Conchillo Flaqué
b1a622971b Merge pull request #176 from pipecat-ai/aleix/handle-dialin-ready
transport(daily): add support for dial-in use cases
2024-05-24 04:58:10 +08:00
Aleix Conchillo Flaqué
5b72faccb4 update CHANGELOG.md for release 0.0.22 2024-05-23 13:57:28 -07:00
Aleix Conchillo Flaqué
c8732544c7 transport(daily): add support for dial-in use cases 2024-05-23 13:56:50 -07:00
Aleix Conchillo Flaqué
d4219b16b8 Merge pull request #170 from pipecat-ai/add-daily-transport-dialout-support
transport(daily): add dialout support
2024-05-24 04:19:51 +08:00
Aleix Conchillo Flaqué
0c33432f64 transport(daily): update CHANGELOG.md with dialout/dialin updates 2024-05-23 13:14:34 -07:00
Aleix Conchillo Flaqué
95bd58cced pyproject: depend on daily-python 0.8.0 2024-05-23 13:10:48 -07:00
Aleix Conchillo Flaqué
8d7d1a7e24 transport(daily): add dialin-ready event 2024-05-23 07:12:31 -07:00
Aleix Conchillo Flaqué
3768cb2f2c transport(daily): add dialout support 2024-05-22 22:44:01 -07:00
Aleix Conchillo Flaqué
d4b2741608 Merge pull request #169 from pipecat-ai/update-changelog-0.0.21
update CHANGELOG.md for 0.0.21
2024-05-23 12:42:41 +08:00
Aleix Conchillo Flaqué
aef2152dcc update CHANGELOG.md for 0.0.21 2024-05-22 21:40:29 -07:00
Aleix Conchillo Flaqué
d0b0221b97 Merge pull request #167 from pipecat-ai/khk-bump-anthropic
add new response frame types and vision support for anthropic
2024-05-23 12:16:55 +08:00
Kwindla Hultman Kramer
b4758cd989 update CHANGELOG.md 2024-05-22 21:14:11 -07:00
Kwindla Hultman Kramer
681250f114 add new response frame types and vision support for anthropic 2024-05-22 21:12:30 -07:00
Aleix Conchillo Flaqué
fd13d3c50e Merge pull request #168 from pipecat-ai/transcription-logging
transports(daily): add transcription logging
2024-05-23 11:42:51 +08:00
Aleix Conchillo Flaqué
674b8bb0cd transports(daily): add transcription logging 2024-05-22 20:41:34 -07:00
Aleix Conchillo Flaqué
5d9a962146 Merge pull request #166 from pipecat-ai/fix-llm-response-wake-check
fix llm response wake check
2024-05-23 11:35:11 +08:00
Aleix Conchillo Flaqué
e130aada72 filters(WakeCheckFilter): increase timeout to 3 2024-05-22 19:41:14 -07:00
Aleix Conchillo Flaqué
76709a9a39 enclose text between brackets when logging 2024-05-22 19:05:18 -07:00
Aleix Conchillo Flaqué
acd2d55b84 examples(14): remove commented code 2024-05-22 19:05:18 -07:00
Aleix Conchillo Flaqué
fcec0eb812 transports(base): log when user is speaking 2024-05-22 19:05:18 -07:00
Aleix Conchillo Flaqué
e9965347b5 processors(WakeCheckFilter): log what frame we are pushing 2024-05-22 19:05:18 -07:00
Aleix Conchillo Flaqué
5a83f75e0d processors: fix user response processors 2024-05-22 19:05:18 -07:00
Aleix Conchillo Flaqué
91c706a201 Merge pull request #165 from pipecat-ai/clear-audio-output-buffer-when-interrupted
transport(base): clear audio output buffer if interrupted
2024-05-23 07:31:33 +08:00
Aleix Conchillo Flaqué
34384881bc transport(base): clear audio output buffer if interrupted 2024-05-22 16:30:43 -07:00
Aleix Conchillo Flaqué
71ba28753e Merge pull request #157 from pipecat-ai/khk-improved-wake-word
Improved wake word filter
2024-05-23 06:47:59 +08:00
Aleix Conchillo Flaqué
32d2f0db66 update CHANGELOG.ms with filters updates 2024-05-22 15:46:13 -07:00
Aleix Conchillo Flaqué
e1169a4e82 processors(WakeCheckFilter): push error 2024-05-22 15:44:44 -07:00
Aleix Conchillo Flaqué
0e5711e62d examples: update 10-wake-work.py to use WakeCheckFilter 2024-05-22 15:44:44 -07:00
Aleix Conchillo Flaqué
0ddfa3de5b move WakeCheckFilter to processors/filters 2024-05-22 15:44:43 -07:00
Kwindla Hultman Kramer
661aa79b7c fix user_id str field name in TranscriptionFrame 2024-05-22 15:44:43 -07:00
Kwindla Hultman Kramer
2c32cc2f27 improved wake word filter 2024-05-22 15:44:43 -07:00
Aleix Conchillo Flaqué
d7bb0bc5cb Merge pull request #164 from pipecat-ai/readd-vad-exp-smoothing
vad: re-add volume exponential smoothing
2024-05-23 06:44:27 +08:00
Aleix Conchillo Flaqué
d5644c3ab9 vad: re-add volume exponential smoothing 2024-05-22 15:26:32 -07:00
Aleix Conchillo Flaqué
09ab8e3efd Merge pull request #163 from pipecat-ai/update-0.0.20-deps
update requirements files
2024-05-23 05:40:12 +08:00
Aleix Conchillo Flaqué
2f683529ec update requirements files 2024-05-22 14:39:26 -07:00
Aleix Conchillo Flaqué
6ac012a82b Merge pull request #158 from pipecat-ai/use-pyloudnorm-loudness
interruptions: introduce pyloudnorm to compute loudness
2024-05-23 05:24:38 +08:00
Aleix Conchillo Flaqué
075194cb54 update CHANGELOG for 0.0.20 2024-05-22 14:21:13 -07:00
Aleix Conchillo Flaqué
269f070051 audio: no need for compute_rms 2024-05-22 14:09:24 -07:00
Aleix Conchillo Flaqué
3342c9d7c2 services(stt): use calculate_audio_volume 2024-05-22 13:05:20 -07:00
Aleix Conchillo Flaqué
b468b2f926 audio: clamp normalized volume 2024-05-22 13:04:09 -07:00
Aleix Conchillo Flaqué
af1c7d0023 interruptions: introduce pyloudnorm to compute loudness
https://github.com/csteinmetz1/pyloudnorm
2024-05-22 11:52:07 -07:00
Aleix Conchillo Flaqué
34670eef79 Merge pull request #162 from pipecat-ai/reset-before-pushing
processors: reset aggergator before pushing
2024-05-23 02:51:55 +08:00
Aleix Conchillo Flaqué
979739c1b7 processors: reset aggergator before pushing 2024-05-22 11:26:08 -07:00
Aleix Conchillo Flaqué
83ed6870b9 Merge pull request #161 from pipecat-ai/only-interrupt-assistant
processors: only interrupt asssisstant
2024-05-23 02:02:43 +08:00
Aleix Conchillo Flaqué
57a568986a processors: only interrupt asssisstant
We were pushing interruption frames in the audio task. This was caussing the
LLMUserResponseAggregator to push the accumulated text and then casuing the LLM
to respond.
2024-05-22 10:15:35 -07:00
Aleix Conchillo Flaqué
e828e26b5b Merge pull request #159 from pipecat-ai/create-pool-executor
transports: run threads in their own ThreadPoolExecutor
2024-05-22 15:49:03 +08:00
Aleix Conchillo Flaqué
825738440e transports: run threads in their own ThreadPoolExecutor 2024-05-21 18:52:27 -07:00
Aleix Conchillo Flaqué
147bd1a075 Merge pull request #156 from pipecat-ai/pipecat-0.0.19
update CHANGELOG.md for 0.0.19
2024-05-21 12:36:48 +08:00
Aleix Conchillo Flaqué
209e97f372 update CHANGELOG.md for 0.0.19 2024-05-20 21:33:15 -07:00
Aleix Conchillo Flaqué
47f8627432 Merge pull request #155 from pipecat-ai/llm-accumlate-full-response
aggregators: accumulate full responses and take interruptions into ac…
2024-05-21 11:34:39 +08:00
Aleix Conchillo Flaqué
cc6713837a github: publish test to pypi again. simply always use PRs 2024-05-20 12:19:39 -07:00
Aleix Conchillo Flaqué
728fe0ad88 github: don't publish to test pypi twice 2024-05-20 12:15:54 -07:00
Aleix Conchillo Flaqué
dbba45349f github: don't run publish_test on main branch 2024-05-20 12:14:00 -07:00
Aleix Conchillo Flaqué
40ccf46b4b aggregators: accumulate full responses and take interruptions into account 2024-05-20 11:40:57 -07:00
Aleix Conchillo Flaqué
077bb9f20a Merge pull request #153 from pipecat-ai/expose-llm-messages
aggregators: expose LLM messages
2024-05-21 02:40:26 +08:00
Aleix Conchillo Flaqué
e4c990c677 aggregators: expose LLM messages 2024-05-20 10:51:37 -07:00
Aleix Conchillo Flaqué
1c8b9d813a examples: minot updates to storytelling-chatbot instructions 2024-05-20 10:31:33 -07:00
Aleix Conchillo Flaqué
83812f2671 transports(daily): implement DailyOutputTransport.send_message 2024-05-20 10:30:59 -07:00
Aleix Conchillo Flaqué
4053c33899 update CHANGELOG for 0.0.17 2024-05-19 19:27:20 -07:00
Aleix Conchillo Flaqué
03978b63bc update linux-py3.10-requirements.txt 2024-05-19 19:27:04 -07:00
Aleix Conchillo Flaqué
bf036be6b8 Merge pull request #150 from pipecat-ai/khk-gemini
Initial commit of Google Gemini LLM service.
2024-05-20 10:24:31 +08:00
Kwindla Hultman Kramer
7ffb10d7f5 add to CHANGELOG.md 2024-05-19 12:44:45 -07:00
Kwindla Hultman Kramer
66377954cb fix up openai vision and gemini implementation 2024-05-19 12:33:57 -07:00
Kwindla Hultman Kramer
e507686cef oops, fix openai.py 2024-05-19 11:13:39 -07:00
Kwindla Hultman Kramer
e5ddaf14f4 add google and deepgram to README.md 2024-05-19 11:09:30 -07:00
Kwindla Hultman Kramer
cf597a2f6b add back in debug log line in openai.py 2024-05-19 11:08:38 -07:00
Kwindla Hultman Kramer
d83f0aabca generate macos-py3.10-requirements.txt with Python 3.10 2024-05-19 10:53:50 -07:00
Kwindla Hultman Kramer
b337e984b3 Initial commit of Google Gemini LLM service.
Gemini text input works. We translate from OpenAILLMContext format
on the fly in the GoogleLLMService implementation. This commit also
implements image input (vision) in both the GoogleLLMService and in
the OpenAILLMService. Image input is a hack and needs to be revisited.
OpenAI expects images to be uploaded as base64-encoded JPEGs. Google
does not require the base64 encoding. Other than for images, we use
the OpenAI format as our standard, but base64-encoding the images
and then unencoding them in the GoogleLLMService feels wasteful.
2024-05-19 10:35:20 -07:00
Aleix Conchillo Flaqué
6366ee072e Merge pull request #144 from pipecat-ai/initial-interruptions
intial basic interruptions support
2024-05-20 01:33:15 +08:00
Aleix Conchillo Flaqué
c3bfcbd562 aggregators: clear accumulated responses if interruption happens 2024-05-19 10:21:45 -07:00
Aleix Conchillo Flaqué
c0d5054798 examples: some prompt tweaking 2024-05-19 09:41:36 -07:00
Aleix Conchillo Flaqué
810dc30d3d examples: fix examples to use LLMFullResponseEndFrame 2024-05-19 09:39:34 -07:00
Aleix Conchillo Flaqué
36dd4933e9 example: add assistant responses to simple chatbot 2024-05-18 10:01:46 -07:00
Aleix Conchillo Flaqué
435fffe1b0 add LLMFullResponseStartFrame/LLMFullResponseEndFrame 2024-05-18 09:49:38 -07:00
Aleix Conchillo Flaqué
2b8f1c4cda services(openai): send LLMResponseStartFrame for each completion 2024-05-17 17:47:33 -07:00
Aleix Conchillo Flaqué
0e8c7a9b28 transports(output): create an downstream push frame task 2024-05-17 17:47:24 -07:00
Aleix Conchillo Flaqué
3e13678f23 vad: use exponential smoothed volume to improve speech detection 2024-05-17 17:13:31 -07:00
Aleix Conchillo Flaqué
455ec4f1fd services(tts): always send received TextFrame downstream 2024-05-17 17:11:11 -07:00
Aleix Conchillo Flaqué
8dc81042c3 examples: use DailyTranscriptionSettings in translation-chatbot 2024-05-17 15:37:29 -07:00
Aleix Conchillo Flaqué
c77db79447 examples: pipelines readability and add LLM assistants after transport 2024-05-17 14:52:51 -07:00
Aleix Conchillo Flaqué
de65028061 vad: reduce default confidence back to 0.5 2024-05-17 14:39:40 -07:00
Aleix Conchillo Flaqué
d66a795413 examples: use SileroVADAnalyzer instead of SileroVAD 2024-05-17 14:18:55 -07:00
Aleix Conchillo Flaqué
34762bf604 transports: allows update allow_interruptinos when receiving StartFrame 2024-05-17 14:15:37 -07:00
Aleix Conchillo Flaqué
57121338b1 pipeline(task): cleanup processors only if we need to 2024-05-17 13:53:33 -07:00
Aleix Conchillo Flaqué
a5d246ec0c vad: use exponential smoothing to avoid sudden changes 2024-05-17 13:53:33 -07:00
Aleix Conchillo Flaqué
f2cefeeedc utils: move exp_smoothing to utils module 2024-05-17 13:52:18 -07:00
Aleix Conchillo Flaqué
537e72a05f vad: introduce VADParams so you can tweak things 2024-05-17 13:52:18 -07:00
Aleix Conchillo Flaqué
efa5a061d7 silero: simplify int16 -> float32 conversion 2024-05-17 13:51:06 -07:00
Aleix Conchillo Flaqué
0bef44c2ff introduce StartInterruptionFrame and StopInterruptionFrame 2024-05-17 13:51:06 -07:00
Aleix Conchillo Flaqué
f62fe059b1 fix issues with Ctrl-C tasks cancellation 2024-05-17 13:51:04 -07:00
Aleix Conchillo Flaqué
f432e2b17e transports: allow adding a vad analyzer to BaseInputTransport 2024-05-17 13:50:48 -07:00
Aleix Conchillo Flaqué
8c877d7d8e examples: update 07-interruptible 2024-05-17 13:50:48 -07:00
Aleix Conchillo Flaqué
dc9377fb92 add missing queue task_done() 2024-05-17 13:50:48 -07:00
Aleix Conchillo Flaqué
7384b63b1d initial interruptions support 2024-05-17 13:50:45 -07:00
Aleix Conchillo Flaqué
ba6ecf541f update CHANGELOG.md for 0.0.16 2024-05-16 18:15:07 -07:00
Aleix Conchillo Flaqué
94e5709d58 Merge pull request #149 from pipecat-ai/transports-push-task
transport: create input transports push frame task
2024-05-17 09:14:35 +08:00
Aleix Conchillo Flaqué
add8d3cbaf transport: create input transports push frame task 2024-05-16 16:54:39 -07:00
Aleix Conchillo Flaqué
1a42188bce Merge pull request #146 from pipecat-ai/daily-dont-send-tracks-if-not-enabled
transports(daily): don't send camera/audio tracks if not enabled
2024-05-17 01:24:39 +08:00
Aleix Conchillo Flaqué
0da427e127 transports(daily): don't send camera/audio tracks if not enabled 2024-05-16 08:16:39 -07:00
Aleix Conchillo Flaqué
9447b32f3e transports(daily): on_app_message doesn't need to be event handler 2024-05-15 17:06:47 -07:00
Aleix Conchillo Flaqué
af10adb7fe some minor event loop updates 2024-05-15 17:00:43 -07:00
Aleix Conchillo Flaqué
129acf886f transports(daily): hot fix for receiving transport messages 2024-05-15 17:00:04 -07:00
Aleix Conchillo Flaqué
9af3e1efac update CHANGELOG.md for 0.0.14 2024-05-15 15:59:38 -07:00
Aleix Conchillo Flaqué
9e22a8b4ff transports(daily): add receiving transport messages 2024-05-15 15:59:08 -07:00
Aleix Conchillo Flaqué
28da747f19 transports(daily): fix on_participant_left event 2024-05-15 15:40:31 -07:00
Aleix Conchillo Flaqué
3d6783ddb0 transports: resize output image if it doesn't match camera 2024-05-15 15:36:20 -07:00
Aleix Conchillo Flaqué
349fc526d7 transports(daily): avoid locking if no participant has joined yet 2024-05-15 15:24:58 -07:00
Aleix Conchillo Flaqué
acf6dc0a30 transports: more start and stop fixes 2024-05-15 15:23:03 -07:00
Aleix Conchillo Flaqué
3563e66ff6 transports(daily): add on_participant_left event 2024-05-15 15:20:37 -07:00
Aleix Conchillo Flaqué
8965ff27ec examples: use DEBUG in 09-mirror.py 2024-05-14 19:25:31 -07:00
Aleix Conchillo Flaqué
86feb1e104 services: fix DailyTransport stop/cleanup ordering 2024-05-14 19:24:55 -07:00
Aleix Conchillo Flaqué
f6257a86d3 examples: re-enable audio in 09-mirror.py 2024-05-14 19:23:35 -07:00
Aleix Conchillo Flaqué
bd04ea8aca examples: simplify 09-mirror.py 2024-05-14 19:07:19 -07:00
Aleix Conchillo Flaqué
754c1c6775 services: fixed DailyTransport output camera and audio 2024-05-14 19:07:19 -07:00
Aleix Conchillo Flaqué
0b01eb5a11 services: pass **kwargs to TTService 2024-05-14 18:46:03 -07:00
Aleix Conchillo Flaqué
6247b9df39 services: fix STTService and WhisperSTTService 2024-05-14 18:45:40 -07:00
Aleix Conchillo Flaqué
bd5344c892 services: MoondreamService model_id argument is now model 2024-05-14 18:34:10 -07:00
Aleix Conchillo Flaqué
e4fe54cd7f vad: rename VADAnalyzer arguments 2024-05-14 18:33:17 -07:00
Aleix Conchillo Flaqué
97f9e9b042 examples: update simple-chatbot prompt 2024-05-14 15:30:31 -07:00
Aleix Conchillo Flaqué
3668eb1606 update CHANGELOG for 0.0.12 2024-05-14 14:52:08 -07:00
Aleix Conchillo Flaqué
e23addcc02 examples: update simple-chatbot with Spanish 2024-05-14 14:51:44 -07:00
Aleix Conchillo Flaqué
5147f4086e transports(daily): add DailyTranscriptionSettings to update settings easier 2024-05-14 14:49:30 -07:00
Aleix Conchillo Flaqué
fb3c2de83f Merge pull request #141 from pipecat-ai/add-changelog
add CHANGELOG.md
2024-05-15 04:47:45 +08:00
Aleix Conchillo Flaqué
107817317c add CHANGELOG.md 2024-05-14 13:45:01 -07:00
Aleix Conchillo Flaqué
663ff3417c examples: add missing requirements 2024-05-14 08:03:51 -07:00
Aleix Conchillo Flaqué
2b19d6bbac examples: remove commented out silero from storytelling 2024-05-14 00:57:21 -07:00
Aleix Conchillo Flaqué
7c41246e55 examples: fix storytelling example 2024-05-14 00:32:37 -07:00
Aleix Conchillo Flaqué
11aa9dc803 pipeline: allow stopping tasks with StopTaskFrame 2024-05-14 00:30:32 -07:00
Aleix Conchillo Flaqué
922cdefee5 services: run_* now return async generators 2024-05-14 00:30:07 -07:00
Aleix Conchillo Flaqué
e018d5b47a transports(daily): always allow capturing transcriptions 2024-05-14 00:29:02 -07:00
Aleix Conchillo Flaqué
20c679988c transports: allow base transports to be reused 2024-05-14 00:28:43 -07:00
Aleix Conchillo Flaqué
a344101cff README.md: s/Twitter/X/ 2024-05-13 18:24:06 -07:00
Aleix Conchillo Flaqué
2cefc40a77 README.md: use http urls for images 2024-05-13 18:20:57 -07:00
Aleix Conchillo Flaqué
68f0da26b6 examples: more translation-chatbot fixes 2024-05-13 17:57:11 -07:00
Aleix Conchillo Flaqué
9aea8e951c aggregators/sentence: ignore interim transcriptions 2024-05-13 17:56:19 -07:00
Aleix Conchillo Flaqué
12ff6d08fe examples: fix translation-chatbot 2024-05-13 16:22:11 -07:00
Aleix Conchillo Flaqué
1b21867a6f transports: add support for sending transport messages 2024-05-13 16:22:11 -07:00
Aleix Conchillo Flaqué
d28d0fa218 processors: add FrameProcessor.push_error 2024-05-13 16:12:35 -07:00
Aleix Conchillo Flaqué
01381f6dcd frames: add TransportMessageFrame 2024-05-13 16:12:30 -07:00
Aleix Conchillo Flaqué
c111fff0f7 services: update azure services 2024-05-13 16:12:26 -07:00
Aleix Conchillo Flaqué
50677e6085 Merge pull request #138 from pipecat-ai/moondream-chatbot-fixes
examples: fix moondream-chatbot
2024-05-14 06:29:13 +08:00
Aleix Conchillo Flaqué
22cd1ac5f2 examples: fix moondream-chatbot 2024-05-13 15:28:11 -07:00
Kwindla Hultman Kramer
fdfcfd1d5e Merge pull request #137 from rahulunair/intel_gpu
(feat): adding intel gpus support
2024-05-13 14:52:34 -07:00
Aleix Conchillo Flaqué
b6385be6c6 Merge pull request #136 from pipecat-ai/simple-chatbot-fixes
examples: fix simple-chatbot
2024-05-14 05:41:52 +08:00
rahulunair
6be88fa81b (feat): adding intel gpus support 2024-05-13 21:21:05 +00:00
Aleix Conchillo Flaqué
ed31c7924e examples: fix simple-chatbot 2024-05-13 13:19:11 -07:00
Jon Taylor
4898084645 Update LICENSE 2024-05-13 20:49:51 +01:00
chadbailey59
6be0751a52 Delete CNAME 2024-05-13 14:42:46 -05:00
Aleix Conchillo Flaqué
7ce1206ed4 Create CNAME 2024-05-13 12:05:08 -07:00
Jon Taylor
1b5130694a Update README.md 2024-05-13 19:36:39 +01:00
Jon Taylor
7c6199e93e Merge pull request #135 from pipecat-ai/jpt/devrel-edits-2
Jpt/devrel edits 2
2024-05-13 18:19:33 +01:00
Jon Taylor
3be742479d removed space 2024-05-13 18:17:00 +01:00
Aleix Conchillo Flaqué
d380b02a44 README: improve code reading 2024-05-13 10:12:19 -07:00
Aleix Conchillo Flaqué
5600fc49f1 README: fix code indentation 2024-05-13 10:08:09 -07:00
Jon Taylor
5f0d8b8d9f removed docs badge 2024-05-13 17:42:01 +01:00
Jon Taylor
8204e5c2d4 removed images 2024-05-13 17:41:03 +01:00
Jon Taylor
29b98c0326 removed images from examples readme 2024-05-13 17:40:07 +01:00
Jon Taylor
3502ef4745 Merge pull request #134 from pipecat-ai/jpt/devrel-edits
Added example apps to repo
2024-05-13 17:37:31 +01:00
Jon Taylor
0d28e84c59 addressed nitpicks 2024-05-13 17:37:01 +01:00
Jon Taylor
062fbf4ce3 fixed header for VAD 2024-05-13 17:20:50 +01:00
Jon Taylor
af8471b370 changed daily_url to daily_room 2024-05-13 17:20:10 +01:00
Jon Taylor
f756027333 updated text for simple example 2024-05-13 17:17:41 +01:00
Jon Taylor
65c4c0b21f fixed typo in readme 2024-05-13 17:14:17 +01:00
Jon Taylor
f1c02f8554 added examples back 2024-05-13 17:09:46 +01:00
Jon Taylor
27ba50cbbf updated README with sample code 2024-05-13 14:51:10 +01:00
Aleix Conchillo Flaqué
b254525d3c go back to using @dataclass since they can be inspected 2024-05-12 22:35:43 -07:00
Aleix Conchillo Flaqué
6c06fb8169 README: update pypi badge 2024-05-12 19:28:00 -07:00
Aleix Conchillo Flaqué
721cd11d62 Merge pull request #133 from pipecat-ai/aleix/readme
rebased jpt/readme branch
2024-05-13 10:26:45 +08:00
Aleix Conchillo Flaqué
bfbcb9d531 fix autopep8 linting 2024-05-12 19:25:17 -07:00
Aleix Conchillo Flaqué
724e78c5be renamed image.png to pipecat.png 2024-05-12 17:44:10 -07:00
Jon Taylor
d3c3d78855 added discord badge 2024-05-12 17:41:36 -07:00
Jon Taylor
8fa9fdcd5a Reworked readme to have more pipes and cats 2024-05-12 17:41:30 -07:00
Aleix Conchillo Flaqué
7856d20a38 Merge pull request #132 from pipecat-ai/pypi-repo-change
change pypi repo to pipecat-ai
2024-05-13 03:14:40 +08:00
Aleix Conchillo Flaqué
6d10027f2d change pypi repo to pipecat-ai 2024-05-12 12:08:43 -07:00
Aleix Conchillo Flaqué
bea31215dc Merge pull request #129 from daily-co/wip-proposal
pipecat proposal
2024-05-13 01:13:18 +08:00
Aleix Conchillo Flaqué
083480ca1e update macos-py3.10-requirements.txt 2024-05-12 10:10:35 -07:00
Aleix Conchillo Flaqué
65846330cf update linux-py3.10-requirements.txt 2024-05-12 10:09:04 -07:00
Aleix Conchillo Flaqué
29f48266f7 README: install dev-requirements.txt first 2024-05-12 10:07:54 -07:00
Aleix Conchillo Flaqué
bfd583211c examples: use LocalAudioTransport 2024-05-12 10:07:54 -07:00
Aleix Conchillo Flaqué
b026915d19 initial commit for new pipecat architecture 2024-05-12 10:07:25 -07:00
Aleix Conchillo Flaqué
4a0836dc8f Merge pull request #130 from daily-co/dependabot-05-06-24
dependabot: update packages 05-06-24
2024-05-07 08:14:38 +08:00
Aleix Conchillo Flaqué
2729c6bf5b dependabot: update packages 05-06-24 2024-05-06 15:33:33 -07:00
Aleix Conchillo Flaqué
712a889121 Merge pull request #128 from daily-co/pillow-security-fixes
pyproject: pillow security fixes
2024-04-23 01:51:49 +08:00
Aleix Conchillo Flaqué
2f341e4fb0 pyproject: pillow security fixes 2024-04-22 10:28:42 -07:00
Kwindla Hultman Kramer
24198ecf45 Merge pull request #126 from daily-co/jptaylor-patch-3
Update README.md
2024-04-12 23:10:30 -07:00
Jon Taylor
7e4fefe958 Update README.md 2024-04-12 22:45:30 -07:00
Jon Taylor
e9af39b85f Merge pull request #125 from daily-co/jptaylor-patch-2
Update README.md
2024-04-12 22:44:14 -07:00
Jon Taylor
38aa3cebb4 Update README.md 2024-04-12 22:42:11 -07:00
Jon Taylor
72724365a0 Merge pull request #124 from daily-co/jptaylor-patch-1
Update README.md
2024-04-12 22:40:29 -07:00
Jon Taylor
5368462e41 Update README.md 2024-04-12 22:28:40 -07:00
Jon Taylor
1b2b29dd18 Merge pull request #123 from daily-co/jpt/pypi-badge
added pypi badge
2024-04-12 07:33:26 -07:00
Kwindla Hultman Kramer
d2b2b6f619 Merge pull request #122 from daily-co/kwindla-patch-1
Update README.md
2024-04-11 21:34:37 -07:00
Jon Taylor
54bcb52129 added pypi badge 2024-04-11 21:34:27 -07:00
Kwindla Hultman Kramer
3dc7438bc8 Update README.md 2024-04-11 21:05:27 -07:00
Aleix Conchillo Flaqué
523bb9f2a2 Merge pull request #120 from daily-co/small-fireworks-fixes
minor fireworks updates
2024-04-12 06:35:57 +08:00
Aleix Conchillo Flaqué
0c2b3f8b65 minor fireworks updates 2024-04-11 15:34:23 -07:00
chadbailey59
0b7578056d added fireworks adapter (#118) 2024-04-11 17:15:02 -05:00
Aleix Conchillo Flaqué
f1b6b9f8e5 Merge pull request #119 from daily-co/use-new-fal-client-library
services: FalImageGenService now uses fal-client library
2024-04-12 05:59:58 +08:00
Aleix Conchillo Flaqué
cbc51babbe services: use asyncio to_thread in moondreamservice 2024-04-11 14:22:44 -07:00
Aleix Conchillo Flaqué
b0faafc184 update macos-py3.10 requirements 2024-04-11 14:16:19 -07:00
Aleix Conchillo Flaqué
103092dbb2 update linux-py3.10 requirements 2024-04-11 14:13:59 -07:00
Aleix Conchillo Flaqué
7b49c9ade3 services: FalImageGenService now uses fal-client library 2024-04-11 14:09:01 -07:00
Aleix Conchillo Flaqué
1e83a405c0 Merge pull request #117 from daily-co/llm-use-aggregator-pass-through-fix
aggregators: fix LLMUserResponseAggregator passs-through
2024-04-12 04:24:56 +08:00
Aleix Conchillo Flaqué
7336866a1c examples: rely on new daily default transcription settings 2024-04-11 11:22:58 -07:00
Aleix Conchillo Flaqué
0f23282e30 transport: enable interim results in daily transport 2024-04-11 11:22:05 -07:00
Aleix Conchillo Flaqué
eb3bf117b1 use InterimTranscriptionFrame in LLMUserResponseAggregator 2024-04-11 11:21:42 -07:00
Aleix Conchillo Flaqué
e288aa047b examples: use LLMUserResponseAggregator with VAD 2024-04-11 08:10:56 -07:00
Aleix Conchillo Flaqué
9a9df35d7b aggregators: allow TranscriptionFrame after an end frame threshold 2024-04-10 23:35:31 -07:00
Aleix Conchillo Flaqué
af8663e95d aggregators: fix LLMUserResponseAggregator passs-through 2024-04-10 21:46:16 -07:00
Aleix Conchillo Flaqué
db05a9b29b Merge pull request #116 from daily-co/moondream-use-cpu
moondream: allow passing use_cpu
2024-04-11 09:08:11 +08:00
Aleix Conchillo Flaqué
130e418800 moondream: allow passing use_cpu 2024-04-10 17:43:44 -07:00
Aleix Conchillo Flaqué
1a0a66e503 Merge pull request #114 from daily-co/jpt/fal-updates
Updated Fal.ai service to take a params model and allow for model string param
2024-04-11 00:47:33 +08:00
Aleix Conchillo Flaqué
e22babbae2 examples: update with new FalImageGenService parameters 2024-04-10 09:45:08 -07:00
Aleix Conchillo Flaqué
bfe2e0f36e services: don't use image_size in ImageGenService 2024-04-10 09:44:42 -07:00
Aleix Conchillo Flaqué
26d401e5de Merge pull request #115 from daily-co/add-vision-and-moondream-service
add vision and moondream service
2024-04-11 00:22:26 +08:00
Aleix Conchillo Flaqué
3c20f9153d added VisionImageFrame and VisionImageFrameAggregator 2024-04-10 09:19:34 -07:00
Aleix Conchillo Flaqué
2f9899af5a update macos-py3.10 requirements 2024-04-09 22:39:04 -07:00
Aleix Conchillo Flaqué
5ef5cf30f4 update linux-py3.10 requirements 2024-04-09 22:36:35 -07:00
Aleix Conchillo Flaqué
34a6c5691b examples: added 12-describe-video 2024-04-09 22:36:35 -07:00
Aleix Conchillo Flaqué
18bf09c704 services: added MoondreamService 2024-04-09 22:36:35 -07:00
Aleix Conchillo Flaqué
84cfa7cc95 services: added VisionService 2024-04-09 22:36:35 -07:00
Aleix Conchillo Flaqué
a5eba0106b transport: allow requesting a user frame 2024-04-09 22:36:35 -07:00
Aleix Conchillo Flaqué
b117a185e3 frames: added UserImageRequestFrame 2024-04-09 22:14:54 -07:00
Aleix Conchillo Flaqué
0219230827 Merge pull request #113 from daily-co/aleix/only-subcribe-to-participant
only subcribe to participant
2024-04-10 10:47:29 +08:00
Aleix Conchillo Flaqué
9fcbb36997 examples: add 14a-local-render-remote-participant 2024-04-09 19:46:10 -07:00
Aleix Conchillo Flaqué
0bf15fd6eb daily: only subscribe to participant video source 2024-04-09 19:46:10 -07:00
Aleix Conchillo Flaqué
989252bb52 daily: always check camera/mic/speaker enabled 2024-04-09 19:46:10 -07:00
Jon Taylor
7b44a79a5b added params and model attribute to fal service 2024-04-09 17:43:27 -07:00
Aleix Conchillo Flaqué
4bd29b0080 Merge pull request #110 from daily-co/compatible-versions
pyproject: use compatible version
2024-04-10 00:41:22 +08:00
Aleix Conchillo Flaqué
ebb76fdae9 update macos-py3.10 requirements 2024-04-09 08:52:37 -07:00
Aleix Conchillo Flaqué
5d52def0fe update linux-py3.10 requirements 2024-04-09 08:49:41 -07:00
Aleix Conchillo Flaqué
9ada56d0b0 pyproject: use compatible version 2024-04-09 08:41:54 -07:00
Aleix Conchillo Flaqué
8d73cdb2ee Merge pull request #111 from daily-co/user-transcription-aggregator
pipeline: add UserTranscriptionAggregator
2024-04-09 23:34:52 +08:00
Aleix Conchillo Flaqué
4f04b10202 Merge pull request #112 from daily-co/user-image-frame
user image frames and other updates
2024-04-09 23:34:32 +08:00
Aleix Conchillo Flaqué
97b923e37e llm user and assistant aggregator renames 2024-04-09 08:31:48 -07:00
Aleix Conchillo Flaqué
57aabea0a3 examples: added 14-render-remote-participant 2024-04-09 08:01:14 -07:00
Aleix Conchillo Flaqué
319b8e7816 updated ImageFrame and added URLImageFrame and UserImageFrame 2024-04-08 23:23:33 -07:00
Aleix Conchillo Flaqué
96950ca6df daily: on_first_other_participant_joined now gets the participant 2024-04-08 23:23:33 -07:00
Aleix Conchillo Flaqué
d7b2e67c35 pipeline: add UserTranscriptionAggregator 2024-04-08 17:15:14 -07:00
Aleix Conchillo Flaqué
53930b47a5 github: just some rewording 2024-04-06 18:03:53 -07:00
Aleix Conchillo Flaqué
86c8ab02cc github: also publish stables releases to test pypi 2024-04-06 17:58:13 -07:00
Aleix Conchillo Flaqué
b678097f6d Merge pull request #109 from daily-co/only-use-fps
transport: only use fps to set maxFramerate
2024-04-07 07:02:44 +08:00
Aleix Conchillo Flaqué
eb455043c4 transport: use camera_bitrate and camera_framerate 2024-04-06 12:27:05 -07:00
Aleix Conchillo Flaqué
dd696be04c Merge pull request #108 from daily-co/add-camera-max-framerate
transport: add camera_max_framerate argument
2024-04-06 11:18:42 +08:00
Aleix Conchillo Flaqué
96b2337183 transport: add camera_max_framerate argument 2024-04-05 20:16:03 -07:00
Aleix Conchillo Flaqué
ea52e73f57 Merge pull request #107 from daily-co/increase-max-framerate
transport: increase daily maxFramerate to 30
2024-04-06 11:08:21 +08:00
Aleix Conchillo Flaqué
88404e4739 Merge pull request #106 from daily-co/updated-to-be-updated-examples
examples: updated to_be_updated examples
2024-04-06 11:06:30 +08:00
Aleix Conchillo Flaqué
0fd323714e transport: add camera_max_bitrate argument 2024-04-05 20:05:58 -07:00
Aleix Conchillo Flaqué
a362ca4d3d transport: increase daily maxFramerate to 30 2024-04-05 19:44:25 -07:00
Aleix Conchillo Flaqué
02b5c3dd5f update dot-env.template 2024-04-05 16:16:56 -07:00
Aleix Conchillo Flaqué
497a09cbc8 examples: updated to_be_updated examples 2024-04-05 16:01:23 -07:00
Aleix Conchillo Flaqué
172a14245d Merge pull request #104 from daily-co/threaded-transport-allow-sink-override
examples: fix whisper examples
2024-04-06 04:46:12 +08:00
Aleix Conchillo Flaqué
302246399b Merge pull request #105 from daily-co/local-tranport-read-audio-frames
transports: fix local transport read_audio_frames
2024-04-06 04:44:37 +08:00
Aleix Conchillo Flaqué
9590cc2fbc examples: fix whisper examples 2024-04-05 13:43:51 -07:00
Aleix Conchillo Flaqué
09e4044c72 transports: fix local transport read_audio_frames 2024-04-05 13:34:01 -07:00
Aleix Conchillo Flaqué
efdfb74dc3 github: increase fetch-depth to 100 for test publish 2024-04-05 08:32:29 -07:00
Aleix Conchillo Flaqué
158de6f20b github: fetch-tags and increase fetch-depth for test publish 2024-04-05 08:25:37 -07:00
Aleix Conchillo Flaqué
47f68b742d pyproject: user proper environment for test pypi 2024-04-05 08:02:45 -07:00
Aleix Conchillo Flaqué
2654ca1f62 pyproject: don't use local version for test pypi 2024-04-05 07:51:52 -07:00
Aleix Conchillo Flaqué
4263827ee8 README: use double-quotes with optional dependencies 2024-04-04 17:47:16 -07:00
Aleix Conchillo Flaqué
97fe529b0e github: update test publish workflow 2024-04-04 17:41:31 -07:00
Aleix Conchillo Flaqué
86025723e7 github: one more publish workflow fix 2024-04-04 17:36:20 -07:00
Aleix Conchillo Flaqué
6f4270a552 github: avoid caching in publish workflow 2024-04-04 17:32:50 -07:00
Aleix Conchillo Flaqué
31f050c02b github: more publish workflows fixes 2024-04-04 17:31:59 -07:00
Aleix Conchillo Flaqué
a0fe57721b github: fix publish workflows 2024-04-04 17:17:15 -07:00
Aleix Conchillo Flaqué
abf5e57319 Merge pull request #103 from daily-co/aleix/fix-github-cache-name
github: fix github cache name
2024-04-05 08:03:15 +08:00
Aleix Conchillo Flaqué
44de9007c3 Merge pull request #102 from daily-co/examples-cleanup
examples cleanup
2024-04-05 08:02:57 +08:00
Aleix Conchillo Flaqué
46d265514e pyproject: update github url 2024-04-04 15:52:28 -07:00
Aleix Conchillo Flaqué
9e64de8606 Merge pull request #101 from daily-co/cb/bot-exit
Allow transport exit to end a running pipeline
2024-04-05 06:51:06 +08:00
Aleix Conchillo Flaqué
1ea503c1e6 examples: fix 03a-image-local 2024-04-04 15:35:58 -07:00
Aleix Conchillo Flaqué
d0aeeccb68 github: fix github cache name 2024-04-04 14:36:04 -07:00
Aleix Conchillo Flaqué
d687c8cdeb transports: updated silero vad not found message 2024-04-04 14:05:40 -07:00
Aleix Conchillo Flaqué
951f20c788 transports: don't write/read if microphone/speaker not enabled 2024-04-04 14:05:15 -07:00
Aleix Conchillo Flaqué
982c0a0749 examples: move non-working examples to to_be_updated 2024-04-04 14:04:53 -07:00
Chad Bailey
27cef7cd70 add endframe to transport receive queue 2024-04-04 20:45:23 +00:00
chadbailey59
03ea208361 VAD fallback (#97)
* Silero VAD preferred with webrtc fallback

* webrtc VAD neds a different sample size

* fixup

* fixup
2024-04-04 13:31:07 -05:00
Aleix Conchillo Flaqué
385b51ac83 Merge pull request #98 from daily-co/use-pip-features
use pip optional dependencies
2024-04-05 01:00:21 +08:00
Aleix Conchillo Flaqué
a37e4fabad github: only run publish-test on main 2024-04-04 09:58:42 -07:00
Aleix Conchillo Flaqué
8bc3c03a69 add a requirements.txt per platform 2024-04-03 21:39:10 -07:00
Aleix Conchillo Flaqué
1fc800754b github: no need to install dependencies when building/deploying 2024-04-03 16:26:58 -07:00
Aleix Conchillo Flaqué
18c4bccc13 github: rename deploy to publish 2024-04-03 16:22:23 -07:00
Aleix Conchillo Flaqué
d57d473c13 pyproject.toml: use setuptools_scm to auto manage versions 2024-04-03 16:13:07 -07:00
Aleix Conchillo Flaqué
48bb3c6955 github: add publish to pypi workflows 2024-04-03 15:57:59 -07:00
Aleix Conchillo Flaqué
e3ee3f9cc6 github(lint): use requirements-dev.txt 2024-04-03 15:27:31 -07:00
Aleix Conchillo Flaqué
3528f5d735 use conditional imports and show help errors if modules not found 2024-04-03 15:27:31 -07:00
Aleix Conchillo Flaqué
23735cb3a3 dot-env.example: cleanup and add missing environment variables 2024-04-03 15:27:31 -07:00
Aleix Conchillo Flaqué
6918dc69f0 github: separate build and test workflows 2024-04-03 15:27:31 -07:00
Aleix Conchillo Flaqué
128d350abc pyproject.toml: use project optional dependencies and pin them 2024-04-03 15:27:31 -07:00
chadbailey59
2f59e38a7a Modularize tricky dependencies (#95)
* removed pyaudio from threaded transport

* modularized torch and torchaudio

* modularized local transport

* Working Dockerfile as well

* docker updates for fly.io
2024-04-03 10:48:11 -05:00
chadbailey59
c21014860f Added app messages to translator example (#94) 2024-04-01 14:25:20 -05:00
chadbailey59
d4e3e1710f Server updates (#90)
* updated server readme

* fixup

* Refactored server

* fixup
2024-03-28 15:03:08 -05:00
Moishe Lettvin
e7f9296b5a Merge pull request #93 from daily-co/frame-name-cleanup
Cleanup the last few badly-named Frame types
2024-03-28 14:25:59 -04:00
Moishe Lettvin
27322108b7 Cleanup the last few badly-named Frame types 2024-03-28 12:36:24 -04:00
Moishe Lettvin
22bbedec93 Merge pull request #92 from daily-co/remove-bad-print
Remove mistakenly-added print statement
2024-03-28 11:54:49 -04:00
Moishe Lettvin
ed91bc0f66 Remove mistakenly-added print statement 2024-03-28 11:47:11 -04:00
Moishe Lettvin
565acfa9c9 Merge pull request #86 from daily-co/transport-refactor
Starting refactor of transports into their own directory
2024-03-28 11:17:32 -04:00
Moishe Lettvin
a2295b6b1d Merge pull request #91 from daily-co/pipeline-logging
Add logging for pipeline
2024-03-28 11:16:26 -04:00
Moishe Lettvin
fef1366c84 Merge pull request #88 from daily-co/frame-progress-diagram
Frame progress diagram
2024-03-28 11:13:21 -04:00
Moishe Lettvin
5c0ba1b6f0 Fix off by one errors, add tests and comment 2024-03-28 08:34:34 -04:00
Moishe Lettvin
05c77bce25 Add logging for pipeline 2024-03-27 18:48:30 -04:00
Moishe Lettvin
4ce140bf84 Move some things to AbstractTransport class 2024-03-27 12:59:08 -04:00
James Hush
a3293c6d7a fix: force overriding environment variables from .env files (#89) 2024-03-27 23:38:55 +08:00
Moishe Lettvin
ce04d4a54a Add text to md 2024-03-27 08:10:14 -04:00
Moishe Lettvin
758ed2d895 Frame progress images 2024-03-26 20:40:10 -04:00
Moishe Lettvin
85cd795b2b fix image 2024-03-26 20:36:18 -04:00
Moishe Lettvin
6c36d5f686 Testing 2024-03-26 20:33:09 -04:00
Moishe Lettvin
b2425d6dcd Testing 2024-03-26 20:32:32 -04:00
Moishe Lettvin
e8a6560ac1 Merge forgotten files 2024-03-26 16:24:47 -04:00
Moishe Lettvin
78c80d8941 some more renames 2024-03-26 15:57:19 -04:00
Moishe Lettvin
2fc5de6afe Starting refactor of transports into their own directory 2024-03-26 08:35:04 -04:00
Moishe Lettvin
24fb7c5a05 Merge pull request #81 from daily-co/websocket-transport
Websocket transport
2024-03-25 14:40:34 -04:00
Moishe Lettvin
5761e23af1 remove unnecessary checks 2024-03-25 14:00:08 -04:00
Moishe Lettvin
960c659d5a Remove duplicated constant 2024-03-25 13:59:03 -04:00
Moishe Lettvin
2bda4c3307 Websocket transport 2024-03-25 13:54:34 -04:00
Aleix Conchillo Flaqué
2c5628a621 Merge pull request #85 from daily-co/minor-readme-update
README: minor fixes
2024-03-22 04:33:42 +08:00
Aleix Conchillo Flaqué
9b4cfd9a6c README: minor fixes 2024-03-21 13:16:50 -07:00
Aleix Conchillo Flaqué
8f9aeb0751 Merge pull request #82 from daily-co/remove-unused-imports
remove unused imports
2024-03-22 03:02:07 +08:00
Aleix Conchillo Flaqué
e8a9d43287 Merge pull request #84 from daily-co/use-openai-api-key
use OPENAI_API_KEY instead of OPENAI_CHATGPT_API_KEY
2024-03-21 21:57:40 +08:00
Aleix Conchillo Flaqué
cf5d516d51 use OPENAI_API_KEY instead of OPENAI_CHATGPT_API_KEY
Fixes #77
2024-03-20 15:26:32 -07:00
Aleix Conchillo Flaqué
0666dd1194 remove unused imports 2024-03-20 14:52:19 -07:00
Aleix Conchillo Flaqué
42e25ccd13 create missing __init__.py 2024-03-20 14:41:39 -07:00
Aleix Conchillo Flaqué
520cee273f Merge pull request #80 from daily-co/move-src-daily-tests-to-tests
move src/dailyai/tests to tests
2024-03-21 00:27:07 +08:00
Aleix Conchillo Flaqué
a189e2618f github: source venv in every step 2024-03-19 15:31:03 -07:00
Aleix Conchillo Flaqué
ae2dcf88ed github: use virtual environment 2024-03-19 15:23:09 -07:00
Aleix Conchillo Flaqué
5cdb82ad3c README: one more autopep8 emacs update 2024-03-19 15:18:29 -07:00
Aleix Conchillo Flaqué
593513c84a github: add venv caching 2024-03-19 15:17:48 -07:00
Aleix Conchillo Flaqué
16257f8ec0 move src/dailyai/tests to tests 2024-03-19 14:59:48 -07:00
Aleix Conchillo Flaqué
5fc21a7508 Merge pull request #73 from daily-co/github-unittests-workflow
github: add workflow for unit tests
2024-03-20 03:01:03 +08:00
Aleix Conchillo Flaqué
cc05429135 github: add workflow for unit tests 2024-03-19 11:51:14 -07:00
Aleix Conchillo Flaqué
85e66dddbe Merge pull request #79 from daily-co/readme-emacs-autopep8-update
README: emacs autopep8 update
2024-03-20 02:17:44 +08:00
Aleix Conchillo Flaqué
03ea559839 README: emacs autopep8 update 2024-03-19 10:28:11 -07:00
Aleix Conchillo Flaqué
b6c9859e34 Merge pull request #78 from daily-co/readme-editor-setup
README: add editor setup
2024-03-20 01:10:57 +08:00
Aleix Conchillo Flaqué
bc47c909a3 README: add editor setup 2024-03-19 10:10:14 -07:00
Aleix Conchillo Flaqué
428659730d Merge pull request #70 from daily-co/move-src-example-to-examples
move src/examples to examples
2024-03-20 01:09:13 +08:00
Aleix Conchillo Flaqué
a573277a10 examples: copy runner.py and auth.py where needed 2024-03-18 17:10:23 -07:00
Aleix Conchillo Flaqué
69c2637a25 README.md: update examples 2024-03-18 14:53:53 -07:00
Aleix Conchillo Flaqué
90c34d278f move src/examples to examples 2024-03-18 11:51:38 -07:00
Aleix Conchillo Flaqué
2f4e31d1b2 Merge pull request #69 from daily-co/add-github-linting-workflow
github: add linting workflow
2024-03-19 02:46:50 +08:00
Aleix Conchillo Flaqué
9385270775 autopep8 formatting 2024-03-18 11:28:32 -07:00
Aleix Conchillo Flaqué
2914e43350 github: add linting workflow 2024-03-18 11:28:06 -07:00
chadbailey59
78638d2dba Live translation (#61)
* added translator

* fixup
2024-03-18 13:26:05 -05:00
Aleix Conchillo Flaqué
141a5bb548 Merge pull request #68 from daily-co/log-transcription-errors
daily: log transcription errors
2024-03-19 01:53:40 +08:00
Aleix Conchillo Flaqué
3957813202 Merge pull request #67 from daily-co/add-dot-env-template
add dot-env.template
2024-03-19 01:49:21 +08:00
Aleix Conchillo Flaqué
549862ef99 daily: log transcription errors 2024-03-18 10:47:20 -07:00
Aleix Conchillo Flaqué
1000ca5b55 add dot-env.template 2024-03-18 10:43:57 -07:00
Moishe Lettvin
91dbfef4c3 Merge pull request #64 from daily-co/docs
Some docs
2024-03-18 13:38:32 -04:00
Moishe Lettvin
3b61d0b41a fix typos 2024-03-18 13:38:00 -04:00
Moishe Lettvin
bf3ae091b9 Merge pull request #62 from daily-co/anthropic-support
Anthropic LLM service
2024-03-18 13:36:39 -04:00
Aleix Conchillo Flaqué
34ac796607 Merge pull request #66 from daily-co/daily-transport-release-client
services: release daily client after leave
2024-03-19 01:36:22 +08:00
Aleix Conchillo Flaqué
e0551e9d85 services: release daily client after leave 2024-03-18 10:32:46 -07:00
Moishe Lettvin
b1ab6f91b9 Merge pull request #65 from daily-co/app-messages
Support for app messages
2024-03-18 11:37:10 -04:00
Moishe Lettvin
58726dc20d clean up imports 2024-03-18 10:14:51 -04:00
Moishe Lettvin
8e61fe8e36 Support for app messages 2024-03-18 10:08:41 -04:00
Moishe Lettvin
99b836c227 added docstrings to frames. 2024-03-18 09:08:12 -04:00
Moishe Lettvin
1c27f77f1a drafty architecture doc 2024-03-18 08:39:50 -04:00
Moishe Lettvin
c91fa39a99 Remove testing code 2024-03-15 19:42:46 -04:00
Moishe Lettvin
eacaea7db4 Anthropic LLM service 2024-03-15 19:40:37 -04:00
Moishe Lettvin
c6dfcb6f7a Merge pull request #60 from daily-co/remove-ai-service-methods
Remove run_to_queue and run from AIService class
2024-03-15 15:28:28 -04:00
Moishe Lettvin
18bf26de14 Update apps 2024-03-15 13:39:33 -04:00
Moishe Lettvin
b8b35db89c Remove run_to_queue and run from AIService class 2024-03-15 11:04:22 -04:00
Moishe Lettvin
358166f347 Merge pull request #59 from daily-co/remove-requirements
Remove unused requirements file
2024-03-13 16:23:42 -04:00
Moishe Lettvin
c006c123b2 Remove unused requirements file 2024-03-13 16:19:03 -04:00
chadbailey59
cf302fb765 Storybot and Chatbot examples (#58)
* storybot

* storybot

* added pipeline.queue_frames

* fixup
2024-03-13 15:12:59 -05:00
Moishe Lettvin
e33820fe36 Merge pull request #56 from daily-co/fal-redux
Use other model in FAL
2024-03-12 15:14:57 -04:00
Moishe Lettvin
b84b3d59f3 Use other model in FAL 2024-03-12 14:47:00 -04:00
Moishe Lettvin
7b5b88b99b Merge pull request #55 from daily-co/fix-fal
set FAL param correctly
2024-03-12 14:12:16 -04:00
Moishe Lettvin
e87196cce7 set FAL param correctly 2024-03-12 14:03:43 -04:00
chadbailey59
bbfc9e703b intake cleanup (#54) 2024-03-12 13:01:39 -05:00
Moishe Lettvin
c21a63d48b Merge pull request #49 from daily-co/openai-base-llm
Base OpenAI LLM service
2024-03-12 12:58:31 -04:00
Moishe Lettvin
f546bb32da Make 08- work again 2024-03-12 10:34:52 -04:00
Moishe Lettvin
d9378e23ba Base OpenAI LLM service 2024-03-11 16:52:41 -04:00
Moishe Lettvin
c75a3fb0d0 Merge pull request #53 from daily-co/fix_other_joined_event
Don't do time-consuming processing in `on_other_joined_event`
2024-03-11 13:27:13 -04:00
Moishe Lettvin
f8ae264957 remove unnecessary print 2024-03-11 13:20:28 -04:00
Moishe Lettvin
977c12d530 undo fal change 2024-03-11 13:19:47 -04:00
Moishe Lettvin
61c55d2f47 Fix up other examples 2024-03-11 13:17:31 -04:00
Moishe Lettvin
fd2fa23e9c Fix example 2 2024-03-11 13:00:29 -04:00
Moishe Lettvin
de026ccc8a Merge pull request #50 from daily-co/khk/launch-samples
Khk/launch samples
2024-03-11 12:50:38 -04:00
Moishe Lettvin
c5bb0e14ab Merge pull request #51 from daily-co/khk/readme
updated README
2024-03-11 12:50:22 -04:00
chadbailey59
a4f3c51184 the smallest commit in history 2024-03-11 09:47:00 -05:00
Moishe Lettvin
7786e685cc Merge pull request #52 from daily-co/pypi-updates
updates to pyproject.toml
2024-03-11 10:34:35 -04:00
Moishe Lettvin
33793ca9f8 update description 2024-03-11 07:31:39 -04:00
Moishe Lettvin
d26aede667 updates to pyproject.toml 2024-03-11 07:25:20 -04:00
Moishe Lettvin
ad993056d8 rename to dailyai 2024-03-11 07:16:20 -04:00
Kwindla Hultman Kramer
5b1f26aacb updated README 2024-03-10 22:06:23 -07:00
Kwindla Hultman Kramer
4e16e514dd attempting to change tts to deepgram in example 04 2024-03-10 19:43:06 -07:00
Kwindla Hultman Kramer
959ffa9d36 small streamlining of example 03 2024-03-10 19:42:19 -07:00
Kwindla Hultman Kramer
4396b1018a small streamlining of example 02 2024-03-10 19:41:32 -07:00
Kwindla Hultman Kramer
37e904ce68 changed fal to a maybe slightly faster model 2024-03-10 19:40:51 -07:00
Kwindla Hultman Kramer
ef39d842a5 custom processor in example 05 2024-03-10 19:18:37 -07:00
Kwindla Hultman Kramer
72f631a066 working on foundational examples 2024-03-10 17:21:46 -07:00
chadbailey59
5d46302b9e changed default services (#47) 2024-03-08 15:36:30 -06:00
chadbailey59
8241dc0bed cleaned up example logging (#46) 2024-03-08 15:25:17 -06:00
Moishe Lettvin
95a1efbe75 Merge pull request #45 from daily-co/exception_handling_callbacks
Wait for the callback's result, so exceptions get raised
2024-03-08 15:04:15 -05:00
Moishe Lettvin
e59df8476e Wait for the callback's result, so exceptions get raised 2024-03-08 15:02:15 -05:00
750 changed files with 72268 additions and 4950 deletions

30
.dockerignore Normal file
View File

@@ -0,0 +1,30 @@
# flyctl launch added from .gitignore
**/.vscode
**/env
**/__pycache__
**/*~
**/venv
#*#
# Distribution / packaging
**/.Python
**/build
**/develop-eggs
**/dist
**/downloads
**/eggs
**/.eggs
**/lib
**/lib64
**/parts
**/sdist
**/var
**/wheels
**/share/python-wheels
**/*.egg-info
**/.installed.cfg
**/*.egg
**/MANIFEST
**/.DS_Store
**/.env
fly.toml

48
.github/workflows/android.yaml vendored Normal file
View File

@@ -0,0 +1,48 @@
name: android
on:
push:
branches:
- main
paths:
- "examples/simple-chatbot/client/android/**"
pull_request:
branches:
- "**"
paths:
- "examples/simple-chatbot/client/android/**"
workflow_dispatch:
inputs:
sdk_git_ref:
type: string
description: "Which git ref of the app to build"
concurrency:
group: build-android-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
sdk:
name: "Simple chatbot demo"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.sdk_git_ref || github.ref }}
- name: "Install Java"
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '17'
- name: Build demo app
working-directory: examples/simple-chatbot/client/android
run: ./gradlew :simple-chatbot-client:assembleDebug
- name: Upload demo APK
uses: actions/upload-artifact@v4
with:
name: Simple Chatbot Android Client
path: examples/simple-chatbot/client/android/simple-chatbot-client/build/outputs/apk/debug/simple-chatbot-client-debug.apk

44
.github/workflows/build.yaml vendored Normal file
View File

@@ -0,0 +1,44 @@
name: build
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
concurrency:
group: build-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
build:
name: "Build and Install"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
- name: Build project
run: |
source .venv/bin/activate
python -m build
- name: Install project and other Python dependencies
run: |
source .venv/bin/activate
pip install --editable .

46
.github/workflows/format.yaml vendored Normal file
View File

@@ -0,0 +1,46 @@
name: format
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
concurrency:
group: build-format-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
ruff-format:
name: "Formatting checker"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install development Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
- name: Ruff formatter
id: ruff-format
run: |
source .venv/bin/activate
ruff format --diff
- name: Ruff import linter
id: ruff-check
run: |
source .venv/bin/activate
ruff check --select I

84
.github/workflows/publish.yaml vendored Normal file
View File

@@ -0,0 +1,84 @@
name: publish
on:
workflow_dispatch:
inputs:
gitref:
type: string
description: "what git ref to build"
required: true
jobs:
build:
name: "Build and upload wheels"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.gitref }}
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
- name: Build project
run: |
source .venv/bin/activate
python -m build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels
path: ./dist
publish-to-pypi:
name: "Publish to PyPI"
runs-on: ubuntu-latest
needs: [ build ]
environment:
name: pypi
url: https://pypi.org/p/pipecat-ai
permissions:
id-token: write
steps:
- name: Download wheels
uses: actions/download-artifact@v4
with:
name: wheels
path: ./dist
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
verbose: true
print-hash: true
publish-to-test-pypi:
name: "Publish to Test PyPI"
runs-on: ubuntu-latest
needs: [ build ]
environment:
name: testpypi
url: https://pypi.org/p/pipecat-ai
permissions:
id-token: write
steps:
- name: Download wheels
uses: actions/download-artifact@v4
with:
name: wheels
path: ./dist
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
verbose: true
print-hash: true
repository-url: https://test.pypi.org/legacy/

58
.github/workflows/publish_test.yaml vendored Normal file
View File

@@ -0,0 +1,58 @@
name: publish-test
on: workflow_dispatch
jobs:
build:
name: "Build and upload wheels"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
fetch-tags: true
fetch-depth: 100
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
- name: Build project
run: |
source .venv/bin/activate
python -m build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels
path: ./dist
publish-to-test-pypi:
name: "Publish to Test PyPI"
runs-on: ubuntu-latest
needs: [ build ]
environment:
name: testpypi
url: https://pypi.org/p/pipecat-ai
permissions:
id-token: write
steps:
- name: Download wheels
uses: actions/download-artifact@v4
with:
name: wheels
path: ./dist
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
verbose: true
print-hash: true
repository-url: https://test.pypi.org/legacy/

52
.github/workflows/tests.yaml vendored Normal file
View File

@@ -0,0 +1,52 @@
name: tests
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
concurrency:
group: build-test-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
test:
name: "Unit and Integration Tests"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing dev-requirements.txt and test-requirements.txt which
# contain all dependencies needed to run the tests.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('dev-requirements.txt') }}-${{ hashFiles('test-requirements.txt') }}
path: .venv
- name: Install system packages
id: install_system_packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r test-requirements.txt
- name: Test with pytest
run: |
source .venv/bin/activate
pytest

10
.gitignore vendored
View File

@@ -3,6 +3,8 @@ env/
__pycache__/
*~
venv
.venv
/.idea
#*#
# Distribution / packaging
@@ -26,3 +28,11 @@ share/python-wheels/
MANIFEST
.DS_Store
.env
fly.toml
# Example files
pipecat/examples/twilio-chatbot/templates/streams.xml
# Documentation
docs/api/_build/
docs/api/api

7
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,7 @@
repos:
- repo: local
hooks:
- id: ruff-format-hook
name: Check ruff formatting
entry: sh scripts/pre-commit.sh
language: system

36
.readthedocs.yaml Normal file
View File

@@ -0,0 +1,36 @@
version: 2
build:
os: ubuntu-22.04
tools:
python: '3.12'
apt_packages:
- portaudio19-dev
- python3-dev
- libasound2-dev
jobs:
pre_build:
- python -m pip install --upgrade pip
- pip install wheel setuptools
post_build:
- echo "Build completed"
sphinx:
configuration: docs/api/conf.py
fail_on_warning: false
python:
install:
- requirements: docs/api/requirements.txt
- method: pip
path: .
search:
ranking:
api/*: 5
getting-started/*: 4
guides/*: 3
submodules:
include: all
recursive: true

1843
CHANGELOG.md Normal file

File diff suppressed because it is too large Load Diff

62
CHANGELOG.md.template Normal file
View File

@@ -0,0 +1,62 @@
# Changelog
All notable changes to the **&lt;project name&gt;** SDK will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
Please make sure to add your changes to the appropriate categories:
## [Unreleased]
### Added
<!-- for new functionality -->
- n/a
### Changed
<!-- for changed functionality -->
- n/a
### Deprecated
<!-- for soon-to-be removed functionality -->
- n/a
### Removed
<!-- for removed functionality -->
- n/a
### Fixed
<!-- for fixed bugs -->
- n/a
### Performance
<!-- for performance-relevant changes -->
- n/a
### Security
<!-- for security-relevant changes -->
- n/a
### Other
<!-- for everything else -->
- n/a
## [0.1.0] - YYYY-MM-DD
Initial release.

165
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,165 @@
## Contributing to Pipecat
We welcome contributions of all kinds! Your help is appreciated. Follow these steps to get involved:
1. **Fork this repository**: Start by forking the Pipecat Documentation repository to your GitHub account.
2. **Clone the repository**: Clone your forked repository to your local machine.
```bash
git clone https://github.com/your-username/pipecat
```
3. **Create a branch**: For your contribution, create a new branch.
```bash
git checkout -b your-branch-name
```
4. **Make your changes**: Edit or add files as necessary.
5. **Test your changes**: Ensure that your changes look correct and follow the style set in the codebase.
6. **Commit your changes**: Once you're satisfied with your changes, commit them with a meaningful message.
```bash
git commit -m "Description of your changes"
```
7. **Push your changes**: Push your branch to your forked repository.
```bash
git push origin your-branch-name
```
9. **Submit a Pull Request (PR)**: Open a PR from your forked repository to the main branch of this repo.
> Important: Describe the changes you've made clearly!
Our maintainers will review your PR, and once everything is good, your contributions will be merged!
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at pipecat-ai@daily.co.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -7,13 +7,14 @@ COPY *.py /app
COPY pyproject.toml /app
COPY src/ /app/src/
COPY examples/ /app/examples/
WORKDIR /app
RUN ls --recursive /app/
RUN pip3 install --upgrade -r requirements.txt
RUN python -m build .
RUN pip3 install .
RUN pip3 install gunicorn
# If running on Ubuntu, Azure TTS requires some extra config
# https://learn.microsoft.com/en-us/azure/ai-services/speech-service/quickstarts/setup-platform?pivots=programming-language-python&tabs=linux%2Cubuntu%2Cdotnetcli%2Cdotnet%2Cjre%2Cmaven%2Cnodejs%2Cmac%2Cpypi
@@ -36,4 +37,4 @@ WORKDIR /app
EXPOSE 8000
# run
CMD ["gunicorn", "--workers=2", "--log-level", "debug", "--capture-output", "daily-bot-manager:app", "--bind=0.0.0.0:8000"]
CMD ["gunicorn", "--workers=2", "--log-level", "debug", "--chdir", "examples/server", "--capture-output", "daily-bot-manager:app", "--bind=0.0.0.0:8000"]

View File

@@ -1,6 +1,6 @@
BSD 2-Clause License
Copyright (c) 2024, Daily
Copyright (c) 20242025, Daily
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

335
README.md
View File

@@ -1,159 +1,262 @@
# Daily AI SDK
<h1><div align="center">
 <img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
</div></h1>
Build conversational, multi-modal AI apps with real-time voice and video, like this:
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) <a href="https://app.commanddash.io/agent/github_pipecat-ai_pipecat"><img src="https://img.shields.io/badge/AI-Code%20Agent-EB9FDA"></a>
_Demo Video to come_
Pipecat is an open source Python framework for building voice and multimodal conversational agents. It handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions, letting you focus on creating engaging experiences.
With built-in support for many of the best AI platforms (or [add your own](/docs)):
## What you can build
- Azure - DALL-E, ChatGPT, and Azure AI Text-to-Speech
- Deepgram - Speech-to-text, and Aura text-to-speech
- Eleven Labs text-to-speech
- Fal.ai image generation
- OpenAI DALL-E and ChatGPT
- Whisper local speech-to-text
- **Voice Assistants**: [Natural, real-time conversations with AI](https://demo.dailybots.ai/)
- **Interactive Agents**: Personal coaches and meeting assistants
- **Multimodal Apps**: Combine voice, video, images, and text
- **Creative Tools**: [Story-telling experiences](https://storytelling-chatbot.fly.dev/) and social companions
- **Business Solutions**: [Customer intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0) and support bots
- **Complex conversational flows**: [Refer to Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) to learn more
## Step 1: Get Started
## See it in action
## Build/Install
<p float="left">
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png" width="280" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png" width="280" /></a>
<br/>
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png" width="280" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png" width="280" /></a>
</p>
## Key features
- **Voice-first Design**: Built-in speech recognition, TTS, and conversation handling
- **Flexible Integration**: Works with popular AI services (OpenAI, ElevenLabs, etc.)
- **Pipeline Architecture**: Build complex apps from simple, reusable components
- **Real-time Processing**: Frame-based pipeline architecture for fluid interactions
- **Production Ready**: Enterprise-grade WebRTC and Websocket support
💡 Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
## Getting started
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youre ready. You can also add a 📞 telephone number, 🖼️ image output, 📺 video input, use different LLMs, and more.
```shell
# Install the module
pip install pipecat-ai
# Set up your environment
cp dot-env.template .env
```
To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with:
```shell
pip install "pipecat-ai[option,...]"
```
### Available services
| Category | Services | Install Command Example |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) | `pip install "pipecat-ai[deepgram]"` |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Together AI](https://docs.pipecat.ai/server/services/llm/together) | `pip install "pipecat-ai[openai]"` |
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) | `pip install "pipecat-ai[cartesia]"` |
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) | `pip install "pipecat-ai[openai]"` |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local | `pip install "pipecat-ai[daily]"` |
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) | `pip install "pipecat-ai[tavus,simli]"` |
| Vision & Image | [Moondream](https://docs.pipecat.ai/server/services/vision/moondream), [fal](https://docs.pipecat.ai/server/services/image-generation/fal) | `pip install "pipecat-ai[moondream]"` |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) | `pip install "pipecat-ai[silero]"` |
| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/server/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) | `pip install "pipecat-ai[canonical]"` |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
## Code examples
- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [Example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) — complete applications that you can use as starting points for development
## A simple voice agent running locally
Here is a very basic Pipecat bot that greets a user when they join a real-time session. We'll use [Daily](https://daily.co) for real-time media transport, and [Cartesia](https://cartesia.ai/) for text-to-speech.
```python
import asyncio
from pipecat.frames.frames import TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
async def main():
# Use Daily as a real-time media transport (WebRTC)
transport = DailyTransport(
room_url=...,
token="", # leave empty. Note: token is _not_ your api key
bot_name="Bot Name",
params=DailyParams(audio_out_enabled=True))
# Use Cartesia for Text-to-Speech
tts = CartesiaTTSService(
api_key=...,
voice_id=...
)
# Simple pipeline that will process text to speech and output the result
pipeline = Pipeline([tts, transport.output()])
# Create Pipecat processor that can run one or more pipelines tasks
runner = PipelineRunner()
# Assign the task callable to run the pipeline
task = PipelineTask(pipeline)
# Register an event handler to play audio when a
# participant joins the transport WebRTC session
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
participant_name = participant.get("info", {}).get("userName", "")
# Queue a TextFrame that will get spoken by the TTS service (Cartesia)
await task.queue_frame(TextFrame(f"Hello there, {participant_name}!"))
# Register an event handler to exit the application when the user leaves.
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.cancel()
# Run the pipeline task
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())
```
Run it with:
```shell
python app.py
```
Daily provides a prebuilt WebRTC user interface. While the app is running, you can visit at `https://<yourdomain>.daily.co/<room_url>` and listen to the bot say hello!
## WebRTC for production use
WebSockets are fine for server-to-server communication or for initial development. But for production use, youll need client-server audio to use a protocol designed for real-time media transport. (For an explanation of the difference between WebSockets and WebRTC, see [this post.](https://www.daily.co/blog/how-to-talk-to-an-llm-with-your-voice/#webrtc))
One way to get up and running quickly with WebRTC is to sign up for a Daily developer account. Daily gives you SDKs and global infrastructure for audio (and video) routing. Every account gets 10,000 audio/video/transcription minutes free each month.
Sign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://docs.daily.co/reference/rest-api/rooms) in the developer Dashboard.
## Hacking on the framework itself
_Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo:_
```
python3 -m venv env
source env/bin/activate
```shell
python3 -m venv venv
source venv/bin/activate
```
From the root of this repo, run the following:
```
pip install -r requirements.txt
python -m build
```shell
pip install -r dev-requirements.txt
```
This builds the package. To use the package locally (eg to run sample files), run
This will install the necessary development dependencies. Also, make sure you install the git pre-commit hooks:
```shell
pre-commit install
```
pip install --editable .
The hooks will just save you time when you submit a PR by making sure your code follows the project rules.
To use the package locally (e.g. to run sample files), run:
```shell
pip install --editable ".[option,...]"
```
The `--editable` option makes sure you don't have to run `pip install` again and you can just edit the project files locally.
If you want to use this package from another directory, you can run:
```
pip install path_to_this_repo
```shell
pip install "path_to_this_repo[option,...]"
```
## Running the samples
### Running tests
Tou can run the simple sample like so:
From the root directory, run:
```
python src/examples/theoretical-to-real/01-say-one-thing.py -u <url of your Daily meeting> -k <your Daily API Key>
```
## Overview
The Daily AI SDK allows you to build applications that can participate in WebRTC sessions and interact with AI Services. Some examples of what you can build with this:
- conversational bots that interact 1:1 with a user, using voice recognition and text-to-speech
- assistant bots that aggregate transcriptions from multiple participants in a meeting and provide realtime summaries or other AI-generated output.
- image-recognition bots
- etc
## Concepts
### Transport Service
The SDK provides one “transport service”, which is a wrapper around Dailys `daily-python` client (tk add link). You can use this service to listen for events related to a WebRTC session, such as “a participant joined the meeting”.
The transport service also exposes a send queue, and a receive queue. You can use the send queue to send audio and video to the WebRTC session, and you can listen to the receive queue to see audio, video and transcription data from the WebRTC session.
### AI Services
The AI Service classes provide wrappers around various AI providers, and allow you to query LLMs, convert text to speech and make images from text. The audio and images can then be placed on the transport services send queue, where theyll be sent to the WebRTC session.
### Queue Frames
Communication between the transport service and AI services, and between various AI services, takes place in Queue Frames. These frames contain an indication of the type of data as well as the data itself.
## Using Transports, AI Services and Frames
AI Services all define a `.run` method. This method consumes and generates `QueueFrame` frames. The kind of frames that can be consumed and generated depend on the kind of service. For instance, an LLM AI Service consumes `LLM_MESSAGE` frames (which define a history of interaction with an LLM) and emit `TEXT` frames (the response from the LLM).
The `.run` method is an `AsyncIterable`, and it takes an `iterable`, `AsyncIterable` or `asyncio.Queue` that produces QueueFrames as a parameter. This makes it easy to chain AI Services, and consume input from the Transports `receive_queue` .
AI Services also have a `.run_to_queue` method. This method is not an AsyncIterable, but instead sends processed QueueFrames to a queue. This makes it easy to send the output of an AI Service to the Transports `send_queue`.
AI Services also define convenience functions that let you bypass creating QueueFrames for some simple cases (eg. using the TTS service to convert a string to audio output and send that audio to the transports `send_queue`). See below for examples.
## Examples
### Say Something
The base TTS AI service exposes a `.say` method. After creating a transport and TTS service, you can use this method like so:
```
transport = DailyTransportService(...)
tts = AzureTTSService()
await tts.say("hello world", transport.send_queue)
```shell
pytest
```
This will call the TTS service to render the text to audio frames, then put the audio frames on the transports send queue. The transport will then send those frames along to the WebRTC session.
## Setting up your editor
### Speak an LLM response
This project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting via [Ruff](https://github.com/astral-sh/ruff).
Given a system prompt contained in a `messages` array, you can emit the LLMs response as audio with a chain like this:
### Emacs
```
transport = DailyTransportService(...) # setup parameters omitted
tts = AzureTTSService()
llm = AzureLLMService()
messages = [...] # system prompt omitted for brevity
You can use [use-package](https://github.com/jwiegley/use-package) to install [emacs-lazy-ruff](https://github.com/christophermadsen/emacs-lazy-ruff) package and configure `ruff` arguments:
await tts.run_to_queue(
transport.send_queue,
llm.run([QueueFrame.LLM_MESSAGES, messages])
)
```elisp
(use-package lazy-ruff
:ensure t
:hook ((python-mode . lazy-ruff-mode))
:config
(setq lazy-ruff-format-command "ruff format")
(setq lazy-ruff-check-command "ruff check --select I"))
```
In this code, the LLM service object sends the messages to Azures OpenAI implementation, which streams chunks back asynchronously. Those chunks are aggregated by the TTS Service to ensure the best audio response (TTS works best when it gets complete sentence, so it can inflect correctly), then sent to Azures TTS service, converted to audio frames, and sent to the WebRTC session via the Daily transport.
`ruff` was installed in the `venv` environment described before, so you should be able to use [pyvenv-auto](https://github.com/ryotaro612/pyvenv-auto) to automatically load that environment inside Emacs.
### Pre-cache an LLM response
Sometimes LLMs can be slower than wed like for natural-feeling communication. Heres an example where we take advantage of the time it takes to speak some pre-defined text to get a head start on the LLM response:
(TK link to 04- sample)
In this sample, we set up a buffer queue to receive the audio frames from the LLM response before while we are joining the call and start an asynchronous task to start filling this buffer:
```
buffer_queue = asyncio.Queue()
llm_response_task = asyncio.create_task(
elevenlabs_tts.run_to_queue(
buffer_queue,
llm.run([QueueFrame(FrameType.LLM_MESSAGE, messages)]),
True,
)
)
```elisp
(use-package pyvenv-auto
:ensure t
:defer t
:hook ((python-mode . pyvenv-auto-run)))
```
Then, when weve joined the call, we speak the static text:
### Visual Studio Code
```
await azure_tts.say("My friend...", transport.send_queue)
Install the
[Ruff](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, and enable formatting on save:
```json
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true
}
```
As that text is being spoken, the asynchronous LLM task continues in the background. When the text is done, we pull the frames off the buffer queue and put them in the transports `send_queue`:
### PyCharm
```
async def buffer_to_send_queue():
while True:
frame = await buffer_queue.get()
await transport.send_queue.put(frame)
buffer_queue.task_done()
if frame.frame_type == FrameType.END_STREAM:
break
`ruff` was installed in the `venv` environment described before, now to enable autoformatting on save, go to `File` -> `Settings` -> `Tools` -> `File Watchers` and add a new watcher with the following settings:
await asyncio.gather(llm_response_task, buffer_to_send_queue())
1. **Name**: `Ruff formatter`
2. **File type**: `Python`
3. **Working directory**: `$ContentRoot$`
4. **Arguments**: `format $FilePath$`
5. **Program**: `$PyInterpreterDirectory$/ruff`
```
## Contributing
One thing to note here is the last parameter to `run_to_queue` in the first code clause above: this causes the `run_to_queue` method to send an `END_STREAM` frame when its done rendering. This lets us know when to stop our `buffer_to_send_queue` task above.
We welcome contributions from the community! Whether you're fixing bugs, improving documentation, or adding new features, here's how you can help:
- **Found a bug?** Open an [issue](https://github.com/pipecat-ai/pipecat/issues)
- **Have a feature idea?** Start a [discussion](https://discord.gg/pipecat)
- **Want to contribute code?** Check our [CONTRIBUTING.md](CONTRIBUTING.md) guide
- **Documentation improvements?** [Docs](https://github.com/pipecat-ai/docs) PRs are always welcome
Before submitting a pull request, please check existing issues and PRs to avoid duplicates.
We aim to review all contributions promptly and provide constructive feedback to help get your changes merged.
## Getting help
➡️ [Join our Discord](https://discord.gg/pipecat)
➡️ [Read the docs](https://docs.pipecat.ai)
➡️ [Reach us on X](https://x.com/pipecat_ai)

11
dev-requirements.txt Normal file
View File

@@ -0,0 +1,11 @@
build~=1.2.2
grpcio-tools~=1.69.0
pip-tools~=7.4.1
pre-commit~=4.0.1
pyright~=1.1.392
pytest~=8.3.4
pytest-asyncio~=0.25.2
ruff~=0.9.1
setuptools~=75.8.0
setuptools_scm~=8.1.0
python-dotenv~=1.0.1

22
docs/ISSUE_TEMPLATE.md Normal file
View File

@@ -0,0 +1,22 @@
# Description
Is this reporting a bug or feature request?
If reporting a bug, please fill out the following:
### Environment
- pipecat-ai version:
- python version:
- OS:
### Issue description
Provide a clear description of the issue.
### Repro steps
List the steps to reproduce the issue.
### Expected behavior
### Actual behavior
### Logs

View File

@@ -0,0 +1 @@
#### Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

View File

@@ -1,13 +1,10 @@
# Daily AI SDK Docs
# Pipecat Docs
## [Architecture Overview](architecture.md)
Learn about the thinking behind the SDK's design.
Learn about the thinking behind the framework's design.
## [Example Code](examples/)
## [A Frame's Progress](frame-progress.md)
The repo includes several example apps in the `src/examples` directory. The docs explain how they work.
See how a Frame is processed through a Transport, a Pipeline, and a series of Frame Processors.
## [API Reference](api/)
Complete documentation of the available classes and methods in the SDK.

20
docs/api/Makefile Normal file
View File

@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

109
docs/api/README.md Normal file
View File

@@ -0,0 +1,109 @@
# Pipecat Documentation
This directory contains the source files for auto-generating Pipecat's server API reference documentation.
## Setup
1. Install documentation dependencies:
```bash
pip install -r requirements.txt
```
2. Make the build scripts executable:
```bash
chmod +x build-docs.sh rtd-test.py
```
## Building Documentation
From this directory, you can build the documentation in several ways:
### Local Build
```bash
# Using the build script (automatically opens docs when done)
./build-docs.sh
# Or directly with sphinx-build
sphinx-build -b html . _build/html -W --keep-going
```
### ReadTheDocs Test Build
To test the documentation build process exactly as it would run on ReadTheDocs:
```bash
./rtd-test.py
```
This script:
- Creates a fresh virtual environment
- Installs all dependencies as specified in requirements files
- Handles conflicting dependencies (like grpcio versions for Riva and PlayHT)
- Builds the documentation in an isolated environment
- Provides detailed logging of the build process
Use this script to verify your documentation will build correctly on ReadTheDocs before pushing changes.
## Viewing Documentation
The built documentation will be available at `_build/html/index.html`. To open:
```bash
# On MacOS
open _build/html/index.html
# On Linux
xdg-open _build/html/index.html
# On Windows
start _build/html/index.html
```
## Directory Structure
```
.
├── api/ # Auto-generated API documentation
├── _build/ # Built documentation
├── _static/ # Static files (images, css, etc.)
├── conf.py # Sphinx configuration
├── index.rst # Main documentation entry point
├── requirements-base.txt # Base documentation dependencies
├── requirements-riva.txt # Riva-specific dependencies
├── requirements-playht.txt # PlayHT-specific dependencies
├── build-docs.sh # Local build script
└── rtd-test.py # ReadTheDocs test build script
```
## Notes
- Documentation is auto-generated from Python docstrings
- Service modules are automatically detected and included
- The build process matches our ReadTheDocs configuration
- Warnings are treated as errors (-W flag) to maintain consistency
- The --keep-going flag ensures all errors are reported
- Dependencies are split into multiple requirements files to handle version conflicts
## Troubleshooting
If you encounter missing service modules:
1. Verify the service is installed with its extras: `pip install pipecat-ai[service-name]`
2. Check the build logs for import errors
3. Ensure the service module is properly initialized in the package
4. Run `./rtd-test.py` to test in an isolated environment matching ReadTheDocs
For dependency conflicts:
1. Check the requirements files for version specifications
2. Use `rtd-test.py` to verify dependency resolution
3. Consider adding service-specific requirements files if needed
For more information:
- [ReadTheDocs Configuration](.readthedocs.yaml)
- [Sphinx Documentation](https://www.sphinx-doc.org/)

10
docs/api/build-docs.sh Executable file
View File

@@ -0,0 +1,10 @@
#!/bin/bash
# Clean previous build
rm -rf _build
# Build docs matching ReadTheDocs configuration
sphinx-build -b html -d _build/doctrees . _build/html -W --keep-going
# Open docs (MacOS)
open _build/html/index.html

252
docs/api/conf.py Normal file
View File

@@ -0,0 +1,252 @@
import logging
import sys
from pathlib import Path
# Configure logging
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger("sphinx-build")
# Add source directory to path
docs_dir = Path(__file__).parent
project_root = docs_dir.parent.parent
sys.path.insert(0, str(project_root / "src"))
# Project information
project = "pipecat-ai"
copyright = "2024, Daily"
author = "Daily"
# General configuration
extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinx.ext.intersphinx",
]
# Napoleon settings
napoleon_google_docstring = True
napoleon_numpy_docstring = False
napoleon_include_init_with_doc = True
# AutoDoc settings
autodoc_default_options = {
"members": True,
"member-order": "bysource",
"special-members": "__init__",
"undoc-members": True,
"exclude-members": "__weakref__",
"no-index": True,
"show-inheritance": True,
}
# Mock imports for optional dependencies
autodoc_mock_imports = [
"riva",
"livekit",
"pyht", # Base PlayHT package
"pyht.async_client", # PlayHT specific imports
"pyht.client",
"pyht.protos",
"pyht.protos.api_pb2",
"pipecat_ai_playht", # PlayHT wrapper
"anthropic",
"assemblyai",
"boto3",
"azure",
"cartesia",
"deepgram",
"elevenlabs",
"fal",
"gladia",
"google",
"krisp",
"langchain",
"lmnt",
"noisereduce",
"openai",
"openpipe",
"simli",
"soundfile",
# Existing mocks
"pipecat_ai_krisp",
"pyaudio",
"_tkinter",
"tkinter",
"daily",
"daily_python",
"pydantic.BaseModel",
"pydantic.Field",
"pydantic._internal._model_construction",
"pydantic._internal._fields",
]
# HTML output settings
html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"]
autodoc_typehints = "description"
html_show_sphinx = False
def verify_modules():
"""Verify that required modules are available."""
required_modules = {
"services": [
"assemblyai",
"aws",
"cartesia",
"deepgram",
"google",
"lmnt",
"riva",
"simli",
],
"serializers": ["livekit"],
"vad": ["silero", "vad_analyzer"],
"transports": {
"services": ["daily", "livekit"],
"local": ["audio", "tk"],
"network": ["fastapi_websocket", "websocket_server"],
},
}
missing = []
for category, modules in required_modules.items():
if isinstance(modules, dict):
# Handle nested structure
for subcategory, submodules in modules.items():
for module in submodules:
try:
__import__(f"pipecat.{category}.{subcategory}.{module}")
logger.info(
f"Successfully imported pipecat.{category}.{subcategory}.{module}"
)
except (ImportError, TypeError, NameError) as e:
missing.append(f"pipecat.{category}.{subcategory}.{module}")
logger.warning(
f"Optional module not available: pipecat.{category}.{subcategory}.{module} - {str(e)}"
)
else:
# Handle flat structure
for module in modules:
try:
__import__(f"pipecat.{category}.{module}")
logger.info(f"Successfully imported pipecat.{category}.{module}")
except (ImportError, TypeError, NameError) as e:
missing.append(f"pipecat.{category}.{module}")
logger.warning(
f"Optional module not available: pipecat.{category}.{module} - {str(e)}"
)
if missing:
logger.warning(f"Some optional modules are not available: {missing}")
def clean_title(title: str) -> str:
"""Automatically clean module titles."""
# Remove everything after space (like 'module', 'processor', etc.)
title = title.split(" ")[0]
# Get the last part of the dot-separated path
parts = title.split(".")
title = parts[-1]
# Special cases for service names and common acronyms
special_cases = {
"ai": "AI",
"aws": "AWS",
"api": "API",
"vad": "VAD",
"assemblyai": "AssemblyAI",
"deepgram": "Deepgram",
"elevenlabs": "ElevenLabs",
"openai": "OpenAI",
"openpipe": "OpenPipe",
"playht": "PlayHT",
"xtts": "XTTS",
"lmnt": "LMNT",
}
# Check if the entire title is a special case
if title.lower() in special_cases:
return special_cases[title.lower()]
# Otherwise, capitalize each word
words = title.split("_")
cleaned_words = []
for word in words:
if word.lower() in special_cases:
cleaned_words.append(special_cases[word.lower()])
else:
cleaned_words.append(word.capitalize())
return " ".join(cleaned_words)
def setup(app):
"""Generate API documentation during Sphinx build."""
from sphinx.ext.apidoc import main
docs_dir = Path(__file__).parent
project_root = docs_dir.parent.parent
output_dir = str(docs_dir / "api")
source_dir = str(project_root / "src" / "pipecat")
# Clean existing files
if Path(output_dir).exists():
import shutil
shutil.rmtree(output_dir)
logger.info(f"Cleaned existing documentation in {output_dir}")
logger.info(f"Generating API documentation...")
logger.info(f"Output directory: {output_dir}")
logger.info(f"Source directory: {source_dir}")
excludes = [
str(project_root / "src/pipecat/pipeline/to_be_updated"),
str(project_root / "src/pipecat/processors/gstreamer"),
str(project_root / "src/pipecat/services/to_be_updated"),
str(project_root / "src/pipecat/vad"), # deprecated
"**/test_*.py",
"**/tests/*.py",
]
try:
main(
[
"-f", # Force overwriting
"-e", # Don't generate empty files
"-M", # Put module documentation before submodule documentation
"--no-toc", # Don't create a table of contents file
"--separate", # Put documentation for each module in its own page
"--module-first", # Module documentation before submodule documentation
"--implicit-namespaces", # Added: Handle implicit namespace packages
"-o",
output_dir,
source_dir,
]
+ excludes
)
logger.info("API documentation generated successfully!")
# Process generated RST files to update titles
for rst_file in Path(output_dir).glob("**/*.rst"): # Changed to recursive glob
content = rst_file.read_text()
lines = content.split("\n")
# Find and clean up the title
if lines and "=" in lines[1]: # Title is typically the first line
old_title = lines[0]
new_title = clean_title(old_title)
content = content.replace(old_title, new_title)
rst_file.write_text(content)
logger.info(f"Updated title: {old_title} -> {new_title}")
except Exception as e:
logger.error(f"Error generating API documentation: {e}", exc_info=True)
# Run module verification
verify_modules()

77
docs/api/index.rst Normal file
View File

@@ -0,0 +1,77 @@
Pipecat API Reference Docs
==========================
Welcome to Pipecat's API reference documentation!
Pipecat is an open source framework for building voice and multimodal assistants.
It provides a flexible pipeline architecture for connecting various AI services,
audio processing, and transport layers.
Quick Links
-----------
* `GitHub Repository <https://github.com/pipecat-ai/pipecat>`_
* `Website <https://pipecat.ai>`_
API Reference
-------------
Core Components
~~~~~~~~~~~~~~~
* :mod:`Frames <pipecat.frames>`
* :mod:`Processors <pipecat.processors>`
* :mod:`Pipeline <pipecat.pipeline>`
Audio Processing
~~~~~~~~~~~~~~~~
* :mod:`Audio <pipecat.audio>`
Services
~~~~~~~~
* :mod:`Services <pipecat.services>`
Transport & Serialization
~~~~~~~~~~~~~~~~~~~~~~~~~
* :mod:`Transports <pipecat.transports>`
* :mod:`Local <pipecat.transports.local>`
* :mod:`Network <pipecat.transports.network>`
* :mod:`Services <pipecat.transports.services>`
* :mod:`Serializers <pipecat.serializers>`
Utilities
~~~~~~~~~
* :mod:`Clocks <pipecat.clocks>`
* :mod:`Metrics <pipecat.metrics>`
* :mod:`Sync <pipecat.sync>`
* :mod:`Transcriptions <pipecat.transcriptions>`
* :mod:`Utils <pipecat.utils>`
.. toctree::
:maxdepth: 3
:caption: API Reference
:hidden:
Audio <api/pipecat.audio>
Clocks <api/pipecat.clocks>
Frames <api/pipecat.frames>
Metrics <api/pipecat.metrics>
Pipeline <api/pipecat.pipeline>
Processors <api/pipecat.processors>
Serializers <api/pipecat.serializers>
Services <api/pipecat.services>
Sync <api/pipecat.sync>
Transcriptions <api/pipecat.transcriptions>
Transports <api/pipecat.transports>
Utils <api/pipecat.utils>
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

35
docs/api/make.bat Normal file
View File

@@ -0,0 +1,35 @@
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)
if "%1" == "" goto help
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
:end
popd

40
docs/api/requirements.txt Normal file
View File

@@ -0,0 +1,40 @@
# Sphinx dependencies
sphinx>=8.1.3
sphinx-rtd-theme
sphinx-markdown-builder
sphinx-autodoc-typehints
toml
# Install all extras individually to ensure they're properly resolved
pipecat-ai[anthropic]
pipecat-ai[assemblyai]
pipecat-ai[aws]
pipecat-ai[azure]
pipecat-ai[canonical]
pipecat-ai[cartesia]
pipecat-ai[daily]
pipecat-ai[deepgram]
pipecat-ai[elevenlabs]
pipecat-ai[fal]
pipecat-ai[fireworks]
pipecat-ai[gladia]
pipecat-ai[google]
pipecat-ai[grok]
pipecat-ai[groq]
# pipecat-ai[krisp] # Mocked instead
pipecat-ai[langchain]
pipecat-ai[livekit]
pipecat-ai[lmnt]
pipecat-ai[local]
pipecat-ai[moondream]
pipecat-ai[nim]
pipecat-ai[noisereduce]
pipecat-ai[openai]
# pipecat-ai[openpipe]
# pipecat-ai[playht] # Mocked due to grpcio conflict with riva
pipecat-ai[riva]
pipecat-ai[silero]
pipecat-ai[simli]
pipecat-ai[soundfile]
pipecat-ai[websocket]
pipecat-ai[whisper]

38
docs/api/rtd-test.sh Executable file
View File

@@ -0,0 +1,38 @@
#!/bin/bash
set -e
# Configuration
DOCS_DIR=$(pwd)
PROJECT_ROOT=$(cd ../../ && pwd)
TEST_DIR="/tmp/rtd-test-$(date +%Y%m%d_%H%M%S)"
echo "Creating test directory: $TEST_DIR"
mkdir -p "$TEST_DIR"
cd "$TEST_DIR"
# Create virtual environment
python -m venv venv
source venv/bin/activate
echo "Installing build dependencies..."
pip install --upgrade pip wheel setuptools
echo "Installing documentation dependencies..."
pip install -r "$DOCS_DIR/requirements.txt"
echo "Building documentation..."
cd "$DOCS_DIR"
sphinx-build -b html . "_build/html"
echo "Build complete. Check _build/html directory for output."
# Print summary
echo -e "\n=== Build Summary ==="
echo "Documentation: $DOCS_DIR/_build/html"
echo "Test environment: $TEST_DIR"
echo -e "\nTo view the documentation:"
echo "open $DOCS_DIR/_build/html/index.html"
# Print installed packages for verification
echo -e "\n=== Installed Packages ==="
pip freeze | grep -E "sphinx|pipecat"

View File

@@ -1,2 +1,17 @@
# Daily AI SDK Architecture Guide
# Pipecat architecture guide
## Frames
Frames can represent discrete chunks of data, for instance a chunk of text, a chunk of audio, or an image. They can also be used to as control flow, for instance a frame that indicates that there is no more data available, or that a user started or stopped talking. They can also represent more complex data structures, such as a message array used for an LLM completion.
## FrameProcessors
Frame processors operate on frames. Every frame processor implements a `process_frame` method that consumes one frame and produces zero or more frames. Frame processors can do simple transforms, such as concatenating text fragments into sentences, or they can treat frames as input for an AI Service, and emit chat completions based on message arrays or transform text into audio or images.
## Pipelines
Pipelines are lists of frame processors linked together. Frame processors can push frames upstream or downstream to their peers. A very simple pipeline might chain an LLM frame processor to a text-to-speech frame processor, with a transport as an output.
## Transports
Transports provide input and output frame processors to receive or send frames respectively. For example, the `DailyTransport` does this with a WebRTC session joined to a Daily.co room.

View File

@@ -1,119 +0,0 @@
# 01: Say One Thing
_video here - youtube?_
This example uses a text-to-speech (TTS) service to say one predefined sentence. But first, a quick overview of the general structure of these examples.
## Running the demos
All of the demos have something like this at the bottom of the file:
```python
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url, token))
```
### `configure()`
The `configure()` function comes from `src/examples/foundational/support/runner.py`, and it allows you to configure the examples from the command line directly, or using environment variables:
```bash
python 01-say-one-thing.py -u https://YOUR_DOMAIN.daily.co/YOUR_ROOM -k YOUR_API_KEY
# or
DAILY_ROOM_URL=https://YOUR_DOMAIN.daily.co/YOUR_ROOM DAILY_API_KEY=YOUR_API_KEY python 01-say-one-thing.py
# or set DAILY_ROOM_URL and DAILY_API_KEY in a .env file
python 01-say-one-thing.py
```
You'll need a Daily account to run these demos. You can sign up for free at [daily.co](https://daily.co). Once you've signed up you can create a room from the [Dashboard](https://dashboard.daily.co/rooms), and grab [your API key](https://dashboard.daily.co/developers) while you're there.
Some functionality (such as transcription) requires the bot to have owner privileges in the room. `runner.py` uses the Daily REST API to create a meeting token with owner privileges. You can learn more about meeting tokens in the [Daily docs](https://docs.daily.co/reference/rest-api/meeting-tokens).
### `asyncio.run()`
The AI SDK makes heavy use of Python's `asyncio` module. [This is a reasonable intro to the topic](https://builtin.com/data-science/asyncio) if you haven't worked with `asyncio` and coroutines before.
You can learn a bit more about the specifics of how the Daily AI SDK uses coroutines in the [Architecture Guide](../architecture.md).
## The `main()` function
All of the examples have a `main()` function with a similar structure:
- Configure the transport
- Configure the AI service(s) used in the demo
- Configure any event listeners
- Define a processing pipeline
- Run the example's coroutine(s)
### Configuring the transport
The first section of the `main()` function configures the transport object:
```python
meeting_duration_minutes = 5
transport = DailyTransportService(
room_url,
None,
"Say One Thing",
meeting_duration_minutes,
)
transport.mic_enabled = True
```
The [Architecture Guide](../architecture.md) explains the transport object in more detail. In this case, we're configuring a Daily transport object and enabling the virtual microphone, so our bot can play audio.
### Configuring the services
As described in the [Architecture Guide](../architecture.md), 'a 'Service' is a class that processes 'Frames' as part of a 'Pipeline'. In this demo app, we'll only need one service: a text-to-speech generator. We can create an instance of the `ElevenLabsTTSService` class with this line of code:
```python
tts = ElevenLabsTTSService(aiohttp_session=session, api_key=os.getenv("ELEVENLABS_API_KEY"), voice_id=os.getenv("ELEVENLABS_VOICE_ID"))
```
You'll need to make sure and set those environment variables somewhere. The easiest way to do that is to copy the `example.env` file in the repo and rename it to `.env`, and then add your credentials to that file. `runner.py` loads the `python-dotenv` module and initializes it, making the values in that file available in the environment.
### Configuring event listeners
This part isn't strictly necessary for an app like this. You could include the contents of the `on_participant_joined` function directly in the body of the `main()` function, and it would run as soon as you started the script from the command line.
Instead, we can use an event handler to wait to run that code until someone else joins the meeting. We'll define a function called `greet_user()`, and use the `@transport.event_handler("on_participant_joined")` decorator to tell the SDK that we want to run that function whenever a user joins the room.
```python
@transport.event_handler("on_participant_joined")
async def greet_user(transport, participant):
if participant["info"]["isLocal"]:
return
await tts.say(
"Hello there, " + participant["info"]["userName"] + "!",
transport.send_queue,
)
# wait for the output queue to be empty, then leave the meeting
await transport.stop_when_done()
```
### Defining a processing pipeline
In this example, we don't actually have much of a processing pipeline! In fact, we're doing the whole thing inside the `greet_user()` function already.
Pipelines usually look like a bunch of nested calls to the `run()` or `run_to_queue()` function from different Services. In this example, we're using the `say()` function from the TTS service. This is effectively a convenience wrapper around the `run_to_queue()` function, which we'll discuss more later. It's important to `await` this function to ensure that the speech frames are queued for playback before the next line of code, because of the `stop_when_done()` function being called immediately afterward.
The output of the `say()` function goes to the transport's `send_queue`. This queue is the all-important connection between the world of the Services pipeline that's generating frames asynchronously and the ordered playback of audio and visual media in the WebRTC call.
### Running the coroutines
In this example, we don't actually have any separate processing pipelines—everything happens as a result of an event from the transport. So we only need to run the transport's coroutine, and await its completion:
```python
await transport.run()
```
In future examples, we'll run more processes in parallel. For now, this script can run until the transport exits—which will happen based on calling `stop_when_done()` in the `greet_user()` function.
## Next Steps
Next, we'll start connecting multiple AI services together by building a service pipeline.
## [02 - LLM Say One Thing »](02-llm-say-one-thing.md)

View File

@@ -1,5 +0,0 @@
# Daily AI SDK Examples
The docs in this folder pair with the example apps located in `src/examples/foundational`. They are designed to serve as a quick references for building different kinds of AI apps. But the examples also build on one another, so it can be really helpful to walk through them in order.
To start, you can learn about the overall structure of the examples in [01 - Say One Thing](01-say-one-thing.md).

46
docs/frame-progress.md Normal file
View File

@@ -0,0 +1,46 @@
# A Frame's Progress
1. A user says “Hello, LLM” and the cloud transcription service delivers a transcription to the Transport.
![A transcript frame arrives](images/frame-progress-01.png)
2. The Transport places a Transcription frame in the Pipelines source queue.
![Frame in source queue](images/frame-progress-02.png)
3. The Pipeline passes the Transcription frame to the first Frame Processor in its list, the LLM User Message Aggregator.
![To UMA](images/frame-progress-03.png)
4. The LLM User Message Aggregator updates the LLM Context with a `{“user”: “Hello LLM”}` message.
![Update context](images/frame-progress-04.png)
5. The LLM User Message Aggregator yields an LLM Message Frame, containing the updated LLM Context. The Pipeline passes this frame to the LLM Frame Processor.
![Update context](images/frame-progress-05.png)
6. The LLM Frame Processor creates a streaming chat completion based on the LLM context and yields the first chunk of a response, Text Frame with the value “Hi, “. The Pipeline passes this frame to the TTS Frame Processor. The TTS Frame Processor aggregates this response but doesnt yield anything, yet, because its waiting for a full sentence.
![LLM yields Text](images/frame-progress-06.png)
7. The LLM Frame Processor yields another Text Frame with the value “there.”. The Pipeline passes this frame to the TTS Frame Processor.
![LLM yields more Text](images/frame-progress-07.png)
8. The TTS Frame Processor now has a full sentence, so it starts streaming audio based on “Hi, there.” It yields the first chunk of streaming audio as an Audio frame, which the Pipeline passes to the LLM Assistant Message Aggregator.
![TTS yields Audio](images/frame-progress-08.png)
9. The LLM Assistant Message Aggregator doesnt do anything with Audio frames, so it immediately yields the frame, unchanged. This is the convention for all Frame Processors: frames that the processor doesnt process should be immediately yielded.
![pass-through](images/frame-progress-09.png)
10. The Pipeline places the first Audio frame in its sink queue, which is being watched by the Transport. Since the frame is now in a queue, the Pipeline can continue processing other frames. Note that the source and sink queues form a sort of “boundary of concurrent processing” between a Pipeline and the outside world. In a Pipeline, Frames are processed sequentially; once a Frame is on a queue it can be processed in parallel with the frames being processed by the Pipeline. TODO: link to a more in-depth section about this.
![sink queue](images/frame-progress-10.png)
11. The TTS Frame Processor yields another Audio frame as the Transport transmits the first Audio frame.
![parallel audio](images/frame-progress-11.png)
12. As before, the LLM Assistant Message Aggregator immediately yields the Audio frame and the Pipeline places the Audio frame in the sink queue.
![sink queue 2](images/frame-progress-12.png)
13. The TTS Frame Processor has no more frames to yield. The LLM Frame Processor emits an LLM Response End Frame, which the Pipeline passes to the TTS Frame Processor.
![response end](images/frame-progress-13.png)
14. The TTS Frame Processor immediately yields the LLM Response End Frame, so the Pipeline passes it along to the LLM Assistant Message Aggregator. The LLM Assistant Message Aggregator updates the LLM Context with the full response from the LLM. TODO TODO: I realized I forgot that the TSS Frame Processor also yields the Text frames that the LLM emitted so that the LLM Assistant Message Aggregator could accumulate them, arrggh.
![response end](images/frame-progress-14.png)
15. The system is quiet, and waiting for the next message from the Transport.
![response end](images/frame-progress-15.png)

110
docs/frame.md Normal file
View File

@@ -0,0 +1,110 @@
# Understanding Different Frame Types in the Pipecat System
In the Pipecat system, frames are used to represent different types of data and control signals that flow through the pipeline. Understanding these frame types is crucial for working with the system effectively. This tutorial will cover the main categories of frames and their specific uses.
## 1. Base Frame Classes
### Frame
The `Frame` class is the base class for all frames. It includes:
- `id`: A unique identifier
- `name`: A descriptive name
- `pts`: Presentation timestamp (optional)
### DataFrame
`DataFrame` is a subclass of `Frame` and serves as a base for most data-carrying frames.
## 2. Audio Frames
### AudioRawFrame
Represents a chunk of audio with properties:
- `audio`: Raw audio data
- `sample_rate`: Audio sample rate
- `num_channels`: Number of audio channels
Subclasses include:
- `InputAudioRawFrame`: For audio from input sources
- `OutputAudioRawFrame`: For audio to be played by output devices
- `TTSAudioRawFrame`: For audio generated by Text-to-Speech services
## 3. Image Frames
### ImageRawFrame
Represents an image with properties:
- `image`: Raw image data
- `size`: Image dimensions
- `format`: Image format (e.g., JPEG, PNG)
Subclasses include:
- `InputImageRawFrame`: For images from input sources
- `OutputImageRawFrame`: For images to be displayed
- `UserImageRawFrame`: For images associated with a specific user
- `VisionImageRawFrame`: For images with associated text for description
- `URLImageRawFrame`: For images with an associated URL
### SpriteFrame
Represents an animated sprite, containing a list of `ImageRawFrame` objects.
## 4. Text and Transcription Frames
### TextFrame
Represents a chunk of text, used for various purposes in the pipeline.
### TranscriptionFrame
A specialized `TextFrame` for speech transcriptions, including:
- `user_id`: ID of the speaking user
- `timestamp`: When the transcription was generated
- `language`: Detected language of the speech
### InterimTranscriptionFrame
Similar to `TranscriptionFrame`, but for interim (not final) transcriptions.
## 5. LLM (Language Model) Frames
### LLMMessagesFrame
Contains a list of messages for an LLM service to process.
### LLMMessagesAppendFrame and LLMMessagesUpdateFrame
Used to modify the current context of LLM messages.
### LLMSetToolsFrame
Specifies tools (functions) available for the LLM to use.
### LLMEnablePromptCachingFrame
Controls prompt caching in certain LLMs.
## 6. System and Control Frames
### SystemFrame
Base class for system-level frames.
Important system frames include:
- `StartFrame`: Initiates a pipeline
- `CancelFrame`: Stops a pipeline immediately
- `ErrorFrame`: Notifies of errors (with `FatalErrorFrame` for unrecoverable errors)
- `EndTaskFrame` and `CancelTaskFrame`: Control pipeline tasks
- `StartInterruptionFrame` and `StopInterruptionFrame`: Indicate user speech for interruptions
### ControlFrame
Base class for control-flow frames.
Notable control frames:
- `EndFrame`: Signals the end of a pipeline
- `LLMFullResponseStartFrame` and `LLMFullResponseEndFrame`: Bracket LLM responses
- `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame`: Indicate user speech activity
- `BotStartedSpeakingFrame` and `BotStoppedSpeakingFrame`: Indicate bot speech activity
- `TTSStartedFrame` and `TTSStoppedFrame`: Bracket Text-to-Speech responses
## 7. Special Purpose Frames
### MetricsFrame
Contains performance metrics data.
### FunctionCallInProgressFrame and FunctionCallResultFrame
Used for handling LLM function (tool) calls.
### ServiceUpdateSettingsFrame
Base class for updating service settings, with specific subclasses for LLM, TTS, and STT services.
## Conclusion
Understanding these frame types is essential for working with the Pipecat system. Each frame type serves a specific purpose in the pipeline, whether it's carrying data (like audio or images), controlling the flow of the pipeline, or managing system-level operations. By using the appropriate frame types, you can effectively process and transmit various kinds of information through your pipeline.

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 111 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 117 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

86
dot-env.template Normal file
View File

@@ -0,0 +1,86 @@
# Anthropic
ANTHROPIC_API_KEY=...
# AWS
AWS_SECRET_ACCESS_KEY=...
AWS_ACCESS_KEY_ID=...
AWS_REGION=...
# Azure
AZURE_SPEECH_REGION=...
AZURE_SPEECH_API_KEY=...
AZURE_CHATGPT_API_KEY=...
AZURE_CHATGPT_ENDPOINT=https://...
AZURE_CHATGPT_MODEL=...
AZURE_DALLE_API_KEY=...
AZURE_DALLE_ENDPOINT=https://...
AZURE_DALLE_MODEL=...
# Daily
DAILY_API_KEY=...
DAILY_SAMPLE_ROOM_URL=https://...
# ElevenLabs
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
# Fal
FAL_KEY=...
# Fireworks
FIREWORKS_API_KEY=...
# Gladia
GLADIA_API_KEY=...
# LMNT
LMNT_API_KEY=...
LMNT_VOICE_ID=...
# PlayHT
PLAY_HT_USER_ID=...
PLAY_HT_API_KEY=...
# OpenAI
OPENAI_API_KEY=...
# OpenPipe
OPENPIPE_API_KEY=...
# Tavus
TAVUS_API_KEY=...
TAVUS_REPLICA_ID=...
TAVUS_PERSONA_ID=...
# Simli
SIMLI_API_KEY=...
SIMLI_FACE_ID=...
# Krisp
KRISP_MODEL_PATH=...
# DeepSeek
DEEPSEEK_API_KEY=...
# Groq
GROQ_API_KEY=...
# Grok
GROK_API_KEY=...
# Together.ai
TOGETHER_API_KEY=...
# Cerebras
CEREBRAS_API_KEY=...
# Fish Audio
FISH_API_KEY=...
# Assembly AI
ASSEMBLYAI_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...

88
examples/README.md Normal file
View File

@@ -0,0 +1,88 @@
# Pipecat &mdash; Examples
## Foundational snippets
Small snippets that build on each other, introducing one or two concepts at a time.
➡️ [Take a look](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational)
## Chatbot examples
Collection of self-contained real-time voice and video AI demo applications built with Pipecat.
### Quickstart
Each project has its own set of dependencies and configuration variables. They intentionally avoids shared code across projects &mdash; you can grab whichever demo folder you want to work with as a starting point.
We recommend you start with a virtual environment:
```shell
cd pipecat-ai/examples/simple-chatbot
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
Next, follow the steps in the README for each demo.
Make sure you `pip install -r requirements.txt` for each demo project, so you can be sure to have the necessary service dependencies that extend the functionality of Pipecat. You can read more about the framework architecture [here](https://github.com/pipecat-ai/pipecat/tree/main/docs).
## Projects:
| Project | Description | Services |
|----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|
| [Simple Chatbot](simple-chatbot) | Basic voice-driven conversational bot. A good starting point for learning the flow of the framework. | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI |
| [Storytelling Chatbot](storytelling-chatbot) | Stitches together multiple third-party services to create a collaborative storytime experience. | Deepgram, ElevenLabs, OpenAI, Fal, Daily, Custom UI |
| [Translation Chatbot](translation-chatbot) | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI |
| [Moondream Chatbot](moondream-chatbot) | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU** | Deepgram, ElevenLabs, OpenAI, Moondream, Daily, Daily Prebuilt UI |
| [Patient intake](patient-intake) | A chatbot that can call functions in response to user input. | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI |
| [Phone Chatbot](phone-chatbot) | A chatbot that connects to PSTN/SIP phone calls, powered by Daily or Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [Twilio Chatbot](twilio-chatbot) | A chatbot that connects to an incoming phone call from Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [studypal](studypal) | A chatbot to have a conversation about any article on the web | |
| [WebSocket Chatbot Server](websocket-server) | A real-time websocket server that handles audio streaming and bot interactions with speech-to-text and text-to-speech capabilities. | Cartesia, Deepgram, OpenAI, Websockets |
> [!IMPORTANT]
> These example projects use Daily as a WebRTC transport and can be joined using their hosted Prebuilt UI.
> It provides a quick way to join a real-time session with your bot and test your ideas without building any frontend code. If you'd like to see an example of a custom UI, try Storybot.
## FAQ
### Deployment
For each of these demos we've included a `Dockerfile`. Out of the box, this should provide everything needed to get the respective demo running on a VM:
```shell
docker build username/app:tag .
docker run -p 7860:7860 --env-file ./.env username/app:tag
docker push ...
```
### SSL
If you're working with a custom UI (such as with the Storytelling Chatbot), it's important to ensure your deployment platform supports HTTPS, as accessing user devices such as mics and webcams requires SSL.
If you try to run a custom UI without SSL, you may see an error in the console telling you that `navigator` is undefined, or no devices are available.
### Are these examples production ready?
Yes, kind of.
These demos attempt to keep things simple and are unopinionated regarding environment or scalability.
We're using FastAPI to spawn a subprocess for the bots / agents &mdash; useful for small tests, but not so great for production grade apps with many concurrent users. You can see how this works in each project's `start` endpoint in `server.py`.
Creating virtualized worker pools and on-demand instances is out of scope for these examples, but we hope to add some examples to this repo soon!
For projects that have CUDA as a requirement, such as Moondream Chatbot, be sure to deploy to a GPU-powered platform (such as [fly.io](https://fly.io) or [Runpod](https://runpod.io).)
## Getting help
➡️ [Join our Discord](https://discord.gg/pipecat)
➡️ [Reach us on Twitter](https://x.com/pipecat_ai)

View File

@@ -0,0 +1,45 @@
# Bot ready signaling
A simple Pipecat example demonstrating how to handle signaling between the client and the bot,
ensuring that the bot starts sending audio only when the client is available,
thereby avoiding the risk of cutting off the beginning of the audio.
## Quick Start
### First, start the bot server:
1. Navigate to the server directory:
```bash
cd server
```
2. Create and activate a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install requirements:
```bash
pip install -r requirements.txt
```
4. Copy env.example to .env and configure:
- Add your API keys
5. Start the server:
```bash
python server.py
```
### Next, connect using the client app:
For client-side setup, refer to the [JavaScript Guide](client/javascript/README.md).
## Important Note
Ensure the bot server is running before using any client implementations.
## Requirements
- Python 3.10+
- Node.js 16+ (for JavaScript)
- Daily API key
- Cartesia API key
- Modern web browser with WebRTC support

View File

@@ -0,0 +1,27 @@
# JavaScript Implementation
Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
## Setup
1. Run the bot server. See the [server README](../../README).
2. Navigate to the `client/javascript` directory:
```bash
cd client/javascript
```
3. Install dependencies:
```bash
npm install
```
4. Run the client app:
```
npm run dev
```
5. Visit http://localhost:5173 in your browser.

View File

@@ -0,0 +1,34 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI Chatbot</title>
</head>
<body>
<div class="container">
<div class="status-bar">
<div class="status">
Status: <span id="connection-status">Disconnected</span>
</div>
<div class="controls">
<button id="connect-btn">Connect</button>
<button id="disconnect-btn" disabled>Disconnect</button>
</div>
</div>
<audio id="bot-audio" autoplay></audio>
<div class="debug-panel">
<h3>Debug Info</h3>
<div id="debug-log"></div>
</div>
</div>
<script type="module" src="/src/app.js"></script>
<link rel="stylesheet" href="/src/style.css">
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,20 @@
{
"name": "client",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.0.2"
},
"dependencies": {
"@daily-co/daily-js": "0.74.0"
}
}

View File

@@ -0,0 +1,216 @@
/**
* Copyright (c) 20242025, Daily
*
* SPDX-License-Identifier: BSD 2-Clause License
*/
import Daily from "@daily-co/daily-js";
/**
* ChatbotClient handles the connection and media management for a real-time
* voice interaction with an AI bot.
*/
class ChatbotClient {
constructor() {
// Initialize client state
this.dailyCallObject = null;
this.setupDOMElements();
this.setupEventListeners();
}
/**
* Set up references to DOM elements and create necessary media elements
*/
setupDOMElements() {
// Get references to UI control elements
this.connectBtn = document.getElementById('connect-btn');
this.disconnectBtn = document.getElementById('disconnect-btn');
this.statusSpan = document.getElementById('connection-status');
this.debugLog = document.getElementById('debug-log');
// Create an audio element for bot's voice output
this.botAudio = document.createElement('audio');
this.botAudio.autoplay = true;
this.botAudio.playsInline = true;
document.body.appendChild(this.botAudio);
}
/**
* Set up event listeners for connect/disconnect buttons
*/
setupEventListeners() {
this.connectBtn.addEventListener('click', () => this.connect());
this.disconnectBtn.addEventListener('click', () => this.disconnect());
}
/**
* Add a timestamped message to the debug log
*/
log(message) {
const entry = document.createElement('div');
entry.textContent = `${new Date().toISOString()} - ${message}`;
// Add styling based on message type
if (message.startsWith('User: ')) {
entry.style.color = '#2196F3'; // blue for user
} else if (message.startsWith('Bot: ')) {
entry.style.color = '#4CAF50'; // green for bot
}
this.debugLog.appendChild(entry);
this.debugLog.scrollTop = this.debugLog.scrollHeight;
console.log(message);
}
/**
* Update the connection status display
*/
updateStatus(status) {
this.statusSpan.textContent = status;
this.log(`Status: ${status}`);
}
handleEventToConsole (evt) {
this.log(`Received event: ${evt.action}`);
};
/**
* Set up listeners for track events (start/stop)
* This handles new tracks being added during the session
*/
setupTrackListeners() {
if (!this.dailyCallObject) return;
this.dailyCallObject.on("joined-meeting", () => {
this.updateStatus('Connected');
this.connectBtn.disabled = true;
this.disconnectBtn.disabled = false;
this.log('Client connected');
});
this.dailyCallObject.on("track-started", (evt) => {
if (evt.track.kind === "audio" && evt.participant.local === false) {
this.log("Audio track started.")
this.setupAudioTrack(evt.track);
}
});
this.dailyCallObject.on("track-stopped", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-joined", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-updated", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-left", () => {
// When the bot leaves, we are also disconnecting from the call
this.disconnect()
});
this.dailyCallObject.on("left-meeting", () => {
this.updateStatus('Disconnected');
this.connectBtn.disabled = false;
this.disconnectBtn.disabled = true;
this.log('Client disconnected');
});
this.dailyCallObject.on("error", this.handleEventToConsole.bind(this));
}
/**
* Set up an audio track for playback
* Handles both initial setup and track updates
*/
setupAudioTrack(track) {
this.log(`Setting up audio track, track state: ${track.readyState}, muted: ${track.muted}`);
// Check if we're already playing this track
if (this.botAudio.srcObject) {
const oldTrack = this.botAudio.srcObject.getAudioTracks()[0];
if (oldTrack?.id === track.id) return;
}
// Create a new MediaStream with the track and set it as the audio source
this.botAudio.srcObject = new MediaStream([track]);
this.botAudio.onplaying = async (event) => {
this.log("onplaying")
this.log("Will send the audio message to play the audio at the next tick")
this.dailyCallObject.sendAppMessage("playable")
}
}
async fetchRoomInfo() {
let connectUrl = '/connect'
let res = await fetch(connectUrl, {
method: "POST",
mode: "cors",
headers: new Headers({
"Content-Type": "application/json"
}),
})
if (res.ok) {
return res.json();
}
}
/**
* Initialize and connect to the bot
* This sets up the RTVI client, initializes devices, and establishes the connection
*/
async connect() {
try {
// Initialize the client
this.dailyCallObject = Daily.createCallObject({
subscribeToTracksAutomatically: true,
});
// Set up listeners for media track events
this.setupTrackListeners();
this.log('Creating the bot...');
let roomInfo = await this.fetchRoomInfo()
// Connect to the bot
this.log('Connecting to bot...');
// Only for making debugger easier
window.callObject = this.dailyCallObject;
await this.dailyCallObject.join({
url: roomInfo.room_url,
});
this.log('Connection complete');
} catch (error) {
// Handle any errors during connection
this.log(`Error connecting: ${error.message}`);
this.log(`Error stack: ${error.stack}`);
this.updateStatus('Error');
// Clean up if there's an error
if (this.dailyCallObject) {
try {
await this.dailyCallObject.leave();
} catch (disconnectError) {
this.log(`Error during disconnect: ${disconnectError.message}`);
}
}
}
}
/**
* Disconnect from the bot and clean up media resources
*/
async disconnect() {
if (this.dailyCallObject) {
try {
// Disconnect the RTVI client
await this.dailyCallObject.leave();
await this.dailyCallObject.destroy();
this.dailyCallObject = null;
// Clean up audio
if (this.botAudio.srcObject) {
this.botAudio.srcObject.getTracks().forEach((track) => track.stop());
this.botAudio.srcObject = null;
}
} catch (error) {
this.log(`Error disconnecting: ${error.message}`);
}
}
}
}
// Initialize the client when the page loads
window.addEventListener('DOMContentLoaded', () => {
new ChatbotClient();
});

View File

@@ -0,0 +1,98 @@
body {
margin: 0;
padding: 20px;
font-family: Arial, sans-serif;
background-color: #f0f0f0;
}
.container {
max-width: 1200px;
margin: 0 auto;
}
.status-bar {
display: flex;
justify-content: space-between;
align-items: center;
padding: 10px;
background-color: #fff;
border-radius: 8px;
margin-bottom: 20px;
}
.controls button {
padding: 8px 16px;
margin-left: 10px;
border: none;
border-radius: 4px;
cursor: pointer;
}
#connect-btn {
background-color: #4caf50;
color: white;
}
#disconnect-btn {
background-color: #f44336;
color: white;
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.main-content {
background-color: #fff;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
}
.bot-container {
display: flex;
flex-direction: column;
align-items: center;
}
#bot-video-container {
width: 640px;
height: 360px;
background-color: #e0e0e0;
border-radius: 8px;
margin: 20px auto;
overflow: hidden;
display: flex;
align-items: center;
justify-content: center;
}
#bot-video-container video {
width: 100%;
height: 100%;
object-fit: cover;
}
.debug-panel {
background-color: #fff;
border-radius: 8px;
padding: 20px;
}
.debug-panel h3 {
margin: 0 0 10px 0;
font-size: 16px;
font-weight: bold;
}
#debug-log {
height: 200px;
overflow-y: auto;
background-color: #f8f8f8;
padding: 10px;
border-radius: 4px;
font-family: monospace;
font-size: 12px;
line-height: 1.4;
}

View File

@@ -0,0 +1,13 @@
import { defineConfig } from 'vite';
export default defineConfig({
server: {
proxy: {
// Proxy /api requests to the backend server
'/connect': {
target: 'http://0.0.0.0:7860', // Replace with your backend URL
changeOrigin: true,
},
},
},
});

View File

@@ -0,0 +1,50 @@
# Bot ready signaling Server
A FastAPI server that manages bot instances and provide endpoint for Pipecat client connections.
## Endpoints
- `POST /connect` - Pipecat client connection endpoint
## Environment Variables
Copy `env.example` to `.env` and configure:
```ini
# Required API Keys
DAILY_API_KEY= # Your Daily API key
CARTESIA_API_KEY= # Your Cartesia API key
# Optional Configuration
DAILY_API_URL= # Optional: Daily API URL (defaults to https://api.daily.co/v1)
DAILY_SAMPLE_ROOM_URL= # Optional: Fixed room URL for development
HOST= # Optional: Host address (defaults to 0.0.0.0)
FAST_API_PORT= # Optional: Port number (defaults to 7860)
```
## Running the Server
Set up and activate your virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
Install dependencies:
```bash
pip install -r requirements.txt
```
If you want to use the local version of `pipecat` in this repo rather than the last published version, also run:
```bash
pip install --editable "../../../[daily,cartesia,openai]"
```
Run the server:
```bash
python server.py
```

View File

@@ -0,0 +1,3 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=
CARTESIA_API_KEY=

View File

@@ -0,0 +1,4 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,cartesia,openai]

View File

@@ -0,0 +1,63 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
(url, token, _) = await configure_with_args(aiohttp_session)
return (url, token)
async def configure_with_args(
aiohttp_session: aiohttp.ClientSession, parser: argparse.ArgumentParser | None = None
):
if not parser:
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token, args)

View File

@@ -0,0 +1,147 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
from typing import Any, Dict
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
# Load environment variables from .env file
load_dotenv(override=True)
# Dictionary to track bot processes: {pid: (process, room_url)}
bot_procs = {}
# Store Daily API helpers
daily_helpers = {}
def cleanup():
"""Cleanup function to terminate all bot processes.
Called during server shutdown.
"""
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
"""FastAPI lifespan manager that handles startup and shutdown tasks.
- Creates aiohttp session
- Initializes Daily API helper
- Cleans up resources on shutdown
"""
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
# Initialize FastAPI app with lifespan manager
app = FastAPI(lifespan=lifespan)
# Configure CORS to allow requests from any origin
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
async def create_room_and_token() -> tuple[str, str]:
"""Helper function to create a Daily room and generate an access token.
Returns:
tuple[str, str]: A tuple containing (room_url, token)
Raises:
HTTPException: If room creation or token generation fails
"""
room = await daily_helpers["rest"].create_room(DailyRoomParams())
if not room.url:
raise HTTPException(status_code=500, detail="Failed to create room")
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
return room.url, token
@app.post("/connect")
async def bot_connect(request: Request) -> Dict[Any, Any]:
"""Connect endpoint that creates a room and returns connection credentials.
This endpoint is called by client to establish a connection.
Returns:
Dict[Any, Any]: Authentication bundle containing room_url and token
Raises:
HTTPException: If room creation, token generation, or bot startup fails
"""
print("Creating room for RTVI connection")
room_url, token = await create_room_and_token()
print(f"Room URL: {room_url}")
# Start the bot process
try:
bot_file = "signalling_bot"
proc = subprocess.Popen(
[f"python3 -m {bot_file} -u {room_url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room_url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
# Return the authentication bundle in format expected by DailyTransport
return {"room_url": room_url, "token": token}
if __name__ == "__main__":
import uvicorn
# Parse command line arguments for server configuration
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Travel Companion FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
# Start the FastAPI server
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -0,0 +1,93 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
from dataclasses import dataclass
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.frames.frames import AudioRawFrame, EndFrame, OutputAudioRawFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
@dataclass
class SilenceFrame(OutputAudioRawFrame):
def __init__(
self,
audio: bytes = None,
sample_rate: int = 16000,
num_channels: int = 1,
duration: float = 0.1,
):
# Initialize the parent class with the silent frame's data
super().__init__(
audio=self.create_silent_audio_frame(sample_rate, num_channels, duration).audio,
sample_rate=sample_rate,
num_channels=num_channels,
)
@staticmethod
def create_silent_audio_frame(
sample_rate: int, num_channels: int, duration: float
) -> AudioRawFrame:
"""Create an AudioRawFrame containing silence."""
frame_size = num_channels * 2 # 2 bytes per sample for 16-bit audio
total_frames = int(sample_rate * duration)
total_bytes = total_frames * frame_size
silent_audio = bytes(total_bytes) # Create a byte array filled with zeros
return AudioRawFrame(audio=silent_audio, sample_rate=sample_rate, num_channels=num_channels)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True)
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when we receive a specific message
@transport.event_handler("on_app_message")
async def on_app_message(transport, message, sender):
logger.debug(f"Received app message: {message} - {sender}")
if "playable" not in message:
return
await task.queue_frames(
[
SilenceFrame(duration=0.5),
TTSSpeakFrame(f"Hello there, how are you doing today ?"),
EndFrame(),
]
)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

161
examples/canonical-metrics/.gitignore vendored Normal file
View File

@@ -0,0 +1,161 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
recordings/
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
runpod.toml

View File

@@ -0,0 +1,10 @@
FROM python:3.10-bullseye
RUN mkdir /app
COPY *.py /app/
COPY requirements.txt /app/
WORKDIR /app
RUN pip3 install -r requirements.txt
EXPOSE 7860
CMD ["python3", "server.py"]

View File

@@ -0,0 +1,66 @@
# Chatbot with canonical-metrics
This project implements a chatbot using a pipeline architecture that integrates audio processing, transcription, and a language model for conversational interactions. The chatbot operates within a daily communication environment, utilizing various services for text-to-speech and language model responses.
## Features
- **Audio Input and Output**: Captures microphone input and plays back audio responses.
- **Voice Activity Detection**: Utilizes Silero VAD to manage audio input intelligently.
- **Text-to-Speech**: Integrates ElevenLabs TTS service to convert text responses into audio.
- **Language Model Interaction**: Uses OpenAI's GPT-4 model to generate responses based on user input.
- **Transcription Services**: Captures and transcribes participant speech for analytics.
- **Metrics Collection**: Sends audio data for analysis via Canonical Metrics Service.
## Requirements
- Python 3.10+
- `python-dotenv`
- Additional libraries from the `pipecat` package.
## Setup
1. Clone the repository.
2. Install the required packages.
3. Set up environment variables for API keys:
- `OPENAI_API_KEY`
- `ELEVENLABS_API_KEY`
- `CANONICAL_API_KEY`
- `CANONICAL_API_URL`
4. Run the script.
## Usage
The chatbot introduces itself and engages in conversations, providing brief and creative responses. Designed for flexibility, it can support multiple languages with appropriate configuration.
## Events
- Participants joining or leaving the call are handled dynamically, adjusting the chatbot's behavior accordingly.
The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
## Build and test the Docker image
```
docker build -t chatbot .
docker run --env-file .env -p 7860:7860 chatbot
```

View File

@@ -0,0 +1,148 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import uuid
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.services.canonical import CanonicalMetricsService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
audio_out_enabled=True,
audio_in_enabled=True,
camera_out_enabled=False,
vad_enabled=True,
vad_audio_passthrough=True,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
#
# Spanish
#
# transcription_settings=DailyTranscriptionSettings(
# language="es",
# tier="nova",
# model="2-general"
# )
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
#
# English
#
voice_id="cgSgspJ2msm6clMCkdW9",
aiohttp_session=session,
#
# Spanish
#
# model="eleven_multilingual_v2",
# voice_id="gD1IexrzCvsXPHUuT0s3",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
messages = [
{
"role": "system",
#
# English
#
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your responses to 12 words or fewer.",
#
# Spanish
#
# "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
"""
CanonicalMetrics uses AudioBufferProcessor under the hood to buffer the audio. On
call completion, CanonicalMetrics will send the audio buffer to Canonical for
analysis. Visit https://voice.canonical.chat to learn more.
"""
audio_buffer_processor = AudioBufferProcessor(num_channels=2)
canonical = CanonicalMetricsService(
audio_buffer_processor=audio_buffer_processor,
aiohttp_session=session,
api_key=os.getenv("CANONICAL_API_KEY"),
call_id=str(uuid.uuid4()),
assistant="pipecat-chatbot",
assistant_speaks_first=True,
context=context,
)
pipeline = Pipeline(
[
transport.input(), # microphone
context_aggregator.user(),
llm,
tts,
transport.output(),
audio_buffer_processor, # captures audio into a buffer
canonical, # uploads audio buffer to Canonical AI for metrics
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.cancel()
@transport.event_handler("on_call_state_updated")
async def on_call_state_updated(transport, state):
if state == "left":
# Here we don't want to cancel, we just want to finish sending
# whatever is queued, so we use an EndFrame().
await task.queue_frame(EndFrame())
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,6 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=7df...
OPENAI_API_KEY=sk-PL...
ELEVENLABS_API_KEY=aeb...
CANONICAL_API_KEY=can...
CANONICAL_API_URL=

View File

@@ -0,0 +1,5 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,openai,silero,elevenlabs,canonical]

View File

@@ -0,0 +1,55 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)

View File

@@ -0,0 +1,139 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse, RedirectResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
MAX_BOTS_PER_ROOM = 1
# Bot sub-process dict for status reporting and concurrency control
bot_procs = {}
daily_helpers = {}
load_dotenv(override=True)
def cleanup():
# Clean up function, just to be extra safe
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def start_agent(request: Request):
print(f"!!! Creating room")
room = await daily_helpers["rest"].create_room(DailyRoomParams())
print(f"!!! Room URL: {room.url}")
# Ensure the room property is present
if not room.url:
raise HTTPException(
status_code=500,
detail="Missing 'room' property in request data. Cannot start agent without a target room!",
)
# Check if there is already an existing process running in this room
num_bots_in_room = sum(
1 for proc in bot_procs.values() if proc[1] == room.url and proc[0].poll() is None
)
if num_bots_in_room >= MAX_BOTS_PER_ROOM:
raise HTTPException(status_code=500, detail=f"Max bot limited reach for room: {room.url}")
# Get the token for the room
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
# Spawn a new agent, and join the user session
# Note: this is mostly for demonstration purposes (refer to 'deployment' in README)
try:
proc = subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room.url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
return RedirectResponse(room.url)
@app.get("/status/{pid}")
def get_status(pid: int):
# Look up the subprocess
proc = bot_procs.get(pid)
# If the subprocess doesn't exist, return an error
if not proc:
raise HTTPException(status_code=404, detail=f"Bot with process id: {pid} not found")
# Check the status of the subprocess
if proc[0].poll() is None:
status = "running"
else:
status = "finished"
return JSONResponse({"bot_id": pid, "status": status})
if __name__ == "__main__":
import uvicorn
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Storyteller FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -0,0 +1,161 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
runpod.toml

View File

@@ -0,0 +1,15 @@
FROM python:3.10-bullseye
RUN mkdir /app
RUN mkdir /app/assets
RUN mkdir /app/utils
COPY *.py /app/
COPY requirements.txt /app/
WORKDIR /app
RUN pip3 install -r requirements.txt
EXPOSE 7860
CMD ["python3", "server.py"]

View File

@@ -0,0 +1,37 @@
# Simple Chatbot
<img src="image.png" width="420px">
This app connects you to a chatbot powered by GPT-4, complete with animations generated by Stable Video Diffusion.
See a video of it in action: https://x.com/kwindla/status/1778628911817183509
And a quick video walkthrough of the code: https://www.loom.com/share/13df1967161f4d24ade054e7f8753416
The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
## Build and test the Docker image
```
docker build -t chatbot .
docker run --env-file .env -p 7860:7860 chatbot
```

View File

@@ -0,0 +1,149 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import datetime
import io
import os
import sys
import wave
import aiofiles
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def save_audio(audio: bytes, sample_rate: int, num_channels: int):
if len(audio) > 0:
filename = f"conversation_recording{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}.wav"
with io.BytesIO() as buffer:
with wave.open(buffer, "wb") as wf:
wf.setsampwidth(2)
wf.setnchannels(num_channels)
wf.setframerate(sample_rate)
wf.writeframes(audio)
async with aiofiles.open(filename, "wb") as file:
await file.write(buffer.getvalue())
print(f"Merged audio saved to {filename}")
else:
print("No audio data to save")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
audio_out_enabled=True,
audio_in_enabled=True,
camera_out_enabled=False,
vad_enabled=True,
vad_audio_passthrough=True,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
#
# Spanish
#
# transcription_settings=DailyTranscriptionSettings(
# language="es",
# tier="nova",
# model="2-general"
# )
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
#
# English
#
voice_id="cgSgspJ2msm6clMCkdW9",
aiohttp_session=session,
#
# Spanish
#
# model="eleven_multilingual_v2",
# voice_id="gD1IexrzCvsXPHUuT0s3",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
messages = [
{
"role": "system",
#
# English
#
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your response to 12 words or fewer.",
#
# Spanish
#
# "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
# Save audio every 10 seconds.
audiobuffer = AudioBufferProcessor(buffer_size=480000)
pipeline = Pipeline(
[
transport.input(), # microphone
context_aggregator.user(),
llm,
tts,
transport.output(),
audiobuffer, # used to buffer the audio in the pipeline
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.cancel()
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,4 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=7df...
OPENAI_API_KEY=sk-PL...
ELEVENLABS_API_KEY=aeb...

View File

@@ -0,0 +1,5 @@
aiofiles
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,openai,silero,elevenlabs]

View File

@@ -0,0 +1,56 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)
return (url, token)

View File

@@ -0,0 +1,139 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse, RedirectResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
MAX_BOTS_PER_ROOM = 1
# Bot sub-process dict for status reporting and concurrency control
bot_procs = {}
daily_helpers = {}
load_dotenv(override=True)
def cleanup():
# Clean up function, just to be extra safe
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def start_agent(request: Request):
print(f"!!! Creating room")
room = await daily_helpers["rest"].create_room(DailyRoomParams())
print(f"!!! Room URL: {room.url}")
# Ensure the room property is present
if not room.url:
raise HTTPException(
status_code=500,
detail="Missing 'room' property in request data. Cannot start agent without a target room!",
)
# Check if there is already an existing process running in this room
num_bots_in_room = sum(
1 for proc in bot_procs.values() if proc[1] == room.url and proc[0].poll() is None
)
if num_bots_in_room >= MAX_BOTS_PER_ROOM:
raise HTTPException(status_code=500, detail=f"Max bot limited reach for room: {room.url}")
# Get the token for the room
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
# Spawn a new agent, and join the user session
# Note: this is mostly for demonstration purposes (refer to 'deployment' in README)
try:
proc = subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room.url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
return RedirectResponse(room.url)
@app.get("/status/{pid}")
def get_status(pid: int):
# Look up the subprocess
proc = bot_procs.get(pid)
# If the subprocess doesn't exist, return an error
if not proc:
raise HTTPException(status_code=404, detail=f"Bot with process id: {pid} not found")
# Check the status of the subprocess
if proc[0].poll() is None:
status = "running"
else:
status = "finished"
return JSONResponse({"bot_id": pid, "status": status})
if __name__ == "__main__":
import uvicorn
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Storyteller FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -0,0 +1,13 @@
FROM python:3.11-bullseye
# Open port 7860 for http service
ENV FAST_API_PORT=7860
EXPOSE 7860
# Install Python dependencies
COPY *.py .
COPY ./requirements.txt requirements.txt
RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
# Start the FastAPI server
CMD python3 bot_runner.py --port ${FAST_API_PORT}

View File

@@ -0,0 +1,39 @@
# Fly.io deployment example
This project modifies the `bot_runner.py` server to launch a new machine for each user session. This is a recommended approach for production vs. running shell processess as your deployment will quickly run out of system resources under load.
For this example, we are using Daily as a WebRTC transport and provisioning a new room and token for each session. You can use another transport, such as WebSockets, by modifying the `bot.py` and `bot_runner.py` files accordingly.
## Setting up your fly.io deployment
### Create your fly.toml file
You can copy the `example-fly.toml` as a reference. Be sure to change the app name to something unique.
### Create your .env file
Copy the base `env.example` to `.env` and enter the necessary API keys.
`FLY_APP_NAME` should match that in the `fly.toml` file.
### Launch a new fly.io project
`fly launch` or `fly launch --org your-org-name`
### Set the necessary app secrets from your .env
Note: you can do this manually via the fly.io dashboard under the "secrets" sub-section of your deployment (e.g. "https://fly.io/apps/fly-app-name/secrets") or run the following terminal command:
`cat .env | tr '\n' ' ' | xargs flyctl secrets set`
### Deploy your machine
`fly deploy`
## Connecting to your bot
Send a post request to your running fly.io instance:
`curl --location --request POST 'https://YOUR_FLY_APP_NAME/'`
This request will wait until the machine enters into a `starting` state, before returning the a room URL and token to join.

View File

@@ -0,0 +1,102 @@
import argparse
import asyncio
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
daily_api_key = os.getenv("DAILY_API_KEY", "")
daily_api_url = os.getenv("DAILY_API_URL", "https://api.daily.co/v1")
async def main(room_url: str, token: str):
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
api_url=daily_api_url,
api_key=daily_api_key,
audio_in_enabled=True,
audio_out_enabled=True,
camera_out_enabled=False,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY", ""),
voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
messages = [
{
"role": "system",
"content": "You are Chatbot, a friendly, helpful robot. Your output will be converted to audio so don't include special characters other than '!' or '?' in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by saying hello.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.cancel()
@transport.event_handler("on_call_state_updated")
async def on_call_state_updated(transport, state):
if state == "left":
# Here we don't want to cancel, we just want to finish sending
# whatever is queued, so we use an EndFrame().
await task.queue_frame(EndFrame())
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Pipecat Bot")
parser.add_argument("-u", type=str, help="Room URL")
parser.add_argument("-t", type=str, help="Token")
config = parser.parse_args()
asyncio.run(main(config.u, config.t))

View File

@@ -0,0 +1,209 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pipecat.transports.services.helpers.daily_rest import (
DailyRESTHelper,
DailyRoomObject,
DailyRoomParams,
DailyRoomProperties,
)
load_dotenv(override=True)
# ------------ Configuration ------------ #
MAX_SESSION_TIME = 5 * 60 # 5 minutes
REQUIRED_ENV_VARS = [
"DAILY_API_KEY",
"OPENAI_API_KEY",
"ELEVENLABS_API_KEY",
"ELEVENLABS_VOICE_ID",
"FLY_API_KEY",
"FLY_APP_NAME",
]
FLY_API_HOST = os.getenv("FLY_API_HOST", "https://api.machines.dev/v1")
FLY_APP_NAME = os.getenv("FLY_APP_NAME", "pipecat-fly-example")
FLY_API_KEY = os.getenv("FLY_API_KEY", "")
FLY_HEADERS = {"Authorization": f"Bearer {FLY_API_KEY}", "Content-Type": "application/json"}
daily_helpers = {}
# ----------------- API ----------------- #
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# ----------------- Main ----------------- #
async def spawn_fly_machine(room_url: str, token: str):
async with aiohttp.ClientSession() as session:
# Use the same image as the bot runner
async with session.get(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines", headers=FLY_HEADERS
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Unable to get machine info from Fly: {text}")
data = await r.json()
image = data[0]["config"]["image"]
# Machine configuration
cmd = f"python3 bot.py -u {room_url} -t {token}"
cmd = cmd.split()
worker_props = {
"config": {
"image": image,
"auto_destroy": True,
"init": {"cmd": cmd},
"restart": {"policy": "no"},
"guest": {"cpu_kind": "shared", "cpus": 1, "memory_mb": 1024},
},
}
# Spawn a new machine instance
async with session.post(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines", headers=FLY_HEADERS, json=worker_props
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Problem starting a bot worker: {text}")
data = await r.json()
# Wait for the machine to enter the started state
vm_id = data["id"]
async with session.get(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines/{vm_id}/wait?state=started",
headers=FLY_HEADERS,
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Bot was unable to enter started state: {text}")
print(f"Machine joined room: {room_url}")
@app.post("/")
async def start_bot(request: Request) -> JSONResponse:
try:
data = await request.json()
# Is this a webhook creation request?
if "test" in data:
return JSONResponse({"test": True})
except Exception as e:
pass
# Use specified room URL, or create a new one if not specified
room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", "")
if not room_url:
params = DailyRoomParams(properties=DailyRoomProperties())
try:
room: DailyRoomObject = await daily_helpers["rest"].create_room(params=params)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Unable to provision room {e}")
else:
# Check passed room URL exists, we should assume that it already has a sip set up
try:
room: DailyRoomObject = await daily_helpers["rest"].get_room_from_url(room_url)
except Exception:
raise HTTPException(status_code=500, detail=f"Room not found: {room_url}")
# Give the agent a token to join the session
token = await daily_helpers["rest"].get_token(room.url, MAX_SESSION_TIME)
if not room or not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room_url}")
# Launch a new fly.io machine, or run as a shell process (not recommended)
run_as_process = os.getenv("RUN_AS_PROCESS", False)
if run_as_process:
try:
subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
else:
try:
await spawn_fly_machine(room.url, token)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to spawn VM: {e}")
# Grab a token for the user to join with
user_token = await daily_helpers["rest"].get_token(room.url, MAX_SESSION_TIME)
return JSONResponse(
{
"room_url": room.url,
"token": user_token,
}
)
if __name__ == "__main__":
# Check environment variables
for env_var in REQUIRED_ENV_VARS:
if env_var not in os.environ:
raise Exception(f"Missing environment variable: {env_var}.")
parser = argparse.ArgumentParser(description="Pipecat Bot Runner")
parser.add_argument(
"--host", type=str, default=os.getenv("HOST", "0.0.0.0"), help="Host address"
)
parser.add_argument("--port", type=int, default=os.getenv("PORT", 7860), help="Port number")
parser.add_argument(
"--reload", action="store_true", default=False, help="Reload code on change"
)
config = parser.parse_args()
try:
import uvicorn
uvicorn.run("bot_runner:app", host=config.host, port=config.port, reload=config.reload)
except KeyboardInterrupt:
print("Pipecat runner shutting down...")

View File

@@ -0,0 +1,8 @@
DAILY_API_KEY=
DAILY_SAMPLE_ROOM_URL= # Enter a Daily room URL to use a set room URL each time (useful for local testing)
OPENAI_API_KEY=
ELEVENLABS_API_KEY=
ELEVENLABS_VOICE_ID=
FLY_API_KEY=
FLY_APP_NAME=
RUN_AS_PROCESS= # Spawn fly.io machine for each session or run as local process

View File

@@ -0,0 +1,25 @@
# fly.toml app configuration file generated for pipecat-fly-example on 2024-07-01T15:04:53+01:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = 'pipecat-fly-example'
primary_region = 'sjc'
[build]
[env]
FLY_APP_NAME = 'pipecat-fly-example'
[http_service]
internal_port = 7860
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 0
processes = ['app']
[[vm]]
memory = 512
cpu_kind = 'shared'
cpus = 1

View File

@@ -0,0 +1,5 @@
pipecat-ai[daily,openai,silero]
fastapi
uvicorn
python-dotenv
loguru

View File

@@ -0,0 +1,91 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
dist/
*.egg-info/
*.egg
.installed.cfg
.eggs/
downloads/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
MANIFEST
# Virtual Environments
venv/
env/
.env
.venv/
ENV/
env.bak/
venv.bak/
# IDE
.idea/
.vscode/
.spyderproject
.spyproject
.ropeproject
# Testing and Coverage
.coverage
.coverage.*
htmlcov/
.pytest_cache/
.tox/
.nox/
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
cover/
# Logs and Databases
*.log
*.db
db.sqlite3
db.sqlite3-journal
pip-log.txt
# System Files
.DS_Store
Thumbs.db
desktop.ini
*.swp
*.swo
*.bak
*.tmp
*~
# Build and Documentation
docs/_build/
.pybuilder/
target/
instance/
.webassets-cache
.pdm.toml
.pdm-python
.pdm-build/
__pypackages__/
# Other
*.mo
*.pot
*.sage.py
.mypy_cache/
.dmypy.json
dmypy.json
.pyre/
.pytype/
cython_debug/
.ipynb_checkpoints

View File

@@ -0,0 +1,37 @@
# Deploying Pipecat to Modal.com
Barebones deployment example for [modal.com](https://www.modal.com)
1. Install dependencies
```bash
python -m venv venv
source venv/bin/active # or OS equivalent
pip install -r requirements.txt
```
2. Setup .env
```bash
cp env.example .env
```
Alternatively, you can configure your Modal app to use [secrets](https://modal.com/docs/guide/secrets)
3. Test the app locally
```bash
modal serve app.py
```
4. Deploy to production
```bash
modal deploy app.py
```
## Configuration options
This app sets some sensible defaults for reducing cold starts, such as `minkeep_warm=1`, which will keep at least 1 warm instance ready for your bot function.
It has been configured to only allow a concurrency of 1 (`max_inputs=1`) as each user will require their own running function.

View File

@@ -0,0 +1,74 @@
import os
import aiohttp
import modal
from bot import _voice_bot_process
from fastapi import HTTPException
from fastapi.responses import JSONResponse
from loguru import logger
MAX_SESSION_TIME = 15 * 60 # 15 minutes
app = modal.App("pipecat-modal")
image = modal.Image.debian_slim(python_version="3.12").pip_install_from_requirements(
"requirements.txt"
)
@app.function(
image=image,
cpu=1.0,
secrets=[modal.Secret.from_dotenv()],
keep_warm=1,
enable_memory_snapshot=True,
max_inputs=1, # Do not reuse instances across requests
retries=0,
)
def launch_bot_process(room_url: str, token: str):
_voice_bot_process(room_url, token)
@app.function(
image=image,
secrets=[modal.Secret.from_dotenv()],
)
@modal.web_endpoint(method="POST")
async def start():
from pipecat.transports.services.helpers.daily_rest import (
DailyRESTHelper,
DailyRoomParams,
)
logger.info("Request received")
async with aiohttp.ClientSession() as session:
daily_rest_helper = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=session,
)
# Create new Daily room
room = await daily_rest_helper.create_room(DailyRoomParams())
if not room.url:
raise HTTPException(
status_code=500,
detail="Unable to create room",
)
logger.info(f"Created room: {room.url}")
# Create bot token for room
token = await daily_rest_helper.get_token(room.url, MAX_SESSION_TIME)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
logger.info(f"Bot token created: {token}")
# Spawn a new bot process
launch_bot_process.spawn(room_url=room.url, token=token)
# Return room URL to the user to join
# Note: in production, you would want to return a token to the user
return JSONResponse(content={"room_url": room.url, token: token})

View File

@@ -0,0 +1,89 @@
import asyncio
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main(room_url: str, token: str):
transport = DailyTransport(
room_url,
token,
"bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY", ""), voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22"
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
report_only_initial_ttfb=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.cancel()
runner = PipelineRunner()
await runner.run(task)
def _voice_bot_process(room_url: str, token: str):
asyncio.run(main(room_url, token))

View File

@@ -0,0 +1,3 @@
DAILY_API_KEY=
OPENAI_API_KEY=
CARTESIA_API_KEY=

View File

@@ -0,0 +1,5 @@
python-dotenv==1.0.1
modal==0.71.3
pipecat-ai[daily,silero,cartesia,openai]==0.0.52
fastapi==0.115.6
aiohttp==3.11.11

View File

@@ -0,0 +1,59 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True)
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when the
# participant joins.
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
participant_name = participant.get("info", {}).get("userName", "")
await task.queue_frames(
[TTSSpeakFrame(f"Hello there, {participant_name}!"), EndFrame()]
)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,50 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.audio import LocalAudioTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
transport = LocalAudioTransport(TransportParams(audio_out_enabled=True))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)
pipeline = Pipeline([tts, transport.output()])
task = PipelineTask(pipeline)
async def say_something():
await asyncio.sleep(1)
await task.queue_frames([TTSSpeakFrame("Hello there, how is it going!"), EndFrame()])
runner = PipelineRunner()
await asyncio.gather(runner.run(task), say_something())
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,108 @@
import argparse
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from livekit import api
from loguru import logger
from pipecat.frames.frames import TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.services.livekit import LiveKitParams, LiveKitTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
def generate_token(room_name: str, participant_name: str, api_key: str, api_secret: str) -> str:
token = api.AccessToken(api_key, api_secret)
token.with_identity(participant_name).with_name(participant_name).with_grants(
api.VideoGrants(
room_join=True,
room=room_name,
)
)
return token.to_jwt()
async def configure_livekit():
parser = argparse.ArgumentParser(description="LiveKit AI SDK Bot Sample")
parser.add_argument(
"-r", "--room", type=str, required=False, help="Name of the LiveKit room to join"
)
parser.add_argument("-u", "--url", type=str, required=False, help="URL of the LiveKit server")
args, unknown = parser.parse_known_args()
room_name = args.room or os.getenv("LIVEKIT_ROOM_NAME")
url = args.url or os.getenv("LIVEKIT_URL")
api_key = os.getenv("LIVEKIT_API_KEY")
api_secret = os.getenv("LIVEKIT_API_SECRET")
if not room_name:
raise Exception(
"No LiveKit room specified. Use the -r/--room option from the command line, or set LIVEKIT_ROOM_NAME in your environment."
)
if not url:
raise Exception(
"No LiveKit server URL specified. Use the -u/--url option from the command line, or set LIVEKIT_URL in your environment."
)
if not api_key or not api_secret:
raise Exception(
"LIVEKIT_API_KEY and LIVEKIT_API_SECRET must be set in environment variables."
)
token = generate_token(room_name, "Say One Thing", api_key, api_secret)
user_token = generate_token(room_name, "User", api_key, api_secret)
logger.info(f"User token: {user_token}")
return (url, token, room_name)
async def main():
async with aiohttp.ClientSession() as session:
(url, token, room_name) = await configure_livekit()
transport = LiveKitTransport(
url=url,
token=token,
room_name=room_name,
params=LiveKitParams(audio_out_enabled=True),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when the
# participant joins.
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant_id):
await asyncio.sleep(1)
await task.queue_frame(
TextFrame(
"Hello there! How are you doing today? Would you like to talk about the weather?"
)
)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,54 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.riva import FastPitchTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True)
)
tts = FastPitchTTSService(api_key=os.getenv("NVIDIA_API_KEY"))
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when the
# participant joins.
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
participant_name = participant.get("info", {}).get("userName", "")
await task.queue_frames([TTSSpeakFrame(f"Aloha, {participant_name}!"), EndFrame()])
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,64 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.frames.frames import EndFrame, LLMMessagesFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url, None, "Say One Thing From an LLM", DailyParams(audio_out_enabled=True)
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22", # British Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
messages = [
{
"role": "system",
"content": "You are an LLM in a WebRTC session, and this is a 'hello world' demo. Say hello to the world.",
}
]
runner = PipelineRunner()
task = PipelineTask(Pipeline([llm, tts, transport.output()]))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await task.queue_frames([LLMMessagesFrame(messages), EndFrame()])
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

Some files were not shown because too many files have changed in this diff Show More