Compare commits

...

5147 Commits

Author SHA1 Message Date
James Hush
a39f8b4882 Remove extra code 2025-10-01 14:48:12 +08:00
James Hush
76fc36f621 Pre recorded message example 2025-10-01 14:43:36 +08:00
James Hush
c0878c5e09 Save progress 2025-10-01 13:58:32 +08:00
James Hush
c6a1013051 Pre-recorded message example 2025-10-01 13:52:58 +08:00
Aleix Conchillo Flaqué
feae3b6d2d Merge pull request #2742 from pipecat-ai/aleix/deprecate-daily-update-remote-participants-frame
DailyTransport: deprecated DailyUpdateRemoteParticipantsFrame
2025-09-30 16:27:34 -07:00
Aleix Conchillo Flaqué
92d3be8975 DailyTransport: deprecated DailyUpdateRemoteParticipantsFrame 2025-09-30 16:26:48 -07:00
Aleix Conchillo Flaqué
0f53e1db2c Merge pull request #2759 from pipecat-ai/aleix/dont-cancel-if-finished
PipelineTask: avoid cancellation if application is finished
2025-09-30 16:21:16 -07:00
Aleix Conchillo Flaqué
d398e8cc10 Merge pull request #2761 from pipecat-ai/aleix/rtvi-tail-updates
RTVI updates: audio levels and system logs
2025-09-30 13:55:17 -07:00
Aleix Conchillo Flaqué
e5f263d380 update CHANGELOG 2025-09-30 13:51:35 -07:00
Aleix Conchillo Flaqué
3a4c303c54 RTVIParams: add errors_enabled deprecation warnings 2025-09-30 13:49:51 -07:00
Mark Backman
54a1ef47d0 Merge pull request #2758 from pipecat-ai/mb/claude-sonnet-4.5
Update AnthropicLLMService to use claude-sonnet-4-5-20250929
2025-09-30 16:42:47 -04:00
Aleix Conchillo Flaqué
149ffa4f3c RTVIObserver: add support system logs 2025-09-30 13:42:40 -07:00
Aleix Conchillo Flaqué
e5465034d9 RTVIObserver: add support for user/bot audio levels 2025-09-30 13:41:26 -07:00
Aleix Conchillo Flaqué
568c7c782d rtvi: allow None RTVIProcessor and rename to send_rtvi_message() 2025-09-30 13:35:27 -07:00
Aleix Conchillo Flaqué
9851334221 rtvi: deprecate errors_enabled and always send errors 2025-09-30 13:31:30 -07:00
Aleix Conchillo Flaqué
e79c4fc99d PipelineTask: avoid cancellation if application is finished 2025-09-30 13:18:25 -07:00
Aleix Conchillo Flaqué
55c321f4ff Merge pull request #2751 from pipecat-ai/aleix/nova-sonic-disconnect-fix
AWSNovaSonicLLMService: add missing await
2025-09-30 13:12:22 -07:00
kompfner
a14a53a005 Merge pull request #2735 from pipecat-ai/pk/remove-openaillmcontext-usage
Remove remaining usage of `OpenAILLMContext` throughout the codebase …
2025-09-30 10:09:25 -04:00
Mark Backman
a71f937e8f Update AnthropicLLMService to use claude-sonnet-4-5-20250929 2025-09-30 08:49:30 -04:00
Mark Backman
d0178edad0 Merge pull request #2753 from pipecat-ai/mb/quickstart-0.0.86
Quickstart: Update to 0.0.86, removing pytorch requirements
2025-09-29 09:43:33 -04:00
Mark Backman
795c5e55d9 Quickstart: Update to 0.0.86, removing pytorch requirements 2025-09-27 08:30:37 -04:00
Aleix Conchillo Flaqué
8f8d8ae0d8 AWSNovaSonicLLMService: add missing await 2025-09-26 15:58:05 -07:00
Vanessa Pyne
741f192d04 Merge pull request #2096 from pipecat-ai/vp-mcp-ex-nit
mcp examples: check for env vars needed for examples
2025-09-26 10:21:22 -05:00
Aleix Conchillo Flaqué
ee00ee5c57 Merge pull request #2744 from pipecat-ai/aleix/vad-analyzer-thread-executor
BaseInputTransport: create VAD thread in VADAnalyzer
2025-09-25 13:43:34 -07:00
Aleix Conchillo Flaqué
f53fd880dc BaseInputTransport: create VAD thread in VADAnalyzer
We move the thread creation to the VADAnalyzer instead of the input
transport. This can potentially be useful if we need to analyze multiple audio
streams.
2025-09-25 13:41:20 -07:00
Aleix Conchillo Flaqué
de3461e4cc Merge pull request #2743 from pipecat-ai/aleix/turn-analyzer-fixes
turn analyzer fixes
2025-09-25 13:40:43 -07:00
Aleix Conchillo Flaqué
7bafc3a1bb BaseSmartTurn: process speech in a separate thread 2025-09-25 13:37:28 -07:00
Aleix Conchillo Flaqué
22ef61fe8d BaseTurnAnalyzer: add BaseTurnParams base class for parameters 2025-09-25 13:37:09 -07:00
Aleix Conchillo Flaqué
7078fb53bd Merge pull request #2738 from pipecat-ai/aleix/openai-cached-tokens-metrics
BaseOpenAILLMService: include cached tokens to metrics frame
2025-09-25 13:36:03 -07:00
Aleix Conchillo Flaqué
33447ad6f2 BaseOpenAILLMService: include cached tokens to metrics frame 2025-09-24 19:32:16 -07:00
Paul Kompfner
6faa50ae5b Remove remaining usage of OpenAILLMContext throughout the codebase in favor of LLMContext, except for:
- Usage in classes that are already deprecated
- Usage related to realtime LLMs, which don't yet support `LLMContext`
- Usage in (soon-to-be-deprecated) code paths related to `OpenAILLMContext` itself and associated machinery
2025-09-24 16:35:03 -04:00
Aleix Conchillo Flaqué
3797f41c8c Merge pull request #2734 from pipecat-ai/aleix/pipecat-0.0.86
update CHANGELOG for 0.0.86
2025-09-24 12:19:16 -07:00
Aleix Conchillo Flaqué
ff919b8c15 update CHANGELOG for 0.0.86 2025-09-24 11:28:14 -07:00
Aleix Conchillo Flaqué
cb048d6c7e Merge pull request #2733 from pipecat-ai/aleix/tavus-missing-daily-callback
TavusTransport: add missing on_before_leave callback
2025-09-24 11:10:32 -07:00
Aleix Conchillo Flaqué
6c2c43ade0 Merge pull request #2724 from pipecat-ai/pk/update-natural-conversation-examples-with-universal-context
Update natural conversation examples with universal context
2025-09-24 11:07:50 -07:00
Aleix Conchillo Flaqué
f899c15b03 Merge pull request #2731 from pipecat-ai/pk/update-example-25-to-use-universal-context
Update example 25 to use universal `LLMContext`
2025-09-24 11:05:36 -07:00
Aleix Conchillo Flaqué
d10ef08775 Merge pull request #2727 from pipecat-ai/pk/strands-agents-needs-to-support-openaillmcontextframe-for-now
`StrandsAgentsProcessor` should still support `OpenAILLMContextFrame`…
2025-09-24 11:05:07 -07:00
Aleix Conchillo Flaqué
27a5af6fa1 Merge pull request #2728 from pipecat-ai/pk/fix-playht-in-env-example
Fix PlayHT env variable names in env.example
2025-09-24 11:04:55 -07:00
Aleix Conchillo Flaqué
4bff0a7c49 Merge pull request #2732 from pipecat-ai/pk/update-voicemail-detector-to-use-llm-context
Update `VoicemailDetector` to use universal `LLMContext`
2025-09-24 11:04:42 -07:00
Aleix Conchillo Flaqué
508f7d203d Merge pull request #2729 from pipecat-ai/aleix/frame-processor-cancel-default-timeout
FrameProcessor: timeout when cancelling tasks
2025-09-24 10:55:52 -07:00
Filipi da Silva Fuchter
0f87d5342c Merge pull request #2266 from pipecat-ai/filipi/hey_gen_transport
HeyGen implementation for Pipecat - HeyGenTransport
2025-09-24 14:35:50 -03:00
Filipi Fuchter
f6164e3bde HeyGen implementation for Pipecat - HeyGenTransport. 2025-09-24 14:33:56 -03:00
Aleix Conchillo Flaqué
1a0fb55d0f TavusTransport: add missing on_before_leave callback 2025-09-24 10:18:56 -07:00
Paul Kompfner
6d0beef944 Update VoicemailDetector to use universal LLMContext 2025-09-24 12:58:42 -04:00
Paul Kompfner
b9fd6b873b Update comment in example 07b to reference LLMContext rather than OpenAILLMContext 2025-09-24 12:49:34 -04:00
Paul Kompfner
dea0f1791f Update OpenAILLMAdapter.get_messages_for_logging() to truncate "input_audio" message data 2025-09-24 12:41:41 -04:00
Paul Kompfner
da66c38795 Update example 25 to use universal LLMContext 2025-09-24 12:37:29 -04:00
Paul Kompfner
912f8b96f0 Fix PlayHT env variable names in env.example 2025-09-24 11:24:35 -04:00
Aleix Conchillo Flaqué
f9eb447d82 FrameProcessor: timeout when cancelling tasks 2025-09-24 08:24:28 -07:00
Paul Kompfner
65f5fe8588 StrandsAgentsProcessor should still support OpenAILLMContextFrame until that frame has been deprecated 2025-09-24 11:05:27 -04:00
Paul Kompfner
817c77f3fe Update SmallWebRTCTransport to pass a sender ID in the "on_app_message" event 2025-09-24 10:24:59 -04:00
Paul Kompfner
8896179b00 Update another "natural conversation" example to use universal LLMContext. Note that this one had to also be fixed in various ways, as it wasn't working. 2025-09-24 09:55:11 -04:00
Filipi da Silva Fuchter
463752360b Merge pull request #2726 from pipecat-ai/mb/twilio-serializer-cleanup
Fixup for TwilioFrameSerializer
2025-09-24 10:45:38 -03:00
Paul Kompfner
66b7977a62 Make SmallWebRTCTransport adhere to the expected "on_app_message" event signature 2025-09-24 09:34:28 -04:00
Mark Backman
468de68aec Fixup for TwilioFrameSerializer 2025-09-24 09:32:46 -04:00
Mark Backman
c4762c1a92 Merge pull request #2627 from jessieweiyi/main
Support TwilioFrameSerializer region/edge settings. Close #2625.
2025-09-24 09:13:10 -04:00
Aleix Conchillo Flaqué
7f4d3a2f02 pyproject: updated sentry to 2.38.0 2025-09-23 19:12:03 -07:00
Jessie Wei
88614b312f Merge branch 'pipecat-ai:main' into main 2025-09-24 10:23:52 +10:00
Jessie Wei
5b4655f45a chore: Update per review comments 2025-09-24 00:22:56 +00:00
Aleix Conchillo Flaqué
d7c8f8df53 update CHANGELOG with AudioBufferProcessor fixes 2025-09-23 15:42:01 -07:00
Aleix Conchillo Flaqué
2571cb2e69 tests: fix formatting 2025-09-23 15:28:07 -07:00
Aleix Conchillo Flaqué
15782be27c Merge pull request #2676 from golbin/main
Fix audio buffer flush and silence handling
2025-09-23 15:27:31 -07:00
Aleix Conchillo Flaqué
997e4b66c6 Merge pull request #2722 from pipecat-ai/aleix/strands-agents-update-and-evals
examples: update Strands Agents with universal context and add evals
2025-09-23 15:21:41 -07:00
Paul Kompfner
6ccbfd9b57 Update "natural conversation" examples to use universal LLMContext 2025-09-23 16:20:16 -04:00
Paul Kompfner
677f69971c Add filters in 22b and 22c examples to prevent function call results triggering the "statement judge" LLM from running unnecessarily, and with the wrong system prompt, which would result in garbled output statements comprised of both LLMs outputs combined 2025-09-23 16:14:58 -04:00
Paul Kompfner
678dd22b8e Add missing sender argument to a few "on_app_message" handlers in examples 2025-09-23 15:29:20 -04:00
kompfner
0bba02028d Merge pull request #2721 from pipecat-ai/pk/update-persistent-storage-examples-to-use-universal-llmcontext
Update persistent conversation storage examples to use universal `LLM…
2025-09-23 15:18:21 -04:00
Aleix Conchillo Flaqué
620b1f785c examples: update Strands Agents with universal context and add evals 2025-09-23 11:37:57 -07:00
Paul Kompfner
780e91eb91 Update persistent conversation storage examples to use universal LLMContext.
Note that `LLMContext` doesn't have a `get_messages_for_persistent_storage()`; the messages are already in the "standard" format so they can be used directly for storage.
2025-09-23 14:16:35 -04:00
Aleix Conchillo Flaqué
667569ef47 Merge pull request #2708 from pipecat-ai/aleix/base-output-transport-only-push-if-send-successful
BaseOutputTransport: only push downstream if transport write successful
2025-09-23 11:10:29 -07:00
Aleix Conchillo Flaqué
17ea0afa6f StrandsAgentsProcessor: more formatting fixes 2025-09-23 11:05:14 -07:00
Aleix Conchillo Flaqué
3fc5214c15 BaseOutputTransport: only push downstream if transport write successful
Fixes #2589
2025-09-23 11:04:37 -07:00
kompfner
1636c48ab9 Merge pull request #2720 from pipecat-ai/pk/update-changelog-with-additional-llmcontext-support
Update CHANGELOG with additional `LLMContext` support
2025-09-23 13:46:20 -04:00
Paul Kompfner
c3a2fa100c Update CHANGELOG with additional LLMContext support 2025-09-23 13:45:49 -04:00
kompfner
8649368337 Merge pull request #2719 from pipecat-ai/pk/update-more-examples-to-use-universal-llmcontext
Update more examples to use universal `LLMContext`. Specifically, upd…
2025-09-23 13:44:20 -04:00
Aleix Conchillo Flaqué
781366627c updated CHANGELOG with Strands Agents 2025-09-23 10:41:35 -07:00
Aleix Conchillo Flaqué
f6b4db42ef pyproject: add strands-agents 2025-09-23 10:39:06 -07:00
Aleix Conchillo Flaqué
ed64716219 StrandsAgentsProcessor: fix formatting 2025-09-23 10:38:31 -07:00
Aleix Conchillo Flaqué
5c22b2e1de Merge pull request #2610 from adithyaxx/add-strands-processor
Add native Strands Agents support to Pipecat
2025-09-23 10:33:22 -07:00
Paul Kompfner
d4b1e1ab41 Update more examples to use universal LLMContext. Specifically, update examples we didn't update before because they weren't using ToolsSchema for their tool definitions, which is a requirement for using LLMContext.
NOTE: oops! Turns out some of these files had *already* been updated to use universal `LLMContext` even though they weren't yet using `ToolsSchema`. This commit should fix those examples.
2025-09-23 12:41:35 -04:00
Mark Backman
fafe0cc4a3 Merge pull request #2718 from lshaun/fix-openai-deprecation-warning
update imports to avoid deprecated module
2025-09-23 12:29:16 -04:00
Mark Backman
40c82a8530 Merge pull request #2716 from pipecat-ai/mb/add-11labs-stt
Add ElevenLabsSTTService
2025-09-23 12:21:08 -04:00
lshaun
98d3686861 update imports to avoid deprecated module 2025-09-23 15:58:09 +00:00
kompfner
88337fc21f Merge pull request #2717 from pipecat-ai/pk/mem0-support-univeral-context
Add support for universal `LLMContext` to `Mem0MemoryService`
2025-09-23 11:54:55 -04:00
kompfner
928c0ef1b4 Merge pull request #2715 from pipecat-ai/pk/langchain-processor-support-univeral-context
Add support for universal `LLMContext` to `LangchainProcessor`
2025-09-23 11:54:44 -04:00
kompfner
1f005e7075 Merge pull request #2714 from pipecat-ai/pk/gated-openai-llm-context-aggregator-support-univeral-context
Add support for universal `LLMContext` to `GatedOpenAILLMContextAggre…
2025-09-23 11:54:26 -04:00
kompfner
2cee6229ae Merge pull request #2713 from pipecat-ai/pk/log-observer-support-univeral-context
Add support for universal `LLMContext` to `LLMLogObserver`
2025-09-23 11:54:11 -04:00
Aleix Conchillo Flaqué
10e9371f49 Merge pull request #2712 from pipecat-ai/aleix/pipeline-runner-signals-windows
PipelineRunner: use signal.signal() on Windows
2025-09-23 08:28:26 -07:00
Paul Kompfner
e21ab89509 Add support for universal LLMContext to Mem0MemoryService 2025-09-23 10:35:54 -04:00
Mark Backman
cbce2075eb Add ElevenLabsSTTService 2025-09-23 10:27:31 -04:00
Paul Kompfner
97868175e6 Add support for universal LLMContext to LangchainProcessor 2025-09-23 10:00:48 -04:00
Paul Kompfner
99731ca40a Add support for universal LLMContext to GatedOpenAILLMContextAggregator, renaming it to GatedLLMContextAggregator in the process 2025-09-23 09:52:13 -04:00
Paul Kompfner
f96cbcce22 Add support for universal LLMContext to LLMLogObserver 2025-09-23 09:30:14 -04:00
kompfner
a19b9f70c0 Merge pull request #2706 from pipecat-ai/pk/update-examples-to-use-universal-llm-context
Update examples, wherever possible, to use `LLMContext` and associate…
2025-09-23 09:20:39 -04:00
Filipi da Silva Fuchter
9b4f1bdf39 Merge pull request #2705 from tzookb/tzookb/latency-logging
update UserBotLatencyLogObserver to have logging in functions that can be overidden
2025-09-23 09:46:20 -03:00
Filipi Fuchter
6b2bf8de64 Fixing the ruff format and making the methods sync. 2025-09-23 09:41:50 -03:00
Filipi da Silva Fuchter
33481c6614 Merge pull request #2672 from pipecat-ai/filipi/pcc_small_webrtc_2
Monitoring the peer connection while it is in the *connecting* state.
2025-09-23 08:32:16 -03:00
Filipi Fuchter
a5776b20ad Monitoring the peer connection while it is in the *connecting* state. 2025-09-23 08:30:17 -03:00
Filipi da Silva Fuchter
e286e015cf Merge pull request #2687 from pipecat-ai/memory_leak
Improving memory cleanup
2025-09-23 08:13:05 -03:00
Filipi Fuchter
a7bfac8d68 Mentioning the memory cleanups in the changelog. 2025-09-23 08:09:31 -03:00
Filipi Fuchter
1647b5b665 Created a new example using the video processor to make it easier to investigate memory leaks. 2025-09-23 08:05:28 -03:00
Filipi Fuchter
eaecefe675 Refactoring how we are reading the image bytes inside the base llm. 2025-09-23 08:05:15 -03:00
Filipi Fuchter
7c569b3863 Calling task_done when reading the audio from the queue. 2025-09-23 08:04:06 -03:00
Filipi Fuchter
8bf6a4c66f Improving memory cleanup inside the SmallWebRTCTransport. 2025-09-23 08:03:55 -03:00
Filipi Fuchter
1df3660186 Not storing anymore the last frames received to display them in the idle processor. 2025-09-23 08:03:35 -03:00
Aleix Conchillo Flaqué
75c0b089e0 PipelineRunner: use signal.signal() on Windows 2025-09-22 22:38:12 -07:00
Aleix Conchillo Flaqué
d8f3d4dd32 Merge pull request #1874 from nischalj10/patch-1
added deepwiki badge for weekly repo refresh
2025-09-22 17:44:08 -07:00
Aleix Conchillo Flaqué
c5e53bb84f Merge pull request #2707 from pipecat-ai/aleix/daily-transport-deprecated-mistake
DailyTransport: remove deprecated note and double registration
2025-09-22 17:01:09 -07:00
Aleix Conchillo Flaqué
b04e494373 DailyTransport: remove deprecated note and double registration 2025-09-22 16:45:58 -07:00
Jessie Wei
392293d55f Merge branch 'pipecat-ai:main' into main 2025-09-23 07:48:45 +10:00
Paul Kompfner
272532a3ea Update examples, wherever possible, to use LLMContext and associated machinery instead of OpenAILLMContext and associated machinery.
With all these examples updated, we no longer need dedicated examples illustrating `LLMContext`, so they're removed.

Here’s where we *don’t* yet use `LLMContext` and associated machinery:
- Realtime services: OpenAI Realtime, Gemini Live, and AWS Nova Sonic (support coming soon)
- `GoogleLLMOpenAIBetaService` (it’s deprecated, so we didn’t bother adding support)
- `LLMLogObserver` (support coming soon)
- `GatedOpenAILLMContextAggregator` (support coming soon)
- `LangchainProcessor` (support coming soon)
- `Mem0MemoryService` (support coming soon)
- Examples that use LLM-specific tools definitions as opposed to `ToolsSchema` (these will be updated soon)
- Examples that rely `GoogleLLMContext.upgrade_to_google` (TBD what to do with these)

Examples that use `LLMLogObserver`:
- 30-

Examples that use `GatedOpenAILLMContextAggregator`:
- 22-

Examples that use `LangchainProcessor`:
- 07b-

Examples that use `Mem0MemoryService`:
- 37-

Examples that need updating to use `ToolsSchema`:
- 15-
- 15a-
- 20a-
- 20c-
- 20d-
- 22b-
- 22c-
- 33-
- 36-

Examples that use `GoogleLLMContext.upgrade_to_google`:
- 22d-
- 25-
2025-09-22 16:21:35 -04:00
Tzook Bar Noy
3d04f565ec update latency observer logging to be in unique funcs, so others could extend and overwrite 2025-09-22 15:38:29 -04:00
Filipi da Silva Fuchter
d0477edb6a Merge pull request #2696 from pipecat-ai/filipi/inworld_default_temperature
Changing InworldTTSService default temperature to 1.1
2025-09-22 09:33:52 -03:00
Filipi Fuchter
326bfe4239 Removing the temperature from InworldTTSService example. 2025-09-22 09:30:53 -03:00
Aleix Conchillo Flaqué
3cb78d839d examples(foundational): update comment in 45-before-and-afet-events.py 2025-09-20 10:33:52 -07:00
Aleix Conchillo Flaqué
9129e44c05 Merge pull request #2697 from pipecat-ai/aleix/frame-processor-before-after-events
FrameProcessor: add before/after events for processed/pushed frames
2025-09-20 10:26:37 -07:00
Aleix Conchillo Flaqué
ec664e2d33 examples(foundational): added 45-before-and-afet-events.py 2025-09-20 10:23:49 -07:00
Aleix Conchillo Flaqué
3d88b42e0b FrameProcessor: add before/after events for processed/pushed frames 2025-09-19 20:47:21 -07:00
Aleix Conchillo Flaqué
2289409b4c Merge pull request #2699 from pipecat-ai/aleix/daily-on-before-leave
DailyTransport: rename on_before_leave to on_before_disconnect
2025-09-19 20:38:46 -07:00
Aleix Conchillo Flaqué
b1551b0d6b DailyTransport: rename on_before_leave to on_before_disconnect 2025-09-19 19:21:31 -07:00
Mark Backman
7cb5c951f4 Merge pull request #2689 from pipecat-ai/mb/foundational-smart-turn
Update quickstart and foundational examples to use smart-turn v3
2025-09-19 15:39:14 -07:00
mattie ruth backman
a2cb5ab8e1 Add Changelong entry 2025-09-19 17:46:29 -04:00
mattie ruth backman
cad3104d56 Fix _handle_send_text to use default options if not provided 2025-09-19 17:46:29 -04:00
mattie ruth backman
d8a2a917a2 Deprecate append-to-text handling in favor of new send-text with support for toggling skip-tts while handling the text 2025-09-19 17:46:29 -04:00
chadbailey59
3984cb58a2 cleared input and process events when pausing (#2698) 2025-09-19 16:44:43 -05:00
chadbailey59
9027a96a07 Add remote participant updates to DailyTransport (#2694)
* add remote participant updates to DailyTransport

* cleanup

* cleanup

* ruff cleanup again
2025-09-19 16:03:34 -05:00
Aleix Conchillo Flaqué
6abac3e3e5 Merge pull request #2673 from pipecat-ai/aleix/sync-event-handlers
introduce synchronous event handlers
2025-09-19 13:40:43 -07:00
Aleix Conchillo Flaqué
077b949bb2 BaseObject: run each handler for the same event in a separate task 2025-09-19 13:32:39 -07:00
Aleix Conchillo Flaqué
3a6c9786e8 LiveKitTransport: added synchronous before_disconnect event 2025-09-19 13:32:39 -07:00
Aleix Conchillo Flaqué
492da16cc9 DailyTransport: added synchronous on_before_disconnect event 2025-09-19 13:32:39 -07:00
Aleix Conchillo Flaqué
a698c4064b BaseObject: allow synchronous event handlers 2025-09-19 13:32:39 -07:00
Filipi Fuchter
9e098b5f79 Changing InworldTTSService default temperature to 1.1 2025-09-19 17:03:53 -03:00
Mark Backman
7df7395dd1 Merge pull request #2692 from pipecat-ai/mb/lazy-load-smallwebrtc-request
Lazy load SmallWebRTCRequest, SmallWebRTCRequestHandler in runner
2025-09-19 10:43:43 -07:00
Mark Backman
0885bc9cdf Lazy load SmallWebRTCRequest, SmallWebRTCRequestHandler in runner 2025-09-19 13:28:01 -04:00
vipyne
889dc19a27 mcp examples: check for env vars needed for examples 2025-09-19 12:09:50 -05:00
Mark Backman
4bc41466b7 Bump quickstart pipecat-ai version to require smart-turn v3, add local smart-turn dep, update quickstart 2025-09-19 08:03:06 -04:00
Mark Backman
9ab8ddee79 Update quickstart and foundational examples to use smart-turn v3 2025-09-18 23:54:18 -04:00
Aleix Conchillo Flaqué
0204f6a95d Merge pull request #2686 from pipecat-ai/aleix/silero-vad-v6
audio(vad): update Silero VAD model to v6
2025-09-18 20:31:10 -07:00
Mark Backman
b0bf653f04 Merge pull request #2679 from pipecat-ai/mb/gladia-remove-confidence
GladiaSTTService: deprecate confidence arg
2025-09-18 17:41:33 -07:00
Mark Backman
e8a676eb36 GladiaSTTService: deprecate confidence arg 2025-09-18 20:38:53 -04:00
Mark Backman
ca96eef1f3 Merge pull request #2680 from pipecat-ai/mb/dial-in-session-id
DailyTransport sip_call_transfer now automatically receives session_id
2025-09-18 17:36:51 -07:00
Mark Backman
8e1637d6c7 DailyTransport sip_call_transfer now automatically receives session_id 2025-09-18 20:34:14 -04:00
Filipi da Silva Fuchter
367200c0ad Merge pull request #2682 from pipecat-ai/filipi/smallwebrtc_leak
Smallwebrtc memory leak
2025-09-18 18:56:08 -03:00
Filipi Fuchter
766e1948a6 Mentioning the fix in the changelog. 2025-09-18 18:43:33 -03:00
Aleix Conchillo Flaqué
f369683b8b audio(vad): update Silero VAD model to v6 2025-09-18 14:06:37 -07:00
Aleix Conchillo Flaqué
461025d1cc Merge pull request #2684 from pipecat-ai/aleix/readme-whisker
README: add whisker debugger
2025-09-18 13:27:35 -07:00
Aleix Conchillo Flaqué
ac88706f38 README: add whisker debugger 2025-09-18 13:22:54 -07:00
Filipi Fuchter
93a89449b8 Adding warnings in case queue grows. 2025-09-18 16:43:57 -03:00
Filipi Fuchter
199bf72945 Preventing memory growth if we are not consuming the track. 2025-09-18 16:16:10 -03:00
Filipi Fuchter
d20e4125f6 Updating aiortc to the latest version. 2025-09-18 15:22:46 -03:00
Filipi Fuchter
c1baed642e Script to monitor memory usage. 2025-09-18 14:43:42 -03:00
Mark Backman
33ef68573f Merge pull request #2662 from pelguetat/fix-vertex-ai-global-location-support
feat: add support for global location in Vertex AI base URL
2025-09-18 10:25:10 -07:00
Pablo Elgueta
3c1b41df13 docs: add changelog entry for global location support
- Document the new global location support in GoogleVertexLLMService
- Explain the difference between regional and global API hosts
- Follow Keep a Changelog format
2025-09-18 17:39:03 +01:00
Jin Kim
58f70e7e0d Add tests for audio buffer processor flush alignment 2025-09-18 22:19:32 +09:00
kompfner
fca4ecc73c Merge pull request #2675 from pipecat-ai/pk/service-switcher-logic-simplification
Simplify a bit of logic in `ServiceSwitcher`
2025-09-18 09:17:22 -04:00
Jin Kim
d0b573e44f Fix audio buffer flush and silence handling 2025-09-18 19:40:45 +09:00
Paul Kompfner
cfa333508b Simplify a bit of logic in ServiceSwitcher 2025-09-17 21:03:38 -04:00
Mark Backman
9e7260393a Merge pull request #2671 from pipecat-ai/mb/fix-asyncai-ttstextframe
fix: AsyncAITTSService wasn't pushing TTSTextFrames
2025-09-17 14:06:41 -07:00
Mark Backman
073b585c52 fix: AsyncAITTSService wasn't pushing TTSTextFrames 2025-09-17 16:54:18 -04:00
Aleix Conchillo Flaqué
81c2e51bec Merge pull request #2669 from pipecat-ai/aleix/interruption-task-frame-wait-fixes
interruption task frame wait fixes
2025-09-17 13:47:57 -07:00
Aleix Conchillo Flaqué
42344125b1 tests: add unit tests for push_interruption_task_frame_and_wait() 2025-09-17 13:38:22 -07:00
Aleix Conchillo Flaqué
db5bcfaa51 FrameProcessor: fix push_interruption_task_frame_and_wait() 2025-09-17 13:38:21 -07:00
kompfner
615239b7d2 Merge pull request #2646 from pipecat-ai/pk/service-switcher-unit-tests
`ServiceSwitcher` unit tests (ended up being much more than that)
2025-09-17 16:30:18 -04:00
Paul Kompfner
27f1e9dd69 Update CHANGELOG with a description of the recently-fixed ServiceSwitcher bugs 2025-09-17 16:27:12 -04:00
Paul Kompfner
bd760deff2 Update comment with more detail for posterity 2025-09-17 16:19:31 -04:00
Paul Kompfner
8bc3c89140 Fix a bug preventing usage of multiple ServiceSwitchers in a pipeline 2025-09-17 16:09:18 -04:00
Paul Kompfner
2cd2567a37 Add a unit tests validating that multiple ServiceSwitchers can be used in the same pipeline (currently failing) 2025-09-17 16:04:30 -04:00
Paul Kompfner
5b55988846 Denote a couple of variables are private with a leading underscore 2025-09-17 15:38:28 -04:00
Paul Kompfner
a12392182c Simplify, undoing the change allowing controlling ServiceSwitcher with immediate frames (SystemFrames). Service switcher frames are ControlFrames, which are easier to reason about. We can always build the immediate option later if needed (i.e. if there's sufficient user pull for it) 2025-09-17 15:35:02 -04:00
Paul Kompfner
b814b70e1e Allow controlling ServiceSwitcher with either immediate frames (SystemFrames) or in-order frames (ControlFrames).
Immediate is the "default", i.e. has the more obvious name (e.g. `ManuallySwitchServiceFrame` v `ManuallySwitchServiceControlFrame`), since that's *probably* what users will want to reach for. Also, the immediate frames are more likely to behave like what we had before the last few commits, where the service switch would always "jump the queue" by having an immediate effect once it hit the `ServiceSwitcher` in the pipeline, jumping ahead of frames in front of it destined for the service.
2025-09-17 15:35:02 -04:00
Paul Kompfner
a1f84e1b50 Remove extraneous unit tests 2025-09-17 15:35:02 -04:00
Paul Kompfner
0839b48da8 Fix an issue where the upstream ServiceSwitcherFilter wouldn't get updated with the current active service 2025-09-17 15:35:02 -04:00
Paul Kompfner
de51637b77 Update ServiceSwitcher so that ServiceSwitcherFrames (which might update the currently active service) are processed and have an effect at the expected time. We should be able to, for example, queue:
- A text frame
- A `ManuallySwitchServiceFrame` (which is a `ServiceSwitcherFrame`)
- Another text frame

And expect that the first text frame be handled by the initially active service and the second text frame be handled by the newly active one.

Previously, the `ManuallySwitchServiceFrame` would have an effect too early, causing both text frames to be handled by the newly active service. Why? Because the frame filtering condition was being updated *directly* by the `ServiceSwitcher`, which is upstream from the services it's switching between. It could therefore update the filters *before* the services received the prior frames.
2025-09-17 15:35:02 -04:00
Paul Kompfner
e1b1dc16ec Add unit tests for ServiceSwitcher 2025-09-17 15:35:02 -04:00
Mark Backman
1fe27eb0a2 Merge pull request #2660 from pipecat-ai/mb/fix-user-idle-processor-cancel-task
fix: clean up how UserIdleProcessor handles return False
2025-09-16 14:48:59 -07:00
Mark Backman
d7e1389497 fix: clean up how UserIdleProcessor handles return False 2025-09-16 17:44:06 -04:00
Aleix Conchillo Flaqué
8c7230aa8f Merge pull request #2668 from pipecat-ai/aleix/livekit-update
livekit package update
2025-09-16 14:43:18 -07:00
Aleix Conchillo Flaqué
2cf71239b0 examples(01b): use TTSSpeakFrame instead of TextFrame 2025-09-16 17:18:45 -04:00
Aleix Conchillo Flaqué
ec2c62e32b pyproject: update to livekit 1.0.13
Fixes #2643
2025-09-16 17:18:44 -04:00
Mark Backman
38ce85e9a0 Merge pull request #2667 from zytegalaxy/mcp-serverparameters-typefix
fix: replace `Tuple` type with `TypeAlias` for server params in MCP client
2025-09-16 14:14:59 -07:00
Mark Backman
2279e5a899 Merge pull request #2663 from pipecat-ai/mb/websockets-15
Add support for websockets 15.0
2025-09-16 14:08:36 -07:00
Mark Backman
cce6eb5d87 Merge pull request #2666 from pipecat-ai/mb/update-38b-local-turn-model
38b: Update bundled ONNX smart-turn model
2025-09-16 14:05:12 -07:00
mehrdad
c2b98ae557 fix(lint): fix space format issue 2025-09-16 13:44:15 -07:00
Filipi da Silva Fuchter
727eb12b16 Merge pull request #2648 from pipecat-ai/filipi/pcc_small_webrtc
Creating SmallWebRTCRequestHandler for managing peer connections.
2025-09-16 16:37:04 -03:00
mehrdad
ba96bd05d3 fix: replace Tuple type with TypeAlias for server params in MCP client 2025-09-16 11:44:25 -07:00
Mark Backman
8ead309f8d 38b: Update bundled ONNX smart-turn model 2025-09-16 13:17:14 -04:00
Mark Backman
fad0e55c64 Add websockets-base optional dependency and use for DRY pyproject.toml 2025-09-16 11:24:38 -04:00
Mark Backman
74b1af56a0 Update uv.lock 2025-09-16 11:21:49 -04:00
Mark Backman
6924850ec4 Add support for websockets 15.0 2025-09-16 11:21:49 -04:00
marcus-daily
dfe7815dc5 Smart Turn v3: removing torch and torchaudio deps 2025-09-16 16:02:41 +01:00
Pablo Elgueta
69f0a75882 feat: add support for global location in Vertex AI base URL
- Update _get_base_url method to handle 'global' location case
- Use 'aiplatform.googleapis.com' for global locations
- Use '{location}-aiplatform.googleapis.com' for regional locations
- Maintains backward compatibility with existing regional endpoints
2025-09-16 10:28:22 -03:00
Mark Backman
cca90791c4 Merge pull request #2652 from pipecat-ai/mb/fix-audio-buffer-processor-has-audio
fix: AudioBufferProcessor has_audio returns based on user or bot audi…
2025-09-15 18:43:59 -07:00
Mark Backman
f2a5d408de fix: AudioBufferProcessor has_audio returns based on user or bot audio existing 2025-09-15 21:35:35 -04:00
Aleix Conchillo Flaqué
044c6eba46 Merge pull request #2655 from pipecat-ai/aleix/add-on-pipeline-finalized
PipelineTask: add on_pipeline_finished event
2025-09-15 15:32:04 -07:00
Aleix Conchillo Flaqué
db71089f5e PipelineTask: add on_pipeline_finished event
This deprecates `on_pipeline_stopped`, `on_pipeline_ended` and
`on_pipeline_cancelled`.
2025-09-15 15:28:33 -07:00
Aleix Conchillo Flaqué
f861f5066f Merge pull request #2654 from pipecat-ai/aleix/unify-on-client-disconnected
transports: on_client_disconnected only if remote client disconnects
2025-09-15 15:18:57 -07:00
kompfner
81cede2c60 Merge pull request #2653 from pipecat-ai/pk/llm-context-adapting-tests
`LLMContext`-adapting unit tests
2025-09-15 16:38:46 -04:00
kompfner
7603203230 Merge pull request #2644 from pipecat-ai/pk/run-inference-unit-tests
`run_inference` unit tests
2025-09-15 16:26:10 -04:00
Aleix Conchillo Flaqué
8569b61598 transports: on_client_disconnected only if remote client disconnects 2025-09-15 11:35:40 -07:00
Paul Kompfner
fe42187dc1 Implement LLMService.create_llm_specific_message() so that users don't need to just know what value of llm to provide to the LLMSpecificMessage constructor 2025-09-15 14:15:22 -04:00
Paul Kompfner
999e88c942 Add unit tests for AWSBedrockLLMAdapter.get_llm_invocation_params(), focusing on messages specifically 2025-09-15 12:08:21 -04:00
Paul Kompfner
c04df2f28b Add unit tests for AnthropicLLMAdapter.get_llm_invocation_params(), focusing on messages specifically 2025-09-15 11:55:48 -04:00
Paul Kompfner
100ef0ab5c Add unit tests for GeminiLLMAdapter.get_llm_invocation_params(), focusing on messages specifically 2025-09-15 11:38:23 -04:00
Paul Kompfner
42886d7105 Add unit tests for OpenAILLMAdapter.get_llm_invocation_params(), focusing on messages specifically. Also, fix a bug in OpenAILLMAdapter (found thanks to the unit tests) where it wasn't "unwrapping" LLMSpecificMessages. 2025-09-15 11:17:11 -04:00
Mark Backman
22cbba002a Merge pull request #2651 from pipecat-ai/mb/heygen-bot-speaking-frame
fix: push BotStartedSpeakingFrame in HeyGenVideoService
2025-09-15 08:02:25 -07:00
Filipi Fuchter
0a043154f2 Removing not used import. 2025-09-15 10:46:43 -03:00
Filipi Fuchter
5e322eba9e Supporting both single and multiple connection modes. 2025-09-15 10:43:46 -03:00
Filipi Fuchter
11d0c3d46d Refactoring SmallWebRTCRequestHandler. 2025-09-15 09:58:44 -03:00
Mark Backman
c873798ce5 fix: push BotStartedSpeakingFrame in HeyGenVideoService 2025-09-14 08:12:44 -04:00
Filipi Fuchter
95f72f6dce Creating SmallWebRTCRequestHandler for managing peer connections. 2025-09-12 18:15:24 -03:00
Aleix Conchillo Flaqué
d8cd28bb8b Merge pull request #2640 from pipecat-ai/aleix/pipecat-0.0.85
update CHANGELOG for 0.0.85
2025-09-12 11:06:41 -07:00
Aleix Conchillo Flaqué
c2df6c8aee update CHANGELOG for 0.0.85 2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué
82478be861 scripts(evals): add 19b-openai-realtime-text 2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué
0f2b7bc01b examples(foundational): fix 19b-openai-realtime-beta-text 2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué
1b2a5df017 Merge pull request #2622 from pipecat-ai/mb/call-data-runner
Add to, from phone info and custom data to the development runner
2025-09-12 10:28:17 -07:00
Mark Backman
2f496ac74f Add optional body parameter to WebsocketRunnerArguments 2025-09-12 11:28:12 -04:00
Mark Backman
22633a63b0 Update changelog 2025-09-12 11:15:03 -04:00
Mark Backman
e5ed0424e4 Remove to/from data from Plivo, as it will rely on body information 2025-09-12 11:10:03 -04:00
Paul Kompfner
786387722a Fix an issue in AWSBedrockLLMService.run_inference—exceptions should propagate, just like with other LLM services 2025-09-12 11:09:32 -04:00
Paul Kompfner
9f82c6b4a4 Add unit tests for run_inference 2025-09-12 11:07:11 -04:00
Mark Backman
99cfcb1d4e Parsed custom data from Plivo extraHeaders 2025-09-12 08:11:30 -04:00
Mark Backman
d595676436 Add custom data handling for Twilio 2025-09-12 08:11:30 -04:00
Aleix Conchillo Flaqué
0190812ee8 Merge pull request #2639 from pipecat-ai/aleix/min-words-interruption-unit-test
MinWordsInterruptionStrategy unit test
2025-09-11 18:52:39 -07:00
Aleix Conchillo Flaqué
2a24061bbb examples(07ad): remove deprecated user_continuous_stream 2025-09-11 18:50:00 -07:00
Aleix Conchillo Flaqué
89f7e7d199 update CHANGELOG with BaseOutputTransport fix 2025-09-11 16:58:44 -07:00
Aleix Conchillo Flaqué
384814e640 Merge pull request #2456 from a6kme/patch-1
Only set last_frame_time when handling OutputAudioRawFrame
2025-09-11 16:56:25 -07:00
Aleix Conchillo Flaqué
ab4364b833 update CHANGELOG and fix formatting 2025-09-11 15:34:47 -07:00
Aleix Conchillo Flaqué
fafdadad3c Merge pull request #2473 from TheNotary/adds-interim-transcription-frame-support
adds support to Azure STT for creating InterimTranscriptFrames
2025-09-11 15:33:38 -07:00
Aleix Conchillo Flaqué
05dc2fa916 updated CHANGELOG.md with GoogleTTSService updates 2025-09-11 14:36:21 -07:00
Aleix Conchillo Flaqué
0c30cc6ea6 Merge pull request #2547 from manishkjs/feat/google-tts-voice-cloning
feat: add voice cloning and speaking rate to GoogleTTSService
2025-09-11 14:32:21 -07:00
Aleix Conchillo Flaqué
c26d336e34 Merge pull request #2545 from pipecat-ai/aleix/aws-nova-sonic-pre-load-cue
AWSNovaSonicLLMService: pre-load audio cue in the constructor
2025-09-11 14:31:26 -07:00
Mark Backman
37b6198787 Merge pull request #2635 from pipecat-ai/mb/openai-tts-speed 2025-09-11 14:22:51 -07:00
kompfner
3c271da94c Merge pull request #2633 from pipecat-ai/pk/uv-readme-updates
Updating the README to reflect that:
2025-09-11 17:07:41 -04:00
kompfner
be28d3f93b Merge pull request #2637 from pipecat-ai/pk/llm-context-evals-and-bug-fix
`LLMContext` evals and bug fix
2025-09-11 17:00:07 -04:00
marcus-daily
d2f210e960 Bundle Smart Turn v3 with Pipecat 2025-09-11 21:37:16 +01:00
Aleix Conchillo Flaqué
57add41971 tests: add unit test for MinWordsInterruptionStrategy 2025-09-11 13:07:30 -07:00
Aleix Conchillo Flaqué
74b38b59d6 tests(utils): allow passing PipelineParams to run_test() 2025-09-11 13:02:21 -07:00
kompfner
dac58deffc Merge pull request #2636 from pipecat-ai/pk/uv-lock-update-for-smart-turn-v3
uv.lock update for Smart Turn v3
2025-09-11 14:35:36 -04:00
Paul Kompfner
aff11f5121 Fix missing import in llm_response_universal.py 2025-09-11 14:33:17 -04:00
Paul Kompfner
a4023d3915 Update evals to include examples that exercise the universal LLMContext 2025-09-11 14:32:56 -04:00
Paul Kompfner
d6543d244d uv.lock update for Smart Turn v3 2025-09-11 14:07:17 -04:00
Mark Backman
fafcd79870 OpenAITTSService: add speed arg 2025-09-11 13:53:52 -04:00
Paul Kompfner
6a717fbbd1 Updating the README to reflect that:
- various dependencies that previously didn't work with Python 3.13 now seem to
- ultravox isn't fully supported on macOS
2025-09-11 12:27:43 -04:00
Aleix Conchillo Flaqué
9b3f6927c2 Merge pull request #2621 from pipecat-ai/aleix/interruption-task-frame
interruption task frame
2025-09-11 09:22:35 -07:00
Aleix Conchillo Flaqué
0b21f8a6bd FrameProcessor: add push_interruption_task_frame_and_wait() 2025-09-11 09:19:44 -07:00
Aleix Conchillo Flaqué
8249b014f0 frames: BotInterruptionFrame is deprecated, use InterruptionTaskFrame 2025-09-11 09:01:54 -07:00
Aleix Conchillo Flaqué
9d9f10ae0e frames: StartInterruptionFrame is deprecated, use InterruptionFrame 2025-09-11 09:01:54 -07:00
Aleix Conchillo Flaqué
e27b23694d frames: add new TaskFrame
TaskFrame is a base class for other frames that are meant to be sent to the
pipeline task.
2025-09-11 09:01:52 -07:00
marcus-daily
66ce5fe6bd Ruff fixes 2025-09-11 16:04:56 +01:00
marcus-daily
a9b53dc800 Update inference session options 2025-09-11 16:04:56 +01:00
marcus-daily
818352a300 Formatting 2025-09-11 16:04:56 +01:00
marcus-daily
3e9fc7be19 Update onnxruntime version 2025-09-11 16:04:56 +01:00
marcus-daily
a2e76bcad8 Smart Turn V3 support 2025-09-11 16:04:56 +01:00
Mark Backman
8e8e42717b Add to and from phone information to the development runner 2025-09-11 10:44:21 -04:00
kompfner
b31322e38e Merge pull request #2619 from pipecat-ai/pk/aws-universal-context
Expand universal `LLMContext` support to AWS Bedrock
2025-09-11 09:33:08 -04:00
Jessie Wei
305108be9a [TwilioFrameSerializer]: Add parameter validation 2025-09-10 23:00:15 +00:00
Aleix Conchillo Flaqué
908325484d Merge pull request #2614 from pipecat-ai/aleix/readme-client-sdks-table
README: update clients' table
2025-09-10 10:21:18 -07:00
Mark Backman
dd6ff789c7 Merge pull request #2628 from pipecat-ai/mb/fix-13-push-frame
fix: 13 foundational examples now push frames from TranscriptionLogger
2025-09-10 09:13:04 -07:00
Mark Backman
f4938e0fad fix: 13 foundational examples now push frames from TranscriptionLogger 2025-09-10 10:40:10 -04:00
Jessie Wei
2e1f397d17 Support TwilioFrameSerializer region/edge settings to that the call is terminated successfully for regions other than default us1 region 2025-09-10 14:30:10 +00:00
James Hush
e8f60c7c6f Handle missing rawResponse in transcription messages (#2623)
* Handle missing rawResponse in transcription messages

- Use message.get('rawResponse', {}) to safely access rawResponse field
- Default is_final to False when rawResponse is missing
- Add proper type annotations for better code clarity
- Minor import formatting cleanup

This prevents KeyError crashes when transcription messages from Daily's API
don't include the rawResponse field in edge cases.

* docs: add changelog line
2025-09-10 15:03:23 +08:00
Paul Kompfner
fedb8a201f Update 12d example to use LLMContext, now that AWS Bedrock supports it 2025-09-09 16:24:13 -04:00
Paul Kompfner
8ccd220a60 Add universal LLMContext support to AWSBedrockLLMService.run_inference() 2025-09-09 16:00:32 -04:00
Paul Kompfner
fe79de8f27 When converting universal LLMContext messages to AWS Bedrock expected format, automatically update non-initial "system"-role messages to "user"-role messages, as we do in other non-OpenAI LLM services 2025-09-09 15:50:03 -04:00
Paul Kompfner
176573c342 Add to CHANGELOG AWS Bedrock's support for universal LLMContext 2025-09-09 15:31:56 -04:00
Paul Kompfner
75f9914f49 Add support for universal LLMContext to AWS Bedrock LLM service 2025-09-09 15:25:04 -04:00
Paul Kompfner
f4d6715e32 Add foundational example using AWS Bedrock with universal LLMContext 2025-09-09 10:49:51 -04:00
kompfner
38f6e33f97 Merge pull request #2598 from pipecat-ai/pk/deprecate-vision-image-raw-frame
Remove `VisionImageRawFrame`, which was previously being handled dire…
2025-09-08 17:13:28 -04:00
Paul Kompfner
1c3e4e34e5 Minor fix to AWS Bedrock console logging to handle image messages in the context 2025-09-08 17:10:11 -04:00
Paul Kompfner
623c660027 Remove debugging comment 2025-09-08 17:01:51 -04:00
Paul Kompfner
a3e65ab3b5 The VisionImageRawFrame removal and corresponding VisionImageFrameAggregator deprecation will now happen in version 0.0.85 2025-09-08 17:01:47 -04:00
Paul Kompfner
f3a4b416df Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.
Removing `VisionImageRawFrame` lets us simplify LLM services' logic, getting us closer to the idealized architecture where all they care about is handling context frames.

This change is in service of getting us closer to ready to deprecate usage of `OpenAILLMContext` and subclasses in favor of the universal `LLMContext`, at least for the traditional text-to-text LLMs.

Why remove `VisionImageRawFrame` rather than deprecate? It's "internal"—only created by `VisionImageFrameAggregator`—and never intended to be used directly by users (it would be difficult to use directly anyway).

Move the logic that was once in `VisionImageFrameAggregator` directly into the examples. Reasoning:
- If `UserImageRequester` is defined in the examples, it makes sense for `UserImageProcessor` to be too, as it’s the flip side of the same coin, so to speak
- The logic is now pretty trivial
- This kind of one-shot, history-less image-describing pipeline shouldn't be common at all; it's ok for it to live in examples rather than as a dedicated class
- In the short term, this enables us to create `LLMContext`s for services that support it and `OpenAILLMContext`s for services that don't yet (AWS)

This commit also adds missing translation from OpenAI-format image context messages to AWS format. Note that this isn't a wasted effort in the face of the upcoming migration to universal `LLMContext`—this work will be reused as it has to be implemented there too.
2025-09-08 17:00:08 -04:00
Aleix Conchillo Flaqué
aa471a4ef5 update CHANGELOG with LiveKitTransport updates 2025-09-08 13:53:21 -07:00
Aleix Conchillo Flaqué
d55133a44f Merge pull request #2604 from alexyzhou/feature/livekit_video_and_bug_fix
Feature: Add support for livekit video stream and minor bug fixes
2025-09-08 13:51:14 -07:00
Aleix Conchillo Flaqué
0f1cf81691 README: update clients' table 2025-09-08 12:08:32 -07:00
kompfner
ac4d335799 Merge pull request #2613 from pipecat-ai/pk/mistral-message-fixups
Apply additional fixups to context messages to meet Mistral-specific …
2025-09-08 13:59:54 -04:00
Paul Kompfner
e65385c151 Tweak the Mistral-specific context messages fixup logic to handle the (mostly academic) possibility of a "tool" message appearing at the end 2025-09-08 13:55:09 -04:00
Paul Kompfner
0bb7df7a6b Remove stray debugging message 2025-09-08 13:38:26 -04:00
Paul Kompfner
daee1ddf3b Apply additional fixups to context messages to meet Mistral-specific requirements 2025-09-08 11:26:58 -04:00
Adithya Suresh
d1f72c1c0b Added usage metrics 2025-09-08 14:44:25 +10:00
Aleix Conchillo Flaqué
1cccb97ccf Merge pull request #2608 from pipecat-ai/aleix/deprecate-noisereducefilter
audio(filters): deprecate NoisereduceFilter
2025-09-07 20:54:09 -07:00
Aleix Conchillo Flaqué
d7794abf21 audio(filters): deprecate NoisereduceFilter 2025-09-07 20:52:17 -07:00
Aleix Conchillo Flaqué
6a6a63a532 Merge pull request #2607 from pipecat-ai/aleix/scripts-evals-improve-eval-prompt
scripts(evals): allow user to talk and only eval when needed
2025-09-07 20:49:43 -07:00
Mark Backman
6edb6fed41 Merge pull request #2606 from pipecat-ai/mb/quickstart-lockfile
Remove uv.lock from quickstart
2025-09-07 06:10:14 -07:00
Mark Backman
a537382816 Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596)
* Add OpenAI Realtime module

* Add foundational examples for OpenAI Realtime

* Add deprecation warning to OpenAIRealtimeBetaLLMService

* Add deprecation warning to AzureRealtimeBetaLLMService

* Update Changelog
2025-09-07 09:09:57 -04:00
Aleix Conchillo Flaqué
46deaada70 scripts(evals): allow user to talk and only eval when needed 2025-09-06 19:19:08 -07:00
TheNotary
7366b1aee0 adds missing InterimTranscriptionFrame import 2025-09-06 14:40:19 -05:00
Mark Backman
dbc52bc6b0 Remove uv.lock from quickstart 2025-09-06 11:13:50 -04:00
Alex Zhou
d6432589f6 fix: fix format and lint by ruff 2025-09-06 10:50:47 +08:00
Alex Zhou
13b73d4406 feat: Add support for pipecat video stream; fix the bug of duplicate participants when connecting; fix the bug of RTVI messages sent via livekit messages; 2025-09-06 10:41:33 +08:00
Aleix Conchillo Flaqué
85d8282f7e Merge pull request #2602 from pipecat-ai/aleix/pipecat-0.0.84
update CHANGELOG for 0.0.84
2025-09-05 19:35:26 -07:00
Aleix Conchillo Flaqué
070690ec64 update CHANGELOG for 0.0.84 2025-09-05 18:22:50 -07:00
Aleix Conchillo Flaqué
b9c96fd623 Merge pull request #2601 from pipecat-ai/aleix/daily-python-0.19.9
pyproject: update daily-python to 0.19.9
2025-09-05 18:21:49 -07:00
Aleix Conchillo Flaqué
f8b2ab6331 pyproject: update daily-python to 0.19.9 2025-09-05 18:14:57 -07:00
Mark Backman
ea3f7e3c34 Merge pull request #2600 from pipecat-ai/mb/livekit-dtmf
LiveKitTransport: Add support to send DTMF
2025-09-05 15:25:32 -07:00
Mark Backman
2f44f88b08 LiveKitTransport: Add support to send DTMF 2025-09-05 18:23:04 -04:00
Mark Backman
25747a001b Merge pull request #2599 from pipecat-ai/mb/fix-daily-dtmf
DTMF: Add support for native DTMF implementation where available
2025-09-05 15:20:05 -07:00
Mark Backman
fbe4338440 DTMF: Add support for native DTMF implementation where available 2025-09-05 18:16:56 -04:00
Filipi da Silva Fuchter
64b4c65728 Merge pull request #2595 from pipecat-ai/filipi/heygen_quality
Improving HeyGen example video quality.
2025-09-05 17:19:25 -03:00
kompfner
29442969a9 Merge pull request #2597 from pipecat-ai/pk/fix-anthropic-tool-less-usage
Fix Anthropic tool-less usage
2025-09-05 15:30:29 -04:00
Paul Kompfner
dc2e1d4ad3 Fix Anthropic tool-less usage 2025-09-05 11:47:31 -04:00
Filipi Fuchter
5477dfcbea Improving HeyGen example video quality. 2025-09-05 11:30:01 -03:00
kompfner
516f0e08ab Merge pull request #2590 from pipecat-ai/pk/gemini-multimodal-live-doesnt-support-llm-context
Raise an error when attempting to use Gemini Multimodal Live with uni…
2025-09-05 09:22:33 -04:00
Paul Kompfner
246f9f3325 Raise an error when attempting to use Gemini Multimodal Live with universal LLMContext. This is exactly the same error we already have for the other s2s models, AWS Nova Sonic and OpenAI Realtime, it was just missing from this service. 2025-09-04 16:47:08 -04:00
Manish Kumar
4699ee8d86 docs: add docstring for voice_cloning_key and update CHANGELOG 2025-09-04 22:45:51 +05:30
kompfner
3d850e8cc5 Merge pull request #2574 from pipecat-ai/pk/expand-universal-llm-context-support-to-anthropic
Expand universal `LLMContext` support to Anthropic
2025-09-04 13:09:44 -04:00
Paul Kompfner
6e734a37f9 Fix a bug in AWSBedrockLLMService.run_inference(); it was expecting the wrong format for the system instruction 2025-09-04 13:04:15 -04:00
Paul Kompfner
f72ca2fd7d Remove unnecessary system_instruction argument from run_inference() methods 2025-09-04 13:04:15 -04:00
Paul Kompfner
0826d72f74 Add deprecation warning for using enable_prompt_caching_beta param 2025-09-04 13:04:15 -04:00
Paul Kompfner
ba5ebfa0ec Fixed subtle CHANGELOG conflict after release of 0.0.83: universal LLMContext support for Anthropic didn't make that release. Also, some automatic Prettier fixes. 2025-09-04 13:04:11 -04:00
Paul Kompfner
dc3412b2df Bump a deprecation to 0.0.84, as 0.0.83 just shipped 2025-09-04 13:03:06 -04:00
Paul Kompfner
b2e9fd9341 Rename Anthropic enable_prompt_caching_beta parameter to just enable_prompt_caching 2025-09-04 13:03:06 -04:00
Paul Kompfner
c11b207c97 Add Anthropic to CHANGELOG list of services newly supporting runtime LLM switching 2025-09-04 13:03:06 -04:00
Paul Kompfner
d6205027cf Trivial cleanup 2025-09-04 13:03:06 -04:00
Paul Kompfner
986160c077 Fix a bug where the Anthropic adapter's merge-consecutive-messages-with-the-same-role logic was unintentionally affecting the source LLMContext's messages, resulting in more and more duplication of text with each inference 2025-09-04 13:03:06 -04:00
Paul Kompfner
b56ff86fee Minor refactor of AnthropicLLMAdapter cache-control-marker-adding logic (without really changing its behavior) 2025-09-04 13:03:06 -04:00
Paul Kompfner
5c574eaad9 Add support for universal LLMContext to Anthropic LLM service 2025-09-04 13:03:06 -04:00
Paul Kompfner
2df231143a Add foundational example using Anthropic with universal LLMContext 2025-09-04 13:03:06 -04:00
Aleix Conchillo Flaqué
e3597801d4 AWSNovaSonicLLMService: pre-load audio cue in the constructor 2025-09-04 09:31:39 -07:00
Aleix Conchillo Flaqué
65298ab792 update CHANGELOG with AWSBedrockLLMService fix 2025-09-04 09:24:55 -07:00
Aleix Conchillo Flaqué
b609b02614 Merge pull request #2568 from ezisezis/fix-bedrock-timeouts
fix timeout handling in AWSBedrockLLMService
2025-09-04 09:23:28 -07:00
Aleix Conchillo Flaqué
f2b50c14d2 Merge pull request #2573 from pipecat-ai/vp-minor-fixes-07s
example 07s: minor typo updates
2025-09-04 09:21:32 -07:00
Aleix Conchillo Flaqué
ee3b023986 update CHANGELOG with OpenAIImageGenService fix 2025-09-04 09:20:02 -07:00
Aleix Conchillo Flaqué
0d9e1190d7 Merge pull request #2583 from sassanh/main
fix: openai image generator now initiates URLImageRawFrame with correct order of arguments
2025-09-04 09:17:51 -07:00
Mark Backman
595a7c7fbe Merge pull request #2587 from pipecat-ai/mb/update-quickstart-0.0.83
Update quickstart pyproject to use 0.0.83
2025-09-04 07:42:56 -07:00
Mark Backman
586586f743 Update quickstart pyproject to use 0.0.83 2025-09-04 10:36:58 -04:00
Mark Backman
a1c6ad539d Merge pull request #2585 from ashotbagh/feat/asyncai-multilingual-support
feat(asyncai): add multilingual TTS support
2025-09-04 05:03:45 -07:00
Ashot
daf7fed8b3 feat(asyncai): add multilingual TTS support 2025-09-04 13:58:50 +04:00
Adithya Suresh
446d99d194 Bug fixes 2025-09-04 17:08:16 +10:00
Adithya Suresh
cbdbdee4c0 Initial StrandsAgentsProcessor implementation 2025-09-04 16:58:58 +10:00
Sassan Haradji
a26647c433 fix: openai image generator now initiates URLImageRawFrame with correct order of arguments 2025-09-04 06:09:57 +03:30
Aleix Conchillo Flaqué
0fab56fc13 Merge pull request #2577 from pipecat-ai/aleix/pipecat-0.0.83
update CHANGELOG for 0.0.83
2025-09-03 16:49:24 -07:00
Aleix Conchillo Flaqué
f0baff94b2 update CHANGELOG for 0.0.83 2025-09-03 16:47:43 -07:00
Aleix Conchillo Flaqué
d146170fd6 Merge pull request #2580 from pipecat-ai/mb/add-cerebras-evals
Add 14k (CerebrasLLMService) to release evals
2025-09-03 15:08:28 -07:00
Filipi da Silva Fuchter
001a2d36e5 Merge pull request #2579 from pipecat-ai/filipi/input_message
Creating InputTransportMessageUrgentFrame
2025-09-03 19:01:07 -03:00
Filipi Fuchter
99e237b1e2 Fixed an issue where messages received from the transport were always being resent. 2025-09-03 18:58:34 -03:00
Aleix Conchillo Flaqué
978f644f19 Merge pull request #2578 from pipecat-ai/aleix/user-speaking-frame
add UserSpeakingFrame and UserStartedSpeakingFrame/UserStopeedSpeakingFrame updates
2025-09-03 14:55:45 -07:00
Aleix Conchillo Flaqué
5a4c6b9618 BaseInputTransport: push UserStartedSpeakingFrame/UserStoppedSpeakingFrame upstream 2025-09-03 14:32:32 -07:00
Mark Backman
977a57c8fb Add 14k (CerebrasLLMService) to release evals 2025-09-03 17:11:38 -04:00
Mark Backman
c64bc5a636 Merge pull request #2576 from joyceerhl/joyce/cerebras-default
fix: update default Cerebras model to GPT-OSS-120B
2025-09-03 14:10:28 -07:00
Joyce Er
eba006d39c Fix nits 2025-09-03 14:07:49 -07:00
Joyce Er
a001f6f193 Switch to GPT-OSS-120B 2025-09-03 14:00:27 -07:00
Aleix Conchillo Flaqué
09d6ec1098 introduce and push UserSpeakingFrame upstream/downstream 2025-09-03 13:56:01 -07:00
Aleix Conchillo Flaqué
f56be9315a Merge pull request #2572 from pipecat-ai/aleix/deepgram-disconnect-task
ParallePipeline: wait for CancelFrame in all branches
2025-09-03 13:10:55 -07:00
Mark Backman
8e5880b2e7 Merge pull request #2575 from pipecat-ai/mb/fix-foundational-frame-direction
fix: Specify frame direction in 06a push_frame
2025-09-03 12:46:10 -07:00
Joyce Er
d8ac6f2c1a fix: update default Cerebras model to Qwen 3 32B 2025-09-03 12:23:36 -07:00
Mark Backman
052ffe8712 fix: Specify frame direction in 06a push_frame 2025-09-03 15:07:05 -04:00
Aleix Conchillo Flaqué
b52296450c DeepgramSTTService: remove raising CancelledError 2025-09-03 11:24:02 -07:00
Aleix Conchillo Flaqué
c71cec04d3 ParallelPipeline: wait for CancelFrame in all branches 2025-09-03 11:23:37 -07:00
vipyne
83f64ecd3b example 07s: minor typo updates 2025-09-03 12:11:07 -05:00
Aleix Conchillo Flaqué
d19170d8b1 Merge pull request #2565 from pipecat-ai/aleix/reorganize-transports
transports: reorganize module
2025-09-03 08:52:49 -07:00
Mark Backman
8b95d74193 Merge pull request #2571 from pipecat-ai/mb/fix-docs-0.0.83
Fix docs generation before 0.0.83 release
2025-09-03 08:52:20 -07:00
Mark Backman
3c4694a8f1 Fix docs generation before 0.0.83 release 2025-09-03 11:31:14 -04:00
kompfner
b9748b1228 Merge pull request #2563 from pipecat-ai/pk/expand-universal-llm-context-support-to-more-llms
Expand universal `LLMContext` support to more LLMs
2025-09-03 11:20:26 -04:00
Paul Kompfner
def1cf1548 Update CHANGELOG with entry about expanded support among LLM services for new universal LLMContext 2025-09-03 09:07:57 -04:00
Paul Kompfner
9b216116f1 Remove supports_universal_context gate from OpenAI-library-based LLM services. It's no longer needed, as they all now support universal LLMContext. 2025-09-03 09:07:18 -04:00
Paul Kompfner
7cb372ebb9 Add support for universal LLMContext to Together.ai LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
6838bc1e51 Add a few missing LLMContext type hints 2025-09-03 09:07:18 -04:00
Paul Kompfner
e04f42167e Add support for universal LLMContext to SambaNova LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
91a3f63e28 Add support for universal LLMContext to Qwen LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b24eb76559 Add QWEN_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Paul Kompfner
d9ea02595b Add support for universal LLMContext to Ollama LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
5bc0e49baa Add support for universal LLMContext to Perplexity LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
ec138b97d9 Add support for universal LLMContext to OpenRouter LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
0c32cc29a7 Add support for universal LLMContext to OpenPipe LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
d740bab99e Add support for universal LLMContext to NVIDIA NIM LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
ac62183eb6 Add NVIDIA_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Paul Kompfner
34f823bcac Add support for universal LLMContext to Groq LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b4e1051066 Add support for universal LLMContext to Grok LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
d8882bc381 Add support for universal LLMContext to Google Vertex AI LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
da18d0a562 Add support for universal LLMContext to Fireworks AI LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
f8e13a82cf Fix Fireworks AI function calling example 2025-09-03 09:07:18 -04:00
Paul Kompfner
2b00d37e94 Add support for universal LLMContext to DeepSeek LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
2dbd17da4d Fix Cerebras function calling example 2025-09-03 09:07:18 -04:00
Paul Kompfner
d45fbd5455 Add support for universal LLMContext to Cerebras LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b22bdff6d0 Add support for universal LLMContext to Azure LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
2b286365e0 Add MISTRAL_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Eduards Klavins
0a3e98857e fix timeout handling in AWSBedrockLLMService 2025-09-03 11:52:30 +03:00
Aleix Conchillo Flaqué
aeb9f1ffca transports: reorganize module 2025-09-02 17:31:39 -07:00
Filipi da Silva Fuchter
7f1100bd4c Merge pull request #2513 from pipecat-ai/filipi/whatsapp
Support for the new WhatsApp Cloud API
2025-09-02 18:06:59 -03:00
Filipi Fuchter
8fbd9b5af7 Added support for WhatsApp User-initiated Calls. 2025-09-02 18:05:00 -03:00
Filipi Fuchter
49c1f0bd08 Fixed SmallWebRTCTransport to not use mid to decide if the transceiver should be sendrecv or not. 2025-09-02 18:04:51 -03:00
Aleix Conchillo Flaqué
ce7a0512f9 Merge pull request #2562 from pipecat-ai/aleix/ai-coustics-speech-enhancement
add ai-coustics speech enhancement filter
2025-09-02 13:13:28 -07:00
Aleix Conchillo Flaqué
fdcd14dd21 updated CHANGELOG with AICFilter and fix deprecations 2025-09-02 13:10:10 -07:00
Mark Backman
0386599163 Added Daily SIP room creation utility (#2560)
* Added Daily SIP room creation utility to configure() function

* Add sip_codecs to the DailyRoomSipParams
2025-09-02 14:12:04 -04:00
Corvin Jaedicke
c1ce3d7d2b bumped aic-sdk version to v1.0.1 with minor changes 2025-09-02 11:11:29 -07:00
Corvin Jaedicke
8ecece2d9c Add AIC SDK audio filter 2025-09-02 11:11:29 -07:00
Filipi da Silva Fuchter
0d8ab7abca Merge pull request #2552 from pipecat-ai/filipi/freeze_issues
Fixed an issue where the pipeline could freeze.
2025-09-02 14:43:08 -03:00
Filipi Fuchter
dea7c22020 Fixed an issue where the pipeline could freeze. 2025-09-02 13:58:41 -03:00
Mark Backman
cfe11267f4 Merge pull request #2546 from pipecat-ai/mb/update-changelog-mem0
Add mem0 changelog entry
2025-09-02 07:22:52 -07:00
Mark Backman
d0c97d3602 Add mem0 changelog entry 2025-09-02 10:03:17 -04:00
Mark Backman
37e1551abc Merge pull request #2555 from pipecat-ai/mb/update-uv-lock-quickstart 2025-09-02 04:19:15 -07:00
Mark Backman
e1477e79f0 Merge pull request #2538 from rimelabs/rime-flush-audio-update
Use Rime’s official {"operation": "flush"} command in flush_audio() for proper text buffer flushing
2025-09-01 18:01:20 -07:00
Mark Backman
547b126d98 Update required pipecat-ai version for quickstart 2025-09-01 20:52:44 -04:00
Mark Backman
447e3b28eb Update uv.lock for quickstart 2025-09-01 20:52:12 -04:00
gokuljs
472efa2971 ruff format fix 2025-09-02 04:25:28 +05:30
Mark Backman
64486ef50b Merge pull request #2536 from gladiaio/PLA-38-missing-config-parameters
Gladia - add missing config parameters
2025-09-01 12:42:10 -07:00
gokuljs
5f801743d0 Add changelog entry for RimeTTSService flush_audio API update 2025-09-01 22:16:02 +05:30
Fabrice Lamant
802c5d04f4 update changelog 2025-09-01 10:21:11 +02:00
Aleix Conchillo Flaqué
83b90da53a Merge pull request #2537 from pipecat-ai/aleix/pipeline-task-cleanup-observers
PipelineTask: cleanup observers
2025-08-31 13:44:38 -07:00
Aleix Conchillo Flaqué
1f49de5cdf Merge pull request #2542 from pipecat-ai/aleix/remove-stop-interruption-frame
frames: remove StopInterruptionFrame
2025-08-31 13:44:22 -07:00
Manish Kumar
2ee481d541 feat: add voice cloning and speaking rate to GoogleTTSService 2025-08-30 23:04:59 +05:30
Mark Backman
7cf099eae7 Merge pull request #2541 from parshvadaftari/user/parshva/update_mem0
Update mem0 integration
2025-08-30 05:11:31 -07:00
Mark Backman
93a8ea3cb2 Merge pull request #2543 from pipecat-ai/mb/docs-extensions
Add Extensions to ref docs generation
2025-08-30 04:20:03 -07:00
Aleix Conchillo Flaqué
776aafddfb Merge pull request #2534 from pipecat-ai/aleix/pyright-1.1.404
pyproject: update pyright and ruff
2025-08-29 19:55:54 -07:00
Mark Backman
d56762262a Fix docs build errors 2025-08-29 20:24:35 -04:00
Mark Backman
bbcf35d657 Add Extensions to reference docs generation 2025-08-29 20:17:34 -04:00
Mark Backman
972546b24f Add IVR navigation (#2529) 2025-08-29 20:08:17 -04:00
Aleix Conchillo Flaqué
8b351f5bec pyproject: update pyright and ruff 2025-08-29 17:02:13 -07:00
Aleix Conchillo Flaqué
bd7d9346b7 frames: remove StopInterruptionFrame 2025-08-29 16:40:01 -07:00
Aleix Conchillo Flaqué
81325be4f3 Merge pull request #2540 from pipecat-ai/aleix/dtmf-tones-slower
audio(dtmf): use longer tones and longer gaps
2025-08-29 15:15:01 -07:00
Aleix Conchillo Flaqué
399f8de6ef audio(dtmf): use longer tones and longer gaps 2025-08-29 15:10:20 -07:00
parshvadaftari
60c070e077 update mem0 integration for reduced latency and better performance 2025-08-30 02:27:36 +05:30
gokuljs
e3f2faabf7 Merge branch 'main' of github.com:rimelabs/pipecat into rime-flush-audio-update 2025-08-30 01:18:50 +05:30
Aleix Conchillo Flaqué
b5a644dd6f PipelineTask: cleanup observers 2025-08-29 10:54:36 -07:00
gokuljs
e06bd6049e update flush operation in flush audio function under rime tts 2025-08-29 21:27:14 +05:30
Fabrice Lamant
25b595e125 add suggestions 2025-08-29 14:51:20 +02:00
Fabrice Lamant
edc8cc1e69 remove sample_rate from GladiaInputParams 2025-08-29 14:00:00 +02:00
Fabrice Lamant
633dd69dee feat: add logging for pipecat version and session url 2025-08-29 13:47:16 +02:00
Fabrice Lamant
1a1d5a1081 feat: add missing config params 2025-08-29 13:46:44 +02:00
Aleix Conchillo Flaqué
c1b8d2acab Merge pull request #2532 from pipecat-ai/aleix/universal-dtmf-support
Universal DTMF support
2025-08-28 21:04:13 -07:00
Aleix Conchillo Flaqué
ea368e4c5f scripts(dtmf): added generate_dtmf.sh to generate DTMF wav files 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
f03deb6ecc DailyTransport: remove send_dtmf() and write_dtmf() 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
0e01ac8ef6 BaseOutputTransport: implement generic write_dtmf() 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
5787743ab3 audio(dtmf): added DTMF audio files and load_dtmf_audio() 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
79be0695dd make sure warnings are always displayed 2025-08-28 17:43:29 -07:00
Aleix Conchillo Flaqué
a5c5e069ba move pipecat.frames.frames.KeypadEntry to pipecat.audio.dtmf.types.KeypadEntry 2025-08-28 17:43:29 -07:00
Aleix Conchillo Flaqué
77c34076f7 Merge pull request #2531 from pipecat-ai/aleix/pipecat-0.0.82
update CHANGELOG for 0.0.82
2025-08-28 13:04:41 -07:00
Aleix Conchillo Flaqué
d67cece356 update CHANGELOG for 0.0.82 2025-08-28 13:02:47 -07:00
Aleix Conchillo Flaqué
275c8b59c5 MistralLLMService: fix build_chat_completion_params() 2025-08-28 12:04:14 -07:00
Aleix Conchillo Flaqué
5ebcea2a3b scripts(eval): change "result" function call parameter 2025-08-28 11:38:59 -07:00
Aleix Conchillo Flaqué
64f2135ddc examples(14f): use default models 2025-08-28 11:38:59 -07:00
kompfner
a74231f036 Merge pull request #2515 from pipecat-ai/pk/llm-run-frame
Add `LLMRunFrame` to trigger an LLM response, replacing `context_aggr…
2025-08-28 10:01:00 -04:00
Paul Kompfner
189749b579 Add LLMRunFrame to trigger an LLM response, replacing context_aggregator.user().get_context_frame() 2025-08-28 09:53:33 -04:00
Aleix Conchillo Flaqué
e384ca949e Merge pull request #2512 from pipecat-ai/aleix/textframe-skip-tts
TextFrame: add skip_tts field
2025-08-27 16:26:03 -07:00
Aleix Conchillo Flaqué
eb248fedc1 add skip_tts to LLMFullResponseStartFrame/LLMFullResponseEndFrame 2025-08-27 16:23:27 -07:00
Aleix Conchillo Flaqué
16f57be72c LLMConfigureOutputFrame: allow configuring LLM output 2025-08-27 16:23:27 -07:00
Aleix Conchillo Flaqué
5803936838 TextFrame: add skip_tts field
This lets a text frame bypass TTS while still being included in the LLM
context. Useful for cases like structured text that isn’t meant to be spoken but
should still contribute to context.
2025-08-27 16:23:27 -07:00
Aleix Conchillo Flaqué
d9837dd1e5 Merge pull request #2527 from pipecat-ai/aleix/daily-python-0.19.8
pyproject: update daily-python to 0.19.8
2025-08-27 16:22:49 -07:00
Aleix Conchillo Flaqué
e48c9fc3e2 pyproject: update daily-python to 0.19.8 2025-08-27 16:00:36 -07:00
Aleix Conchillo Flaqué
3c4454a33e Merge pull request #2526 from pipecat-ai/aleix/pipeline-task-wait-for-startframe
PipelineTask: wait for StartFrame to reach end of pipeline
2025-08-27 15:57:10 -07:00
Aleix Conchillo Flaqué
2a0780e6ef PipelineTask: wait for StartFrame to reach end of pipeline
Fixes #2498
2025-08-27 14:23:09 -07:00
Aleix Conchillo Flaqué
5e121346fb Merge pull request #2516 from pipecat-ai/aleix/rtvi-client-version-check
RTVIProcessor: make check sure client version is set
2025-08-27 14:02:14 -07:00
Aleix Conchillo Flaqué
2bdca8d22c RTVIProcessor: make check sure client version is set 2025-08-27 13:36:11 -07:00
Aleix Conchillo Flaqué
1f5888bcf7 Merge pull request #2517 from pipecat-ai/aleix/unify-get-messages-for-logging
unify get_messages_for_logging()
2025-08-27 12:49:36 -07:00
Mark Backman
3d09f9a2af Merge pull request #2524 from pipecat-ai/mb/cartesia-speed
Cartesia: update speed InputParam
2025-08-27 12:47:29 -07:00
Aleix Conchillo Flaqué
cd3563bb16 unify get_messages_for_logging()
Some implementations were returing a list and some were returning a JSON
string. They should all return a list and the user would decide if it wants to
transform that into JSON.
2025-08-27 12:45:24 -07:00
Aleix Conchillo Flaqué
3e79ef4118 Merge pull request #2525 from pipecat-ai/aleix/daily-fix-send-dtmf
DailyTransport: fix sending DTMF tones
2025-08-27 12:44:27 -07:00
Aleix Conchillo Flaqué
2613da1a1f PipelineTask: increase CANCEL_TIMEOUT_SECS to 20 2025-08-27 11:50:48 -07:00
Aleix Conchillo Flaqué
41d40f9a11 DailyTransport: make sure we have a client before joining/leaving 2025-08-27 11:50:48 -07:00
Aleix Conchillo Flaqué
74af2b6aa4 DailyTransport: fix sending DTMF tones 2025-08-27 11:50:48 -07:00
Mark Backman
f7d9f32b0f Cartesia: update speed InputParam 2025-08-27 13:34:28 -04:00
Mark Backman
6074af60ef Merge pull request #2521 from pipecat-ai/mb/update-quickstart-pcc-docker
Update quickstart to use pcc docker command
2025-08-27 08:13:31 -07:00
Mark Backman
7ef6893c0d Merge pull request #2523 from sam-s10s/fix/connection-none
Speechmatics TTS connection issue
2025-08-27 08:09:46 -07:00
Sam Sykes
cc5557e051 changelog 2025-08-27 16:07:31 +01:00
Sam Sykes
06f7a92c99 fix to finally statement 2025-08-27 14:43:07 +01:00
Mark Backman
61a333ccae Update quickstart to use pcc docker command 2025-08-26 21:29:13 -04:00
Mark Backman
fc3d84dff7 Merge pull request #2501 from pipecat-ai/mb/aws-tts-more-flexible-auth
Support additional authentication mechanisms for AWS services
2025-08-26 18:05:37 -07:00
Mark Backman
86a37d8cea Add changelog entry for SentryMetrics missing import fix 2025-08-26 21:00:16 -04:00
Mark Backman
3f66acf9f1 Merge pull request #2520 from geluso/bugfix-missing-asyncio-import
add missing import asyncio
2025-08-26 17:59:25 -07:00
Mark Backman
facfaa2dd4 AWSBedrockLLMService: Allow setting auth credentials via env vars 2025-08-26 20:59:12 -04:00
Mark Backman
8250c381d1 AWSPollyTTSService: allow setting auth credentials through provider chain 2025-08-26 20:58:02 -04:00
Steve Geluso
32f9e48865 add missing import asyncio 2025-08-26 17:40:11 -07:00
Filipi Fuchter
76eef837b6 Removing watchdog from SarvamTTSService. 2025-08-26 18:44:58 -03:00
Filipi Fuchter
c9aaa463b7 Mentioning the recent SarvamTTSService changes in the changelog. 2025-08-26 18:44:58 -03:00
pratham-sarvam
6d582e41b7 Added Sarvam TTS Websocket Implementation (#2356)
* Added Sarvam TTS Websocket Implementation

* Addressed some of the comments on PR

* added change voice logic

* added changes from main

* pushing text frames and added flush audio

* updated docs string for better docs

* Addressed comments and added some improvements

* pushed optional args down

* removed new line

* made aiohttp session mandatory in http service

* added push frame and removed unused function

* removed pong message

* added disconnecting logic

---------

Co-authored-by: vinayak-sarvam <vinayak@sarvam.ai>
2025-08-26 18:10:26 -03:00
kompfner
ca29f62bff Merge pull request #2510 from pipecat-ai/pk/fix-set-tools-types
Update types for tools in `LLMSetToolsFrame` and `LLMContextAggregato…
2025-08-26 14:12:21 -04:00
Aleix Conchillo Flaqué
0dced68c3c Merge pull request #2511 from pipecat-ai/aleix/end-of-pipline-warning
PipelineTask: warn if CancelFrame doesn't reach the end
2025-08-26 11:02:26 -07:00
Aleix Conchillo Flaqué
8ab81d289a PipelineTask: warn if CancelFrame doesn't reach the end 2025-08-26 10:36:33 -07:00
Paul Kompfner
f457d00760 Update types for tools in LLMSetToolsFrame and LLMContextAggregator.set_tools(), for two reason:
1. `ToolsSchema` has been supported in `LLMSetToolsFrame` for a while but wasn't properly reflected in these type hints
2. The new universal `LLMContext` expects tools to be either a `ToolsSchema` or `NOT_GIVEN`.
2025-08-26 11:32:21 -04:00
kompfner
f5118c4412 Merge pull request #2440 from pipecat-ai/pk/prototype-llm-failover-attempt-4
Support for runtime LLM switching
2025-08-26 09:55:03 -04:00
Paul Kompfner
a79fe40162 Fix a typo in the CHANGELOG 2025-08-26 09:51:48 -04:00
Paul Kompfner
dcb4949e20 Move ServiceSwitcherFrame and ManuallySwitchServiceFrame to frames.py 2025-08-26 09:47:37 -04:00
Paul Kompfner
8b543e558d Add CHANGELOG entry describing LLMService.run_inference() 2025-08-26 09:47:32 -04:00
Paul Kompfner
8181962236 Add CHANGELOG entry describing LLM switcher 2025-08-26 09:46:51 -04:00
Paul Kompfner
98dc891640 Move CHANGELOG log entry from 0.0.81 to Unreleased 2025-08-26 09:45:49 -04:00
Paul Kompfner
71de0da570 ServiceSwitchers are now controlled using frames rather than with direct method calls 2025-08-26 09:44:15 -04:00
Paul Kompfner
b40c8bb81d Refactor LLMSwitcher into a base ServiceSwitcher and an LLMSwitcher that subclasses it 2025-08-26 09:44:15 -04:00
Paul Kompfner
43f1b59b86 Convert LLM generate_summary() methods to the more generic run_inference() 2025-08-26 09:44:15 -04:00
Paul Kompfner
a0a2bb3aa4 In GeminiLLMAdapter, when translating from the universal LLMContext format, only pull out the first "system" message as the system instruction, and convert subsequent ones into "user" messages. This is a more correct thing to do than simply drop subsequent "system" messages, especially when potentially sharing a context between multiple LLMs. 2025-08-26 09:44:15 -04:00
Paul Kompfner
04a50df3d5 Add LLMSwitcher, with LLMSwitcherStrategyManual as the first supported switching strategy 2025-08-26 09:44:15 -04:00
Paul Kompfner
8c0edffaff Fix bug in AWS Bedrock conversation summarization. It was using an out-of-date pattern (the _client property no longer exists) 2025-08-26 09:44:15 -04:00
Paul Kompfner
fe6063fdbe Introduce an affordance to LLMService for generating a summary of a conversation directly (i.e. without going through the pipeline).
This abstraction will allow us to update Pipecat Flows to avoid reaching into LLM service internals to generate summaries.

In addition to being a helpful refactor to remove a fragile part of Pipecat Flows, this change helps set the stage for supporting the upcoming `LLMSwitcher`, where the “active” LLM will only be determined at runtime—today, Pipecat Flows needs to know ahead of time what type of LLM it’s working with, to load an LLM-specific “adapter” that does the work of generating summaries, among other things.
2025-08-26 09:44:15 -04:00
Paul Kompfner
195146adb2 Bump deprecation warning version, as this commit is not expected to ship until version 0.0.82. 2025-08-26 09:44:15 -04:00
Paul Kompfner
cab9e18cc9 Port recent change to LLMAssistantContextAggregator to universal LLMAssistantAggregator 2025-08-26 09:44:15 -04:00
Paul Kompfner
baef688e4e Port recent changes to LLMUserContextAggregator to universal LLMUserAggregator 2025-08-26 09:44:15 -04:00
Paul Kompfner
f1f43fe500 After a rebase, rename foundational examples showing usage of universal context to avoid naming conflict with a recently-added example. 2025-08-26 09:44:15 -04:00
Paul Kompfner
73b63f8d35 Remove unnecessary import 2025-08-26 09:44:15 -04:00
Paul Kompfner
0c14b33e92 Deprecate GoogleLLMOpenAIBetaService 2025-08-26 09:44:15 -04:00
Paul Kompfner
09beaccaf0 Assorted minor improvements after code review 2025-08-26 09:44:15 -04:00
Paul Kompfner
40557a1aae Remove TODO comment 2025-08-26 09:44:15 -04:00
Paul Kompfner
ecc4cc4a79 Add support for universal LLMContext to RTVIObserver 2025-08-26 09:44:15 -04:00
Paul Kompfner
37be8805f4 ruff 2025-08-26 09:44:15 -04:00
Paul Kompfner
93c7e64995 Add missing PERPLEXITY_API_KEY in env.example 2025-08-26 09:44:15 -04:00
Paul Kompfner
9de2bd61a9 Add supports_universal_context for OpenAILLMService subclasses so that we can gradually roll out support for universal LLMContext in a controlled manner.
Also update `get_chat_completions()` implementations with the new argument type.
2025-08-26 09:44:15 -04:00
Paul Kompfner
566af71862 Add CHANGELOG entry for the universal LLMContext machinery 2025-08-26 09:44:15 -04:00
Paul Kompfner
12064bd6e6 Add a bit of helpful info in an error message 2025-08-26 09:44:15 -04:00
Paul Kompfner
a962459151 Change LLMContextAggregatorPair.create(context) to LLMContextAggregatorPair(context) 2025-08-26 09:44:15 -04:00
Paul Kompfner
8fc76a29bc Raise errors when trying to use universal LLMContext with LLM services that don't yet support it 2025-08-26 09:44:15 -04:00
Paul Kompfner
e3019261a5 Fix classes that subclass BaseLLMAdapter by adding placeholder stuff until support for universal LLMContext machinery comes to all LLM services 2025-08-26 09:44:15 -04:00
Paul Kompfner
fa1f6f1c51 In LLMContext, normalize an empty provided ToolsSchema to NOT_GIVEN 2025-08-26 09:44:15 -04:00
Paul Kompfner
337f00c16c Minor fix: add a type annotation 2025-08-26 09:44:15 -04:00
Paul Kompfner
d50922cdcd Update Google adapter to handle possibility of system message in standard format being provided as a list of text parts rather than just a string. 2025-08-26 09:44:15 -04:00
Paul Kompfner
47f5ca6265 Update Gemini adapter to be able to handle LLMSpecificMessages containing Google-formatted messages 2025-08-26 09:44:15 -04:00
Paul Kompfner
2eddb6ffda [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Remove outdated comment
2025-08-26 09:44:15 -04:00
Paul Kompfner
560a6f2247 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Make `LLMContext.add_audio_frames_message()` respect the OpenAI standard format
2025-08-26 09:44:15 -04:00
Paul Kompfner
59ecb19000 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add support for LLM-specific messages in the universal `LLMContext`, to enable using LLM-specific functionality while still using the universal LLM context
2025-08-26 09:44:15 -04:00
Paul Kompfner
cfb094b3c8 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Make it so that tools in `LLMContext` are guaranteed to be either a `ToolsSchema` or `NOT_GIVEN`
2025-08-26 09:44:15 -04:00
Paul Kompfner
1f7e8e001b [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Update some types to also allow for universal `LLMContext`
2025-08-26 09:44:15 -04:00
Paul Kompfner
688b136141 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add to Google LLM service support for universal LLM context
2025-08-26 09:44:15 -04:00
Paul Kompfner
809c4c1bc5 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add to OpenAI LLM service support for universal LLM context
2025-08-26 09:44:15 -04:00
Paul Kompfner
81ca5e6601 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Formatting fix + dead import cleanup
2025-08-26 09:44:15 -04:00
Paul Kompfner
ebc49d2252 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add a "universal" alias for `OpenAILLMContextAssistantTimestampFrame`: `LLMContextAssistantTimestampFrame`
2025-08-26 09:44:15 -04:00
Paul Kompfner
ff8d158e18 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Added universal `LLMContext` and associated context aggregators.
2025-08-26 09:44:15 -04:00
Aleix Conchillo Flaqué
37980b0854 Merge pull request #2504 from pipecat-ai/aleix/cartesia-fix-timeout-reconnection
CartesiaTTSService: reconnect on Cartesia's timeout
2025-08-25 15:24:31 -07:00
Aleix Conchillo Flaqué
39ebc2c9c1 CartesiaTTSService: reconnect on Cartesia's timeout 2025-08-25 14:09:03 -07:00
Aleix Conchillo Flaqué
ab61d09ec1 Merge pull request #2502 from pipecat-ai/aleix/pipecat-0.0.81
update CHANGELOG for 0.0.81
2025-08-25 09:28:21 -07:00
Aleix Conchillo Flaqué
e4afc0a13c update CHANGELOG for 0.0.81 2025-08-25 08:22:28 -07:00
Mark Backman
dde3d2395b Merge pull request #2491 from pipecat-ai/mb/update-quickstart 2025-08-23 06:34:37 -07:00
Aleix Conchillo Flaqué
30b36c3d6e Merge pull request #2497 from pipecat-ai/aleix/pipeline-task-fix-cancellation
PipelineTask: handle cancellations gracefully
2025-08-22 22:37:12 -07:00
Mark Backman
de4dfc3ed4 Update deployment steps 2025-08-23 00:19:26 -04:00
Aleix Conchillo Flaqué
a0128516ff PipelineTask: handle cancellations gracefully 2025-08-22 19:04:31 -07:00
Aleix Conchillo Flaqué
db3b8c7325 Merge pull request #2496 from pipecat-ai/aleix/release-evals-always-provide-eval-prompt
scripts(evals): always require an eval prompt
2025-08-22 18:11:33 -07:00
Aleix Conchillo Flaqué
9273ec0f25 scripts(evals): always require an eval prompt 2025-08-22 16:57:47 -07:00
Mark Backman
8dfa1187be Merge pull request #2402 from pipecat-ai/mb/voicemail-detection
Add voicemail detection
2025-08-22 14:51:13 -07:00
Mark Backman
e17fd580c6 Update README 2025-08-22 15:56:56 -04:00
mattie ruth backman
3e3d50a855 Fix issue with request images from the camera introduced in smallwebrtctransport 2025-08-22 15:02:33 -04:00
Mark Backman
402661ae03 Prevent user speaking frames from entering the classifier branch after a conversation is detected 2025-08-22 14:09:45 -04:00
Mark Backman
69c6a95b8a Simplify frames in the NotifierGate 2025-08-22 14:09:45 -04:00
Mark Backman
4d49210a73 Rename system_prompt to custom_system_prompt; improve dev ex for classification prompt requirements 2025-08-22 14:09:45 -04:00
Aleix Conchillo Flaqué
5f8a22ef2f Merge pull request #2493 from pipecat-ai/aleix/runner-task-asyncio-cancellation
PipelineRunner/PipelineTask: fix asyncio task cancellation
2025-08-22 09:13:58 -07:00
Aleix Conchillo Flaqué
606ad0826a Merge pull request #2492 from pipecat-ai/aleix/wait-for-task-deprecated
FrameProcessor: wait_for_task is now deprecated
2025-08-22 09:13:34 -07:00
Mark Backman
57028255ee Update changelog, mention text LLMs only 2025-08-22 12:12:17 -04:00
Mark Backman
87ebbab758 Only set/clear voicemail_event when voicemail is detected 2025-08-22 12:12:17 -04:00
Mark Backman
bd401e8d6f Rename TTSBuffer to TTSGate 2025-08-22 12:12:17 -04:00
Mark Backman
f0dfab23e7 Cleanup 2025-08-22 12:12:17 -04:00
Mark Backman
fbc907c371 Change path to extensions 2025-08-22 12:12:17 -04:00
Mark Backman
40b5ef485d Add base NotifierGate class and ClassifierGate, ConversationGate subclasses 2025-08-22 12:12:17 -04:00
Mark Backman
b30af3e155 Tests specify USER_SPEAKS_FIRST or BOT_SPEAKS_FIRST 2025-08-22 12:12:17 -04:00
Mark Backman
446bb5cddf Refactor callback to event 2025-08-22 12:12:17 -04:00
Mark Backman
1c1ee94074 Add 44 to evals, update evals to support user speaking first 2025-08-22 12:12:17 -04:00
Mark Backman
ac30083b45 Add CHANGELOG entry 2025-08-22 12:12:17 -04:00
Mark Backman
ce579d4266 Make on_voicemail_detected callback required, cleanup logging 2025-08-22 12:12:17 -04:00
Mark Backman
5a07b30c7a Class name changes, add TTSStarted/StoppedFrame to the TTSBuffer 2025-08-22 12:12:17 -04:00
Mark Backman
9da33f3897 Handle multiple user inputs from the user when a voicemail is detected; add a configurable timeout to emitting the callback 2025-08-22 12:12:17 -04:00
Mark Backman
5ca82ec61e Final docstrings, comments, and cleanup 2025-08-22 12:12:17 -04:00
Mark Backman
0067c7df47 Add aggregation to classifier LLM output and validate prompt 2025-08-22 12:12:17 -04:00
Mark Backman
ab03db5b0c Updated prompt, add custom system_prompt input 2025-08-22 12:12:17 -04:00
Mark Backman
238d6bf9ab Add buffering logic 2025-08-22 12:12:17 -04:00
Mark Backman
90ae85bab2 More updates—added new voicemail module 2025-08-22 12:12:17 -04:00
Mark Backman
29e09b2053 POC demo in progress 2025-08-22 12:12:17 -04:00
mattie ruth backman
bad9977e8c PR feedback and more explicit about only supporting exporting 1 video 2025-08-22 11:24:22 -04:00
mattie ruth backman
b987579d54 update smallWebRTC screen support to support the utils format for listening to screenshares 2025-08-22 11:24:22 -04:00
mattie ruth backman
40f1f4ff11 Add support to smallWebRTCTransport for receiving screenshare videos 2025-08-22 11:24:22 -04:00
Aleix Conchillo Flaqué
a3ad31d0f6 README: recommended python version is 3.12 2025-08-21 23:50:00 -07:00
Aleix Conchillo Flaqué
8044c4170d PipelineRunner/PipelineTask: fix asyncio task cancellation 2025-08-21 23:50:00 -07:00
Aleix Conchillo Flaqué
bc51e7abc6 FrameProcessor: wait_for_task is now deprecated 2025-08-21 21:17:47 -07:00
Aleix Conchillo Flaqué
256ecf4d71 Merge pull request #2490 from pipecat-ai/aleix/speechmatics-exceptions
Speechmatics exception handling
2025-08-21 19:48:43 -07:00
Aleix Conchillo Flaqué
c16969c4f5 Merge pull request #2489 from pipecat-ai/aleix/daily-python-0.19.7
pyproject: update daily-python to 0.19.7
2025-08-21 19:48:31 -07:00
Mark Backman
8ef64d8c8d Update quickstart, make it deployable 2025-08-21 22:32:34 -04:00
Aleix Conchillo Flaqué
4947d08733 GladiaSTTService: update loggin levels 2025-08-21 18:42:23 -07:00
Aleix Conchillo Flaqué
b61846534d SpeechmaticsSTTService: improve exception handling and loggin 2025-08-21 18:42:23 -07:00
Aleix Conchillo Flaqué
8f01cd220a pyproject: update daily-python to 0.19.7 2025-08-21 18:40:01 -07:00
Aleix Conchillo Flaqué
3abaaf80e0 Merge pull request #2487 from pipecat-ai/aleix/watchdog-timers-removal
remove watchdog timers and specific asyncio implementations
2025-08-21 18:37:35 -07:00
Aleix Conchillo Flaqué
13890fa021 github(tests): use python 3.12 to run unit tests/coverage 2025-08-21 18:09:56 -07:00
Aleix Conchillo Flaqué
802af28888 update pytest-asyncio to 1.1.0 2025-08-21 18:09:56 -07:00
Aleix Conchillo Flaqué
24a628c85e remove watchdog timers and specific asyncio implementations
Watchdog timers have been removed. They were introduced in 0.0.72 to help
diagnose pipeline freezes. Unfortunately, they proved ineffective since they
required developers to use Pipecat-specific queues, iterators, and events to
correctly reset the timer, which limited their usefulness and added friction.
2025-08-21 18:09:56 -07:00
Mark Backman
ddab95835b Merge pull request #2474 from pipecat-ai/mb/add-frames-pipeline-idle
Add UserStarted/StoppedSpeakingFrames to idle_timeout_frames
2025-08-21 03:45:46 -07:00
Mark Backman
cb13f4b4cb Add user speaking and transcription frames to idle_timeout_frames 2025-08-21 06:43:10 -04:00
Aleix Conchillo Flaqué
4793277d34 Merge pull request #2480 from pipecat-ai/aleix/replace-asyncio-waitfor
replace asyncio.wait_for for wait_for2.wait_for
2025-08-20 17:43:32 -07:00
Aleix Conchillo Flaqué
28c729cc36 replace asyncio.wait_for for wait_for2.wait_for 2025-08-20 15:26:57 -07:00
Aleix Conchillo Flaqué
4d07c7b77c Merge pull request #2479 from pipecat-ai/aleix/simplify-dtmf-aggregator
DTMFAggregator: no need for interruption task
2025-08-20 15:15:35 -07:00
Aleix Conchillo Flaqué
4ff0567025 BaseObject: allow keyword arguments 2025-08-20 15:14:31 -07:00
Aleix Conchillo Flaqué
1377dec01b DTMFAggregator: no need for interruption task
Now that system frames are queued there's no need to have an additional task to
push a `BotInterruptionFrame`.
2025-08-20 14:35:04 -07:00
Aleix Conchillo Flaqué
42f4d73a63 Merge pull request #2478 from pipecat-ai/aleix/fix-wait-for2-import
timeout: fix wait_for2 import
2025-08-20 14:29:19 -07:00
Aleix Conchillo Flaqué
f1c1ebf852 timeout: fix wait_for2 import 2025-08-20 14:24:16 -07:00
Aleix Conchillo Flaqué
eb6d43f6cb Merge pull request #2476 from pipecat-ai/aleix/add-asyncio-timeout
implement custom asyncio.wait_for()
2025-08-20 14:20:22 -07:00
Aleix Conchillo Flaqué
f387776985 add custom asyncio.wait_for()
This patch uses `wait_for2` package to implement `asyncio.wait_for()` for
Python < 3.12.

In Python 3.12, `asyncio.wait_for()` is implemented in terms of
`asyncio.timeout()` which fixed a bunch of issues. However, this was never
backported (because of the lack of `async.timeout()`) and there are still many
remainig issues, specially in Python 3.10, in `async.wait_for()`.

See https://github.com/python/cpython/pull/98518
2025-08-20 14:09:05 -07:00
Aleix Conchillo Flaqué
5286591826 Merge pull request #2464 from pipecat-ai/aleix/frame-processor-updates
various frame processor updates
2025-08-20 10:11:49 -07:00
Aleix Conchillo Flaqué
6831e63ec9 PipelineTask: use PipelineSource/PipelineSink and remove tasks 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
12bcb7db64 ParallelPipeline: use PipelineSource/PipelineSink and remove tasks 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
1b48b1d860 Pipeline: allow passing user source and sink processors 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
d161e2767f FrameProcessor: allow pausing/resuming system frames 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
4e3af00b6d tests: try to use default SleepFrame time 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
4015aedb86 tests: fix unit tests 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
75a6ee839b BaseObserver: added new on_process_frame 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
13ce02c896 FrameProcessor: add new entry_processors() method 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
2fd5885dc3 pipeline: implement processors property 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
d743586bfb BasePipeline: move processors_with_metrics() to FrameProcessor 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
8051017895 pipeline: wrap with pipelines, use direct mode and reduce tasks 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
dc7bf98ce5 Pipeline: improve performance by using direct mode 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
609a43a191 FrameProcessor: added processors/next/previous properties 2025-08-20 07:40:19 -07:00
Aleix Conchillo Flaqué
4fb04422d9 FrameProcessor: remove unused set_parent/get_parent 2025-08-20 07:40:02 -07:00
Mark Backman
2f74a7e674 Merge pull request #2469 from pipecat-ai/mb/11labs-text-normalization
Add apply_text_normalization to ElevenLabs TTS services
2025-08-19 18:21:33 -07:00
Mark Backman
5205f56087 Add apply_text_normalization to ElevenLabs TTS services 2025-08-19 21:19:00 -04:00
Mark Backman
694c792af3 Merge pull request #2470 from pipecat-ai/mb/11labs-settings-reconnect
Update ElevenLabsTTSService: update runtime configuration
2025-08-19 18:18:14 -07:00
TheNotary
48b3ad8f8f adds support for creating InterimTranscriptFrames for Azure speech services 2025-08-19 17:00:42 -05:00
Mark Backman
406e82a842 Merge pull request #2438 from pipecat-ai/mb/delete-old-docs
Remove stale docs
2025-08-19 12:22:54 -07:00
Mark Backman
837de5f893 Merge pull request #2468 from pipecat-ai/mb/fix-mistral-docs-errors
Fix Mistral docstrings build errors
2025-08-19 12:22:26 -07:00
Mark Backman
10b9b1da2f Merge pull request #2471 from pipecat-ai/mb/add-13j
Add foundational 13j for Azure STT
2025-08-19 12:10:03 -07:00
Mark Backman
7854a2ec83 Add foundational 13j for Azure STT 2025-08-19 14:36:31 -04:00
Mark Backman
ac7c69078f Merge pull request #2442 from pipecat-ai/mb/retry-completion
retry_on_timeout: Anthropic, AWS Bedrock
2025-08-19 11:23:43 -07:00
Mark Backman
c9b4356ea6 Update changelog 2025-08-19 14:21:18 -04:00
Mark Backman
b3e4421191 Add retry_on_timeout to AWSBedrockLLMService 2025-08-19 14:20:35 -04:00
Mark Backman
84058c3948 Add retry_on_timeout to AnthropicLLMService 2025-08-19 14:20:35 -04:00
Mark Backman
aebc781419 Update ElevenLabsTTSService to update when voice_settings change 2025-08-19 13:51:10 -04:00
Mark Backman
4160446f4c Update ElevenLabsTTSService: reconnect on model and language changes 2025-08-19 11:32:54 -04:00
Mark Backman
05a14af184 Fix Mistral docstrings build errors 2025-08-19 10:31:03 -04:00
Filipi da Silva Fuchter
89d2ef2bde Merge pull request #2465 from pipecat-ai/filipi/heygen_changing_log_level
Changing heygen log level to trace.
2025-08-19 07:50:11 -03:00
Filipi Fuchter
f550015efb Changing heygen log level to trace. 2025-08-18 18:00:25 -03:00
Abhishek
8bbdc7c8d1 Only set last_frame_time when handling OutputAudioRawFrame
We don't want to set `last_frame_time` on other frames like `HeartBeatFrame`, `LLMGeneratedTextFrame`, `InterruptionFrames` so that we can calculate `diff_time` and compare it against `vad_stop_secs` properly
2025-08-16 16:25:14 +05:30
Mark Backman
8fa44863fb Merge pull request #2455 from pipecat-ai/vp-log-line
log: add Disconnected from ElevenLabs debug log
2025-08-15 14:12:28 -07:00
vipyne
088cb56922 log: add Disconnected from ElevenLabs debug log 2025-08-15 15:05:07 -05:00
Aleix Conchillo Flaqué
a789e5feea Merge pull request #2451 from pipecat-ai/aleix/audio-buffer-processor-overlap
AudioBufferProcessor: fix overlap when buffer size is set
2025-08-14 15:31:50 -07:00
Aleix Conchillo Flaqué
16ca44131c Merge pull request #2452 from pipecat-ai/aleix/runner-daily-direct-handlesigint
Runner: set handle_sigint to True for Daily direct
2025-08-14 15:25:05 -07:00
Mark Backman
418860cf26 Merge pull request #2450 from pipecat-ai/mb/fix-openai-changelog-entry
fix: Move OpenAI retry changelog entry to the correct release
2025-08-14 15:23:00 -07:00
Aleix Conchillo Flaqué
e2fc8b3dce Runner: set handle_sigint to True for Daily direct 2025-08-14 14:55:52 -07:00
Aleix Conchillo Flaqué
8b641089f8 AudioBufferProcessor: fix overlap when buffer size is set 2025-08-14 14:44:08 -07:00
Mark Backman
d36ed755ce fix: Move OpenAI retry changelog entry to the correct release 2025-08-14 17:34:35 -04:00
Mark Backman
7aaf64fe55 Merge pull request #2447 from pipecat-ai/mb/update-foundational-readme
Improve the foundational example README
2025-08-14 09:51:01 -07:00
Mark Backman
5f52008974 Improve the foundational example README 2025-08-14 11:29:04 -04:00
Mark Backman
d520677b23 Merge pull request #2408 from pipecat-ai/mb/add-mistral-llm
Add MistralLLMService
2025-08-14 08:19:18 -07:00
Mark Backman
42bd1e9d40 Add Mistral to README and pyproject.toml 2025-08-14 11:15:52 -04:00
Mark Backman
7f0494aa04 Override build_chat_completion_params for Mistral 2025-08-14 10:32:18 -04:00
Mark Backman
b7ae2989ac Add foundational 14w-function-calling.py 2025-08-14 10:00:46 -04:00
Mark Backman
2b2b0f8121 Add MistralLLMService 2025-08-14 09:57:14 -04:00
Mark Backman
5ca33a2b00 Merge pull request #2445 from pipecat-ai/mb/fix-changelog-asyncai
fix: Changelog for Async AI bugfix
2025-08-14 06:48:08 -07:00
Mark Backman
938dcb613d fix: Changelog for Async AI bugfix 2025-08-14 09:13:03 -04:00
Mark Backman
bc748cf9d0 Merge pull request #2444 from ashotbagh/fix/asyncai-force-flush
fix(asyncai): force flush WS TTS to eliminate stalls
2025-08-14 06:10:16 -07:00
Ashot
3b55d16a49 fix(asyncai): force flush WS TTS to eliminate stalls 2025-08-14 16:34:34 +04:00
Mark Backman
d7f31e0cbd Merge pull request #2387 from pipecat-ai/mb/retry-chat-completion
Retry chat completions for OpenAILLMService and its subclasses
2025-08-13 14:39:40 -07:00
Mark Backman
c662a2d820 Merge pull request #2437 from pipecat-ai/mb/19-english
Foundational 19: Respond in English
2025-08-13 11:57:24 -07:00
Mark Backman
2c220ca54e Remove stale docs 2025-08-13 14:11:41 -04:00
Mark Backman
89f0ff17c0 Merge pull request #2430 from pipecat-ai/aleix/pipecat-0.0.80
update CHANGELOG for 0.0.80
2025-08-13 09:41:43 -07:00
Mark Backman
b5465364fa Foundational 19: Respond in English 2025-08-13 12:37:13 -04:00
Aleix Conchillo Flaqué
c024eb7b8c update CHANGELOG for 0.0.80 2025-08-13 11:46:24 -04:00
Mark Backman
608570e89d Merge pull request #2433 from pipecat-ai/mb/openai-realtime-text-modality
fix: Add text support to OpenAIRealtimeBetaLLMService
2025-08-13 08:41:33 -07:00
Mark Backman
3ad61a8a04 Remove stray - in changelog 2025-08-13 11:39:59 -04:00
Mark Backman
4c4bae2db6 Remove unnessecary messages from 19 and 19b examples 2025-08-13 11:39:59 -04:00
Mark Backman
901b6b5913 Add foundational 19b 2025-08-13 11:37:38 -04:00
Mark Backman
71cd0f1c87 fix: Add text support to OpenAIRealtimeBetaLLMService 2025-08-13 11:37:36 -04:00
Filipi da Silva Fuchter
a2a419e6db Merge pull request #2435 from pipecat-ai/filipi/small_webrtc_end_pipeline
Fixed an issue where `SmallWebRTCTransport` ended before TTS finished.
2025-08-13 11:58:33 -03:00
Filipi Fuchter
bbbbdc459a Fixed an issue where SmallWebRTCTransport ended before TTS finished. 2025-08-13 11:46:51 -03:00
Mark Backman
d203528dad Merge pull request #2333 from yohan-altrium/fix/2277-azure-tts-ssml-reserved-characters
Fixes 2277 - SSML reserved characters causes Azure TTS to fail
2025-08-13 06:27:30 -07:00
Yohan Liyanage
4bcca7956e Refactors the code based on PR comments and adds the relevant changelog entry. 2025-08-13 16:34:33 +05:30
Aleix Conchillo Flaqué
68a4cf4c68 Merge pull request #2427 from pipecat-ai/aleix/base-watchdog-priority-queue
WatchdogPriorityQueue: this is now a base class
2025-08-12 18:25:59 -07:00
Aleix Conchillo Flaqué
0508ddddfb WatchdogPriorityQueue: fix watchdog sentinel insertion
We now force each inserted item in the priority queue to be a tuple and the
actual value to be last in the tuple. All the previous values in the tuple also
need to be numeric.
2025-08-12 17:40:58 -07:00
Mark Backman
8714c9137f Code review fixes 2025-08-12 17:49:13 -04:00
Mark Backman
4c029fcfa7 Update OpenAILLMService subclasses to use the new build_chat_completion_params function 2025-08-12 17:48:51 -04:00
Mark Backman
5c86f8e687 Add timeout/retry logic and refactor parameter building in BaseOpenAILLMService
- Add timeout (default 5.0s) and retry_on_timeout parameters to constructor
- Implement timeout/retry logic in get_chat_completions using asyncio.wait_for
- Extract build_chat_completion_params() as public method for subclass customization
2025-08-12 17:48:51 -04:00
Mark Backman
54a4d8a9f8 Merge pull request #2422 from thsunkid/thu/fix-set-lang-in-base-whisper
Fix: assigns string code instead of Language enum to BaseWhisperSTTService._language
2025-08-12 11:57:46 -07:00
Mark Backman
38af514d95 Merge pull request #2407 from pipecat-ai/mb/add-gemini-tts
Add GeminiTTSService
2025-08-12 11:56:45 -07:00
Aleix Conchillo Flaqué
6aa80c0b8e Merge pull request #2424 from pipecat-ai/aleix/system-frame-queues-fix
FrameProcessor: fix race condition on FrameProcessorQueue
2025-08-12 11:56:00 -07:00
Mark Backman
e720573e60 Added 07n-interruptible-gemini 2025-08-12 14:54:49 -04:00
Mark Backman
541a43905b Add GeminiTTSService 2025-08-12 14:52:20 -04:00
Aleix Conchillo Flaqué
707df913cd FrameProcessor: fix race condition on FrameProcessorQueue
We need to increment the counters before the await otherwise we could go to a
different task that could add an item with the same counter.

Also, we need to handle non-frame items as well.
2025-08-12 11:48:22 -07:00
Aleix Conchillo Flaqué
3f3d757581 tests: added WatchdogQueue and WatchdogPriorityQueue unit tests 2025-08-12 11:48:22 -07:00
Aleix Conchillo Flaqué
7c781ce816 WatchdogPriorityQueue: make WatchdogPriorityCancelSentinel public 2025-08-12 11:34:31 -07:00
Aleix Conchillo Flaqué
f3efc9da00 WatchdogQueue: make WatchdogQueueCancelSentinel public 2025-08-12 11:34:31 -07:00
Mark Backman
827a70104d Merge pull request #2425 from pipecat-ai/mb/runner-add-exotel
Add Exotel support to the development runner
2025-08-12 10:36:54 -07:00
Mark Backman
a40327305c Add Exotel support to the development runner 2025-08-12 13:21:18 -04:00
Thu Nguyen
168af44429 Fix: assigns string code instead of Language enum to _language attr of BaseWhisperSTTService 2025-08-12 20:27:26 +07:00
Mark Backman
5f8433476c Merge pull request #2397 from gladiaio/PLA-37-GladiaSTTService-minor-tweaks
feat: add minor tweaks to GladiaSTTService
2025-08-12 04:59:40 -07:00
Fabrice Lamant
6a6fea74f5 fix: set default region to none 2025-08-12 13:31:51 +02:00
Mark Backman
91b557ecbf Merge pull request #2419 from pipecat-ai/mb/fix-lockfile-workflow 2025-08-12 03:39:54 -07:00
Mark Backman
be85291414 Merge pull request #2420 from pipecat-ai/mb/runner-handle-sigint-default 2025-08-12 03:39:29 -07:00
Fabrice Lamant
09f171b69d fix: only pass region if set 2025-08-12 12:05:38 +02:00
Aleix Conchillo Flaqué
929fd98958 Merge pull request #2416 from pipecat-ai/aleix/release-evals-vision
scripts(evals): add vision support
2025-08-11 20:08:08 -07:00
Aleix Conchillo Flaqué
1cfbfcaf11 scripts(evals): add vision support 2025-08-11 20:06:24 -07:00
Mark Backman
cd5a3c13bd Development runner: handle_sigint defaults to False 2025-08-11 22:06:56 -04:00
Mark Backman
9b871b0cc5 Update uv.lock, remove lockfile workflow, update CONTRIBUTING with dependency guidance 2025-08-11 21:39:25 -04:00
Mark Backman
0d499a8aa3 Merge pull request #2409 from pipecat-ai/mb/refactor-playht-http
Refactor PlayHTHttpTTSService to use aiohttp
2025-08-11 18:20:58 -07:00
Mark Backman
45292ab13d Merge pull request #2411 from pipecat-ai/mb/fix-websocket-service-retry
fix: WebsocketService retry logic incorrectly handling ConnectionClos…
2025-08-11 18:17:50 -07:00
Mark Backman
be6ea0dbf6 Code review feedback 2025-08-11 21:17:04 -04:00
Aleix Conchillo Flaqué
fb18ae174e Merge pull request #2417 from pipecat-ai/aleix/release-evals-15-series
scripts(evals): add multilinguag support and 15 series
2025-08-11 17:14:47 -07:00
Mark Backman
c4506523ab Refactor PlayHTHttpTTSService to use aiohttp 2025-08-11 19:58:25 -04:00
Aleix Conchillo Flaqué
b360cb31dc scripts(evals): add multilinguag support and 15 series 2025-08-11 15:21:14 -07:00
Aleix Conchillo Flaqué
07f104199c Merge pull request #2415 from pipecat-ai/aleix/moondream-2025-01-09
MoondreamService: update to revision 2025-01-09
2025-08-11 15:10:35 -07:00
Aleix Conchillo Flaqué
bc1949b4bf MoondreamService: update to revision 2025-01-09 2025-08-11 14:54:04 -07:00
Aleix Conchillo Flaqué
2035dd8b39 Merge pull request #2403 from pipecat-ai/aleix/system-frame-queue-priority-fix
FrameProcessor: fix system frame higher priorty and use a PriortyQueue
2025-08-11 13:57:57 -07:00
Aleix Conchillo Flaqué
24c8189327 Merge pull request #2405 from pipecat-ai/aleix/frame-processor-direct-mode
FrameProcessor: introduce direct mode
2025-08-11 13:57:34 -07:00
Mark Backman
998ac32627 Merge pull request #2413 from captaincaius/fix-stt-mute-filter-vad-frames-20250810
Add VADUserStartSpeakingFrame VADUserStopSpeakingFrame to STTMuteFilter (fix #2412)
2025-08-11 13:54:34 -07:00
Aleix Conchillo Flaqué
50645c1c4f README: recommend python 3.11-3.12
Python 3.11 has significant performance improvements compared to 3.10 which
makes Pipecat's asyncio heavy use  specially better.
2025-08-11 13:53:08 -07:00
Aleix Conchillo Flaqué
8ce29ee8f2 FrameProcessor: fix system frame higher priorty and use a PriortyQueue 2025-08-11 13:53:08 -07:00
Captain Caius
7b8aeef4cc update changelog 2025-08-11 12:45:54 -07:00
Aleix Conchillo Flaqué
6a24457f0e FrameProcessor: introduce direct mode
Direct mode avoids creating internal queues and tasks and processes frames right
away. This might be useful for some very simple processors.
2025-08-11 09:26:31 -07:00
Aleix Conchillo Flaqué
2c01c2b5b3 Merge pull request #2404 from pipecat-ai/aleix/examples-22-simplify-main-pipeline
examples(foundational): update 22 series with simple main pipelines
2025-08-11 09:14:39 -07:00
Aleix Conchillo Flaqué
1c2e114fa2 examples(foundational): update 22 series with simple main pipelines 2025-08-11 09:13:09 -07:00
Filipi da Silva Fuchter
0f137e36c2 Merge pull request #2399 from pipecat-ai/filipi/heygen_latency
Improving the latency of the `HeyGenVideoService`.
2025-08-11 09:13:10 -03:00
Filipi Fuchter
b7f12a96f1 Improving the latency of the HeyGenVideoService. 2025-08-11 09:11:17 -03:00
Filipi da Silva Fuchter
3331f71e17 Merge pull request #2398 from pipecat-ai/filipi/ttfb_metrics_video_services
Added TTFB metrics for `HeyGenVideoService` and `TavusVideoService`.
2025-08-11 09:09:27 -03:00
Filipi Fuchter
55d200e2d1 Added TTFB metrics for HeyGenVideoService and TavusVideoService. 2025-08-11 09:07:21 -03:00
Captain Caius
3fae00e067 Add VADUserStartSpeakingFrame VADUserStopSpeakingFrame to STTMuteFilter 2025-08-10 19:35:04 -07:00
Mark Backman
78cdefd191 Merge pull request #2410 from smokyabdulrahman/issue-2373
Support endpoint_id for AzureSTTService
2025-08-10 16:43:29 -07:00
Mark Backman
42502a4f3b fix: WebsocketService retry logic incorrectly handling ConnectionClosedOK exception 2025-08-10 19:35:05 -04:00
Abdulrahman Alrahma
fc67cc3302 Support endpoint_id for AzureSTTService 2025-08-10 22:24:47 +01:00
Aleix Conchillo Flaqué
241ab19228 update uv.lock with numba dependency 2025-08-08 15:12:55 -07:00
Mark Backman
c08e8ec8fb Merge pull request #2391 from pipecat-ai/mb/readme-local-dev
Update README with local dev setup for contributors
2025-08-08 11:15:58 -07:00
Mark Backman
eb9bc9644e Merge pull request #2400 from pipecat-ai/mb/pin-numba-0.61.2
fix: pin numba to >=0.61.2
2025-08-08 11:15:22 -07:00
Mark Backman
3a306dae90 fix: pin numba to >=0.61.2 2025-08-08 10:52:47 -04:00
Fabrice Lamant
e503ea7466 feat: add minor tweaks to GladiaSTTService 2025-08-08 10:21:52 +02:00
Mark Backman
c42cc8254f Update README with local dev setup for contributors 2025-08-07 22:07:35 -04:00
Aleix Conchillo Flaqué
a8e21f7d5d Merge pull request #2395 from pipecat-ai/aleix/examples-15-inherit-parallel-pipeline
examples(foundational): move 15/15a logic into its own processor
2025-08-07 17:59:28 -07:00
Aleix Conchillo Flaqué
c6ef8de578 scripts(evals): fix 14v-function-calling-openai.py 2025-08-07 17:57:47 -07:00
Aleix Conchillo Flaqué
fc571fba42 examples(foundational): move 15/15a logic into its own processor 2025-08-07 17:57:47 -07:00
Mark Backman
0502ee2b5a Merge pull request #2394 from pipecat-ai/mb/uv-lock
Update uv.lock
2025-08-07 15:25:38 -07:00
Mark Backman
9ec047094b Update uv.lock 2025-08-07 18:24:47 -04:00
Mark Backman
d991c106c8 Merge pull request #2393 from pipecat-ai/mb/openai-dep
fix: pin openai package upper bound to <=1.99.1
2025-08-07 15:19:05 -07:00
Mark Backman
312fb23c89 fix: pin openai package upper bound to <=1.99.1 2025-08-07 18:00:25 -04:00
Aleix Conchillo Flaqué
4d7f21d44e Merge pull request #2392 from pipecat-ai/aleix/avoid-using-tts-say
deprecate TTSService.say() method
2025-08-07 13:55:49 -07:00
Aleix Conchillo Flaqué
ec25d0a7c9 examples(foundational): fix 20a-persistent-context-openai 2025-08-07 13:48:32 -07:00
Aleix Conchillo Flaqué
2b8218deaa examples(foundational): use TTSSpeakFrame instead of TTSService.say() 2025-08-07 13:48:32 -07:00
Aleix Conchillo Flaqué
11119430cd TTSService: deprecate say() method 2025-08-07 13:48:32 -07:00
kompfner
9ca79232c1 Merge pull request #2380 from pipecat-ai/pk/deprecate-llm-messages-frame
Deprecate `LLMMessagesFrame`, `LLMUserResponseAggregator`, and `LLMAssistantResponseAggregator`
2025-08-07 15:13:01 -04:00
Paul Kompfner
9ea06c33f7 Bump deprecation version of LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator (the deprecation slipped past the 0.0.78 release) 2025-08-07 14:56:50 -04:00
Paul Kompfner
30a1dd202e Move deprecation of LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator into the next release in the changelog 2025-08-07 14:55:11 -04:00
Paul Kompfner
809ab0b7b6 Improve printed deprecation warning 2025-08-07 14:45:35 -04:00
Paul Kompfner
2b5db9c562 Remove redundant deprecation warning in docstring 2025-08-07 14:45:35 -04:00
Paul Kompfner
b4a886b59f Remove redundant deprecation warning in docstring 2025-08-07 14:45:35 -04:00
Paul Kompfner
07eb00722b Fix langchain unit test 2025-08-07 14:45:35 -04:00
Paul Kompfner
96652b8fba Add new deprecations to changelog 2025-08-07 14:45:30 -04:00
Paul Kompfner
df1fcf0c68 Remove unused import 2025-08-07 14:43:37 -04:00
Paul Kompfner
711f740d9e Update UserResponseAggregator to avoid using the now-deprecated LLMUserResponseAggregator 2025-08-07 14:43:37 -04:00
Paul Kompfner
a0bda98c20 Update langchain to avoid using the now-deprecated LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator 2025-08-07 14:43:37 -04:00
Paul Kompfner
1c1bae35ab Mention deprecation in docstring for LLMMessagesFrame 2025-08-07 14:43:37 -04:00
Paul Kompfner
56c52c2cf2 Deprecate LLMUserResponseAggregator and LLMAssistantResponseAggregator, which depend on the now-deprecated LLMMessagesFrame. 2025-08-07 14:43:37 -04:00
Paul Kompfner
740aee1a1a Fix an issue in AnthropicLLMContext where we would never initialize turns_above_cache_threshold if we were upgrading from an OpenAILLMContext.
I noticed this when working on 22c-natural-conversation-mixed-llms.py
2025-08-07 14:43:37 -04:00
Paul Kompfner
f0391c3280 Progress on updating foundational examples to avoid using the newly-deprecated LLMMessagesFrame.
Skipping over 07b-interruptible-langchain.py for now, as it requires deeper changes involving `LLMUserResponseAggregator` and `LLMAssistantResponseAggregator`.
2025-08-07 14:43:37 -04:00
Paul Kompfner
64e48e4660 Deprecate LLMMessagesFrame.
The same functionality can be achieved using either:
- `LLMMessagesUpdateFrame` with the desired messages, with `run_llm` set to `True`
- `OpenAILLMContextFrame` with a new context initialized with the desired messages
2025-08-07 14:43:37 -04:00
Paul Kompfner
b8147bdbbd Add missing Deepgram key to env.example 2025-08-07 14:43:37 -04:00
Aleix Conchillo Flaqué
315e45d41b Merge pull request #2389 from pipecat-ai/aleix/pipecat-0.0.78
update CHANGELOG for 0.0.78
2025-08-07 11:34:27 -07:00
Aleix Conchillo Flaqué
c057139c48 update CHANGELOG for 0.0.78 2025-08-07 11:14:54 -07:00
Mark Backman
c61e07132d Merge pull request #2390 from pipecat-ai/mb/optionally-ignore-emulated-speech
feat: Add option to ignore emulated user speech while the bot is spea…
2025-08-07 11:14:46 -07:00
Mark Backman
a5f5e418a8 feat: Add option to ignore emulated user speech while the bot is speaking 2025-08-07 14:08:11 -04:00
Mark Backman
31acfaa091 Merge pull request #2388 from pipecat-ai/14v-adding-openai-stt-tts-llm-functioncalling
14v adding OpenAI stt tts llm functioncalling
2025-08-07 10:22:35 -07:00
Mark Backman
69541c8835 Linting fix, plus update eval suite with 14v and others, tiny fix for 14m, too 2025-08-07 13:20:45 -04:00
Varun Singh
af94620839 Add OpenAI function calling example with Pipecat
Introduces a new example script demonstrating how to use OpenAI's function calling capabilities within a Pipecat pipeline. The example integrates OpenAI STT, TTS, and LLM services, registers a weather function, and sets up a pipeline for real-time audio interaction over WebRTC.
2025-08-07 13:20:45 -04:00
Filipi da Silva Fuchter
cec8a74293 Merge pull request #2386 from pipecat-ai/filipi/parallel_pipeline
Only push the StartFrame when all parallel pipelines have processed it
2025-08-07 14:20:30 -03:00
Filipi Fuchter
228a55ac1e Only push the StartFrame when all parallel pipelines have processed it. 2025-08-07 14:18:21 -03:00
Vanessa Pyne
ab9831daf0 Merge pull request #2382 from pipecat-ai/vp-trace-ignore-message
log: warning -> trace for elevenlabs tts unavailable context
2025-08-07 09:35:57 -05:00
Vanessa Pyne
e8c3f5dea6 Update src/pipecat/services/elevenlabs/tts.py
Co-authored-by: Mark Backman <mark@daily.co>
2025-08-07 09:23:33 -05:00
Mark Backman
4288b5e780 Merge pull request #2381 from pipecat-ai/aleix/runner-args-pipeline-idle-timeout
allow specifying PipelineTask idle timeout to runner arguments
2025-08-07 04:47:08 -07:00
Mark Backman
23343dd7e7 Remove idle_timeout_secs from quickstart 2025-08-07 07:44:21 -04:00
Mark Backman
88de5dd415 Merge pull request #2383 from pipecat-ai/aleix/riva-stt-iterator-exception
properly handle concurrent.futures.CancelledError
2025-08-07 04:39:56 -07:00
Mark Backman
33f87589d1 Merge pull request #2384 from pipecat-ai/aleix/release-evals-soniox-inworld-asyncai
scripts(evals): added soniox, inworld and asyncai
2025-08-07 04:35:18 -07:00
Aleix Conchillo Flaqué
7ed14ad91f scripts(evals): added soniox, inworld and asyncai 2025-08-06 23:14:50 -07:00
Aleix Conchillo Flaqué
86c6141580 DailyTransport: handle future cancellation 2025-08-06 23:03:20 -07:00
Aleix Conchillo Flaqué
c97643c797 RivaSTTService: always use WatchdogQueue 2025-08-06 23:00:03 -07:00
Aleix Conchillo Flaqué
434d346079 RivaSTTService: handle future cancellation 2025-08-06 22:59:52 -07:00
vipyne
64ae8d2394 log: warning -> trace for elevenlabs tts unavailable context 2025-08-06 22:40:47 -05:00
Aleix Conchillo Flaqué
786f24c9db examples(foundational): use RunnerArgs.pipeline_idle_timeout_secs 2025-08-06 19:38:06 -07:00
Aleix Conchillo Flaqué
38951aab56 scripts(evals): use RunnerArguments.pipeline_idle_timeout_secs 2025-08-06 19:37:29 -07:00
Aleix Conchillo Flaqué
ed8b0655a8 scripts(evals): fix runner eval cancellation
We need to call asyncio.gather() just once, not for every cancelled task.
2025-08-06 19:36:42 -07:00
Aleix Conchillo Flaqué
0b2b9f5f1b RunnerArguments: add pipeline_idle_timeout_secs 2025-08-06 19:35:40 -07:00
Filipi da Silva Fuchter
ad1841b739 Merge pull request #2377 from pipecat-ai/filipi/fast_api_freeze_issue
Fixed an issue in BaseOutputTransport where the loop could consume all CPU.
2025-08-06 14:58:36 -03:00
Mark Backman
b0c002c128 Merge pull request #2378 from pipecat-ai/mb/pyproject-compat-updates
Add new python-compatiblity workflow to check for dependency compatib…
2025-08-06 10:40:29 -07:00
Mark Backman
820176084c Add support for 3.13 by bumping min version for vllm to 0.9.0, adding support for torch and torchaudio up to the next major version 2025-08-06 13:36:01 -04:00
Mark Backman
5b7e31beff README updates for python versions 2025-08-06 13:36:01 -04:00
Mark Backman
41a22d3bf4 Add new python-compatiblity workflow to check for dependency compatibility across supported python versions 2025-08-06 13:36:01 -04:00
Filipi Fuchter
84fecabac5 Removing audio sleep from FastAPI and WebSocket server when they are not connected. 2025-08-06 14:02:51 -03:00
Filipi Fuchter
bbe01d10ef Fixed an issue in BaseOutputTransport where the loop could consume all CPU. 2025-08-06 12:42:58 -03:00
Mark Backman
4364990fd0 Merge pull request #2375 from fabrice404/gladia-region-selection
Gladia region selection
2025-08-06 07:01:24 -07:00
Fabrice Lamant
e576fa481f Add new region feature for GladiaSTTService in CHANGELOG 2025-08-06 15:31:10 +02:00
Mark Backman
ac6b59cae2 Merge pull request #2372 from pipecat-ai/mb/dotenv-dev
Wider package support for python-dotenv dev dep
2025-08-06 06:06:01 -07:00
Mark Backman
12e168e740 Wider package support for python-dotenv dev dep 2025-08-06 09:04:01 -04:00
Mark Backman
ac354f66ed Merge pull request #2371 from pipecat-ai/mb/docs-gen-with-uv
Update docs auto-generation to use uv
2025-08-06 06:02:52 -07:00
Mark Backman
eead793927 Merge pull request #2370 from pipecat-ai/mb/update-workflows-for-uv
Update workflows for uv
2025-08-06 05:54:55 -07:00
Fabrice Lamant
0594a203fc Add new region parameter to Gladia 2025-08-06 14:28:06 +02:00
Mark Backman
2337a2d92d Remove dev-requirements.txt and mentions of it 2025-08-05 21:46:50 -04:00
Mark Backman
b3e2603553 Update workflows for uv 2025-08-05 21:45:48 -04:00
Mark Backman
29229df719 Speed up builds, mocking large packages 2025-08-05 21:34:40 -04:00
Aleix Conchillo Flaqué
61f4dd2ff2 scripts(evals): fix 14e-function-calling-google 2025-08-05 17:44:45 -07:00
Mark Backman
42094fb206 Update docs auto-generation to use uv 2025-08-05 20:37:27 -04:00
Aleix Conchillo Flaqué
58c41f112a DailyRunnerArguments: make body optional (fix) 2025-08-05 16:59:36 -07:00
Aleix Conchillo Flaqué
fa55e2ca9b Merge pull request #2369 from pipecat-ai/aleix/pipeline-task-cancellation-fix
PipelineTask: always try to cancel things
2025-08-05 16:56:23 -07:00
Aleix Conchillo Flaqué
313fdc92a1 DailyRunnerArguments: make body optional 2025-08-05 16:39:18 -07:00
Aleix Conchillo Flaqué
d22d2da03d PipelineTask: always try to cancel things
In a previous commit we only cleanup things if the user run
`task.cancel()`. However, if the task finishes cleanly we were not cancelling
anything.
2025-08-05 16:24:59 -07:00
Aleix Conchillo Flaqué
de2ae9a2ec Merge pull request #2368 from pipecat-ai/aleix/release-evals-runner-args-fix
pass runner arguments to release evals
2025-08-05 16:23:32 -07:00
Aleix Conchillo Flaqué
52a6d8013c scripts(evals): pass runner arguments to run_bot() 2025-08-05 16:13:32 -07:00
Aleix Conchillo Flaqué
f14cbae9b5 DailyRunnerArguments: make token optional
DailyTransport can get a None token value.
2025-08-05 15:46:12 -07:00
Aleix Conchillo Flaqué
8fe906438a Merge pull request #2358 from pipecat-ai/aleix/system-frames-queued
system frames are now queued
2025-08-05 15:09:52 -07:00
Mark Backman
d8f4db8827 Merge pull request #2367 from richtermb/richtermb/fix-errorframe-docstring
Rename 'source' parameter to 'processor' in ErrorFrame class document…
2025-08-05 15:09:18 -07:00
Aleix Conchillo Flaqué
a5ea6e1642 FrameProcessor: system frames are now queued
System frames are now queued. Before, system frames could be generated from any
task and would not guarantee any order which was causing undesired
behavior. Also, it was possible to get into some rare recursion issues because
of the way system frames were executed (they were executed in-place, meaning
calling `push_frame()` would finish after the system frame traversed all the
pipeline). This makes system frames more deterministic.
2025-08-05 15:05:50 -07:00
richtermb
e777e78510 Rename 'source' parameter to 'processor' in ErrorFrame class documentation for clarity. 2025-08-05 15:02:00 -07:00
Aleix Conchillo Flaqué
49a5a1e375 PipelineTask: improve task cancellation 2025-08-05 14:49:23 -07:00
Aleix Conchillo Flaqué
61cb45d61b PipelineTask: also wait on CancelFrame
Before CancelFrames didn't need to be waited for because system frames were
processed in-place and therefore calling push_frame() would finalize after it
traversed all the pipeline. Now, system frames are queued so we need to wait
until CancelFrame reaches the end of the pipeline.
2025-08-05 14:49:23 -07:00
Aleix Conchillo Flaqué
6c6deb4e85 Merge pull request #2366 from pipecat-ai/aleix/run-bot-runner-arguments
add sigint/sigterm to RunnerArguments
2025-08-05 14:46:19 -07:00
Aleix Conchillo Flaqué
66ad29b2b1 example: pass RunnerArguments to run_bot()
This lets us get handle_sigint from RunnerArguments which knows where the
application is running and if SIGINT/SIGTERM should be handled or not.
2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
21e4f0d56d PipelineRunner: argument ordering 2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
627b44bac2 runner: use new RunnerArguments handle_sigint/handle_sigterm
This allow us to control applications behavior from the runner arguments, which
depen on the environment they run.
2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
e2a576beca RunnerArguments: add handle_sigint/handle_sigterm 2025-08-05 14:32:28 -07:00
Mark Backman
2981afb117 Merge pull request #2361 from pipecat-ai/mb/fix-changelog-simli
Fix Simli changelog entry placement
2025-08-05 14:12:38 -07:00
Mark Backman
d422c57b52 Merge pull request #2304 from pipecat-ai/mb/cartesia-cjk-lang-support
CartesiaTTSService: Add CJK lang support for word timestamps
2025-08-05 14:08:53 -07:00
Mark Backman
06d8bbd154 Fix Simli changelog entry placement 2025-08-05 17:07:58 -04:00
Mark Backman
35108afeb8 Merge pull request #2360 from pipecat-ai/mb/add-heygen-readme
Add HeyGen to the README page
2025-08-05 14:05:33 -07:00
Mark Backman
a0e2a2754a Merge pull request #2327 from richtermb/richtermb/push-more-error-frames
Add source parameter to ErrorFrame and set it in FrameProcessor. Upda…
2025-08-05 14:04:52 -07:00
Mark Backman
b8d620c8bb Merge pull request #2362 from pipecat-ai/mb/aws-stt-languages
AWSTranscribeSTTService add support for new languages
2025-08-05 14:00:50 -07:00
Mark Backman
f26bbe4092 Merge pull request #2363 from pipecat-ai/mb/update-14p
Update 14p, add 14p to evals, add Google creds to env.example
2025-08-05 14:00:13 -07:00
Mark Backman
52cb23f8d5 Merge pull request #2364 from pipecat-ai/mb/11labs-default-model
ElevenLabs TTS services: revert to Turbo v2.5 as default model
2025-08-05 13:59:59 -07:00
Filipi da Silva Fuchter
17e7f8a2cd Merge pull request #2352 from pipecat-ai/filipi/webrtc_audio_frame
Implementing if the bot it is speaking or not based on the SpeechOutputAudioRawFrame
2025-08-05 17:26:44 -03:00
richtermb
efddc4732c Refactor ErrorFrame: rename source field to processor for clarity and update related references in FrameProcessor. 2025-08-05 13:25:08 -07:00
richtermb
4476a76ad7 Merge branch 'main' into richtermb/push-more-error-frames 2025-08-05 13:23:24 -07:00
Filipi Fuchter
64592b274b Fixed an issue where BotStartedSpeakingFrame and BotStoppedSpeakingFrame
were not emitted when using `TavusVideoService` or `HeyGenVideoService`.
2025-08-05 17:11:34 -03:00
Aleix Conchillo Flaqué
95c661bdaa Merge pull request #2365 from pipecat-ai/aleix/update-release-evals-for-new-runner
scripts(evals): update to use new runner function
2025-08-05 13:07:57 -07:00
Aleix Conchillo Flaqué
5546c8e01c scripts(evals): update to use new runner function 2025-08-05 11:46:28 -07:00
Mark Backman
14e02c1b08 ElevenLabs TTS services: revert to Turbo v2.5 as default model 2025-08-05 13:44:37 -04:00
Mark Backman
ba5a5c7187 Update 14p, add 14p to evals, add Google creds to env.example 2025-08-05 13:30:36 -04:00
Mark Backman
2378cba155 AWSTranscribeSTTService add support for new languages 2025-08-05 13:01:06 -04:00
Mark Backman
1138c92a00 Merge pull request #2217 from simliai/main
feat: Add Simli Trinity models support to pipecat
2025-08-05 09:01:20 -07:00
Antonyesk601
fb82dc8308 Update CHANGELOG.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-08-05 17:46:01 +02:00
Mark Backman
c8a15f30fa Add HeyGen to the README page 2025-08-05 10:54:49 -04:00
antonyesk601
72168070f1 update changelog 2025-08-05 14:18:41 +00:00
Mark Backman
50083d1144 Merge pull request #2342 from pipecat-ai/mb/runner-connect-request-body
Development runner handles body information in the RTVI connect request
2025-08-05 05:15:55 -07:00
Mark Backman
64732518c6 Development runner handles body information in the RTVI connect request 2025-08-05 07:26:34 -04:00
Mark Backman
c3d8ea210f CartesiaTTSService: Add CJK lang support for word timestamps 2025-08-05 07:17:40 -04:00
Filipi da Silva Fuchter
98ed614f63 Merge pull request #2357 from pipecat-ai/filipi/latency_observer
Added detailed latency logging to UserBotLatencyLogObserver.
2025-08-05 08:11:48 -03:00
Filipi Fuchter
e43bdff31e Added detailed latency logging to UserBotLatencyLogObserver. 2025-08-04 19:36:30 -03:00
Mark Backman
42e48381fe Merge pull request #2355 from pipecat-ai/mb/update-readme-for-uv
Update the README with uv-centric steps
2025-08-04 15:28:07 -07:00
Mark Backman
df7ba64b4a Merge pull request #2354 from pipecat-ai/mb/revert-43-inline-script
Remove inline script from foundational 43a
2025-08-04 15:27:28 -07:00
Mark Backman
ac9b2e67a7 Merge pull request #2349 from pipecat-ai/mb/runner-support-daily-url-arg
daily runner util: remove arg parsing, add auto room, token generation
2025-08-04 13:44:25 -07:00
Mark Backman
c9918607cf Merge pull request #2335 from pipecat-ai/mb/quickstart-runner-improvements
Improve quickstart logging, runner startup message
2025-08-04 13:43:42 -07:00
Mark Backman
cfda410a43 Remove foundational requirements.txt file 2025-08-04 16:38:37 -04:00
Mark Backman
c773ddf83d Update foundational examples README 2025-08-04 16:26:11 -04:00
Mark Backman
54d5ebbc20 Update the README with uv-centric steps 2025-08-04 16:11:38 -04:00
Mark Backman
35002cd727 Remove inline script from foundational 43a 2025-08-04 15:46:18 -04:00
Mark Backman
53d75faa47 Merge pull request #2330 from pipecat-ai/mb/runner-clean-proxy-name
Runner: strip protocol from proxy address
2025-08-04 10:42:16 -07:00
Mark Backman
2901dddc2b Merge pull request #2338 from pipecat-ai/mb/update-release-evals-tavus
Add Tavus, HeyGen, Simli to release-evals
2025-08-04 10:38:27 -07:00
Mark Backman
3a8d809837 Runner: strip protocol from proxy address 2025-08-04 13:38:02 -04:00
Mark Backman
1b3c2bee30 Merge pull request #2331 from pipecat-ai/mb/more-foundational
Updating more foundational examples
2025-08-04 10:37:15 -07:00
Mark Backman
69f049cb63 Merge pull request #2328 from pipecat-ai/mb/04b-example-cleanup
Align 04b livekit example with other foundational examples
2025-08-04 10:36:57 -07:00
Vanessa Pyne
96b1000e52 Merge pull request #2341 from getchannel/realtime-text
Hotfix: Correct Gemini Live API class to fix 1007 payload error.
2025-08-04 11:03:57 -05:00
Filipi da Silva Fuchter
0184a8c231 Merge pull request #2351 from pipecat-ai/filipi/tavus_transport_ready
Changed `TavusVideoService` to send audio or video frames only after the   transport is ready, preventing warning messages at startup.
2025-08-04 11:48:27 -03:00
Filipi Fuchter
c22866ed58 Mentioning the TavusVideoService fix in the changelog. 2025-08-04 11:46:24 -03:00
Filipi Fuchter
0e533d21be Only send audio and video from the tavus video service if the transport is ready. 2025-08-04 10:52:30 -03:00
Mark Backman
6f6f4c3dea Merge pull request #2348 from sam-s10s/speechmatics-stt
Fix for Speechmatics STT
2025-08-04 06:15:39 -07:00
Mark Backman
f609971637 daily runner util: remove arg parsing, add auto room, token generation 2025-08-03 21:50:44 -04:00
Mark Backman
54ff10ae86 Merge pull request #2332 from hankehly/fix-piper-tts-json-payload
Fix PiperTTSService to send TTS input as JSON object
2025-08-03 17:39:04 -07:00
hankehly
77057eb829 Fix ruff formatting 2025-08-04 08:13:16 +09:00
Mark Backman
2b1a7b840d Merge pull request #2346 from adenta/heygen-testing
me trying to get heygen working
2025-08-03 14:11:14 -07:00
Sam Sykes
e07db88bc0 Updated changelog. 2025-08-03 22:11:10 +01:00
Sam Sykes
c2282b0e73 Set user_id to "" (not None) for RTVIProcessor. 2025-08-03 22:08:22 +01:00
Andrew Denta
593bf09d8d update script 2025-08-03 17:01:27 -04:00
Sam Sykes
534ed77ebf Merge branch 'main' into speechmatics-stt 2025-08-03 21:51:35 +01:00
Andrew Denta
193299988d me trying to get heygen working 2025-08-03 13:42:07 -04:00
Aleix Conchillo Flaqué
d589bcb345 Merge pull request #2344 from pipecat-ai/aleix/daily-python-0.19.6
pyproject: update daily-python to 0.19.6
2025-08-03 10:15:15 -07:00
Aleix Conchillo Flaqué
011ebc2801 Merge pull request #2345 from pipecat-ai/aleix/task-observer-signature-performance
TaskObserver: don't inspect on_push_frame signature for every frame
2025-08-03 10:15:00 -07:00
Aleix Conchillo Flaqué
3a72e94d0c LLMService: only do handle function inspection once 2025-08-03 10:09:19 -07:00
Aleix Conchillo Flaqué
d6d39fc873 TaskObserver: don't inspect on_push_frame signature for every frame 2025-08-03 10:09:17 -07:00
Pete
258e83c904 Fix: Correct Gemini Live API text input to prevent 1007 WebSocket errors
- Restore TextInputMessage.realtimeInput structure for correct API format
- Remove invalid turnComplete message from _send_user_text method
- turnComplete is only valid for clientContent, not realtimeInput messages
- realtimeInput text completion is automatically inferred by the API

This fixes WebSocket 1007 errors caused by mixing realtimeInput and
clientContent message types in violation of the Gemini Live API contract.
2025-08-03 10:58:59 -04:00
Mark Backman
061f2086b2 Merge pull request #2343 from pipecat-ai/mb/update-pre-commit-ruff-version
Update pre-commit-config ruff version
2025-08-03 03:38:54 -07:00
Aleix Conchillo Flaqué
a1f3f51168 pyproject: update daily-python to 0.19.6 2025-08-02 20:02:22 -07:00
hankehly
2177a2b805 Remove trailing space 2025-08-03 10:34:27 +09:00
hankehly
68164415ce Format changelog entry 2025-08-03 10:26:37 +09:00
hankehly
7646599b66 Add changelog entry 2025-08-03 10:23:58 +09:00
Mark Backman
e467eaf130 Merge pull request #2334 from Designedforusers/fix/tavus-transport-daily-callbacks
fix: Add missing transcription callbacks to TavusTransport for 0.0.77 compatibility
2025-08-02 16:09:57 -07:00
Designedforusers
9d6d53629e style: Apply ruff formatting to fix line length
Fixed a line length issue in tavus.py where the on_transcription_stopped callback was exceeding the maximum line length. Split the partial() call across multiple lines for better readability and compliance with project style guidelines.
2025-08-02 18:27:09 -04:00
Mark Backman
89596cfec4 Update pre-commit-config ruff version 2025-08-02 18:06:06 -04:00
Designedforusers
5e338ecaf1 refactor: Remove redundant transcription callback methods
As suggested in PR review, removed the _on_transcription_stopped and
_on_transcription_error method definitions. Now using the consistent
partial(self._on_handle_callback, ...) pattern for these callbacks,
matching how all other callbacks are handled.

This simplifies the code while maintaining the same functionality.
2025-08-02 15:02:54 -04:00
Designedforusers
62319021f8 docs: Add changelog entry for TavusVideoService fix
Added changelog entry as requested by maintainers for the fix addressing
missing transcription callbacks in TavusVideoService.
2025-08-02 14:53:44 -04:00
Pete
cccd82a617 Refactor TextInputMessage class to replace realtimeInput with a text attribute.
This was sending a 1007 because it was wrapping RealtimeInput in the json.

- Updated the `TextInputMessage` class to directly store text input as a string.
- Modified the `from_text` class method to create an instance using the new `text` attribute.
2025-08-02 14:34:00 -04:00
Mark Backman
f552ba1f5e Merge pull request #2336 from pipecat-ai/mb/suppress-pydub-warning
Suppress pydub (cartesia dependency) SyntaxWarning
2025-08-02 10:01:05 -07:00
Mark Backman
b9a2a9b729 Add Tavus, HeyGen, Simli to release-evals 2025-08-02 09:35:06 -04:00
Mark Backman
e43b3869c3 Suppress pydub SyntaxWarning from the cartesia module 2025-08-02 08:49:59 -04:00
Mark Backman
55731df999 Improve quickstart logging, runner startup message 2025-08-02 08:40:05 -04:00
Designedforusers
3a7ea25077 fix: Add missing transcription callbacks to TavusTransport for 0.0.77 compatibility
TavusTransport was broken in Pipecat 0.0.77 due to PR #2292 adding required
callbacks (on_transcription_stopped, on_transcription_error) to DailyCallbacks.

This fix adds placeholder implementations of these callbacks to TavusTransportClient,
allowing TavusTransport to initialize properly. These callbacks are not used by
Tavus (which handles avatar video, not transcription) but are required by the
DailyCallbacks validation.

Fixes initialization error:
- 2 validation errors for DailyCallbacks
- on_transcription_stopped: Field required
- on_transcription_error: Field required
2025-08-02 05:56:46 -04:00
Yohan Liyanage
248206e234 Fixes 2277 - SSML reserved characters in LLM generated text causes Azure TTS to fail. 2025-08-02 12:49:29 +05:30
hankehly
694922f627 Fix PiperTTSService to send TTS input as JSON object 2025-08-02 15:29:16 +09:00
Mark Backman
cc9950e72d Updating more foundational examples 2025-08-01 19:58:40 -04:00
richtermb
6814c390ba Update CHANGELOG to reflect the addition of the source field in ErrorFrame for improved error tracking. 2025-08-01 14:47:57 -07:00
Richter Brzeski
c2d05ad23b Merge branch 'pipecat-ai:main' into richtermb/push-more-error-frames 2025-08-01 14:47:08 -07:00
Mark Backman
ee56d8572d Merge pull request #2329 from pipecat-ai/mb/fix-livekit-empty-audio-frames
fix: LiveKitTransport, don't push empty AudioRawFrames
2025-08-01 12:53:05 -07:00
richtermb
91568eeddc Update type hint for source in ErrorFrame to use forward declaration for improved clarity. 2025-08-01 12:52:56 -07:00
richtermb
165d6b4c1d Update CHANGELOG to include new source field in ErrorFrame for error tracking. 2025-08-01 12:25:29 -07:00
Mark Backman
1d8abe3c1c fix: LiveKitTransport, don't push empty AudioRawFrames 2025-08-01 14:57:53 -04:00
Mark Backman
a6e69d6aad Merge pull request #2325 from pipecat-ai/mb/dependency-groups
Move dev to [dependency-groups], update uv.lock
2025-08-01 11:54:21 -07:00
Mark Backman
519da9cc61 Align 04b livekit example with other foundational examples 2025-08-01 14:28:15 -04:00
richtermb
ead4e97ab5 Add source parameter to ErrorFrame and set it in FrameProcessor. Updated error handling in AnthropicLLMService and DeepgramSTTService to include ErrorFrame with source information. 2025-08-01 11:14:50 -07:00
Mark Backman
0c021378b0 Merge pull request #2326 from pipecat-ai/readme-quickstart-link
Update README.md
2025-08-01 10:45:30 -07:00
Mark Backman
e22c7e8ad5 Update README.md 2025-08-01 13:40:03 -04:00
Mark Backman
b71057bf7c Move dev to [dependency-groups], update uv.lock 2025-08-01 09:43:56 -04:00
Mark Backman
0865f6cd7d Merge pull request #2318 from pipecat-ai/mb/add-asyncio-readme
Add AsyncAI TTS to README vendor list
2025-08-01 06:11:13 -07:00
Mark Backman
610b1ab065 Merge pull request #2319 from pipecat-ai/mb/use-new-runner
Update foundational examples to use new runner
2025-08-01 06:11:03 -07:00
Mark Backman
3a2a226668 Merge pull request #2320 from pipecat-ai/mb/uv-lock-init
Add initial uv.lock file
2025-08-01 06:07:53 -07:00
Mark Backman
8e4b7352fd Merge pull request #2321 from pipecat-ai/mb/dev-requirements
Add dev to optional-dependencies
2025-08-01 06:02:58 -07:00
Mark Backman
637d372fe4 Add dev to optional-dependencies 2025-07-31 23:39:23 -04:00
Mark Backman
ac15fe8ae4 Add workflow to update lockfile with pyproject.toml changes 2025-07-31 23:08:21 -04:00
Mark Backman
07239c0b8b Add initial uv.lock file 2025-07-31 22:46:44 -04:00
Mark Backman
367b2fbe3c Update requirements.txt 2025-07-31 22:12:57 -04:00
Mark Backman
f1b1d5b130 Update foundational examples to use the development runner 2025-07-31 22:11:32 -04:00
Mark Backman
ff45b77fdf Remove examples runner 2025-07-31 21:22:04 -04:00
Mark Backman
e522b7ae96 Add AsyncAI TTS to README vendor list 2025-07-31 19:33:37 -04:00
Mark Backman
b8eef4f93b Merge pull request #2314 from pipecat-ai/mb/sync-quickstart-example
Add workflow to sync quickstart to pipecat-quickstart repo
2025-07-31 15:34:26 -07:00
Mark Backman
dcc205996a Merge pull request #2317 from pipecat-ai/mb/release-prep-0.0.77
Changelog update for 0.0.77
2025-07-31 15:34:02 -07:00
Mark Backman
9f61af4d1b Changelog update for 0.0.77 2025-07-31 18:19:05 -04:00
Sam Sykes
e8faf28e6a Doc fix for incorrect argument name. 2025-07-31 22:30:54 +01:00
Filipi da Silva Fuchter
40d53b3d84 Merge pull request #2316 from sam-s10s/speechmatics-stt
Updated to SpeechmaticsSTTService
2025-07-31 18:28:16 -03:00
Sam Sykes
7c223a86c2 Fix to missing deprecated attribute enable_speaker_diarization. 2025-07-31 22:25:46 +01:00
Sam Sykes
2d3f61aa07 Updated Speechmatics Plugin (#2225)
Changes
Split out module attributes to make engine settings clearer
Removed internal audio buffer to use latest Speechmatics python SDK (0.4.0)
Use diarization for improved VAD in multi-speaker situations
Support custom dictionary / vocabulary with attributes
Deprecated attributes superseded by re-organised attributes

Diarization Enhancements
Focus on specific speakers (using speaker labels)
Ignore specific speakers (using speaker labels)
Separate transcription formats for active and inactive speakers
Support for known speakers
2025-07-31 17:51:38 -03:00
Mark Backman
e05a47744d Merge pull request #2311 from pipecat-ai/mb/quickstart-fixups
Set quickstart min pipecat-ai version to 0.0.77, remove non-quickstart examples
2025-07-31 13:42:10 -07:00
Aleix Conchillo Flaqué
6ffaab2b93 CHANGELOG cosmetics 2025-07-31 13:39:37 -07:00
Aleix Conchillo Flaqué
c2d8844903 Merge pull request #2312 from pipecat-ai/aleix/srhinos/main
Enable Interruption Support for LLMUserResponseAggregator
2025-07-31 13:30:57 -07:00
Mark Backman
e8caba7723 Add workflow to sync quickstart to pipecat-quickstart repo 2025-07-31 15:56:18 -04:00
Mark Backman
df96ef7d37 Remove non-quickstart demos 2025-07-31 15:38:54 -04:00
Aleix Conchillo Flaqué
7553f670af fix formatting and update CHANGELOG 2025-07-31 10:41:11 -07:00
Mark Backman
6960f5861b Set example min pipecat-ai version to 0.0.77 due to runner requirement 2025-07-31 12:21:58 -04:00
Mark Backman
b5edbbc0ca Merge pull request #2309 from pipecat-ai/mb/remove-runner-examples
Remove examples/runner-examples
2025-07-31 09:18:22 -07:00
Aleix Conchillo Flaqué
e78d9c2c95 Merge pull request #2293 from azain47/azain47/fix-piper-tts-service
Fix Piper TTS Service
2025-07-31 08:32:17 -07:00
Vanessa Pyne
b25547a98b Merge pull request #2305 from pipecat-ai/vp-changelog-text-input
update changelog
2025-07-31 10:16:47 -05:00
Mark Backman
e80281c3c4 Remove examples/runner-examples 2025-07-31 10:59:06 -04:00
Mark Backman
d692843e5b Merge pull request #2308 from pipecat-ai/mb/change-neuphonic-url
NeuphonicTTSService: change the default url value to the global endpoint
2025-07-31 07:38:57 -07:00
Mark Backman
eaad3c5d55 NeuphonicTTSService: change the default url value to the global endpoint 2025-07-31 10:24:54 -04:00
vipyne
f2a1c66379 update changelog 2025-07-31 08:55:25 -05:00
Vanessa Pyne
af8de227bb Merge pull request #2223 from getchannel/realtime-text
Add text input handling to unify context for realtimeInput stream of GeminiMultimodalLiveService
2025-07-31 08:53:39 -05:00
Mark Backman
7cd78dd286 Merge pull request #2303 from pipecat-ai/mb/add-new-quickstart-demos
Add quickstart demos
2025-07-31 05:56:00 -07:00
Mark Backman
226b516948 Add quickstart demos 2025-07-30 22:14:10 -04:00
Mark Backman
aa85fffa57 New runner module (#2269)
* Adds pipecat.runner.run - FastAPI-based development server with automatic bot discovery

* Adds new RunnerArguments types for different transports

* Adds new runner utils for creating transports and parsing data

* Adds new Daily and LiveKit utils for setup
2025-07-30 22:02:28 -04:00
srhinos
8b97ab70ff Enable Interruption Support for LLMUserResponseAggregator 2025-07-30 20:48:31 -04:00
Filipi da Silva Fuchter
9013b2929a Merge pull request #2300 from pipecat-ai/filipi/fast_api_race_condition
Fixed a race condition in FastAPIWebsocketClient
2025-07-30 18:10:09 -03:00
Filipi Fuchter
0c6e12a9b0 Fixed a race condition in FastAPIWebsocketClient that occurred when attempting to send a message while the client was disconnecting. 2025-07-30 18:07:40 -03:00
Aleix Conchillo Flaqué
efb24071d5 Merge pull request #2301 from pipecat-ai/aleix/daily-python-0.19.5
pyproject: update daily-python to 0.19.5
2025-07-30 14:01:27 -07:00
Filipi da Silva Fuchter
318ebec67e Merge pull request #2298 from pipecat-ai/filipi/google_interruptions
Fixed an issue in GoogleLLMService where interruptions did not work when an interruption strategy was used.
2025-07-30 17:49:07 -03:00
Aleix Conchillo Flaqué
c679227aa8 pyproject: update daily-python to 0.19.5 2025-07-30 13:19:48 -07:00
Filipi Fuchter
392853f5fa Fixed an issue in GoogleLLMService where interruptions did not work when an interruption strategy was used. 2025-07-30 12:10:32 -03:00
Mark Backman
98d27caab3 Merge pull request #2296 from pipecat-ai/mb/switch-rime-voices
Added the ability to switch voices to RimeTTSService
2025-07-30 07:55:52 -07:00
Mark Backman
0fa51968bf Added the ability to switch voices to RimeTTSService 2025-07-30 10:53:14 -04:00
Mark Backman
92aee2634b Merge pull request #2291 from pipecat-ai/mb/remove-on-client-closed 2025-07-30 06:36:32 -07:00
Filipi da Silva Fuchter
bff6a93f31 Merge pull request #2150 from pipecat-ai/filipi/hey_gen
HeyGen implementation for Pipecat - HeyGenVideoService
2025-07-30 09:10:07 -03:00
Filipi Fuchter
6e921cdf45 HeyGen implementation for Pipecat - HeyGenVideoService 2025-07-30 09:07:15 -03:00
Azain.
1e2b066cf3 Fix tts.py
Update Piper TTS Service to work with the newer Piper GPL Version, that uses JSON as its payload.
2025-07-30 13:41:27 +05:30
Pete
2af3b6329d Ruff format debug 2025-07-29 17:48:11 -04:00
Pete
8ca06e5887 Add InputTextRawFrame class for handling raw text input in frames
- Introduced `InputTextRawFrame` to represent raw text input from users or programs.
- Updated `GeminiMultimodalLiveLLMService` to process `InputTextRawFrame` and send user text via the Gemini Live API's realtime input stream.
- Enhanced `_send_user_text` method documentation for clarity on its functionality and usage.
2025-07-29 17:43:14 -04:00
Mark Backman
c145a9ef13 Merge pull request #2288 from pipecat-ai/mb/stt-mute-examples
Update placement of STTMuteFilter in examples to reflect the new reco…
2025-07-29 12:10:36 -07:00
Mark Backman
b523f9a4c6 Merge pull request #2248 from ashotbagh/feat/async-tts
feat(tts): integrate Async TTS engine into pipecat
2025-07-29 12:10:11 -07:00
Mark Backman
7f184422d0 Merge branch 'main' into feat/async-tts 2025-07-29 12:06:56 -07:00
Aleix Conchillo Flaqué
fa4c3ec6bf Merge pull request #2287 from pipecat-ai/aleix/asyncio-trace-logging
utils(asyncio): use trace logging for some cancelling messages
2025-07-29 11:56:25 -07:00
Aleix Conchillo Flaqué
9fafc10844 Merge pull request #2292 from richtermb/transcription-error-callback
Add on_transcription_error callback to DailyCallbacks and handle tran…
2025-07-29 11:56:02 -07:00
richtermb
67107d02ed Refactor callback invocation for on_transcription_stopped in DailyTransportClient for improved readability 2025-07-29 11:53:41 -07:00
richtermb
c1df19982c Add on_transcription_stopped callback to DailyTransport for handling transcription stop events 2025-07-29 11:50:16 -07:00
richtermb
444b1b5b02 Add on_transcription_stopped callback to DailyCallbacks and implement handling in DailyTransport for transcription stop events 2025-07-29 11:49:28 -07:00
Mark Backman
ebfa4f2d5e Push the STTMuteFrame upstream and downstream 2025-07-29 14:37:36 -04:00
Mark Backman
e961c438e7 Update placement of STTMuteFilter in examples to reflect the new recommendation 2025-07-29 14:36:39 -04:00
richtermb
d3d36a89e2 Add _on_transcription_error method to DailyTransport for handling transcription error events 2025-07-29 10:48:50 -07:00
richtermb
fa6e5ce4a7 Add on_transcription_error callback to DailyCallbacks and handle transcription errors in DailyTransportClient 2025-07-29 10:43:18 -07:00
Mark Backman
3ffb261864 Remove use of on_client_closed event in foundational examples 2025-07-29 13:28:33 -04:00
Mark Backman
f69a02b7a7 Merge pull request #2290 from pipecat-ai/mb/remove-examples
Removed most pipecat examples, relocating to pipecat-examples repo
2025-07-29 10:02:24 -07:00
Mark Backman
f1f4aed398 Remove random Dockerfile, update README links 2025-07-29 11:41:27 -04:00
Mark Backman
414c245c92 Remove android.yaml github workflow 2025-07-29 11:34:48 -04:00
Mark Backman
3f57d94c0b Update examples README 2025-07-29 11:28:40 -04:00
Mark Backman
15e3c69ddc Removed most pipecat examples, relocating to pipecat-examples repo 2025-07-29 11:17:01 -04:00
Ashot
39b00f5269 chore: address review comments 2025-07-29 18:20:50 +04:00
Mark Backman
4c368c78c6 Merge pull request #2289 from tomoima525/tomoima525/transcription-bucket-params
Add transcription_bucket param for rest helper
2025-07-29 05:05:08 -07:00
Tomoaki Imai
6eb00a99cb update Changelog for transcription_bucket params addition 2025-07-29 20:38:12 +09:00
Tomoaki Imai
3ae8cf1916 Add transcription_bucket param for rest helper 2025-07-29 18:58:38 +09:00
Aleix Conchillo Flaqué
03e87469df utils(asyncio): use trace logging for some cancelling messages 2025-07-28 17:43:41 -07:00
Mark Backman
70255d3c81 Merge pull request #2274 from pipecat-ai/mb/remove-message-name
Remove message["name"] addition when pushing
2025-07-28 17:29:23 -07:00
Mark Backman
96a72d0647 Remove message[name] addition when pushing 2025-07-28 20:13:50 -04:00
Mark Backman
27d4910694 Merge pull request #2286 from pipecat-ai/mb/fix-transcript-processor-newline
fix: Improve TranscriptProcessor detection for transcript type
2025-07-28 17:07:43 -07:00
Mark Backman
50242f4ad8 fix: Improve TranscriptProcessor detection for transcript type 2025-07-28 19:56:36 -04:00
Mark Backman
c9dda5251c Merge pull request #2284 from pipecat-ai/mb/yank-74-75
Add yanked notices to 0.0.74 and 0.0.75 changelogs
2025-07-28 12:55:47 -07:00
Mark Backman
419cc9ac68 Add yanked notices to 0.0.74 and 0.0.75 changelogs 2025-07-28 13:36:02 -04:00
Ashot
83b4747196 chore: address review comments 2025-07-28 17:52:17 +04:00
Ashot
a13b954415 formatting/cleanup: address Copilot PR review comments 2025-07-28 17:43:17 +04:00
Ashot
f2e9562f1b feat(tts): integrate Async TTS engine into pipecat 2025-07-28 17:42:57 +04:00
Mark Backman
afed9a61f2 Merge pull request #2268 from pipecat-ai/mb/inworld-changelog
Add changelog entry for InworldTTSService
2025-07-28 05:59:57 -07:00
Mark Backman
f0de27b35e Merge pull request #2273 from pipecat-ai/mb/gitignore-plivo-stream
.gitignore: add plivo-chatbot streams.xml, .python-version
2025-07-28 05:59:41 -07:00
Filipi da Silva Fuchter
9d5510ee47 Merge pull request #2265 from pipecat-ai/filipi/small_webrtc_buffer_processor
Fixed an issue in AudioBufferProcessor when using SmallWebRTCTransport
2025-07-28 09:23:58 -03:00
Mark Backman
434c3fc527 Merge pull request #2279 from Allenmylath/patch-26
Update README.md
2025-07-28 04:51:58 -07:00
allenmylath
aba79a9478 Update README.md
Explanatory comment.Eventhought code allows for transport params.It is not clearly given in readme, what all are the options are there.
2025-07-28 11:24:27 +05:30
Mark Backman
fc96e091a9 Update NVIDIA in README 2025-07-26 15:01:52 -04:00
Mark Backman
851a27c082 Add Groq TTS to README 2025-07-26 14:58:07 -04:00
Mark Backman
a72d93dc6d Add Inworld to README 2025-07-26 14:55:19 -04:00
Mark Backman
c971232f20 .gitignore: add plivo-chatbot streams.xml, .python-version 2025-07-26 10:19:50 -04:00
Mark Backman
4b2ba2d69f Merge pull request #2270 from Allenmylath/patch-25
Update README.md
2025-07-26 04:55:59 -07:00
allenmylath
240a698fab Update README.md
recreating examples without activating bedrock models will create errors . Warning and link added
2025-07-26 14:51:19 +05:30
Mark Backman
9aaae01063 Add changelog entry for InworldTTSService 2025-07-25 21:46:02 -04:00
Mark Backman
41c8d22cf3 Merge pull request #2208 from padillamt/mtp/add-inworld-tts
Inworld HTTP TTS Service
2025-07-25 17:13:37 -07:00
padillamt
b68f044ef7 mtpadilla: updated example to reflect parameter placement changes in base Inworld TTS class 2025-07-25 15:13:43 -07:00
padillamt
e140bd6960 mtpadilla: moved model and voice id setting into the class constructor 2025-07-25 14:04:49 -07:00
Filipi Fuchter
e86b55e2b3 Fixed an issue in AudioBufferProcessor when using SmallWebRTCTransport where, if the microphone was muted, track timing was not respected. 2025-07-25 17:01:41 -03:00
padillamt
4a9bec5b35 mtpadilla: stop metrics at result chunk 2025-07-25 11:14:20 -07:00
padillamt
37361391d9 mtpadilla: removed ability to set base_url via constructor, set internally based on streaming variable 2025-07-25 09:16:56 -07:00
Filipi da Silva Fuchter
4b3726eba4 Merge pull request #2260 from pipecat-ai/filipi/audio_resampler
Fixed an issue in `AudioBufferProcessor` that caused garbled audio
2025-07-25 09:27:42 -03:00
padillamt
8e66794759 mtpadilla: switch to Deepgram ASR for lower latency 2025-07-24 22:22:36 -07:00
padillamt
acc5b9f210 inworld: change to function that stops all processing metrics
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 22:07:15 -07:00
padillamt
f982ace4c5 inworld: removal of unnecessary setting of ssampling rate since matches default 2025-07-24 21:56:01 -07:00
padillamt
5fb1899aeb inworld: removal of unnecessary default assignment as already handled 2025-07-24 21:42:42 -07:00
padillamt
7483422bd9 inworld: change set_voice uto use self._settings
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:23:03 -07:00
padillamt
16c20f3a99 inworld: removal of unnecessary default assignment since already done
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:15:34 -07:00
padillamt
d248c102c8 inworld: removal of unnecessary default assignment since already done
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:15:20 -07:00
padillamt
662550cc5e mtpadilla: remove unused imports 2025-07-24 21:05:22 -07:00
padillamt
067f64389b mtpadilla: no longer needed so making empty 2025-07-24 20:44:27 -07:00
padillamt
81048ce43a mtpadilla: rename 07aa-interruptible-inworld-http.py to 07ab-interruptible-inworld-http.py 2025-07-24 20:42:29 -07:00
padillamt
f6440ee6e1 mtpadilla: correct Examples header in comments 2025-07-24 13:36:40 -07:00
padillamt
da8c67114a mtpadilla: make streaming the default for example 2025-07-24 13:35:29 -07:00
Mark Backman
d8ea1311ff Merge pull request #2254 from pipecat-ai/mb/11labs-create-context-id
ElevenLabsTTSService: Only reset the context_id when interruptions ar…
2025-07-24 13:32:37 -07:00
Mark Backman
2be615066c Merge pull request #2261 from pipecat-ai/mb/foundational-requirements
Foundational requirements.txt: add silero, websocket optional dep, re…
2025-07-24 11:06:16 -07:00
Mark Backman
75c2ffc0b5 Check if audio context is already available, create one if not 2025-07-24 13:57:46 -04:00
Mark Backman
2297eb217e ElevenLabsTTSService: Only reset the context_id when interruptions are enabled 2025-07-24 13:53:44 -04:00
Mark Backman
1bb821a07d Foundational requirements.txt: add silero, websocket optional dep, remove fastapi 2025-07-24 13:49:44 -04:00
Filipi Fuchter
970b8044a0 Fixed an issue in AudioBufferProcessor that caused garbled audio when enable_turn_audio was enabled and audio resampling was required. 2025-07-24 13:25:48 -03:00
Filipi da Silva Fuchter
d8bcb81f35 Merge pull request #2259 from pipecat-ai/filipi/eleven_labs_delayed_messages
Play delayed messages from `ElevenLabsTTSService` if they still belong to the current context.
2025-07-24 12:07:06 -03:00
Filipi da Silva Fuchter
3ce0ab8c6d Removing extra space.
Co-authored-by: Mark Backman <mark@daily.co>
2025-07-24 12:05:17 -03:00
Filipi Fuchter
097d786431 Fixing ruff format. 2025-07-24 12:03:17 -03:00
Filipi Fuchter
662f04879c Play delayed messages from ElevenLabsTTSService if they still belong to the current context. 2025-07-24 12:00:14 -03:00
Mark Backman
7a69f57e11 Merge pull request #2255 from pipecat-ai/mb/pyproject-versions-for-uv
pyproject.toml dependency updates to support better cross compatibility
2025-07-24 06:43:35 -07:00
Mark Backman
5b7b4efdc9 Add broader version support for stable core dependencies, up to the next major version 2025-07-24 09:40:52 -04:00
Mark Backman
cfa26524ca Add support for fastapi>=0.115.6,<0.117.0 2025-07-24 09:37:42 -04:00
Mark Backman
3d4ab7158d pyproject.toml dependency updates to support better cross compatibility 2025-07-24 09:37:42 -04:00
Mark Backman
26d1ca3c98 Merge pull request #2256 from pipecat-ai/mb/refactor-neuphonic-http
NeuphonicHttpTTSService: Refactor to use POST API
2025-07-24 06:36:23 -07:00
Mark Backman
083b32887e NeuphonicHttpTTSService: Refactor to use POST API 2025-07-24 01:05:37 -04:00
padillamt
b6367965cb mtpadilla: consolidate streaming and non-streaming options into a single class with common API, with boolean switch variable added (streaming) 2025-07-23 16:50:32 -07:00
padillamt
147bf9cfe8 mtpadilla: addition of non-streaming option with own dedicated class, and related additional non-streaming test option 2025-07-23 15:28:43 -07:00
Mark Backman
3391929127 Merge pull request #2252 from pipecat-ai/mb/example-axios-version-bump
Update axios in daily-pstn-server example due to transitive vulnerabi…
2025-07-23 13:30:58 -07:00
padillamt
a5d353030e mtpadilla: small formatting fix to comments 2025-07-23 12:02:58 -07:00
padillamt
f29024bcc0 mtpadilla: update coments regarding temperature parameter 2025-07-23 11:47:26 -07:00
Mark Backman
ebf9bc2741 Merge pull request #2246 from ydlamba/ydlamba/missing-livekit-event
fix(livekit): emit on_audio_track_subscribed event
2025-07-23 11:27:10 -07:00
Mark Backman
f5edde42f6 Update axios in daily-pstn-server example due to transitive vulnerability with form-data 2025-07-23 14:22:13 -04:00
Filipi da Silva Fuchter
37bb7ef926 Merge pull request #2239 from pipecat-ai/filipi/daily_log
Added `set_log_level` to `DailyTransport`
2025-07-23 14:48:34 -03:00
Filipi Fuchter
a63d1530a4 Added set_log_level to DailyTransport. 2025-07-23 14:43:53 -03:00
Yash Dev Lamba
960bc9df5b chore(changelog): add entry for LiveKitTransport audio subscribed event fix 2025-07-23 22:41:20 +05:30
Mark Backman
e2a153ee01 Merge pull request #2242 from pipecat-ai/mb/websockets-14
Upgrade websockets to support asyncio implementation
2025-07-23 08:58:08 -07:00
Mark Backman
300f19ad23 Port to the websockets asyncio implementation, support for websockets 13 and 14 2025-07-23 11:54:25 -04:00
Mark Backman
7955080da2 Change extra_headers to additional_headers, update websocket version support 2025-07-23 11:53:43 -04:00
Mark Backman
994e82c1ef Merge pull request #2243 from pipecat-ai/mb/word-wrangler-twilio-readme
Update Word Wrangler phone bot README to include deployment info
2025-07-23 07:04:19 -07:00
Mark Backman
b07b947352 Merge pull request #2244 from pipecat-ai/mb/upgrade-deepgram-4.7.0
Deepgram: Update optional dependency to 4.7.0
2025-07-23 07:04:02 -07:00
Filipi da Silva Fuchter
a6527c3856 Merge pull request #2240 from pipecat-ai/filipi/sig_term
Adding support for handle_sigterm
2025-07-23 08:15:50 -03:00
antonyesk601
1cbf7ae480 fix: remove unused variable; fix: remove redundant logic 2025-07-23 08:26:44 +00:00
Yash Dev Lamba
0e6874b605 fix(livekit): emit on_audio_track_subscribed event 2025-07-23 08:23:45 +05:30
Mark Backman
9ba172c49f Merge pull request #2236 from dbtreasure/fix/python-311-compatibility
Fix Python 3.11+ compatibility by pinning numba/llvmlite versions
2025-07-22 18:20:38 -07:00
dbtreasure
f710c94b6e Address code review feedback: remove explicit llvmlite pin
- Remove explicit llvmlite>=0.44.0 pin as numba>=0.61.0 automatically pulls compatible version
- Add changelog entry for Python 3.11+ dependency fix

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-22 18:45:32 -06:00
dbtreasure
6e3a0a2d5d Add explicit numba/llvmlite pins for Python 3.11+ compatibility
Fixes dependency resolution issues where transitive dependencies
through resampy would install incompatible versions:
- numba>=0.61.0 (supports Python 3.10-3.13)
- llvmlite>=0.44.0 (supports Python 3.10-3.13)

Previously, older versions (numba 0.53.1, llvmlite 0.36.0) only
supported Python 3.6-3.9, causing deployment failures on Python 3.11+.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-22 18:45:02 -06:00
Mark Backman
9530b8b842 Merge pull request #2235 from pipecat-ai/mb/nltk-tokenizer
Update match_endofsentence to use NLTK sentence tokenizer
2025-07-22 17:22:23 -07:00
Mark Backman
26c937af87 Update match_endofsentence to use NLTK sentence tokenizer 2025-07-22 20:19:29 -04:00
Mark Backman
976f6168f0 Deepgram: Update optional dependency to 4.7.0 2025-07-22 20:15:30 -04:00
Mark Backman
0be64e0fd9 Update Word Wrangler phone bot README to include deployment info 2025-07-22 20:10:20 -04:00
Filipi Fuchter
7d527c3a6b Mentioning the new field in the changelog. 2025-07-22 19:32:52 -03:00
Filipi Fuchter
c6f6930c27 Adding support for handle_sigterm 2025-07-22 17:24:07 -03:00
Mark Backman
c33dfe8309 Merge pull request #2233 from pipecat-ai/mb/enable-tracing-flag
fix: enable_tracing PipelineParam controls the service class decorators
2025-07-22 08:14:32 -07:00
Mark Backman
769cd1ef06 fix: enable_tracing PipelineParam controls the service class decorators 2025-07-22 11:10:53 -04:00
Mark Backman
6d72f60571 Merge pull request #2234 from pipecat-ai/mb/fix-minimax-pitch
fix: MiniMaxHttpTTSService pitch, add base_url arg
2025-07-22 08:10:01 -07:00
Mark Backman
e8d0712ac1 Merge pull request #2238 from pipecat-ai/mb/patch-form-data
Fix form-data vulnerability in pipecat-cloud-daily-pstn-server
2025-07-22 08:09:49 -07:00
Mark Backman
88b2c817ac Fix form-data vulnerability in pipecat-cloud-daily-pstn-server 2025-07-22 10:08:25 -04:00
Mark Backman
f8f6c9918d Merge pull request #2237 from pipecat-ai/mb/pipecat-cloud-example-pipeline-runner-args
Update Pipecat Cloud example to use handle_sigint=False in PipelineRu…
2025-07-22 06:55:56 -07:00
Mark Backman
8ee608bbfe Update Pipecat Cloud example to use handle_sigint=False in PipelineRunner args 2025-07-22 09:52:57 -04:00
Mark Backman
fad2ba4570 Merge pull request #2204 from yousifa/mcp-FunctionCallParams 2025-07-22 05:01:32 -07:00
Mark Backman
f609f7eb53 fix: MiniMaxHttpTTSService pitch, add base_url arg 2025-07-21 21:16:35 -04:00
Mark Backman
ea09813a2b Merge pull request #2227 from pipecat-ai/mb/fix-11labs-wordtimestamps
fix: Improve ElevenLabsTTSService word/timestamp calcuation accuracy
2025-07-21 16:07:07 -07:00
Mark Backman
53abfc27a7 fix: Improve ElevenLabsTTSService word/timestamp calcuation accuracy 2025-07-21 18:48:38 -04:00
padillamt
1915407ff7 inworld: removed unreferenced is_first_chunk variable 2025-07-21 15:30:48 -07:00
Mark Backman
9c72e96a2c Merge pull request #2230 from pipecat-ai/mb/livekit-tenacity
Livekit: change tenacity supported versions
2025-07-21 15:28:38 -07:00
Mark Backman
f66c67c4ab Merge pull request #2232 from pipecat-ai/mb/fix-ollama-args
Fix: Ollama kwargs error
2025-07-21 15:26:13 -07:00
Mark Backman
b623face03 Add Ollama function calling example 14u 2025-07-21 17:52:16 -04:00
Mark Backman
698d60f3ae fix: OLLamaLLMService pass base_url as kwarg 2025-07-21 17:51:11 -04:00
Mark Backman
c9717a23a5 Livekit: change tenacity supported versions 2025-07-21 17:30:18 -04:00
padillamt
076a675a75 inworld: Fix...Set sample_rate=None in InworldHttpTTSService to match Cartesia pattern 2025-07-21 13:50:36 -07:00
padillamt
0d5292c4ef inworld: typo fix in voice name 2025-07-21 13:48:13 -07:00
padillamt
4853d5d55c inworld: updated InworldHttpTTSService initialization 2025-07-21 13:27:25 -07:00
padillamt
8eda2435a2 inworld: removed explicit references to language since our models currently infer that from the text. 2025-07-21 13:24:10 -07:00
Mark Backman
d981ce6e56 Merge pull request #2226 from pipecat-ai/mb/11labs-speed-docstring
Fix 11Labs speed docstring
2025-07-21 13:21:45 -07:00
padillamt
54ff946976 inworld: largely adjustments for docstring compatibility 2025-07-21 12:07:58 -07:00
Mark Backman
1bbd3bd8ab Fix 11Labs speed docstring 2025-07-21 14:58:12 -04:00
padillamt
aadd088b50 inworld: commented out contents as per Pipecat guidance that this pattern is being retired 2025-07-21 10:52:55 -07:00
padillamt
4250aa6616 inworld: removal of backup copy, no longer needed 2025-07-21 10:11:50 -07:00
Kwindla Hultman Kramer
a20915caa7 Merge pull request #2224 from pipecat-ai/khk/mps
Add MPS backend auto-detection to local smart-turn v2
2025-07-21 09:24:51 -07:00
Vanessa Pyne
28cab5a606 Merge pull request #1932 from getchannel/groundingMetadata
Add groundingMetadata to Gemini Multimodal Live Service
2025-07-21 10:09:26 -05:00
Vanessa Pyne
cfea56064d small merge-main nit fixes - gemini_multimodal_live events.py 2025-07-21 09:54:15 -05:00
Vanessa Pyne
8467d87cfc small main-merge fixes - gemini.py 2025-07-21 09:52:32 -05:00
Kwindla Hultman Kramer
b20d020bea Add MPS backend auto-detection to local smart-turn v2 2025-07-20 20:18:45 -04:00
padillamt
e3711f96a3 inworld: added detailed comments 2025-07-20 17:06:35 -07:00
Pete
948257c66e Merge branch 'main' into groundingMetadata 2025-07-20 19:54:30 -04:00
Pete
b54d1fb7fd Resolve merge conflict and remove duplicate File API initialization
- Remove duplicate file_api initialization lines
- Keep grounding metadata tracking functionality
- Maintain clean code structure
2025-07-20 19:15:40 -04:00
Pete
ec361df0d1 Fix final ruff linting issues
- Remove duplicate import in __init__.py
- Clean up extra blank lines in gemini.py
- Remove extra blank line in _create_single_response method
2025-07-20 18:58:54 -04:00
Pete
b1a5cddde4 Refactor whitespace and formatting in multiple files
- Clean up unnecessary whitespace in `gemini.py`, `events.py`, and `file_api.py`
- Ensure consistent formatting in `26g-gemini-multimodal-live-groundingMetadata.py`
- Improve readability by aligning code and removing trailing spaces
2025-07-20 18:40:12 -04:00
Pete
e165d38277 remove truncated logging from debug 2025-07-20 18:27:21 -04:00
Pete
8ba340a8a5 remove debug logging 2025-07-20 18:21:42 -04:00
Pete
8f74b97591 Refactor _send_user_text method in Gemini multimodal service to streamline event creation for turn completion 2025-07-20 18:08:45 -04:00
Pete
1d69cd1a5e Remove debug logging from _send_user_text method in Gemini multimodal service 2025-07-20 18:04:57 -04:00
Pete
bd7a0f27cc Add text input handling to Gemini multimodal service
- Updated `RealtimeInput` to include an optional `text` parameter.
- Introduced `TextInputMessage` class for encapsulating text input data.
- Implemented `_send_user_text` method to send text input to the Gemini Live API.
- Enhanced message processing to support text input alongside media chunks.
2025-07-20 17:39:31 -04:00
padillamt
5d8c184d99 inworld: commit of original text file and changes that copy openai's with Inworld TTS as only change 2025-07-18 16:30:03 -07:00
padillamt
1bc442e329 inworld: docstring fix 2025-07-18 15:13:19 -07:00
kompfner
d4e33663b2 Merge pull request #2214 from pipecat-ai/pk/fix-google-llm-context
Fixed an issue in `GoogleLLMContext` where it would inject the `syste…
2025-07-18 09:28:28 -04:00
marcus-daily
d7d1b16dad Removing old import 2025-07-18 12:48:06 +01:00
marcus-daily
0bc2ea13f2 Updating changelog 2025-07-18 12:48:06 +01:00
marcus-daily
b5d1301221 Fix linter warnings 2025-07-18 12:48:06 +01:00
marcus-daily
ed8f30ec71 Add support for running smart-turn-v2 locally 2025-07-18 12:48:06 +01:00
antonyesk601
688031efd6 fix: use undeclared variable _preinitialized. fix: double send of start frame 2025-07-18 08:23:04 +00:00
kompfner
a74a935ca0 Merge pull request #1910 from matejmarinko-soniox/main
Add Soniox STT service integration
2025-07-17 09:29:07 -04:00
antonyesk601
0f9e69d3c7 feat: Add Simli Trinity models support to pipecat 2025-07-17 11:55:40 +00:00
padillamt
f3984aec33 inworld: added (empty) requirements for Inworld to be explicit reg dependencies 2025-07-16 13:21:32 -07:00
Paul Kompfner
7cfd56699b Fixed an issue in GoogleLLMContext where it would inject the system_message as a "user" message into cases where it was not meant to; it was only meant to do that when there were no "regular" (non-function-call) messages in the context, to ensure that inference would run properly. 2025-07-16 16:07:53 -04:00
Matej Marinko
cb984237a7 Fix lint error 2025-07-16 16:54:28 +02:00
Matej Marinko
c969fdddb9 Rename and simplify VAD finalization parameter usage 2025-07-16 09:47:34 +02:00
padillamt
2b76823b01 inworld: added comments to track a few things to confirm 2025-07-15 18:17:30 -07:00
padillamt
ca936bd569 inworld: added Inworld to list of needed credentials 2025-07-15 18:11:50 -07:00
padillamt
c67b779b91 inworld: first commit of Inworld example file for TTS 2025-07-15 17:21:16 -07:00
padillamt
913dba3b74 inworld: class name change 2025-07-15 17:15:57 -07:00
padillamt
384838147a inworld: removed unnecessary code from stop() and cancel() 2025-07-15 16:56:18 -07:00
padillamt
7861b911c0 inworld: first commit of __init__ and tts.py files 2025-07-15 16:50:50 -07:00
Mark Backman
9931ad2ce1 Merge pull request #2199 from Dev-Khant/add-host-support-in-Mem0
Add `host` support in Mem0 Memory
2025-07-15 11:41:15 -07:00
Filipi da Silva Fuchter
fd73feb645 Merge pull request #2201 from pipecat-ai/filipi/stt_issue
Only create the EmulateUserStartedSpeakingFrame if we have received a transcription
2025-07-15 13:56:11 -03:00
Yousif Astarabadi
ee78428a2a formatted 2025-07-14 20:38:28 -07:00
Yousif Astarabadi
ae02249255 mcp_tool_wrapper using FunctionCallParams 2025-07-14 20:31:22 -07:00
Filipi Fuchter
727af2e6fb Only create the EmulateUserStartedSpeakingFrame if we have received a transcription. 2025-07-14 17:38:03 -03:00
Mark Backman
8fd5576879 Merge pull request #2198 from Allenmylath/patch-24
Update app.py
2025-07-14 06:37:42 -07:00
kompfner
1f85dcee7c Merge pull request #2171 from pipecat-ai/pk/aws-strands-demo
Minimal AWS Strands demo
2025-07-14 09:32:16 -04:00
Dev Khant
138890bc5c Add support in Mem0 Memory 2025-07-14 18:08:25 +05:30
Filipi da Silva Fuchter
a094efc9e6 Merge pull request #2196 from pipecat-ai/mb/lmnt-model
LmntTTSService: update the default model to blizzard
2025-07-14 09:15:17 -03:00
allenmylath
1f9e2fdecc Update app.py
misleading comment. no endpoints.py
2025-07-14 14:02:35 +05:30
Mark Backman
4a2b4660bc LmntTTSService: update the default model to blizzard 2025-07-13 10:54:43 -07:00
Mark Backman
b3ac90015a Merge pull request #2195 from Trinary-Projects/transformers_ver_patch
Update transformers dep. to >=4.48.0 for Ultravox
2025-07-11 23:31:47 -07:00
Jaideep
2fe06f0a4e Update pyproject.toml 2025-07-12 11:34:45 +05:30
Mark Backman
1836a7484e Merge pull request #2193 from pipecat-ai/mb/changelog-0.0.76
Prepare changelog for 0.0.76 release
2025-07-11 16:15:34 -07:00
Mark Backman
25a5c5aaab Prepare changelog for 0.0.76 release 2025-07-11 16:08:08 -07:00
mattie ruth backman
24694e2558 Changelog entry 2025-07-11 14:30:12 -07:00
mattie ruth backman
2325edd9ba Add a text entry box to the simple-chatbot example 2025-07-11 14:30:12 -07:00
mattie ruth backman
fad5713ade Fix append-to-context function call 2025-07-11 14:30:12 -07:00
Paul Kompfner
fe8573322f AWS Strands demos 2025-07-11 16:42:27 -04:00
Mark Backman
06c1255abe fix: use a different aggregation timeout for emulated user speech (#2185)
* fix: use a different aggregation timeout for emulated user speech

* Add SpeechControlParamsFrame

* Update test_context_aggregator tests
2025-07-11 16:33:44 -04:00
Mark Backman
f108a67635 Merge pull request #2189 from pipecat-ai/mb/numpy-version-bump
Update numpy, transformers to support newer versions
2025-07-11 12:02:02 -07:00
Mark Backman
bf580d061d Update numpy, transformers to support newer versions 2025-07-11 11:58:31 -07:00
Filipi da Silva Fuchter
b005bd7b98 Merge pull request #2184 from pipecat-ai/filipi/twilio_issue
Fixing an issue where Pipecat was not receiving the user's audio
2025-07-11 15:32:28 -03:00
Filipi Fuchter
75f8baab33 Mentioning the fixes in the changelog. 2025-07-11 11:56:16 -03:00
Matej Marinko
5c3fb73cef Rename example 2025-07-11 16:07:24 +02:00
Filipi Fuchter
5c3f4180b9 Refactored VAD analyzer to process multiple audio frames in a single iteration if needed. 2025-07-11 10:59:32 -03:00
Mark Backman
6cd6e7ceed Merge pull request #2186 from pipecat-ai/mb/fix-pre-commit-config
Update .pre-commit-config.yaml to use pyproject.toml linting rules
2025-07-11 06:34:01 -07:00
Filipi Fuchter
1a146c2a64 Not serializing a JSON in case we have no audio. 2025-07-11 10:15:09 -03:00
Filipi Fuchter
eaeb9e6efa Not creating InputAudioRawFrame in case we don't have bytes. Fixed for Pilvo, Exotel and Telnyx. 2025-07-11 09:51:38 -03:00
Matej Marinko
2e84c91748 Remove outdated parameter 2025-07-11 08:52:39 +02:00
Matej Marinko
650d45c1f4 Use single sample rate parameter 2025-07-11 08:27:06 +02:00
Filipi Fuchter
f4f65024ef Refactoring the test client to use the new version of the Pipecat Client SDK. 2025-07-10 21:57:25 -03:00
Filipi Fuchter
1200aa4fb8 Not creating InputAudioRawFrame in case we don't have bytes. 2025-07-10 21:56:34 -03:00
Filipi da Silva Fuchter
6762363685 Merge pull request #2183 from pipecat-ai/filipi/parallel_pipeline_issue
Fixed an issue in ParallelPipeline that caused errors when attempting to drain the queues.
2025-07-10 21:51:04 -03:00
Filipi Fuchter
b2ead325c4 Fixed an issue in ParallelPipeline that caused errors when attempting to drain the queues. 2025-07-10 21:50:35 -03:00
Mark Backman
4e24b915cc Update .pre-commit-config.yaml to use pyproject.toml linting rules 2025-07-10 16:10:27 -07:00
kompfner
b610ee26ba Merge pull request #2181 from pipecat-ai/pk/fix-aws-nova-sonic-pipeline-freeze
Fix a pipeline freeze when using AWS Nova Sonic. The freeze occurs if…
2025-07-10 16:30:55 -04:00
Paul Kompfner
2b867f1613 Fix a pipeline freeze when using AWS Nova Sonic. The freeze occurs if the user starts speaking before we've finished sending the "trigger " audio (AWS Nova Sonic can only start speaking in response to a user utterance, so we have a simulated user utterance to "trigger" the bot speaking without the user having actually spoken first). 2025-07-10 15:57:05 -04:00
Mark Backman
7b8fe565c7 Merge pull request #2182 from pipecat-ai/mb/run-example-usage
run.py: Add example usage to the module docstring
2025-07-10 12:48:29 -07:00
Mark Backman
a246862910 run.py: Add example usage to the module docstring 2025-07-10 11:41:49 -07:00
Filipi da Silva Fuchter
106809f3fd Merge pull request #2166 from carolin-tavus/remove-persona-microphone-check
feat: Remove persona microphone check
2025-07-10 15:28:35 -03:00
carolin-tavus
f0d8499f7e feat: avoid checking microphone enabled 2025-07-10 09:40:27 +00:00
Mark Backman
332ca3d55e Merge pull request #2177 from pipecat-ai/mb/fix-ruff-improvements
Make fix-ruff.sh more flexible, use pyproject rules
2025-07-09 12:33:05 -07:00
Mark Backman
a48f5d5796 Make fix-ruff.sh more flexible, use pyproject rules 2025-07-09 11:48:17 -07:00
Mark Backman
f04f047428 Merge pull request #2176 from pipecat-ai/mb/pre-commit-config
Add docstring checking to .pre-commit-config.yaml
2025-07-09 11:47:25 -07:00
Mark Backman
4e61fd33ea Add docstring checking to .pre-commit-config.yaml 2025-07-09 11:18:40 -07:00
Matej Marinko
61ac77be72 Update docs 2025-07-09 11:59:45 +02:00
Matej Marinko
c093eb5b63 Move config to main file 2025-07-09 10:20:37 +02:00
Matej Marinko
98e24131bd Send raw result 2025-07-09 09:59:04 +02:00
Matej Marinko
7becce9e8c Add transcript tracing 2025-07-09 09:37:58 +02:00
Matej Marinko
3cdaeb719a Update examples to new format 2025-07-09 09:28:43 +02:00
Matej Marinko
8daaea5969 Minor code cleanup 2025-07-09 09:03:02 +02:00
matejmarinko-soniox
dc47516e14 Update src/pipecat/services/soniox/config.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-09 08:04:59 +02:00
Mark Backman
0fcc4f822f Merge pull request #2173 from captaincaius/fix-nextjs-webhook-example-null-check
fix nextjs webhook example num_endpoints null check
2025-07-08 14:10:16 -07:00
Captain Caius
c0ed061ff5 fix nextjs webhook example num_endpoints null check 2025-07-08 13:40:26 -07:00
Mark Backman
d98b6b418d Release prep for 0.0.75 (#2169)
* Update CHANGELOG for 0.0.75 release

* Add RTVI changes to CHANGELOG

* Update CHANGELOG, add deprecation directives to rtvi.py

---------

Co-authored-by: mattie ruth backman <mattieruth@gmail.com>
2025-07-08 15:39:44 -04:00
Mark Backman
deea29b5e8 Merge pull request #2170 from pipecat-ai/mb/update-packages-0.0.75
Package updates to run the release evals
2025-07-08 12:02:22 -07:00
Mark Backman
0bdbc83ed9 Package updates to run the release evals 2025-07-08 11:39:49 -07:00
Mark Backman
6c591f0990 Merge pull request #2167 from pipecat-ai/mb/fix-riva-watchdog
RivaSTTService: reset the watchdog in an async function
2025-07-08 11:29:44 -07:00
Mark Backman
b55b9c257b RivaSTTService: remove reset_watchdog, which is handled in the WatchdogQueue already 2025-07-08 11:19:47 -07:00
Mark Backman
5156c21d14 Merge pull request #2168 from pipecat-ai/mb/fix-neuophonic-tts
Fix: NeuphonicTTSService to use latest websocket API
2025-07-08 11:17:58 -07:00
Mark Backman
a9d824753b Fix: NeuphonicTTSService to use latest websocket API 2025-07-08 11:08:08 -07:00
Filipi da Silva Fuchter
3c6a208101 Merge pull request #2148 from pipecat-ai/filipi/aws_bedrock
Refactoring AWSBedrockLLMService to work async
2025-07-08 12:14:28 -03:00
Mark Backman
b1032a1ca4 Merge pull request #2151 from pipecat-ai/mb/ollama-kwargs
OLLamaLLMService: Pass kwargs
2025-07-08 07:21:09 -07:00
Mark Backman
931f34fccd OLLamaLLMService: Pass kwargs 2025-07-08 07:17:18 -07:00
Mark Backman
f2509adec1 Merge pull request #2162 from pipecat-ai/mb/tts-service-aggregate-sentences
TTS services: Add aggregate_sentences arg
2025-07-08 07:14:58 -07:00
Filipi Fuchter
285b82eb65 Mentioning the AWSBedrockLLMService and AWSPollyTTSService refactors in the changelog. 2025-07-08 07:30:30 -03:00
Filipi Fuchter
74da197304 Refactored AWSBedrockLLMService and AWSPollyTTSService to work asynchronously using aioboto3 instead of the boto3 library. 2025-07-08 07:28:23 -03:00
Matej Marinko
0f727248d2 Merge branch 'main' of github.com:pipecat-ai/pipecat 2025-07-08 08:20:10 +02:00
mattie ruth backman
a6de16f92f Bump all client dependencies to use client-js/react/transports 1.0.0 2025-07-07 15:56:08 -07:00
mattie ruth backman
fc09854d7f fix cam light always on 2025-07-07 15:56:08 -07:00
mattie ruth backman
2959029151 PR Review fixes 2025-07-07 15:56:08 -07:00
mattie ruth backman
e590441b7b Add support for about info in ready messages and add deprecation comments to deprecated types 2025-07-07 15:56:08 -07:00
mattie ruth backman
dc41ec7cb1 Updated all examples with clients to use the new PipecatClient 2025-07-07 15:56:08 -07:00
mattie ruth backman
43049c865c Add support for new RTVI client message protocol: handling and responding 2025-07-07 15:56:08 -07:00
mattie ruth backman
c4a9fc7f88 video-transport typescript formatting 2025-07-07 15:56:08 -07:00
mattie ruth backman
faf4026cf4 Add device controls to the simple chatbot example 2025-07-07 15:56:08 -07:00
Mark Backman
f53f45a6cd TTS services: Add aggregate_sentences arg 2025-07-07 15:38:31 -07:00
Mark Backman
e04e876f44 Merge pull request #2156 from shahrukhx01/add-additional-whisper-models
WhisperSTTService: Add additional whisper model variants
2025-07-07 12:53:06 -07:00
shahrukhx01
a84e7e30da WhisperSTTService: Add additional whisper model variants 2025-07-07 21:43:48 +02:00
Mark Backman
6eed6ff779 Merge pull request #2147 from pipecat-ai/mb/user-idle-long-function-call
UserIdleProcessor: Account for function calls in progress
2025-07-04 14:11:16 -07:00
Mark Backman
1375211610 UserIdleProcessor: Account for function calls in progress 2025-07-04 14:05:05 -07:00
Mark Backman
4e9369a702 Merge pull request #2149 from pipecat-ai/mb/twilio-hang-up-handling 2025-07-04 12:44:17 -07:00
Mark Backman
f9e8748a96 TwilioFrameSerializer: Handle user hanging up before the serializer 2025-07-04 09:42:16 -07:00
Filipi da Silva Fuchter
20d6bf267a Merge pull request #2146 from pipecat-ai/remove_gemini_duplicated_code
Removing duplicated code inside Gemini.
2025-07-04 11:59:10 -03:00
Filipi Fuchter
b573f9dab2 Removing duplicated code inside Gemini. 2025-07-04 10:57:53 -03:00
Pete
7ed4fe50d4 Update gemini.py
-FunctionCallFromLLM
-Delete duplicate Gemini imports
2025-07-03 19:39:44 -04:00
Pete
6f66ec1727 Update gemini.py
tab indentation fix
2025-07-03 18:55:21 -04:00
Pete
c7e758fc36 Merge branch 'main' into groundingMetadata 2025-07-03 18:47:47 -04:00
Pete
14c22234bb Fix parameter name consistency in parse_server_event function
- Change function body to use 'str' parameter consistently
- Matches pattern used in OpenAI Realtime Beta service
- Fixes bug where parameter was named 'str' but body used 'message_str'
- Maintains consistency with existing codebase patterns
2025-07-03 18:02:24 -04:00
Pete
d565e9ae53 Update grounding metadata example with final refinements
- Reorganize imports and transport_params structure
- Remove copyright header for consistency
- Enhance grounding metadata logging with better formatting
- Remove unnecessary PipelineParams configuration
- Update message content formatting

Completes incorporation of draft PR #2121 changes
2025-07-03 17:53:55 -04:00
Pete
4951c97eab Clean up verbose logging in grounding metadata implementation
- Remove debug logging from grounding metadata event handlers
- Simplify logging in _process_grounding_metadata method
- Clean up example file logging for better readability
- Remove verbose event parsing comments

Based on suggestions from draft PR #2121
2025-07-03 17:49:27 -04:00
Pete
9b38f3e2fa Delete examples/foundational/26f-gemini-multimodal-live-files-api.py 2025-07-03 17:15:18 -04:00
Mark Backman
dbc76389d8 Merge pull request #2140 from pipecat-ai/mb/fix-26-imports
Fix: missing import in 26f foundational example
2025-07-03 14:12:54 -07:00
Aleix Conchillo Flaqué
c27f838444 Merge pull request #2124 from pipecat-ai/aleix/frame-processor-no-push-queue
FrameProcessor: remove unnecessary push task
2025-07-03 14:03:05 -07:00
Aleix Conchillo Flaqué
ce84485e26 Merge pull request #2142 from pipecat-ai/aleix/publish-workflow-message
github: update publish message to make it clear
2025-07-03 14:02:51 -07:00
Mark Backman
6cf254e2f9 Fix: missing import in 26f foundational example, update twilio transport_params to FastAPIWebsocketParams 2025-07-03 13:58:18 -07:00
Aleix Conchillo Flaqué
02b63c28a5 FrameProcessor: remove unnecessary push task
When we call `FrameProcessor.push_frame()` we end up calling
`FrameProcessor.queue_frame()` on the next or previous processor which already
uses the input queue and guarantees frame ordering. So, there's no need to have
a two queues next to each other.
2025-07-03 13:57:32 -07:00
Aleix Conchillo Flaqué
57c6ce7ffa github: update publish message to make it clear 2025-07-03 13:55:02 -07:00
Aleix Conchillo Flaqué
2f3272ea2f Merge pull request #2135 from pipecat-ai/aleix/pipecat-0.0.74
update CHANGELOG for 0.0.74
2025-07-03 13:46:00 -07:00
Aleix Conchillo Flaqué
f5c2d57e4b update CHANGELOG for 0.0.74 2025-07-03 13:44:21 -07:00
Aleix Conchillo Flaqué
baa878272d scripts(evals): added 07a-interruptible-speechmatics.py 2025-07-03 13:44:21 -07:00
Aleix Conchillo Flaqué
093285868e scripts(evals): update timeout back to 90 seconds 2025-07-03 13:37:17 -07:00
Filipi da Silva Fuchter
6c9d058ec2 Merge pull request #2139 from pipecat-ai/filipi/changelog_improvements
Mentioning the SpeechmaticsSTTService in the changelog.
2025-07-03 17:36:55 -03:00
Filipi Fuchter
5df7be6892 Mentioning the SpeechmaticsSTTService in the changelog. 2025-07-03 17:35:30 -03:00
Mark Backman
2deca816ae Merge pull request #2137 from pipecat-ai/mb/fish-audio-normalize
FishAudioTTSService: arg cleanup, add new InputParam and arg
2025-07-03 13:29:14 -07:00
Mark Backman
b8d2fceced Merge pull request #2138 from pipecat-ai/mb/fix-google-llm-import-order
GoogleLLMService: Linting fixes
2025-07-03 13:26:32 -07:00
Sam Sykes
7596d71460 Speechmatics STT + multi-speaker conversations (#2036)
* initial config

* skeleton

* Added a README (to be added to).

* Payloads coming from the ASR.

* doc update

* handle the partials and finals

* enable diarization in the example

* support sending messages to pipecat pipeline

* requirements fix in README

* updated example (with amusement)

* updated example to match master

* updated docs

* support for diarization tags

* logic fix for wrapper

* Use an internal SpeechFrame for speaker_id (not user_id).

* only include speaker tags on finalised transcript (as this may skew end of utterance detection)

* updated docs

* correction to docs and updated example

* updated requirement

* Fix for using default EU server.

* Updates from PR comments.

* Refactor based on comments in the original PR.

Primary focus on documentation, naming conventions and how `user_id` is used.

* Check for SMX installed when importing.

* Variable name change

* Comment correction.

* Support for Esporanto and Uyghur

* Impoved language support

* function name change

* Locale fix

* intercept

* interim changes

* pass the pipeline task to the module for adding events to the top of the pipeline

* logging for the pipeline

* Reduce timeout for content aggregator.

* staged update

* testing with Azure

* Updated context (Azure was dropping punctuation) and using better ElevenLabs model.

* Updated to RT 0.3.0 and use OpenAI (not Azure).

* Missing OpenAI import; parameter name change for output locale validation.

* Revert to `0.2.0` of RT SDK.

* fix for assignment of `output_locale_code`.

* update Speechmatics library to 0.3.1

* new transcription example

* updated asyncio task handling

* Updated doc strings

* enable OpenTelemetry logging

* removed import from stt for __init__

* updated examples and default values

* updated examples

* prevent lock up when closing the STT connection
2025-07-03 17:25:13 -03:00
Mark Backman
096067b097 GoogleLLMService: Linting fixes 2025-07-03 13:23:13 -07:00
Mark Backman
ec09505f6b FishAudioTTSService: Add normalize as InputParam, model_id as arg 2025-07-03 13:14:15 -07:00
Mark Backman
251ea756c8 FishTTSService: deprecate model, add reference_id 2025-07-03 12:56:24 -07:00
Aleix Conchillo Flaqué
8f6544efe2 Merge pull request #2133 from pipecat-ai/vp-changelog-fileapi
docs: add changelog line for gemini files api
2025-07-03 11:13:02 -07:00
otaqwawi
6045a8ad8c Add option to change the base URL for Google Generative AI. (#2113)
* Add option to change the base URL for Google Generative AI.
This would be useful to support private instance or gateway of the API

* fix: add proper type hints for http_options in Google LLM service
2025-07-03 11:12:35 -07:00
Aleix Conchillo Flaqué
b184d62634 Merge pull request #2134 from pipecat-ai/aleix/evals-cancel-expired-tasks
cancel expire evals tasks
2025-07-03 10:07:27 -07:00
Aleix Conchillo Flaqué
1a8d512abb scripts(evals): make sure we cancel pending tasks after timeout 2025-07-03 10:01:42 -07:00
vipyne
a62be8ea32 docs: add changelog line for gemini files api 2025-07-03 11:44:34 -05:00
Mark Backman
c230d94ff0 Merge pull request #2125 from pipecat-ai/mb/deprecate-handle-function-call-start
Add docs deprecation for handle_function_call_start
2025-07-03 12:27:17 -04:00
Aleix Conchillo Flaqué
e7b02773f5 Merge pull request #2131 from pipecat-ai/aleix/dtmf-aggregator-dangling-tasks
DtmfAggregator: cancel interruption task to avoid a dangling task
2025-07-03 08:34:50 -07:00
Aleix Conchillo Flaqué
ed83248a6b Merge pull request #2130 from pipecat-ai/aleix/pipeline-task-cancel-queue
PipelineTask: cancel idle queue before cancelling task
2025-07-03 08:32:31 -07:00
Aleix Conchillo Flaqué
af8b4901d4 DtmfAggregator: cancel interruption task to avoid a dangling task 2025-07-03 08:18:48 -07:00
Aleix Conchillo Flaqué
64c8230960 PipelineTask: cancel idle queue before cancelling task 2025-07-03 08:18:21 -07:00
Aleix Conchillo Flaqué
bf664534cc PipelineTask: cancel idle queue before cancelling task 2025-07-03 08:15:31 -07:00
Filipi da Silva Fuchter
274a04e535 Merge pull request #2129 from carolin-tavus/carolin-tavus/add-persona-validation
Add persona validation (check that microphone is enabled)
2025-07-03 11:49:42 -03:00
carolin-tavus
cb81f3d50e format 2025-07-03 14:38:20 +00:00
carolin-tavus
30a3b24287 Add persona validation (check that microphone is enabled) 2025-07-03 14:04:04 +00:00
Filipi da Silva Fuchter
8aacf71956 Merge pull request #1623 from phamtrung0633/victor/azure-tts-interruption-fix
Azure TTS fixed by clearing the audio queue before synthesizing the next text
2025-07-03 10:51:54 -03:00
Victor
72d503d3a3 Azure TTS fixed by clearing the audio queue before synthesizing the next text 2025-07-03 10:48:26 -03:00
Aleix Conchillo Flaqué
453a904290 Merge pull request #2123 from pipecat-ai/aleix/dev-requirements-25-07-02
update dev-requirements (dependabot)
2025-07-02 23:00:40 -07:00
Mark Backman
368bff4fb4 Merge pull request #2101 from pipecat-ai/mb/fix-websocket-example-dir
fix: remove javascript directory from the websocket README
2025-07-02 22:55:47 -04:00
Mark Backman
4ae045d704 Add docs deprecation for handle_function_call_start 2025-07-02 19:53:48 -07:00
Mark Backman
8c71939425 Merge pull request #2122 from pipecat-ai/mb/deprecation-docstrings
Add deprecation directives, add indexing, only autodoc members
2025-07-02 21:31:02 -04:00
Aleix Conchillo Flaqué
a437c2d365 update examples (dependabot) 2025-07-02 16:33:24 -07:00
Aleix Conchillo Flaqué
a1784e3237 update dev-requirements (dependabot) 2025-07-02 16:09:13 -07:00
Mark Backman
abee0f853c Add deprecation directives, add indexing, only autodoc members 2025-07-02 15:44:02 -07:00
Aleix Conchillo Flaqué
e9d358ed17 Merge pull request #2119 from pipecat-ai/aleix/llm-messages-append-update-run-llm
add run_llm to LLMMessagesAppendFrame and LLMMessagesUpdateFrame
2025-07-02 13:53:36 -07:00
Aleix Conchillo Flaqué
c5d54d06bb add run_llm to LLMMessagesAppendFrame and LLMMessagesUpdateFrame 2025-07-02 13:53:13 -07:00
Filipi da Silva Fuchter
c16eed7ca2 Merge pull request #2091 from pipecat-ai/filipi/sample_rate
Creating a new stream resampler which avoids clicks.
2025-07-02 16:22:46 -03:00
Filipi Fuchter
76388a10b5 Deprecating the create_default_resampler and adding the changelog. 2025-07-02 16:20:58 -03:00
Filipi Fuchter
38bcc033a2 Improving the docs about when to use: SOXRAudioResampler x SOXRStreamAudioResampler 2025-07-02 16:20:48 -03:00
Filipi Fuchter
5af563cd91 Configured the services to use create_stream_resampler instead of create_default_resampler 2025-07-02 16:20:34 -03:00
Filipi Fuchter
3de271161c Fixing the ruff script to also try to fix docstrings. 2025-07-02 16:19:57 -03:00
Filipi Fuchter
c19f9bc43a Creating a new stream resampler which avoids clicks. 2025-07-02 16:19:47 -03:00
Mark Backman
ef85d245ed Merge pull request #2120 from haayhappen/patch-1
Update README.md
2025-07-02 15:18:28 -04:00
Fynn Merlevede
25749bd4c0 Update README.md
fix: use correct protocol in READme
2025-07-02 20:57:38 +02:00
Mark Backman
e19c5464fe Merge pull request #2114 from pipecat-ai/mb/bump-google-genai-version
Upgrade google-genai version to 1.24.0
2025-07-02 14:25:29 -04:00
Mark Backman
5c2ea3b804 Upgrade google-genai version to 1.24.0 2025-07-02 11:18:37 -07:00
Aleix Conchillo Flaqué
c27348d470 Merge pull request #2118 from pipecat-ai/aleix/daily-python-0.19.4
pyproject: update daily-python to 0.19.4
2025-07-02 10:38:54 -07:00
Aleix Conchillo Flaqué
de5f9c9217 pyproject: update daily-python to 0.19.4 2025-07-02 09:51:36 -07:00
Aleix Conchillo Flaqué
f9086ee3a2 Merge pull request #2110 from pipecat-ai/aleix/daily-add-virtual-speaker-support
DailyTransport: allow receiving audio in a single track
2025-07-02 09:50:02 -07:00
Vanessa Pyne
43298a9026 Merge pull request #2077 from yousifa/mcp-http-gemini-support
Mcp http gemini support
2025-07-02 11:47:25 -05:00
Vanessa Pyne
d80e228c6f Merge branch 'main' into mcp-http-gemini-support 2025-07-02 11:47:18 -05:00
Mark Backman
2902362886 Merge pull request #2115 from pipecat-ai/mb/docstring-cleanup
Docstring cleanup, fix missing examples imports
2025-07-02 11:35:11 -04:00
Mark Backman
1cd303ad7f Merge pull request #2090 from pipecat-ai/mb/silero-np-error
Remove redundant import and global in SileroOnnxModel
2025-07-02 11:28:11 -04:00
Mark Backman
f590a476e7 Gemini Live fixes, plus additional docstrings 2025-07-02 08:27:23 -07:00
Mark Backman
e71cb3ba68 Docstring cleanup, fix missing examples imports 2025-07-02 08:27:23 -07:00
Filipi da Silva Fuchter
510a9af2e5 Merge pull request #2116 from pipecat-ai/filipi/fix_ios_chatbot_demo
Fixed an issue to disconnect the iOS chatbot demo.
2025-07-02 12:13:51 -03:00
Filipi Fuchter
5328f84df4 Fixed an issue to disconnect the iOS chatbot demo. 2025-07-02 12:06:15 -03:00
Yousif Astarabadi
18817fd81b added docstring in public GeminiFileAPI module 2025-07-02 00:09:48 -07:00
Yousif Astarabadi
4bcc536fd2 added arg description in docstring for gemini live init 2025-07-02 00:03:27 -07:00
Yousif Astarabadi
1ab2ddd317 fix lint error 2025-07-01 23:55:34 -07:00
Yousif
09aa168840 Merge branch 'pipecat-ai:main' into mcp-http-gemini-support 2025-07-01 23:54:42 -07:00
Vanessa Pyne
05753fb207 Merge pull request #1786 from getchannel/main
Add File API to GeminiMultimodalLive
2025-07-01 20:29:12 -05:00
Pete
715e3f8543 Merge branch 'pipecat-ai:main' into main 2025-07-01 20:42:28 -04:00
Pete
9c9d4b35a4 remove audio_transcriber from gemini.py
unecessary import removed.
2025-07-01 20:36:54 -04:00
getchannel
2ee935f784 Update gemini.py 2025-07-01 20:31:58 -04:00
Aleix Conchillo Flaqué
58aedc88a4 DailyTransport: allow receiving audio in a single track 2025-07-01 17:29:10 -07:00
getchannel
0e60385871 add FileAPI to gemini.py 2025-07-01 20:14:31 -04:00
Mark Backman
a4188f7986 Merge pull request #2103 from pipecat-ai/mb/add-user-id-to-transcript
Add user_id to transcription frames
2025-07-01 18:28:12 -04:00
vipyne
c7cbfe7a4f remove grounding metadata commits 2025-07-01 17:21:38 -05:00
vipyne
f1c9f5040b Update examples/foundational/26f-gemini-multimodal-live-files-api.py 2025-07-01 16:27:25 -05:00
vipyne
79e51051c7 New lint rules and remove unused example file 2025-07-01 16:27:25 -05:00
Pete
a63d0da528 Update gemini.py 2025-07-01 16:27:25 -05:00
getchannel
4fd8df208f Add groundingMetadata events.py 2025-07-01 16:27:25 -05:00
getchannel
44d3bd30fa Add groundingMetadata and logging gemini.py 2025-07-01 16:27:25 -05:00
getchannel
6e6e932370 Create 26g-gemini-multimodal-live-groundingMetadata.py 2025-07-01 16:27:25 -05:00
getchannel
baccf50417 update correct upload endpoint file_api.py 2025-07-01 16:27:25 -05:00
getchannel
7b1071b30d Create 26f-gemini-multimodal-live-files-api.py
This is an example to test usage of the Files API integration. Specifically with the Gemini Multimodal Live Service.
2025-07-01 16:27:25 -05:00
getchannel
bd7ca94196 Update gemini.py 2025-07-01 16:27:25 -05:00
getchannel
1ec1aa76e9 Rename file_api to file_api.py
added proper .py to file name.
2025-07-01 16:27:25 -05:00
getchannel
77c369c3c7 add file_api __init__.py 2025-07-01 16:27:25 -05:00
getchannel
9171d4b040 add FileData class events.py 2025-07-01 16:27:25 -05:00
getchannel
e02b95fca5 Create file_api 2025-07-01 16:27:25 -05:00
getchannel
d45a07b5e5 add FileAPI to gemini.py 2025-07-01 16:27:25 -05:00
Mark Backman
0cdcfcee8d Remove redundant import and global in SileroOnnxModel 2025-07-01 13:29:47 -07:00
Mark Backman
324546b4e7 Merge pull request #2098 from StrongMind/aws-session-token
Add support for session token in AWS Nova Sonic service
2025-07-01 16:25:38 -04:00
Filipi da Silva Fuchter
c8ee67a636 Merge pull request #2085 from pipecat-ai/filipi/freeze-test-python-3.10
Fixing pipeline freeze when using Python 3.10
2025-07-01 17:17:38 -03:00
Filipi Fuchter
b87c57c951 Adding missing docstring to the watchdog event 2025-07-01 17:12:18 -03:00
Filipi Fuchter
721f662bbe Making cancel sentinel classes private 2025-07-01 17:09:05 -03:00
Filipi Fuchter
fccd48bfff Fixing pipeline freeze when using Python 3.10 2025-07-01 17:05:18 -03:00
Filipi Fuchter
5310d903ec Adding the requirements and needed variables for the freeze-test example. 2025-07-01 17:04:27 -03:00
Mark Backman
8cbce555e4 Add user_id to stt_traced decorator 2025-07-01 13:01:48 -07:00
Mark Backman
f6112713e8 Add user_id to TranscriptionFrame and InterimTranscriptionFrame pushed by STTServices 2025-07-01 12:59:20 -07:00
Mark Backman
cc637f4dea Clean up docstrings after DirectFunction merge (#2105)
* Add missing import for FunctionCallParams

* Update docstrings in direct_function

* Docstring fixes for run.py

* Remove unused imports in llm_service

* Add missing docstrings to llm_service

* Remove FunctionCallParams import

* Wording improvements

* Type checking for FunctionCallParams
2025-07-01 15:22:30 -04:00
kompfner
7f76a14c54 Merge pull request #2104 from pipecat-ai/pk/changelog-fix
Whoops—fix mistake in CHANGELOG (`FlowsFunctionSchema` -> `FunctionSc…
2025-07-01 15:06:14 -04:00
Yousif Astarabadi
58675f4d5a renamed clean schema to alternate schema 2025-07-01 11:50:12 -07:00
Paul Kompfner
d50e6db312 Whoops—fix mistake in CHANGELOG (FlowsFunctionSchema -> FunctionSchema) 2025-07-01 14:24:27 -04:00
kompfner
de74284a8e Merge pull request #2051 from pipecat-ai/pk/direct-functions
Implement "direct functions", which allow you to bypass specifying a …
2025-07-01 14:19:33 -04:00
Aleix Conchillo Flaqué
4c9a295b28 Merge pull request #2095 from pipecat-ai/aleix/examples-smallwebrtc-sdp-munging
examples: add --esp32 for SDP munging if host name specified
2025-07-01 09:07:42 -07:00
Mark Backman
0968f36d3e fix: remove javascript directory from the websocket README 2025-07-01 09:51:02 -04:00
Mark Backman
fd570b0377 Update the remaining docstrings, update pre-commit hook, add docstring formatting CI, update CONTRIBUTING with formatting guidance (#2089) 2025-07-01 00:37:04 -04:00
Paul Shippy
68ea5ee570 Add to change log 2025-06-30 17:39:42 -07:00
Paul Shippy
f891140a74 Update sample to take in session token 2025-06-30 17:35:50 -07:00
Paul Shippy
5ed2d7ac2b Add session token option for AWS 2025-06-30 17:31:31 -07:00
Pete
a297e4208e Merge branch 'main' into groundingMetadata 2025-06-30 19:48:55 -04:00
Aleix Conchillo Flaqué
b713527da0 examples: add --esp32 for SDP munging if host name specified 2025-06-30 13:27:52 -07:00
Kwindla Hultman Kramer
224d2cedc8 Merge pull request #2088 from pipecat-ai/khk/gemini-thinking-default
Turn off thinking for Gemini models by default
2025-06-30 10:32:54 -07:00
Kwindla Hultman Kramer
55cfea776f Merge branch 'main' into khk/gemini-thinking-default 2025-06-30 10:32:42 -07:00
Paul Kompfner
d7a2078e0b Added CHANGELOG entry describing "direct" functions 2025-06-30 10:59:36 -04:00
Paul Kompfner
a3e540eb32 Rename examples/foundational/14s-function-calling-direct.py to examples/foundational/14t-function-calling-direct.py, since a new "14s" example was added 2025-06-30 10:44:55 -04:00
Paul Kompfner
e01c20be84 Remove unused import and tweak a comment 2025-06-30 10:36:47 -04:00
Paul Kompfner
ce3ca418c2 Unit tests for "direct" functions 2025-06-30 10:36:47 -04:00
Paul Kompfner
15b9a5faf6 Implement "direct functions", which allow you to bypass specifying a function configuration (as a FunctionSchema or in a provider-specific format) and use the Python function directly. Metadata is gathered automatically from the function signature and docstring. 2025-06-30 10:36:42 -04:00
Kwindla Hultman Kramer
3afa30894f Turn off thinking for Gemini models by default 2025-06-28 12:23:35 -07:00
Mark Backman
0ecfa827e6 Improve docstrings for services and processors (#2087) 2025-06-28 13:39:45 -04:00
Aleix Conchillo Flaqué
e1b0db75eb Merge pull request #2086 from pipecat-ai/aleix/watchdog-coroutine-helper
add watchdog coroutine helper
2025-06-27 11:10:10 -07:00
Aleix Conchillo Flaqué
b0c773189f AWSNovaSonicLLMService: fix error with watchdog_coroutine() 2025-06-27 11:09:40 -07:00
Aleix Conchillo Flaqué
3064326834 utils.asyncio: added watchdog_coroutine() 2025-06-27 11:09:40 -07:00
Mark Backman
c67e50fe34 Merge pull request #2084 from pipecat-ai/mb/update-evals-nova-sonic
Add 40-aws-nova-sonic to release evals list
2025-06-27 09:47:59 -04:00
Mark Backman
9d45e3eca1 Merge pull request #2079 from pipecat-ai/mb/fix-42-incorrect-import
fix: example 42 incorrect import
2025-06-27 09:47:47 -04:00
Mark Backman
43a24d15f6 Add 40-aws-nova-sonic to release evals list 2025-06-27 08:34:39 -04:00
Yousif Astarabadi
cafbda1668 remove openai from mcp run http example 2025-06-26 20:21:07 -07:00
Yousif Astarabadi
86c26fd64c moved needs_mcp_clean_schema to LLMService 2025-06-26 20:09:12 -07:00
Yousif Astarabadi
0c20668008 fixed linter errors 2025-06-26 20:08:26 -07:00
Yousif Astarabadi
92df8dc43c fix formatting 2025-06-26 20:08:23 -07:00
Yousif Astarabadi
9d5f5844b8 clean mcp schema for gemini models, update http mcp example to use gemini 2025-06-26 20:07:54 -07:00
Mark Backman
2cf31884d0 fix: example 42 incorrect import 2025-06-26 21:52:14 -04:00
Aleix Conchillo Flaqué
19354c6f2d Merge pull request #2078 from pipecat-ai/aleix/hotfix-0.0.73
just a quick hotfix for 0.0.73
2025-06-26 17:31:40 -07:00
Aleix Conchillo Flaqué
0b2079ad41 update CHANGELOG for 0.0.73 2025-06-26 17:02:12 -07:00
Aleix Conchillo Flaqué
5f18c3af70 OpenAIRealtimeLLMContext: fix circular dependency 2025-06-26 17:01:45 -07:00
Aleix Conchillo Flaqué
0a40285d43 update FrameProcessor.watchdog_timers_enabled references 2025-06-26 16:26:12 -07:00
Vanessa Pyne
5b1c328541 Merge pull request #2075 from pipecat-ai/vp-mcp-lint
mcp_service: lint
2025-06-26 15:25:39 -05:00
vipyne
37929533af mcp_service: lint 2025-06-26 15:00:20 -05:00
Vanessa Pyne
3b92113680 Merge pull request #2030 from yousifa/mcp-streaming-http
MCPClient streamable_http transport support
2025-06-26 14:57:31 -05:00
Yousif
46b52cb9bb Merge branch 'main' into mcp-streaming-http 2025-06-26 12:30:43 -07:00
Mark Backman
f0bcc9d9ba Add MCPClient docstrings. Removed google specific cleanup, changed example to openai 2025-06-26 12:29:45 -07:00
Yousif Astarabadi
1cac028bfe example using http transport for mcp client 2025-06-26 12:16:35 -07:00
Yousif Astarabadi
4956886819 updated error message with StreamableHttpParameters 2025-06-26 12:16:28 -07:00
Yousif Astarabadi
c720cfc7c7 updated streamablehttp to use StreamableHttpParameters type 2025-06-26 12:16:26 -07:00
Yousif Astarabadi
8fcef5628f added streamablehttp support, bumped mcp version, added additional headers and streamable_http params to MCPClient 2025-06-26 12:16:19 -07:00
Aleix Conchillo Flaqué
c4a72802f0 Merge pull request #2074 from pipecat-ai/aleix/pipecat-0.0.72
update CHANGELOG for 0.0.72
2025-06-26 12:10:14 -07:00
Aleix Conchillo Flaqué
917394803c update CHANGELOG for 0.0.72 2025-06-26 11:42:52 -07:00
Mark Backman
01040ddcdd Merge pull request #2071 from pipecat-ai/mb/services-docstrings-update
Add/update docstrings to LLM services
2025-06-26 14:42:32 -04:00
Aleix Conchillo Flaqué
7947497f7e Merge pull request #2073 from a6kme/patch-1
Start HeartBeat when all processors have processed StartFrame
2025-06-26 11:34:46 -07:00
Aleix Conchillo Flaqué
539ca5856f Merge pull request #2072 from pipecat-ai/aleix/utils-watchdog-cleanup
utils(asyncio): simplify watchdog helpers
2025-06-26 11:29:21 -07:00
Abhishek
89c801f82c Start HeartBeat when all processors have processed StartFrame
Some of the processors like STTService and TTSService don't push StartFrame ahead in the pipeline, unless they have connected with their service providers. This delays StartFrame in downstream processors. 

If we receive HeartBeat frame before StartFrame, we will get AttributeError `'Processor' object has no attribute '_FrameProcessor__input_queue'`. 

Idea is to start HeartBeats after StartFrame has been processed by all the Processors in the pipeline.
2025-06-26 23:28:37 +05:30
Aleix Conchillo Flaqué
3de4f22d34 utils(asyncio): simplify watchdog helpers 2025-06-26 09:40:42 -07:00
Mark Backman
0e4d2be98c Update AzureRealtimeBetaLLMService docstrings 2025-06-26 12:12:00 -04:00
Mark Backman
d8ce108ccd Update OpenAIRealtimeBetaLLMService docstrings 2025-06-26 12:06:47 -04:00
Mark Backman
d123cd4b2b Update GeminiMultimodalLiveLLMService docstrings 2025-06-26 11:47:30 -04:00
Aleix Conchillo Flaqué
4d34aa7cd6 Merge pull request #2069 from pipecat-ai/aleix/utils-asyncio-package
move things to new utils.asyncio package
2025-06-26 08:26:47 -07:00
Aleix Conchillo Flaqué
b860e94582 move things to new utils.asyncio package 2025-06-26 08:24:25 -07:00
Aleix Conchillo Flaqué
9d653e3788 Merge pull request #2068 from pipecat-ai/aleix/task-manager-dont-warn-reset-watchdog
TaskManager: don't warn on reset_watchdog()
2025-06-26 08:23:51 -07:00
Mark Backman
9e518cf2ba Update AWSNovaSonicLLMService docstrings 2025-06-26 11:21:18 -04:00
Mark Backman
2856372ad6 Update TogetherLLMService docstrings 2025-06-26 11:01:35 -04:00
Mark Backman
efbf574613 Update SambaNovaLLMService docstrings 2025-06-26 11:00:40 -04:00
Mark Backman
c018eb2f0e Update QwenLLMService docstrings 2025-06-26 10:57:42 -04:00
Mark Backman
d7bfe54b7c Update PerplexityLLMService docstrings 2025-06-26 10:56:48 -04:00
Mark Backman
137282b7a9 Update OpenRouterLLMService docstrings 2025-06-26 10:53:42 -04:00
Mark Backman
769f8c8f34 Update OpenPipeLLMService docstrings 2025-06-26 10:53:05 -04:00
Mark Backman
8b8a37ae7c Update OLLamaLLMService docstrings 2025-06-26 10:48:19 -04:00
Mark Backman
56e2b006f5 Update NimLLMService docstrings 2025-06-26 10:47:26 -04:00
Mark Backman
79cca05e43 Update GroqLLMService docstrings 2025-06-26 10:46:07 -04:00
Mark Backman
166c8e8e82 Update GrokLLMService docstrings 2025-06-26 10:39:46 -04:00
Mark Backman
9b64d2c325 Update GoogleLLMService docstrings 2025-06-26 10:37:22 -04:00
Mark Backman
03e3e9fae9 Update FireworksLLMService docstrings 2025-06-26 10:28:35 -04:00
Mark Backman
65234ae41a Update DeepSeekLLMService docstrings 2025-06-26 10:27:36 -04:00
Mark Backman
3828df8cf9 Update CerebrasLLMService docstrings 2025-06-26 10:26:42 -04:00
Mark Backman
9cbe85bf99 Update AzureLLMService docstrings 2025-06-26 10:25:17 -04:00
Mark Backman
7bf805b829 Update AWSBedrock docstrings 2025-06-26 10:23:40 -04:00
Mark Backman
990ee436e1 Add Anthropic docstrings 2025-06-26 07:42:22 -04:00
Mark Backman
1cd42066a6 Merge pull request #2067 from pipecat-ai/mb/update-docstrings-for-ref-docs
Update base service class docstrings for better docs auto-generation
2025-06-26 07:07:59 -04:00
Filipi da Silva Fuchter
ba43558049 Merge pull request #2066 from pipecat-ai/filipi/sentry_freeze_test
Enabling watchdog and sentry into the freeze-test
2025-06-26 08:01:51 -03:00
Mark Backman
951c8d34da Add special case handling for STT, TTS, LLM 2025-06-26 00:15:09 -04:00
Mark Backman
ac61139243 Add OpenAI LLM docstrings 2025-06-26 00:06:57 -04:00
Mark Backman
5b8f1fe3e3 Add Cartesia TTS docstrings 2025-06-25 23:50:55 -04:00
Mark Backman
0aa197e4a4 Add docstrings to DeepgramSTTService 2025-06-25 23:36:04 -04:00
Mark Backman
f04e058c96 Programmatically set the copyright date in docs 2025-06-25 23:29:37 -04:00
Mark Backman
6ef2ae12b7 Mock mcp imports 2025-06-25 23:29:37 -04:00
Mark Backman
fe6bbdaefe Skip dataclass attributes to remove duplicate entries 2025-06-25 23:29:37 -04:00
Mark Backman
cc66fddca9 Modify docs auto-gen rules to remove duplicate parameters listing 2025-06-25 23:29:37 -04:00
Mark Backman
04b70ddf13 Add MCPClient docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
bb3bb8d9c6 Improve WebsocketService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
f80f62c7d1 Add VisionService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
2007ae4317 Add ImageGenService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
a1e5a1eff4 Add AIService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
691999b402 Add AIServices docstring 2025-06-25 22:38:11 -04:00
Mark Backman
33f3a4cea1 Add TTSService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
ab1d2dbe6a Add STTService docstrings 2025-06-25 22:27:07 -04:00
Mark Backman
f622b281d0 Make call_start_function a private function in llm_service 2025-06-25 22:23:13 -04:00
Mark Backman
fb12bf9b4c Update LLMService docstrings 2025-06-25 22:23:13 -04:00
Aleix Conchillo Flaqué
27af50087e TaskManager: don't warn on reset_watchdog() 2025-06-25 17:29:45 -07:00
Filipi Fuchter
03502bed52 Enabling watchdog and sentry into the freeze-test 2025-06-25 20:53:30 -03:00
Aleix Conchillo Flaqué
27c7e2d150 Merge pull request #2063 from pipecat-ai/aleix/watchdog-timers-remove-start-watchdog
no need to call start_watchdog() only reset_watchdog()
2025-06-25 16:47:44 -07:00
Aleix Conchillo Flaqué
e81d387971 TaskManager: rely on add_done_callback() 2025-06-25 16:44:20 -07:00
Aleix Conchillo Flaqué
ef1ade3a71 allow enabling watchdog timers per frame processor or task 2025-06-25 16:36:19 -07:00
Aleix Conchillo Flaqué
4f032f5b96 update keepalive times depending on watchdog timers 2025-06-25 15:55:16 -07:00
Aleix Conchillo Flaqué
72cb967780 update CHANGELOG with watchdog timers updates 2025-06-25 15:55:16 -07:00
Aleix Conchillo Flaqué
357934a644 watchdog timers are disabled by default use enable_watchdog_timers 2025-06-25 15:55:16 -07:00
Aleix Conchillo Flaqué
327973657f TaskManager: remove wathcdog timer when main task is done 2025-06-25 11:26:21 -07:00
Aleix Conchillo Flaqué
d2730e6741 GooglSTTService: cleanup request queues 2025-06-25 11:12:32 -07:00
Aleix Conchillo Flaqué
eb5ecab104 no need to call start_watchdog() only reset_watchdog() 2025-06-25 11:12:32 -07:00
Mark Backman
202055a9b8 Merge pull request #2065 from pipecat-ai/mb/fix-configdict-openai-realtime
fix: add missing ConfigDict import in openai_realtime_beta/events
2025-06-25 11:40:35 -04:00
Mark Backman
7034a9e3fd fix: add missing ConfigDict import in openai_realtime_beta/events 2025-06-25 11:32:29 -04:00
Pete
1cf0b35ac1 Merge branch 'main' into groundingMetadata 2025-06-24 22:00:16 -04:00
Filipi da Silva Fuchter
8f7ed12262 Merge pull request #2061 from pipecat-ai/not_force_bot_speaking
Not forcing the bot resume speaking in case we receive no transcription.
2025-06-24 20:57:46 -03:00
Aleix Conchillo Flaqué
96b5320ef9 Merge pull request #2055 from pipecat-ai/aleix/fix-sentry-async
SentryMetrics: send metrics to sentry asynchronously
2025-06-24 16:32:01 -07:00
Filipi Fuchter
d5cd742237 Not forcing the bot resume speaking in case we receive no transcription. 2025-06-24 20:12:49 -03:00
Aleix Conchillo Flaqué
1f1da8942d SentryMetrics: send metrics to sentry asynchronously 2025-06-24 15:56:08 -07:00
Mark Backman
7953e1e9d9 Merge pull request #2054 from pipecat-ai/mb/telnyx-catch-hangup-error
fix: Telnyx, catch error when user has hung up the call first
2025-06-24 18:04:19 -04:00
Mark Backman
d6f7ecc0a3 fix: Telnyx, catch error when user has hung up the call first 2025-06-24 17:28:00 -04:00
Mark Backman
3eed316049 Merge pull request #2020 from snova-jorgep/snova-jorgep/sambanova-integration
Add Sambanova LLM and STT integration
2025-06-24 17:04:24 -04:00
Jorge Piedrahita Ortiz
851cf079c3 Merge branch 'main' into snova-jorgep/sambanova-integration 2025-06-24 16:00:28 -05:00
jhpiedrahitao
dfb0da32a9 fmt 2025-06-24 15:59:40 -05:00
Aleix Conchillo Flaqué
f450da57e5 Merge pull request #2056 from pipecat-ai/khk/fix-22d
Update google libraries used in google audio-in examples
2025-06-24 13:47:59 -07:00
Aleix Conchillo Flaqué
2ec6b6c995 Merge pull request #2060 from pipecat-ai/aleix/watchdog-timeout-secs
FrameProcessor: use watchdog_timeout_secs
2025-06-24 13:36:39 -07:00
Aleix Conchillo Flaqué
53b769a8ec FrameProcessor: use watchdog_timeout_secs 2025-06-24 13:33:47 -07:00
Filipi da Silva Fuchter
4f9adc173a Merge pull request #2004 from pipecat-ai/filipi/pipeline_freeze
Pipeline freeze improvements
2025-06-24 17:20:38 -03:00
Filipi Fuchter
dc4a58877e Fixing merge conflict. 2025-06-24 17:12:40 -03:00
Filipi Fuchter
a6243a6fe7 Merge branch 'main' into filipi/pipeline_freeze
# Conflicts:
#	CHANGELOG.md
#	src/pipecat/pipeline/task.py
#	src/pipecat/processors/frame_processor.py
#	src/pipecat/transports/base_input.py
2025-06-24 17:11:21 -03:00
Aleix Conchillo Flaqué
cf5f1b541a Merge pull request #2049 from pipecat-ai/aleix/introduce-watchdog-timers
introduce watchdog timers
2025-06-24 13:00:57 -07:00
Filipi Fuchter
70e6c48233 Mentioning the fixes in the changelog. 2025-06-24 16:56:46 -03:00
Filipi Fuchter
51f7d14d0a Merge branch 'main' into filipi/pipeline_freeze 2025-06-24 16:44:07 -03:00
Filipi Fuchter
4853d5d1fc Handling the case where user stopped speaking but no new aggregation received. 2025-06-24 16:42:10 -03:00
Aleix Conchillo Flaqué
076a8938f0 add start_watchdog/reset_watchdog to tasks 2025-06-24 11:56:20 -07:00
Aleix Conchillo Flaqué
5a3457ba33 introduce task watchdog timers 2025-06-24 11:56:20 -07:00
Aleix Conchillo Flaqué
2fc224384d Merge pull request #2059 from pipecat-ai/aleix/heartbeatframe-control-frames
HeartbeatFrames are now control frames
2025-06-24 11:55:18 -07:00
Aleix Conchillo Flaqué
a4e6ea5a3f HeartbeatFrames are now control frames 2025-06-24 11:27:39 -07:00
Vanessa Pyne
d3c211f293 Merge pull request #2058 from pipecat-ai/vp-mcp-sse-up
follow up to #1887 - proper MCP SSE support
2025-06-24 13:06:01 -05:00
vipyne
20047c369e mcp: update examples to use SseServerParameter 2025-06-24 12:58:39 -05:00
vipyne
dd1ff237a8 lint mcp_service 2025-06-24 12:58:33 -05:00
Vanessa Pyne
39d80d0b0e Merge pull request #1887 from ezun-kim/feat/mcp-sse-params
Fix SSE server connection handling for MCP client
2025-06-24 12:58:05 -05:00
Kwindla Hultman Kramer
7a48316534 update google libraries used in google audio-in examples 2025-06-24 09:52:04 -07:00
Filipi da Silva Fuchter
031a93ac46 Merge pull request #2053 from pipecat-ai/sentry_dsn_environment_variable
Creating an environment variable for sentry dsn.
2025-06-24 12:10:20 -03:00
Mark Backman
ea6cc1aa95 Merge pull request #2052 from pipecat-ai/mb/11labs-keepalive
Send context_id when available in ElevenLabsTTSService keepalive message
2025-06-24 11:07:07 -04:00
Filipi Fuchter
365260ec44 Creating an environment variable for sentry dsn. 2025-06-24 11:57:14 -03:00
Mark Backman
2eb244c80a Send context_id when available in ElevenLabsTTSService keepalive message 2025-06-24 10:52:49 -04:00
Mark Backman
aee3011d61 Merge pull request #2037 from pipecat-ai/mb/11labs-close-context
Fix: Correctly close the context for ElevenLabsTTSService
2025-06-24 07:44:22 -04:00
Aleix Conchillo Flaqué
40496e7b0f Merge pull request #2034 from pipecat-ai/khk/pause-frames
small fix for processor pause/resume frames
2025-06-23 17:08:41 -07:00
Kwindla Hultman Kramer
6b24f89fa7 small fix for processor pause/resume frames 2025-06-23 16:44:32 -07:00
Filipi Fuchter
2097800042 Allowing to clear the turn analyser 2025-06-23 18:50:37 -03:00
Filipi Fuchter
6739318e68 Forcing user stopped speaking due to timeout to receive audio frame! 2025-06-23 18:50:02 -03:00
Filipi Fuchter
d0bd563d42 Logging the BaseException inside the cancel_task. 2025-06-23 18:48:44 -03:00
Filipi Fuchter
74280829fc Fixed an issue with the FastAPIWebsocketClient to disconnect in case the websocket is already closed. 2025-06-23 18:48:03 -03:00
Filipi Fuchter
3fde8880f2 Fixed a couple of places inside the FrameProcessor where we should not raise the exceptions. 2025-06-23 18:47:54 -03:00
Filipi Fuchter
98d39e0d38 Logging the last 10 frames received in case idle timeout is detected. 2025-06-23 18:47:17 -03:00
Filipi Fuchter
c9cebb5ffe Created an example for testing the bot and try to create freezing conditions. 2025-06-23 18:46:58 -03:00
Mark Backman
f52ac6e99c Merge pull request #1998 from pipecat-ai/mb/fix-38-smart-turn-fal 2025-06-23 17:15:29 -04:00
Mark Backman
787a6b1c6a Merge pull request #2038 from pipecat-ai/mb/openai-realtime-model-update
Update OpenAIRealtimeBetaLLMService model to gpt-4o-realtime-preview-…
2025-06-23 16:30:31 -04:00
Mark Backman
d00a91074e Update OpenAIRealtimeBetaLLMService model to gpt-4o-realtime-preview-2025-06-03 2025-06-23 16:26:42 -04:00
Mark Backman
4e11497a38 Merge pull request #2048 from thibaudbrg/patch-1
Fix missing video_in_enabled in vision bot.py for Moondream template
2025-06-23 16:11:50 -04:00
Tibo
0443d5202a Fix missing video_in_enabled in vision bot.py for Moondream template
The parameter video_in_enabled=True was missing in DailyParams, which prevented image capture 
from working. Without this parameter, UserImageRequestFrame would be sent but no actual image data would be captured from participants.

This fix enables the "Let me take a look" functionality to work as 
intended by allowing the transport to capture video frames for vision processing with Moondream.
2025-06-23 21:17:41 +02:00
Mark Backman
633c25cb13 Merge pull request #2039 from pipecat-ai/mb/remove-lang-validation
OpenAIRealtimeBetaLLMService accepts language for all InputAudioTrans…
2025-06-23 14:41:09 -04:00
jhpiedrahitao
d07f45132f update changelog 2025-06-23 12:54:00 -05:00
jhpiedrahitao
a51280afa6 add 13 and 14 type foundational examples for sambanova iontegration 2025-06-23 12:53:32 -05:00
Jorge Piedrahita Ortiz
be14eb2460 Merge branch 'pipecat-ai:main' into snova-jorgep/sambanova-integration 2025-06-23 12:23:00 -05:00
jhpiedrahitao
e26dbffcbe update sambanova init imports 2025-06-23 12:22:08 -05:00
Mark Backman
59992fd24a Merge pull request #2044 from pipecat-ai/mb/daily-rest-docstring
Add missing arg docstring in DailyRESTHelper
2025-06-23 11:24:44 -04:00
Mark Backman
455362ccaf Merge pull request #2022 from pipecat-ai/mb/turn-tracking-end-cancel-frame
TurnTrackingObserver ends turn upon seeing EndFrame, CancelFrame
2025-06-23 11:24:27 -04:00
Mark Backman
16c0e2460b TurnTrackingObserver ends turn upon seeing EndFrame, CancelFrame 2025-06-23 11:08:51 -04:00
Matej Marinko
c54084b7a4 Fix deadlock on STT service stop 2025-06-23 14:18:29 +02:00
Mark Backman
92246f7125 Add missing arg docstring in DailyRESTHelper 2025-06-22 13:44:59 -04:00
Pete
e3fe040017 Update gemini.py 2025-06-21 14:43:15 -04:00
Pete
ae5e3e2dc4 Merge branch 'main' into groundingMetadata 2025-06-21 12:16:32 -04:00
Pete
77378d2779 Merge branch 'pipecat-ai:main' into groundingMetadata 2025-06-21 12:08:49 -04:00
Pete
4106f0dabe Merge branch 'pipecat-ai:main' into main 2025-06-21 10:54:25 -04:00
Mark Backman
7737335ec9 OpenAIRealtimeBetaLLMService accepts language for all InputAudioTranscription models 2025-06-21 10:08:46 -04:00
Mark Backman
5cc9b7e0d1 Fix: Correctly close the context for ElevenLabsTTSService 2025-06-20 15:47:03 -04:00
Mark Backman
8c6a441064 Merge pull request #2035 from smokyabdulrahman/feat/aws-polly-lexicon-names-support
Support AWS Polly Lexicon Names parameter
2025-06-20 10:03:27 -04:00
Alrahma
fddc058ce2 add CHANGELOG entry 2025-06-20 14:15:24 +01:00
Alrahma
89750086c5 Support AWS Polly Lexicon Names parameter
Documentation reference
[AWS Managing
Lexicons](https://docs.aws.amazon.com/polly/latest/dg/managing-lexicons.html)
2025-06-20 09:47:46 +01:00
Aleix Conchillo Flaqué
e69406c7e2 Merge pull request #2032 from pipecat-ai/aleix/aws-nova-sonic-function-calls
AWSNovaSonicLLMService: fix function calling
2025-06-19 14:42:47 -07:00
Aleix Conchillo Flaqué
878ae42d84 AWSNovaSonicLLMService: fix function calling 2025-06-19 14:26:34 -07:00
Aleix Conchillo Flaqué
d34ebfc126 Merge pull request #2027 from pipecat-ai/aleix/task-on-idle-timeout-repeated
PipelineTask: fix repeated on_idle_timeout
2025-06-19 14:13:10 -07:00
Aleix Conchillo Flaqué
028f7b2d65 PipelineTask: fix repeated on_idle_timeout 2025-06-19 09:14:10 -07:00
Mark Backman
0aa3ec50f2 Merge pull request #2023 from pipecat-ai/mb/allow-interruptions-true
allow_interruptions=True
2025-06-19 10:24:53 -04:00
Mark Backman
9146def21b Update examples to use default allow_interruptions, fixes to align examples 2025-06-19 10:07:32 -04:00
Aleix Conchillo Flaqué
ebb23a5a8c Merge pull request #2024 from pipecat-ai/aleix/audio-buffer-processor-sync-issues
AudioBufferProcessor: treat all streams as intermittent
2025-06-18 18:26:38 -07:00
Aleix Conchillo Flaqué
b118082984 AudioBufferProcessor: treat all streams as intermittent
This fixes an issue with STTMuteFilter that prevents user audio to be pushed
downstream.
2025-06-18 18:23:31 -07:00
Mark Backman
b5c0ac5f25 allow_interruptions=True 2025-06-18 20:33:40 -04:00
Filipi da Silva Fuchter
dc78e874af Merge pull request #2021 from pipecat-ai/gladia_stt_improvements_changelog
Adding the GladiaSTTService improvements in the changelog.
2025-06-18 18:25:36 -03:00
Filipi Fuchter
c30bde0a2b Adding the GladiaSTTService improvements in the changelog. 2025-06-18 16:19:58 -03:00
Filipi da Silva Fuchter
171597fbe9 Merge pull request #1952 from jqueguiner/feat/gladia-auto-reconnect
feat: Enhance GladiaSTTService with reconnection and audio buffer management features
2025-06-18 16:14:58 -03:00
jhpiedrahitao
fae2d272d5 fmt 2025-06-18 10:53:06 -05:00
jhpiedrahitao
03a067d3e6 add sambanova llm and stt 2025-06-18 10:50:42 -05:00
Mark Backman
f5d028f3b3 Merge pull request #2017 from pipecat-ai/mb/fix-11labs-voice-settings
fix: ElevenLabsTTSService voice settings not being sent
2025-06-18 09:56:46 -04:00
Mark Backman
e5b7dbba90 fix: ElevenLabsTTSService voice settings not being sent 2025-06-18 09:49:17 -04:00
Filipi da Silva Fuchter
7ffba1e0b3 Merge pull request #1950 from pipecat-ai/filipi/tavus_custom_tracks
Sending audio to Tavus using custom tracks
2025-06-18 07:57:19 -03:00
Filipi Fuchter
72cdbf0b78 Mentioning the Tavus improvements in the changelog. 2025-06-18 07:46:04 -03:00
Filipi Fuchter
8b4a86f629 Ignoring the audio level when creating the custom tracks. 2025-06-18 07:45:54 -03:00
Filipi Fuchter
fa15e64fc9 Test script that mimics the behavior expected to be supported by Tavus. 2025-06-18 07:45:38 -03:00
Filipi Fuchter
564f064c71 Refactoring TavusVideoService to send audio using WebRTC audio tracks instead of app-messages. 2025-06-18 07:44:51 -03:00
Filipi Fuchter
4062c7afa0 Refactoring TavusTransport to send audio using WebRTC audio tracks instead of app-messages. 2025-06-18 07:44:38 -03:00
Jean-Louis Queguiner
8071c4ba1c Merge branch 'main' into feat/gladia-auto-reconnect 2025-06-18 08:57:21 +02:00
jqueguiner
3d0ffbc832 🐛 (stt.py): handle websocket connection closure gracefully and log warnings
♻️ (stt.py): refactor reconnection logic into a separate method for clarity
 (stt.py): implement exponential backoff for reconnection attempts to improve reliability
2025-06-18 08:52:43 +02:00
Filipi da Silva Fuchter
1cac94bf97 Merge pull request #1925 from pipecat-ai/filipi/websocket_transport_example_twilio
Websocket client web app to test Twilio.
2025-06-17 16:24:18 -03:00
Mark Backman
c94c51d44f Fix: 38-smart-turn-fal 2025-06-17 15:10:52 -04:00
Mark Backman
96958933af Merge pull request #2016 from pipecat-ai/aleix/example-params-allow-async-objects
examples: create transport params async
2025-06-17 15:08:37 -04:00
Filipi Fuchter
2300c2632e Refactoring how we are organizing the twilio chatbot examples and improving the readmes 2025-06-17 16:08:35 -03:00
Filipi Fuchter
cbd0529674 Merge branch 'main' into filipi/websocket_transport_example_twilio 2025-06-17 15:54:31 -03:00
Filipi da Silva Fuchter
5614e35ac4 Merge pull request #2015 from pipecat-ai/bumping_pipecat_required_versions
Bumping pipecat-ai-krisp required version
2025-06-17 15:42:20 -03:00
Aleix Conchillo Flaqué
c11172caba examples: create transport params async 2025-06-17 11:37:42 -07:00
Filipi Fuchter
11b6e409bb Bumping pipecat-ai-krisp required version 2025-06-17 15:22:31 -03:00
Aleix Conchillo Flaqué
3dca95aa3c Merge pull request #2014 from pipecat-ai/aleix/daily-python-0.19.3
update daily-python to 0.19.3
2025-06-17 10:10:23 -07:00
Aleix Conchillo Flaqué
7ddc706434 update daily-python to 0.19.3 2025-06-17 09:30:28 -07:00
Aleix Conchillo Flaqué
20eebb08e9 update CHANGELOG with AWSTranscribeSTTService Polish support 2025-06-16 10:34:56 -07:00
Aleix Conchillo Flaqué
4abf41b85a Merge pull request #2011 from wuodar/wuodar/polish-lang-aws-transcribe
Support polish language in Amazon Transcribe
2025-06-16 10:33:55 -07:00
Aleix Conchillo Flaqué
e426f7ee7c Merge pull request #2012 from pipecat-ai/aleix/frame-pause-resume-frames
FrameProcessor: handle new FrameProcessorPauseFrame/FrameProcessorResumeFrame
2025-06-16 10:32:38 -07:00
Aleix Conchillo Flaqué
14dc6a7984 FrameProcessor: handle new FrameProcessorPauseFrame/FrameProcessorResumeFrame 2025-06-16 10:31:33 -07:00
Mark Backman
e0a24a3f07 Merge pull request #2006 from pipecat-ai/mb/expose-function-calls-in-progress
Expose has_function_calls_in_progress property
2025-06-16 12:49:07 -04:00
Mark Backman
d1bee22d73 Expose has_function_calls_in_progress property 2025-06-16 12:45:16 -04:00
Jon Taylor
d73f7908f2 Merge pull request #2008 from pipecat-ai/khk/groq-audio
fix groq wav file header parsing
2025-06-16 14:09:09 +01:00
Aleix Conchillo Flaqué
a4ea0d2b82 dev-requirements: update pyright 1.1.400 and ruff 0.11.13 2025-06-15 21:05:03 -07:00
Kacper Włodarczyk
e2c15169b8 feat: support polish language in Amazon Transcribe 2025-06-15 21:44:06 +02:00
Kwindla Hultman Kramer
fe16ed3c73 added changelog entry 2025-06-15 10:49:40 -07:00
Filipi Fuchter
80ce097f90 Using relative URL for the websocket. 2025-06-15 10:49:25 -03:00
Filipi Fuchter
eceaf8a46b Making the path to the web client relative 2025-06-14 21:07:15 -03:00
Kwindla Hultman Kramer
1e3fa4a9c7 fix groq wav file header parsing 2025-06-14 17:41:44 -04:00
Pete
2ed1ed6821 Merge branch 'pipecat-ai:main' into main 2025-06-14 16:23:27 -04:00
Filipi da Silva Fuchter
dc640a7591 Merge pull request #2001 from pipecat-ai/filipi/google_stt_reconnection_issue
Fixed an issue with `GoogleSTTService` where it was constantly reconnecting
2025-06-13 08:29:18 -03:00
Filipi Fuchter
1f072d182c Merge branch 'main' into filipi/google_stt_reconnection_issue
# Conflicts:
#	CHANGELOG.md
2025-06-13 08:26:00 -03:00
Mark Backman
1d64e04ed5 Merge pull request #2002 from pipecat-ai/mb/google-fix-ttfb
Fix: GoogleLLMService TTFB
2025-06-12 12:10:01 -04:00
Mark Backman
22f4f0b79e Update 14e example name 2025-06-12 11:45:59 -04:00
Mark Backman
69c63293fb fix: GoogleLLMService TTFB value 2025-06-12 11:43:27 -04:00
Filipi Fuchter
c1db13ceeb Fixed an issue with GoogleSTTService where it was constantly reconnecting before starting to receive audio from the user. 2025-06-12 12:07:33 -03:00
Matej Marinko
6d3a38842d Merge branch 'main' of github.com:pipecat-ai/pipecat 2025-06-12 11:32:38 +02:00
Filipi Fuchter
70eadee0aa Bumping the @pipecat-ai/websocket-transport dependency. 2025-06-11 18:30:16 -03:00
Pete
7360f79413 Merge branch 'pipecat-ai:main' into main 2025-06-11 13:16:19 -04:00
Aleix Conchillo Flaqué
228afe01ed Merge pull request #1993 from pipecat-ai/aleix/pipecat-0.0.71
update CHANGELOG for 0.0.71
2025-06-10 14:42:09 -07:00
Aleix Conchillo Flaqué
61a5154e49 update CHANGELOG for 0.0.71 2025-06-10 14:34:30 -07:00
Sunah Suh
d3df75aaa0 Add additional_span_attributes param to PipelineTask for extra otel… (#1992) 2025-06-10 17:32:24 -04:00
Aleix Conchillo Flaqué
c59180dd6e udpate CHANGELOG 2025-06-10 14:23:02 -07:00
Mark Backman
e4c2310632 Merge pull request #1990 from pipecat-ai/mb/more-cartesia-stt
Add Cartesia STT docs link to README, fix set_model error
2025-06-10 17:19:11 -04:00
Aleix Conchillo Flaqué
e1735a2da1 Merge pull request #1991 from pipecat-ai/aleix/pipecat-0.0.70
update CHANGELOG for 0.0.70
2025-06-10 14:08:52 -07:00
Aleix Conchillo Flaqué
c101c9c8e1 update CHANGELOG for 0.0.70 2025-06-10 13:37:28 -07:00
Aleix Conchillo Flaqué
96dc162de5 Merge pull request #1988 from pipecat-ai/aleix/update-examples-22b
examples(22b): remove unnecessary parallel pipeline branch
2025-06-10 12:58:37 -07:00
Mark Backman
257dbe3104 Fix model param error 2025-06-10 15:14:47 -04:00
Mark Backman
cd98657e3c Add Cartesia STT docs link to README 2025-06-10 15:09:13 -04:00
Aleix Conchillo Flaqué
03eb22fe0a examples(22b): remove unnecessary parallel pipeline branch 2025-06-10 09:05:58 -07:00
Pete
8d55e13750 remove audio_transcriber from gemini.py
unecessary import removed.
2025-06-10 11:22:16 -04:00
Pete
737e8e79c9 Merge branch 'main' into groundingMetadata 2025-06-10 11:12:35 -04:00
Pete
4d977fede0 Merge branch 'main' into main 2025-06-10 11:07:59 -04:00
Filipi Fuchter
0073a868d4 Websocket client web app to test Twilio. 2025-06-10 11:34:02 -03:00
Mark Backman
0bb61d72ab Merge pull request #1984 from pipecat-ai/mb/cartesia-stt-cleanup
CartesiaSTTService cleanup
2025-06-10 10:30:18 -04:00
Mark Backman
f758508a82 Merge pull request #1978 from pipecat-ai/mb/rime-languages
Add languages to RimeHttpTTSService, extend lang support to German an…
2025-06-10 10:27:15 -04:00
Mark Backman
69d0218d7e Add languages to RimeHttpTTSService, extend lang support to German and French 2025-06-10 10:20:41 -04:00
Aleix Conchillo Flaqué
eb5e5ab1df update CHANGELOG 2025-06-09 20:22:39 -07:00
Aleix Conchillo Flaqué
093697906c Merge pull request #1954 from WebinarGeek/wg/gladia-informal-translations
Gladia informal translations
2025-06-09 20:21:40 -07:00
Aleix Conchillo Flaqué
efe96b7ed1 Merge pull request #1986 from pipecat-ai/aleix/daily-python0.19.2
pyproject: update daily-python to 0.19.2
2025-06-09 19:46:14 -07:00
Aleix Conchillo Flaqué
7ecdd41ab9 pyproject: update daily-python to 0.19.2 2025-06-09 17:29:07 -07:00
Mark Backman
aec70d61e9 CartesiaSTTService cleanup 2025-06-09 15:20:57 -04:00
Mark Backman
2efac13344 Merge pull request #1983 from pipecat-ai/mb/exotel-resampling
Resample audio in ExotelFrameSerializer
2025-06-09 14:41:08 -04:00
Mark Backman
15aeb11c36 Resample audio in ExotelFrameSerializer 2025-06-09 14:02:25 -04:00
Mark Backman
e705f4d984 Merge pull request #1972 from Vaibhav159/vl_add_exotel_serializer
adding exotel serializer
2025-06-09 13:54:26 -04:00
Shrey Gupta
96fa62fdfe [Add] Support for Cartesia AI STT (#1982) 2025-06-09 14:51:01 -03:00
Mark Backman
845c70797a Merge pull request #1975 from pipecat-ai/mb/11labs-flush-context-reset
fix: ElevenLabsTTSService reset context when flushing audio
2025-06-09 13:21:25 -04:00
kompfner
16048956c3 Merge pull request #1956 from pipecat-ai/pk/make-add-observer-sync
Make `PipelineTask.add_observer()` synchronous. This allows callers t…
2025-06-09 13:19:34 -04:00
Mark Backman
cf2f4b5902 fix: ElevenLabsTTSService reset context when flushing audio 2025-06-09 13:17:55 -04:00
marcus-daily
db46f33f34 Update to Android transports 0.3.7 2025-06-09 17:09:59 +01:00
Aleix Conchillo Flaqué
25d1515daf Merge pull request #1979 from pipecat-ai/aleix/buffer-tts-before-playback
buffer audio from TTS service before pushing frames
2025-06-09 08:43:55 -07:00
Paul Kompfner
a3469cd59f Add CHANGELOG entry describing PipelineTask.add_observer() being made synchronous 2025-06-09 11:37:30 -04:00
Paul Kompfner
513ce26200 Add unit test exercising synchronous usage of PipelineTask.add_observer() right after initializing the PipelineTask (before anything else is done with it) 2025-06-09 11:30:24 -04:00
Paul Kompfner
1cd96f94ff Make PipelineTask.add_observer() synchronous. This allows callers to call it before run()ning the PipelineTask first. Without this change, if they tried to do that, they would get an error because the TaskManager's event loop hadn't been set yet. 2025-06-09 11:30:24 -04:00
Aleix Conchillo Flaqué
901dd041f0 buffer audio from TTS service before pushing frames 2025-06-09 07:29:09 -07:00
Vaibhav159
a2ee94651e removing resampling 2025-06-07 12:53:55 +05:30
Aleix Conchillo Flaqué
abdce063f1 Merge pull request #1973 from pipecat-ai/aleix/assemblyai-yield-none
AssemblyAISTTService: yield None instead of Frame()
2025-06-06 15:12:16 -07:00
Aleix Conchillo Flaqué
a33ce5e4bf AssemblyAISTTService: yield None instead of Frame() 2025-06-06 14:41:01 -07:00
Filipi da Silva Fuchter
c9575eaef9 Merge pull request #1911 from pipecat-ai/filipi/websocket_transport_example
Adding support to WebsocketTransport
2025-06-06 17:25:07 -03:00
Filipi Fuchter
1e74476a71 Refactoring to use the observer inside the pipelinetask, and moving to start the bot inside on_client_ready. 2025-06-06 17:22:50 -03:00
Filipi Fuchter
82935884c4 Mentioning the new websocket example in the changelog. 2025-06-06 17:17:11 -03:00
Filipi Fuchter
d774a23768 Improving the readme to mention that can choose which server websocket to use. 2025-06-06 17:12:05 -03:00
Filipi Fuchter
e9f041e170 Removing the old websocket-server example 2025-06-06 17:09:01 -03:00
Filipi Fuchter
1f51b6e4f1 A Pipecat example demonstrating how to use WebsocketTransport 2025-06-06 17:08:43 -03:00
Filipi Fuchter
028650249c Adding support in ProtobufFrameSerializer to deserialize MessageFrame. 2025-06-06 17:07:39 -03:00
Vaibhav159
534197239f updating changelog 2025-06-07 00:24:54 +05:30
Vaibhav159
d2f4bb574c adding exotel serializer 2025-06-07 00:22:41 +05:30
jqueguiner
25ff8ef37b (config.py): add new configuration options for lip-sync optimization, context adaptation, and additional context to enhance translation accuracy
♻️ (stt.py): increase default max buffer size from 5MB to 20MB to accommodate larger audio data
♻️ (stt.py): simplify audio sending logic by removing chunking and sending the entire buffered audio at once for improved performance
2025-06-05 16:51:29 -07:00
Aleix Conchillo Flaqué
07fb1a2c39 Merge pull request #1967 from counterleft/unused-http-client-session
Remove instantiation of unused http client session from certain examples
2025-06-05 12:59:01 -07:00
Aleix Conchillo Flaqué
581b800c43 Merge pull request #1961 from ken-kuro/patch-1
fix(piper-tts): typo
2025-06-05 12:57:58 -07:00
Aleix Conchillo Flaqué
30ca39287f Merge pull request #1962 from ken-kuro/patch-2
fix(fastapi_websocket): typo
2025-06-05 12:57:22 -07:00
Aleix Conchillo Flaqué
01fa9698de Merge pull request #1960 from pipecat-ai/aleix/disable-uvloop
disable uvloop by default and just let the user set it
2025-06-05 12:12:47 -07:00
Brian Mathiyakom
10bd969636 Remove instantiation of unused http client session
These examples don't make any HTTP requests with `session` so there
doesn't seem be a need to create one in the first place. Probably a
copy-paste from a previous example.
2025-06-05 11:45:13 -07:00
Kendrick Ha
f7761f2b61 fix(fastapi_websocket): typo 2025-06-05 13:55:28 +07:00
Kendrick Ha
49ff38a21f fix(piper-tts): typo 2025-06-05 13:50:56 +07:00
Aleix Conchillo Flaqué
8d161306c7 disable uvloop by default and just let the user set it 2025-06-04 21:25:06 -07:00
Vanessa Pyne
027a82dff1 Merge pull request #1958 from pipecat-ai/vp-livekit-fix
fix: transports/services/livekit.py typo
2025-06-04 12:27:47 -05:00
vipyne
cb409d58e0 fix: transports/services/livekit.py typo 2025-06-04 11:14:21 -05:00
Dan Berg
094e2f8151 Fix formatting 2025-06-03 17:21:51 +02:00
Dan Berg
71d121aeb9 Update CHANGELOG.md explaining informal on Gladia TranslationConfig 2025-06-03 17:15:29 +02:00
Dan Berg
b1a88af43c Add informal to Gladia TranslationConfig 2025-06-03 17:10:52 +02:00
Filipi da Silva Fuchter
f73eb4ebd9 Merge pull request #1949 from pipecat-ai/filipi/transport_destination_issue
Fixed transport destination issue
2025-06-03 08:41:34 -03:00
Filipi Fuchter
31ca9be299 Fixing missing await to self.reset. 2025-06-03 08:37:47 -03:00
jqueguiner
02cc6f3d56 Enhance GladiaSTTService with reconnection and audio buffer management features
- Added parameters for maximum reconnection attempts, reconnection delay, and maximum audio buffer size.
- Implemented automatic reconnection logic with exponential backoff.
- Introduced audio buffer management to handle audio data efficiently, including trimming excess data.
- Updated connection handling to ensure proper cleanup and management of WebSocket connections.
- Enhanced audio sending logic to support buffered audio transmission after reconnections.
2025-06-03 03:16:57 -07:00
Filipi Fuchter
1642c082d1 Describing the fix in the changelog. 2025-06-02 22:28:31 -03:00
Filipi Fuchter
892d213442 Fixing issue to keep the transport_destination. 2025-06-02 22:16:10 -03:00
Filipi Fuchter
fc24267e09 Waiting for the LLM response to reset. 2025-06-02 22:15:53 -03:00
Aleix Conchillo Flaqué
9b71bdc608 Merge pull request #1947 from pipecat-ai/aleix/pipecat-0.0.69
update CHANGELOG for 0.0.69
2025-06-02 12:51:51 -07:00
Aleix Conchillo Flaqué
310be89895 update CHANGELOG for 0.0.69 2025-06-02 12:07:50 -07:00
Aleix Conchillo Flaqué
71fbd57e12 Merge pull request #1938 from pipecat-ai/aleix/custom-interruption-strategies
allow custom interruption strategies
2025-06-02 12:05:50 -07:00
Aleix Conchillo Flaqué
ab4b48c823 examples(04a): fix daily_runner import 2025-06-02 12:01:26 -07:00
Aleix Conchillo Flaqué
532767cfa1 LLMUserContextAggregator: reset strategies when reseting the aggregator 2025-06-02 12:01:26 -07:00
Aleix Conchillo Flaqué
5512de3221 allow custom interruption strategies 2025-06-02 12:01:26 -07:00
Mark Backman
13546d5e8f Merge pull request #1946 from pipecat-ai/mb/fix-11labs-context
fix: Use AudioContextWordTTSService context methods in ElevenLabsTTSS…
2025-06-02 14:55:49 -04:00
Mark Backman
c6f1aa8086 fix: Use AudioContextWordTTSService context methods in ElevenLabsTTSService 2025-06-02 14:49:05 -04:00
Mark Backman
5606c47cb7 Merge pull request #1945 from pipecat-ai/mb/gemini-2.0
Reverting Gemini Live model back to gemini-2.0-flash-live-001
2025-06-02 14:25:30 -04:00
Filipi da Silva Fuchter
7f7cd96211 Merge pull request #1944 from pipecat-ai/fixing_tavus_transport
Adding the direction when pushing the frame.
2025-06-02 15:21:58 -03:00
Filipi Fuchter
b828bfd890 Adding the direction when pushing the BotStartedSpeaking and BotStoppedSpeaking frames. 2025-06-02 15:05:56 -03:00
Mark Backman
31d084eb78 Reverting Gemini Live model back to gemini-2.0-flash-live-001 2025-06-02 13:29:05 -04:00
Mark Backman
ab18b280e9 Merge pull request #1943 from pipecat-ai/mb/add-transcription-19-openai
Add a TranscriptProcessor to 19-openai-realtime-beta.py
2025-06-02 13:01:52 -04:00
Mark Backman
24e89c4081 Merge pull request #1936 from pipecat-ai/mb/fix-01-quickstart
Add daily to the foundational examples requirements.txt
2025-06-02 12:55:36 -04:00
Mark Backman
e129390f56 Add a TranscriptProcessor to 19-openai-realtime-beta.py 2025-06-02 11:38:49 -04:00
Mark Backman
4d7c87bb4c Merge pull request #1941 from pipecat-ai/vp-observer-up
chore: move observers arg in p2p-webrtc example; add deprecated to in…
2025-06-02 11:17:48 -04:00
Mark Backman
dac3f82a75 Merge pull request #1934 from counterleft/use-disconnect-on-small-webrtc-connection-example
Fix type checker error with missing function call in the small WebRTC transport example
2025-06-02 11:17:30 -04:00
Mark Backman
fd860921f1 Add daily to the foundational examples requirements.txt 2025-06-02 11:10:49 -04:00
vipyne
0482ccd48b chore: move observers arg in p2p-webrtc example; add deprecated to in line comments 2025-06-02 09:41:09 -05:00
Dominic Stewart
b8b1990617 Fix example env file (#1939)
* Fixed typo in the example env files
2025-06-02 15:12:43 +09:00
Dominic Stewart
70951b1198 Add simplified pstn examples (#1822)
* Add simplified pstn examples
* Add daily_twilio_sip_dial_out example
2025-06-02 14:50:21 +09:00
Mark Backman
6d24514ace Merge pull request #1937 from pipecat-ai/mb/fix-message-for-logging
fix: correctly display non-roman characters
2025-06-01 12:49:48 -04:00
Aleix Conchillo Flaqué
49915ceb84 Merge pull request #1683 from pipecat-ai/aleix/run-function-calls-sequentially
run function calls sequentially or in parallel
2025-06-01 09:47:35 -07:00
Mark Backman
925b13e337 fix: correctly display non-roman characters 2025-06-01 12:29:26 -04:00
Aleix Conchillo Flaqué
ef3143d558 LLMService: don't run function calls if none are given 2025-05-31 14:01:56 -07:00
Mark Backman
ed84637b55 Add additional function call for testing to 14e, 14r, 19, 19a, 26b 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
897a944478 examples(14,14a): add restaurant recommendation function call 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
d86343c38d examples: update to use on_function_calls_started event 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
297afdd126 LLMService: add new FunctionCallsStartedFrame 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
f0cbdc4e68 LLMService: add on_function_calls_started event 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
40b52cadde LLMService: s/FunctionCallLLM/FunctionCallFromLLM/ s/FunctionCallRunner/FunctionCallRunnerItem/ 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
04bf85ddfe LLMService: allow executing tasks sequentially and in parallel 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
4809684a13 LLMService: cancel function calls on interruptions by default 2025-05-31 13:57:50 -07:00
Aleix Conchillo Flaqué
1eb50ad88f LLMService: pass LLM function calls all at once 2025-05-31 13:57:36 -07:00
Aleix Conchillo Flaqué
52569bcdb2 LLMService: don't allow running functions concurrently for now 2025-05-31 13:57:36 -07:00
Aleix Conchillo Flaqué
a50a407415 LLMService: run function calls sequentially 2025-05-31 13:57:35 -07:00
Aleix Conchillo Flaqué
9f223442c2 Merge pull request #1935 from pipecat-ai/aleix/script-evals
improve evals
2025-05-30 19:14:59 -07:00
Mark Backman
c647114bb9 Merge pull request #1927 from pipecat-ai/mb/gemini-tracing 2025-05-30 22:01:44 -04:00
Mark Backman
43719ec737 Update CHANGELOG 2025-05-30 21:36:39 -04:00
Mark Backman
8602557985 Refactor Gemini tracing to more closely match OpenAI Realtime, add TTFB metrics 2025-05-30 21:36:00 -04:00
Mark Backman
dd1f7d0875 Add tracking to OpenAI Realtime 2025-05-30 21:36:00 -04:00
Mark Backman
ec39e794d3 _handle_transcription 2025-05-30 21:36:00 -04:00
Mark Backman
7b1a937d4c Add tracing for Gemini Live 2025-05-30 21:36:00 -04:00
Mark Backman
0fd38d8115 Merge pull request #1931 from pipecat-ai/mb/num-words
Add support for interruption strategies
2025-05-30 21:14:02 -04:00
Mark Backman
7a4efc6212 Code review feedback 2025-05-30 21:09:15 -04:00
Brian Mathiyakom
2eb2c5a413 Use disconnect() because close() doesn't exist
SmallWebRTCConnection doesn't have a `close()`. There's a `_close()` but I assume that's private due to its naming. The closest function that uses `_close()` is `disconnect()`. I assume then, that the intended resource freeing function call should be to `disconnect()`.
2025-05-30 17:14:53 -07:00
Aleix Conchillo Flaqué
2fcfb0aa9f evals: don't use Deepgram's smart formatting 2025-05-30 16:55:55 -07:00
Aleix Conchillo Flaqué
f1df079512 evals: allow running a single eval 2025-05-30 16:55:55 -07:00
getchannel
8070e156d8 Add groundingMetadata events.py 2025-05-30 18:07:09 -04:00
Aleix Conchillo Flaqué
d77bedbafb evals: move scripts/release to script/evals and add README 2025-05-30 15:04:05 -07:00
getchannel
43c6f1f5cd Add groundingMetadata and logging gemini.py 2025-05-30 18:01:15 -04:00
getchannel
f53f5445ba Create 26g-gemini-multimodal-live-groundingMetadata.py 2025-05-30 17:36:36 -04:00
Mark Backman
b34c593c54 Add changelog entry 2025-05-30 16:48:42 -04:00
Mark Backman
62efbc3342 Add foundational example 42 2025-05-30 16:48:42 -04:00
Mark Backman
2d609a0bde Update LLmUserContextAggregator to conditionally push_aggregation 2025-05-30 16:48:42 -04:00
Mark Backman
6bc4b4a17f Update BaseInputTransport to not push StartInterruptionFrame when InterruptionConfig is set and bot is speaking 2025-05-30 16:48:42 -04:00
Mark Backman
b489e52080 Add InterruptionConfig 2025-05-30 16:15:20 -04:00
getchannel
7263d11ee4 update correct upload endpoint file_api.py 2025-05-30 13:41:55 -04:00
getchannel
f2d5b9ad69 Create 26f-gemini-multimodal-live-files-api.py
This is an example to test usage of the Files API integration. Specifically with the Gemini Multimodal Live Service.
2025-05-30 13:04:52 -04:00
getchannel
40c7e3c52c Update gemini.py 2025-05-30 12:19:40 -04:00
Aleix Conchillo Flaqué
a8aaeec52b Merge pull request #1926 from pipecat-ai/aleix/pause-base-input-transport
handle StopFrame in base input transport and stop pushing frames
2025-05-30 08:27:20 -07:00
Mark Backman
ad7eec181e Merge pull request #1923 from philipp-eisen/philipp/fix-ttfb_ms-implementation
Fix implementation of ttfb_ms metric
2025-05-30 11:03:42 -04:00
Aleix Conchillo Flaqué
b33897ffb9 SmallWebRTCTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
1c3d3f2f4b WebsocketClientTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
9a5a1edb6b WebsocketServerTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
f2eb869b02 FastAPIWebsocketTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
0c7e3cfcb2 LiveKitTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
24e19db29e TavusTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
bc6d7b7bbd examples(phone-chatbot): don't show dangling tasks in first pipeline 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
cad271068e BaseInputTransport: handle StopFrame and pause pushing frames 2025-05-30 07:51:49 -07:00
Philipp Eisen
3425293115 Ensure correct formatting 2025-05-30 15:41:45 +02:00
Mark Backman
20dbfec3a9 Merge pull request #1876 from m-ods/m-ods/assemblyai-universal-streaming
Update AssemblyAI Streaming STT
2025-05-30 08:55:43 -04:00
marcus-daily
170057a75a Updating dependency version 2025-05-30 12:38:32 +01:00
marcus-daily
b86b761e0b Fixing Yaml syntax 2025-05-30 12:38:32 +01:00
marcus-daily
da0d2f0266 Small WebRTC transport demo app for Android 2025-05-30 12:38:32 +01:00
Martin Schweiger
321ea27c34 changelog entry 2025-05-30 17:15:58 +08:00
Philipp Eisen
b712e6b9aa Switch ttfb metric to use seconds instead 2025-05-30 11:13:26 +02:00
Martin Schweiger
b3652e6527 set vad_force_turn_endpoint to True by default 2025-05-30 17:07:54 +08:00
Philipp Eisen
bc97f397ef Switch to using float instead of int, but keep ms 2025-05-30 10:27:20 +02:00
Martin Schweiger
e5da3f6e68 add tracing 2025-05-30 10:55:19 +08:00
Martin Schweiger
8400539acf set formatted_finals true by default 2025-05-30 10:40:01 +08:00
Martin Schweiger
b5eac8dfed add message to TranscriptionFrame result 2025-05-30 10:39:07 +08:00
Martin Schweiger
ba312b5591 Merge branch 'main' into m-ods/assemblyai-universal-streaming 2025-05-30 10:29:49 +08:00
Martin Schweiger
f23572b318 Merge branch 'main' into m-ods/assemblyai-universal-streaming 2025-05-30 10:11:02 +08:00
Martin Schweiger
db838634e7 fix: double finals bug 2025-05-30 10:00:31 +08:00
Martin Schweiger
7f2e848a5c use FrameProcessor class methods 2025-05-30 09:40:46 +08:00
Martin Schweiger
096e854d50 remove .dict() 2025-05-30 09:31:20 +08:00
Martin Schweiger
3ffe8b3155 remove parse_obj 2025-05-30 09:29:51 +08:00
Mark Backman
a471f49b61 Merge pull request #1889 from pipecat-ai/mb/add-dtmf-aggregator
Add DTMFAggregator
2025-05-29 17:13:28 -04:00
Mark Backman
4d2a02f318 Refactor to create task on StartFrame, also cleanup 2025-05-29 17:10:54 -04:00
Mark Backman
0bec7db03b Emit a BotInterruptionFrame when the first keypress of a sequence is received 2025-05-29 17:07:18 -04:00
Mark Backman
74827f983f Add tests, improve frame timing 2025-05-29 17:07:18 -04:00
Mark Backman
0ed46f457e Add DTMFAggregator 2025-05-29 17:07:17 -04:00
Aleix Conchillo Flaqué
36b731be73 Merge pull request #1915 from pipecat-ai/aleix/uvloop-event-loop
use uvloop as the new event loop on Linux and macOS
2025-05-29 14:06:44 -07:00
Aleix Conchillo Flaqué
62fbdd4e81 Merge pull request #1922 from pipecat-ai/aleix/output-transport-cleanup
output transports cleanup
2025-05-29 14:06:17 -07:00
Aleix Conchillo Flaqué
ca7b0650c2 examples: capture camera or screen. allow setting framerate 2025-05-29 13:16:44 -07:00
Aleix Conchillo Flaqué
67dd146038 output transports cleanup 2025-05-29 13:16:44 -07:00
Aleix Conchillo Flaqué
fb66df2efd use uvloop as the new event loop on Linux and macOS 2025-05-29 11:24:21 -07:00
Aleix Conchillo Flaqué
2395ca0057 Merge pull request #1921 from pipecat-ai/aleix/transcription-frame-result
add STT result field to TranscriptionFrame/InterimTranscriptionFrame
2025-05-29 11:22:38 -07:00
Aleix Conchillo Flaqué
d203789490 add STT result field to TranscriptionFrame/InterimTranscriptionFrame 2025-05-29 11:21:44 -07:00
Aleix Conchillo Flaqué
7ea0e31cd4 Merge pull request #1924 from pipecat-ai/aleix/new-examples-package
add new pipecat.examples package and make runner public
2025-05-29 11:18:22 -07:00
Mark Backman
d3bf13a503 Merge pull request #1917 from pipecat-ai/mb/fix-twilio-chatbot-client
fix: update websocket_client to use FrameProcessorSetup.task_manager
2025-05-29 14:05:27 -04:00
Mark Backman
ea91970499 Update the twilio-chatbot client to work with the updated server, which requires call_sid 2025-05-29 14:02:49 -04:00
Mark Backman
803b3f2cc4 fix: update websocket_client to use FrameProcessorSetup 2025-05-29 14:02:47 -04:00
Aleix Conchillo Flaqué
1788ba6c5c add new pipecat.examples package and make runner public 2025-05-29 10:43:12 -07:00
Philipp Eisen
5209bd3d9f Fix implemetation of ttfb_ms metric 2025-05-29 19:24:04 +02:00
Aleix Conchillo Flaqué
cb9178f1ec Merge pull request #1914 from pipecat-ai/aleix/base-output-daily-handle-dtmf-frames
output transport can now handle DTMF keypress
2025-05-29 09:35:01 -07:00
Aleix Conchillo Flaqué
5676920a6a BaseOutputTransport/DailyTransport: allow sending DTMF keypress 2025-05-29 09:33:29 -07:00
Aleix Conchillo Flaqué
513221d9fd added OutputDTMFUrgentFrame 2025-05-29 09:32:57 -07:00
Mark Backman
a33d0b4b53 Merge pull request #1904 from pipecat-ai/mb/aws-bedrock-no-op-tool
Add no_op tool to AWSBedrockLLMService
2025-05-29 10:29:19 -04:00
Mark Backman
bee242b781 Safer check when using the no_operation tool 2025-05-29 10:25:20 -04:00
Mark Backman
fa1c98ff29 Add no_op tool to AWSBedrockLLMService 2025-05-29 10:25:19 -04:00
Mark Backman
ae3a7d9bed Merge pull request #1920 from alexflorensa/alexflorensa/set-deepgram-model-name
fix(deepgram-stt): set model name to Deepgram STT
2025-05-29 09:41:56 -04:00
Matej Marinko
ee5fea4221 Fix auto finalization cycle 2025-05-29 14:58:35 +02:00
Matej Marinko
db7b60cfe9 Auto finalize fix 2025-05-29 13:24:53 +02:00
alexflorensa
0c2efb312c fix(deepgram-stt) Set model name to Deepgram STT 2025-05-29 12:08:16 +02:00
Matej Marinko
51b79bd6a1 Minor code style changes 2025-05-29 10:11:11 +02:00
Matej Marinko
95fe762776 Fix typo 2025-05-29 09:23:37 +02:00
Aleix Conchillo Flaqué
cf8eeaab0b Merge pull request #1909 from pipecat-ai/aleix/daily-expose-transcription-functions
DailyTransport: expose start_transcription/stop_transcription
2025-05-28 17:06:40 -07:00
Vanessa Pyne
2f8cb3ce76 Merge pull request #1804 from pipecat-ai/vp-text-webrtc-chatbot
add text chatbot example using small webrtc transport
2025-05-28 14:25:42 -05:00
vipyne
821da723c0 update SmallWebRTCTransport text examples with new run_example 2025-05-28 13:54:45 -05:00
Filipi Fuchter
575b97ba60 Some improvements and cleanups in the SmallWebRTCTransport text examples. 2025-05-28 13:54:45 -05:00
vipyne
cc0819b709 examples: add text and audio over webrtc transport
update filenames

add high level comments to 41* examples
2025-05-28 13:54:45 -05:00
vipyne
318d6f042b examples: add text chatbot example using small webrtc transport
examples: send small webrtc UI updates in RTVIServerMessageFrame

add explanation about RTVI server messages being specific to
small-webrtc-prebuilt UI
2025-05-28 13:54:45 -05:00
Aleix Conchillo Flaqué
05ae3a1703 DailyTransport: expose start_transcription/stop_transcription 2025-05-28 11:21:26 -07:00
Aleix Conchillo Flaqué
8e54805e62 Merge pull request #1908 from pipecat-ai/aleix/add-manifest-file-to-reduce-sdist
add MANIFEST.in to reduce sdist tarball size
2025-05-28 10:15:38 -07:00
Aleix Conchillo Flaqué
64399a72f3 add MANIFEST.in to reduce sdist tarball size 2025-05-28 10:09:39 -07:00
Aleix Conchillo Flaqué
6c33f0b0bd Merge pull request #1907 from pipecat-ai/aleix/pipecat-0.0.68
update CHANGELOG for pipecat 0.0.68
2025-05-28 09:41:34 -07:00
Aleix Conchillo Flaqué
aca304b395 update CHANGELOG for pipecat 0.0.68 2025-05-28 09:08:08 -07:00
Aleix Conchillo Flaqué
db5c9e67be Merge pull request #1906 from pipecat-ai/aleix/daily-python-0.19.1
pyproject: update daily-python to 0.19.1
2025-05-28 09:05:13 -07:00
Aleix Conchillo Flaqué
2313cec792 pyproject: update daily-python to 0.19.1 2025-05-28 00:36:40 -07:00
Matej Marinko
2968c846ce Add Soniox STT service 2025-05-28 09:35:21 +02:00
Aleix Conchillo Flaqué
83acaf692a Merge pull request #1890 from pipecat-ai/aleix/examples-multi-transport
add support for multiple transports to foundational examples
2025-05-28 00:27:01 -07:00
Aleix Conchillo Flaqué
e9aeb2662b scripts: allow specifying a name for the test run 2025-05-28 00:22:55 -07:00
Aleix Conchillo Flaqué
356f4039e4 scripts: allow storing logs for release evals 2025-05-27 21:10:22 -07:00
Aleix Conchillo Flaqué
736c7f1f30 scripts: allow storing audio for release evals 2025-05-27 18:09:25 -07:00
Aleix Conchillo Flaqué
2994448036 introduce release evals
This is an initial attempt to implement evals for all (or most) of our
foundational examples. Before we release, we want to make sure all of them work
and reply properly. Until now this has been done manually, hopefully this will
be useful to speed up our release process.
2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
d476d9ea05 examples: remove "on_client_closed"
This has been replaced for "on_client_disconnected" in SmallWebRTCTransport to
match other transports and therefore it is not necessary anymore.
2025-05-27 17:42:52 -07:00
Filipi Fuchter
bf31bce440 Updated SmallWebRTCTransport to align with how other transports handle on_client_disconnected 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
6393e89022 examples(foundational): update handle_signint depending on transport 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
884268fce3 examples(foundational): allow running examples with twilio 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
071a9307c9 transport(websocket): do not require a frame serializer 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
2cdfaa0a82 examples(foundational): support multiple transports 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
ecf878e14d DailyTransport: allow requesting video frames with any framerate 2025-05-27 17:42:50 -07:00
Aleix Conchillo Flaqué
4eed335bc7 PipelineTask: check if pipeline has already been cancelled 2025-05-27 17:40:39 -07:00
Aleix Conchillo Flaqué
2e57bb74d2 BaseObject: do not raise exception if event handler not registered 2025-05-27 17:40:35 -07:00
Aleix Conchillo Flaqué
0a39769cd0 Merge pull request #1901 from pipecat-ai/aleix/deepgram-sdk-4.1.0
pyproject: update deepgram-sdk to 4.1.0
2025-05-27 17:20:54 -07:00
Aleix Conchillo Flaqué
bdb6a9e5d1 Merge pull request #1903 from pipecat-ai/aleix/openpipe-4.50.0
pyproject: update openpipe to 4.50.0
2025-05-27 17:20:30 -07:00
Aleix Conchillo Flaqué
f88e0eb96d pyproject: update openpipe to 4.50.0 2025-05-27 15:22:55 -07:00
Aleix Conchillo Flaqué
0099f60d29 deepgram: fix an issue with user provided LiveOptions 2025-05-27 15:19:17 -07:00
Filipi da Silva Fuchter
eaf9f20c56 Merge pull request #1898 from pipecat-ai/tavus_transport_fixes
Fixing TavusTransport with some TTS services.
2025-05-27 16:33:19 -03:00
Aleix Conchillo Flaqué
e987c4741a pyproject: update deepgram-sdk to 4.1.0 2025-05-27 10:32:48 -07:00
Aleix Conchillo Flaqué
6242278abd Merge pull request #1895 from pipecat-ai/aleix/more-avoid-mutable-default-values
more avoiding mutable default constructor values
2025-05-27 10:08:42 -07:00
Mark Backman
30850a431a Merge pull request #1886 from pipecat-ai/mb/add-otel-llm-output
Add LLM response tracing to OTel tracing
2025-05-27 11:58:57 -04:00
Mark Backman
1e7407c042 Merge pull request #1899 from pipecat-ai/mb/genai-to-standard-messages
fix: Update GoogleLLMContext to_standard_messages to be compatible with google-genai
2025-05-27 11:57:44 -04:00
Mark Backman
6d94f31ff2 Update foundational example 33 to work with google-genai 2025-05-27 11:20:21 -04:00
Mark Backman
ebb3d1cfd3 fix: Update GoogleLLMContext to_standard_messages to be compatible with google-genai 2025-05-27 11:20:21 -04:00
Filipi Fuchter
acce9489d7 Creating the silence based on the chunk size. 2025-05-27 11:26:34 -03:00
Filipi Fuchter
3d442620f9 Removing not used imports. 2025-05-27 10:42:06 -03:00
Martin Schweiger
f1d7eb8565 final touches 2025-05-27 21:29:50 +08:00
Filipi Fuchter
798b935ff6 The remaining audio should not be sent as done. 2025-05-27 10:28:40 -03:00
Filipi Fuchter
3039a1444e Refactoring the TavusVideoService to match the same the behavior of the bot started speaking and bot stopped speaking. 2025-05-27 10:26:41 -03:00
Mark Backman
aa7d15beb3 fix: move LLM call outside tracing try block to prevent double execution 2025-05-26 22:06:31 -04:00
Filipi Fuchter
2b3d2cb342 Fixing the issue when using the TavusVideoService with some TTS services. 2025-05-26 22:55:20 -03:00
Filipi Fuchter
5a58357429 Fixing the issue when using the TavusTransport with some TTS services. 2025-05-26 18:34:52 -03:00
Mark Backman
366add2536 Merge pull request #1878 from pipecat-ai/mb/add-plivo-serializer
Add PlivoFrameSerializer
2025-05-26 11:07:13 -04:00
Mark Backman
e13c9fd42e Add PlivoFrameSerializer 2025-05-26 11:00:01 -04:00
Mark Backman
2a6c01f634 Merge pull request #1885 from pipecat-ai/mb/update-langfuse-example 2025-05-26 10:14:02 -04:00
Mark Backman
bf29722e78 Merge pull request #1884 from pipecat-ai/mb/readme-otel-twilio-telnyx 2025-05-26 10:13:23 -04:00
Mark Backman
db227ad15f Merge pull request #1897 from pipecat-ai/mb/fix-websocket-client-example
Fix mismatched html tag in websocket client example
2025-05-26 09:26:53 -04:00
Mark Backman
514716042b Fix mismatched html tag in websocket client example 2025-05-26 08:25:03 -04:00
Aleix Conchillo Flaqué
7a767e680c more avoiding mutable default constructor values 2025-05-25 21:00:35 -07:00
Martin Schweiger
320b52eb1e Merge branch 'm-ods/assemblyai-universal-streaming' of https://github.com/m-ods/pipecat into m-ods/assemblyai-universal-streaming 2025-05-26 09:13:08 +08:00
Martin Schweiger
428cee75c5 Add User-Agent header to AssemblyAI websocket connection 2025-05-26 09:10:55 +08:00
Martin Schweiger
5479a55b2c Add websockets dependency to assemblyai extra 2025-05-26 09:08:56 +08:00
Mark Backman
d1f2a5d04f Merge pull request #1868 from aristid/google-streaming-tts
Add Google streaming TTS as a base TTS service
2025-05-24 12:42:47 -04:00
aristid
09ba319f3e Merge branch 'main' into google-streaming-tts 2025-05-24 17:16:22 +02:00
ezun-kim
3da711ba8b Fix SSE server connection handling for MCP client
### Summary
This PR improves the MCP (Model Context Protocol) client's SSE (Server-Sent Events) server connection handling by replacing the generic string parameter with a proper `SseServerParameters` class.

### Changes
- **Breaking Change**: Changed `server_params` type from `Union[StdioServerParameters, str]` to `Union[StdioServerParameters, SseServerParameters]`
- Added import for `SseServerParameters` from `mcp.client.session_group`
- Updated SSE client connection to use structured parameters instead of a simple URL string
- Fixed error message to correctly reflect the expected parameter types
- Improved logging by changing info-level log to debug-level for consistency

### Details

#### Before
The SSE client connection only accepted a URL string:
```python
async with self._client(self._server_params) as (read, write):
```

#### After
Now properly unpacks SSE server parameters:
```python
async with self._client(
    url=self._server_params.url,
    headers=self._server_params.headers,
    timeout=self._server_params.timeout,
    sse_read_timeout=self._server_params.sse_read_timeout
) as (read, write):
```

### Benefits
- **Type Safety**: Stronger type checking with dedicated `SseServerParameters` class
- **Extended Configuration**: Support for custom headers (authentication), timeouts, and SSE-specific settings
- **Better Error Messages**: Clear type error messages when incorrect parameters are provided
- **Improved Debugging**: Debug logging of SSE server parameters for troubleshooting

### Migration Guide
Users need to update their SSE server initialization:
```python
# Before
client = MCPClient("https://example.com/sse")

# After
from mcp.client.session_group import SseServerParameters
client = MCPClient(SseServerParameters(
    url="https://example.com/sse",
    headers={"Authorization": "Bearer token"},
    timeout=30,
    sse_read_timeout=60
))
```

### Testing
- [ ] Tested with StdioServerParameters (unchanged behavior)
- [ ] Tested with SseServerParameters with various configurations
- [ ] Verified error handling for invalid parameter types

---

This is a necessary change to support production-ready SSE connections with proper authentication and timeout handling.
2025-05-24 22:35:57 +09:00
Mark Backman
6f524fb816 Add LLM response to OTel tracing 2025-05-24 09:15:39 -04:00
unknown
d3e2a9e5c0 Change default voice and fix formatting 2025-05-24 15:14:39 +02:00
Mark Backman
b4cd7d7941 Langfuse OTel env.example improvements 2025-05-24 08:17:10 -04:00
Mark Backman
cd03b91115 Update README: Add OTel, Add serializers 2025-05-24 07:23:59 -04:00
Aleix Conchillo Flaqué
f86d002ceb Merge pull request #1881 from pipecat-ai/aleix/daily-input-audio-and-video-task-fix
daily input audio and video task fix
2025-05-23 19:39:25 -07:00
Aleix Conchillo Flaqué
940926b5ec TavusVideoService: no need to enable audio/video outputs 2025-05-23 19:29:34 -07:00
Aleix Conchillo Flaqué
85c096df0b DailyTransport: create audio/video input tasks when input flag is enabled 2025-05-23 19:28:18 -07:00
Filipi da Silva Fuchter
76d93522ac Merge pull request #1820 from pipecat-ai/tavus_video_service
Tavus improvements
2025-05-23 23:11:00 -03:00
Filipi Fuchter
31492831cc Updating the changelog and readme to reflect the Tavus changes. 2025-05-23 23:04:04 -03:00
Filipi Fuchter
8221dd594e Creating a Tavus example using the DailyTransport. 2025-05-23 23:03:40 -03:00
Filipi Fuchter
6346ca1a84 Creating a Tavus example using the SmallWebRTCTransport. 2025-05-23 23:03:24 -03:00
Filipi Fuchter
4a3404883f Creating a Tavus example using the new TavusTransport. 2025-05-23 23:03:16 -03:00
Filipi Fuchter
1ebca35313 Queuing the app messages if the SmallWebrtcTransport is not connected yet. 2025-05-23 23:03:04 -03:00
Filipi Fuchter
e0d1381f87 Refactoring the TavusVideoService to allow to work with any transport. 2025-05-23 23:02:49 -03:00
Filipi Fuchter
86e6841569 Creating TavusTransport and TavusTransportClient. 2025-05-23 23:02:37 -03:00
Aleix Conchillo Flaqué
28b7a92a00 Merge pull request #1880 from pipecat-ai/aleix/daily-resize-event-loop-fix
BaseOutputTransport: don't block event loop during image resize
2025-05-23 18:32:00 -07:00
Aleix Conchillo Flaqué
4db5b18694 BaseOutputTransport: don't block event loop during image resize 2025-05-23 18:30:28 -07:00
Aleix Conchillo Flaqué
a628e921c0 Merge pull request #1879 from pipecat-ai/aleix/daily-fix-video-task
DailyTransport: fix video task variable
2025-05-23 17:56:08 -07:00
Aleix Conchillo Flaqué
6ca6ff37c9 DailyTransport: fix video task variable 2025-05-23 17:54:25 -07:00
Aleix Conchillo Flaqué
456db3710a Merge pull request #1828 from pipecat-ai/aleix/daily-use-audio-renderers
DailyTransport: replace virtual speaker and microphones
2025-05-23 13:31:51 -07:00
Aleix Conchillo Flaqué
50f024c6f9 LiveKitTransport: use UserAudioRawFrame instead of InputAudioRawFrame 2025-05-23 11:27:53 -07:00
Aleix Conchillo Flaqué
a4de75a8c0 Merge pull request #1867 from pipecat-ai/aleix/user-bot-latency-log-observer
observers: added UserBotLatencyLogObserver
2025-05-23 09:23:03 -07:00
Aleix Conchillo Flaqué
88e8fcdaca observers: added UserBotLatencyLogObserver 2025-05-23 09:17:53 -07:00
unknown
bfe9952c9a Remove sleep(0), add doc string etc. 2025-05-23 12:11:08 +02:00
Martin Schweiger
7f568e3e7e Merge branch 'main' into m-ods/assemblyai-universal-streaming 2025-05-23 17:39:00 +08:00
Martin Schweiger
9b8800ac1d update stt.py 2025-05-23 17:32:31 +08:00
Martin Schweiger
fd53712567 add models for new streaming service 2025-05-23 17:32:12 +08:00
Aleix Conchillo Flaqué
7f74c2465c SileroVADAnalyzer: improve non-matching sample rate log 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
30d67a78eb examples(chatbot-audio-recording): use same sample rate to avoid downsampling 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
c3cfd1f0ce DailyTransport: process audio, video and event callbacks in separate tasks 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
69ac70eed8 DailyTransport: replace virtual microphone with custom microphone track 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
fcf49e79cc DailyTransport: use participant audio renderers instead of virtual speaker 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
8d4894846d pyproject: update to daily-python 0.19.0 2025-05-23 01:47:09 -07:00
Nischal Jain
ffa16dd136 added deepwiki badge for weekly repo refresh 2025-05-22 20:08:48 -07:00
Vanessa Pyne
a809b710c5 Merge pull request #1844 from pipecat-ai/vp-docsinreadme
add docs link at top of readme
2025-05-22 21:52:18 -05:00
vipyne
f6289e9db2 add docs link at top of readme 2025-05-22 21:51:29 -05:00
Mark Backman
26b4c4df22 Merge pull request #1870 from pipecat-ai/mb/gemini-2.5-flash-update
Update GeminiMultimodalLiveLLMService to use Gemini 2.5 Flash Native …
2025-05-22 18:19:55 -04:00
Mark Backman
f3a9844295 Merge pull request #1860 from pipecat-ai/mb/organize-otel-demos
Reorganize OpenTelemetry demos, add top-level README
2025-05-22 18:15:20 -04:00
Mark Backman
692821bdae Merge pull request #1873 from pipecat-ai/mb/readme-sarvam
Add SarvamTTSService to README
2025-05-22 18:14:40 -04:00
Mark Backman
ee143d5b3a Update GeminiMultimodalLiveLLMService to use Gemini 2.5 Flash Native Audio Dialog model 2025-05-22 18:13:41 -04:00
Mark Backman
7e178a634a Merge pull request #1871 from pipecat-ai/mb/claude-sonnet-4-update
Update default model for Anthropic to Claude Sonnet 4
2025-05-22 18:12:47 -04:00
Mark Backman
fe88a3d80b Add SarvamTTSService to README 2025-05-22 18:11:11 -04:00
Mark Backman
a196eac290 Merge pull request #1872 from pipecat-ai/mb/add-sarvam-tts
Add SarvamTTSService
2025-05-22 18:02:36 -04:00
Mark Backman
3c819955a2 Add SarvamTTSService 2025-05-22 16:23:08 -04:00
Mark Backman
ca0d7bbbed Update default model for Anthropic to Claude Sonnet 4 2025-05-22 15:13:33 -04:00
Mark Backman
f93bd1e817 Merge pull request #1864 from pipecat-ai/mb/fix-11lab-set-model-voice
Fix: ElevenLabsTTSService, change voice and model
2025-05-22 14:36:24 -04:00
Mark Backman
415bc6ca0a Fix: ElevenLabsTTSService, change voice and model 2025-05-22 14:28:50 -04:00
Mark Backman
8543c8d11d Merge pull request #1869 from pipecat-ai/mb/update-readme-nova-sonic
Add AWS Nova Sonic to README
2025-05-22 14:07:35 -04:00
Mark Backman
bf5ad64575 Add AWS Nova Sonic to README 2025-05-22 14:03:28 -04:00
unknown
d42d02d809 Add Google streaming TTS as a base TTS service. Rename non-streaming service to GoogleHttpTTSService. 2025-05-22 11:21:06 +02:00
Aleix Conchillo Flaqué
0718f79ff2 Merge pull request #1866 from pipecat-ai/aleix/base-observers-are-base-objects
BaseObserver: inherit from BaseObject so we can have events
2025-05-21 16:07:38 -07:00
Aleix Conchillo Flaqué
9bbce225ce BaseObserver: inherit from BaseObject so we can have events 2025-05-21 16:04:44 -07:00
Mark Backman
fb35fd6d71 Merge pull request #1859 from pipecat-ai/mb/otel-attribute-naming
Update OTel attribute names
2025-05-21 12:10:15 -04:00
Mark Backman
b4fd92aed6 Merge pull request #1862 from marctorsoc/clean-links-in-md-text-filter
Add link cleaning in MD text filter
2025-05-21 09:20:27 -04:00
Mark Backman
36931825b3 Merge pull request #1854 from sklinglernv/fix/elevenlab-tts
fix(elevenlabs tts): message parameter naming
2025-05-21 09:17:29 -04:00
marc.torsoc
ca35299dcd add link cleaning and a test for it 2025-05-21 12:08:53 +02:00
Severin Klingler
e74b900914 revert most of the changes except keyword naming fix 2025-05-21 09:24:03 +02:00
Mark Backman
25115668a7 Reorganize OpenTelemetry demos, add top-level README 2025-05-20 23:30:46 -04:00
Mark Backman
fb94db3e64 Update to use GenAI naming 2025-05-20 22:56:02 -04:00
Mark Backman
c4778e770e Merge pull request #1835 from marcklingen/langfuse-tracing
Add examples/open-telemetry-tracing-langfuse
2025-05-20 18:22:55 -04:00
Mark Backman
3860cdf97b Update OTel attribute names 2025-05-20 18:00:46 -04:00
Aleix Conchillo Flaqué
f3aec0c4ac Merge pull request #1829 from pipecat-ai/aleix/pipeline-task-add-observer
PipelineTask: add add_observer()
2025-05-20 13:18:24 -07:00
Aleix Conchillo Flaqué
d333094149 PipelineTask: add add_observer() and remove_observer() 2025-05-20 13:16:06 -07:00
Aleix Conchillo Flaqué
609ff4e66c Merge pull request #1841 from pipecat-ai/aleix/base-text-aggregator-async
make BaseTextAggregator and BaseTextFilter functions async
2025-05-20 13:13:54 -07:00
Aleix Conchillo Flaqué
cbccbcd9e7 BaseTextFilter: make functions async 2025-05-20 13:11:44 -07:00
Aleix Conchillo Flaqué
54b1d7fcc1 BaseTextAggregator: make functions async 2025-05-20 13:11:42 -07:00
Aleix Conchillo Flaqué
54388c0d9b Merge pull request #1850 from pipecat-ai/aleix/transcription-message-user-id
TranscriptionMessage: add user_id field
2025-05-20 13:10:42 -07:00
Aleix Conchillo Flaqué
228c866aaa Merge pull request #1857 from pipecat-ai/aleix/avoid-mutable-default-values
avoid mutable default constructor values
2025-05-20 13:10:24 -07:00
Aleix Conchillo Flaqué
a09bd648af avoid mutable default constructor values 2025-05-20 11:59:28 -07:00
Vanessa Pyne
3e4ae61c75 Merge pull request #1856 from pipecat-ai/vp-mcp-debug
mcp: fix typo in tool call response
2025-05-20 13:59:11 -05:00
vipyne
7655c432c2 mcp: fix typo in tool call response 2025-05-20 11:16:59 -05:00
Aleix Conchillo Flaqué
25dd651757 TranscriptionMessage: add user_id field 2025-05-19 15:47:54 -07:00
Mark Backman
462aecea3e Merge pull request #1839 from pipecat-ai/mb/cartesia-speed
Add support for Cartesia's speed parameter, update clients and APIs, deprecate emotion
2025-05-19 17:57:25 -04:00
Mark Backman
5f37df790b Merge pull request #1848 from pipecat-ai/mb/fix-word-wrangler-transport-params
Fix: Add audio_in_enabled to Word Wrangler TransportParams
2025-05-19 17:52:05 -04:00
Mark Backman
8e4e03541c Update CHANGELOG 2025-05-19 17:51:27 -04:00
Aleix Conchillo Flaqué
c1252fc7eb Merge pull request #1840 from pipecat-ai/aleix/base-object-dont-create-tasks
BaseObject: don't create event handler tasks if none is registered
2025-05-19 14:12:31 -07:00
Mark Backman
ed1077cc9a Fix: Add audio_in_enabled to Word Wrangler TransportParams 2025-05-19 15:53:29 -04:00
Mark Backman
4c761a7b22 Merge pull request #1847 from pipecat-ai/mb/update-otel
Keep span identifiers in attributes only
2025-05-19 14:37:42 -04:00
Mark Backman
9bc3df7803 Keep span identifiers in attributes only 2025-05-19 12:25:13 -04:00
Aleix Conchillo Flaqué
5e5060a6fe BaseObject: don't create event handler tasks if none is registered 2025-05-19 09:24:56 -07:00
Aleix Conchillo Flaqué
2b66eddaa1 Merge pull request #1830 from pipecat-ai/aleix/pipeline-task-frame-events
PipelineTask: add new started/stopped/ended/cancelled events
2025-05-19 08:32:28 -07:00
Mark Backman
916b9d6c6d Add an example for CartesiaHttpTTSService 2025-05-19 11:31:47 -04:00
Mark Backman
bd09ccd608 Update CartesiaHttpTTSService to work with the new cartesia 2.0 client 2025-05-19 11:31:28 -04:00
Mark Backman
682f8e4d45 Bump the cartesia_version for CartesiaTTSService, and cartesia package for CartesiaHttpTTSService 2025-05-19 11:10:03 -04:00
Mark Backman
c9d0af9ee0 Deprecate emotion, add new speed parameter 2025-05-19 09:43:24 -04:00
Severin Klingler
e1299d59bf fix(elevenlabs tts): Fix message paramter naming and make use of contexts to send out TTSStoppedFrames() 2025-05-19 15:22:13 +02:00
Mark Backman
61da6437ea Merge pull request #1834 from pipecat-ai/mb/gemini-live-tokens
Fix: Make LLMTokenUsage more robust
2025-05-19 09:04:07 -04:00
Marc Klingen
798705469b Update README.md 2025-05-18 21:11:20 +02:00
Marc Klingen
459a753de3 add reference to main otel example 2025-05-18 19:56:12 +02:00
Marc Klingen
1092ce70b3 add video of langfuse trace 2025-05-18 19:46:38 +02:00
Marc Klingen
9511c189bd revert original folder 2025-05-18 19:42:13 +02:00
Marc Klingen
66fea9e2ee create new example folder 2025-05-18 19:41:17 +02:00
Marc Klingen
69ae83516e use http exporter 2025-05-18 19:11:06 +02:00
Mark Backman
144ea36c81 Fix: Make LLMTokenUsage more robust 2025-05-18 07:41:16 -04:00
Mark Backman
7a8ab9a900 Merge pull request #1672 from golbin/main
Use "use_original_timestamps" only for sonic-2 model
2025-05-18 07:24:58 -04:00
Jin Kim
c4b35055b4 Update CHANGELOG.md 2025-05-18 16:54:29 +09:00
Jin Kim
a4c04e7c17 Opt out Sonic models from use_original_timestamps 2025-05-18 16:52:37 +09:00
Jin Kim
a6f7e7fc30 Merge branch 'pipecat-ai:main' into main 2025-05-18 16:48:24 +09:00
Aleix Conchillo Flaqué
d5ebc883b3 PipelineTask: add new started/stopped/ended/cancelled events 2025-05-17 22:46:22 -07:00
Mark Backman
deb43df0a4 Merge pull request #1824 from pipecat-ai/mb/gemini-live-transcribe-user-audio
Update GeminiMultimodalLiveLLMService to use Gemini's user transcription
2025-05-16 22:51:04 -04:00
Mattie Ruth
88e472b3f1 Update Modal Readme (#1825) 2025-05-16 17:40:57 -04:00
Mark Backman
f59fb8167d Merge pull request #1784 from thsunkid/thu/handle-transcript-gpt4o-audio
Handle audio transcript from gpt-4o-audio and clean up logs
2025-05-16 13:20:16 -04:00
Mark Backman
fac6f526f7 Add comments and docstrings 2025-05-16 10:54:50 -04:00
Mark Backman
2f78d74ce6 Change Gemini Live to use Gemini provided usage metrics 2025-05-16 10:53:01 -04:00
Mark Backman
d3942dda52 Gemini Live to transcribe user audio 2025-05-16 10:53:01 -04:00
Mark Backman
c00e9a8d3a Merge pull request #1819 from kaikato/lmnt-model-langs
LmntTTSService: add model param and additional languages
2025-05-16 08:49:55 -04:00
kaikato
c3b95767f3 LmntTTSService: add model param and additional languages 2025-05-16 04:24:57 +00:00
Mark Backman
90f27a3090 Merge pull request #1816 from pipecat-ai/mb/add-minimax-tts
Add MiniMax TTS
2025-05-15 18:05:13 -04:00
Mark Backman
b6f09defc9 Add MiniMax TTS 2025-05-15 18:02:29 -04:00
Aleix Conchillo Flaqué
172813bcfb Merge pull request #1815 from pipecat-ai/aleix/remove-silerovad-processor
remove SileroVAD() frame processor
2025-05-15 13:44:44 -07:00
Aleix Conchillo Flaqué
95c25efab7 remove SileroVAD() frame processor 2025-05-15 11:55:20 -07:00
Aleix Conchillo Flaqué
a51af35024 Merge pull request #1814 from pipecat-ai/aleix/examples-dependabot-05142025
examples: updates for dependabot 05/14/2025
2025-05-15 11:38:45 -07:00
Mark Backman
119fd5ba7d Merge pull request #1025 from fatwang2/main
added hailuo tts service
2025-05-15 14:29:24 -04:00
Aleix Conchillo Flaqué
0718a812bd examples: updates for dependabot 05/14/2025 2025-05-14 22:51:08 -07:00
Mark Backman
3814501b48 Merge pull request #1811 from pipecat-ai/mb/dont-require-tracing-dep
Fix: Resolve an issue where tracing imports were required
2025-05-14 12:35:47 -04:00
Mark Backman
7a5205dbda Fix: Resolve an issue where tracing imports were required 2025-05-14 12:29:08 -04:00
Thu Nguyen
15a5028d23 Revert log changes 2025-05-14 22:28:25 +08:00
Thu Nguyen
fee2648ac0 Handle audio transcript from gpt-4o-audio and clean up logs 2025-05-14 13:02:22 +07:00
Varun Singh
04c02c9a20 Merge pull request #1810 from pipecat-ai/vr000m-receiving-custom-sip-headers
added handling for sipHeaders
2025-05-13 23:02:14 -07:00
getchannel
e27da96cdc Rename file_api to file_api.py
added proper .py to file name.
2025-05-13 22:01:02 -04:00
Varun Singh
0ff7195a83 Update README.md
updating docs
2025-05-13 19:08:43 -04:00
Varun Singh
3b91aa013a added handling for sipHeaders 2025-05-13 16:00:05 -07:00
Mark Backman
50f6235edb Add support for OpenTelemetry tracing (#1729)
* Also added TurnTrackingObserver, TurnTraceObserver, foundational 29, open-telemetry-example
2025-05-13 17:18:11 -04:00
Aleix Conchillo Flaqué
6f4d94f91b Merge pull request #1800 from pipecat-ai/aleix/frame-processors-setup
introduce frame processors setup
2025-05-13 13:18:06 -07:00
Aleix Conchillo Flaqué
83a4c7d443 RTVIProcessor: remove unused code 2025-05-13 11:26:37 -07:00
Aleix Conchillo Flaqué
8171fec925 SmallWebRTCConnection: complain if av package not found 2025-05-13 11:26:37 -07:00
Aleix Conchillo Flaqué
175f352ea7 add FrameProcessor.setup() to setup processors before StartFrame 2025-05-13 11:26:35 -07:00
Filipi da Silva Fuchter
5290161ac4 Merge pull request #1746 from pipecat-ai/simple_chatbot-react-native
Simple chatbot: React Native client
2025-05-13 10:48:09 -03:00
Filipi Fuchter
8762019ed7 Not setting the local audio level when the user stopped speaking. 2025-05-13 10:46:30 -03:00
Filipi Fuchter
61a59fa158 Fixing useNavigation typescript warning. 2025-05-13 10:36:39 -03:00
Filipi Fuchter
55eea20c8e Renaming expo environment variable 2025-05-13 10:32:27 -03:00
kompfner
9a621f0c54 Merge pull request #1805 from pipecat-ai/pk/aws-nova-sonic-aggregate-user-transcription-text
AWS Nova Sonic service - aggregate user transcription text; it was fr…
2025-05-13 09:13:58 -04:00
Paul Kompfner
55fc24e933 AWS Nova Sonic service - aggregate user transcription text; it was fragmented across many conversation history messages before 2025-05-13 09:13:28 -04:00
Filipi da Silva Fuchter
b14608f09b Merge pull request #1799 from pipecat-ai/daily_audio_source
Using audio source for capturing Daily's participant audio
2025-05-13 08:15:10 -03:00
Mark Backman
4a25c57337 Merge pull request #1806 from pipecat-ai/aleix/run-test-observers
tests: allow passing observers to run_test()
2025-05-12 22:10:44 -04:00
Aleix Conchillo Flaqué
f800e35ccb tests: allow passing observers to run_test() 2025-05-12 17:53:02 -07:00
Vanessa Pyne
12d49a9b9d Merge pull request #1801 from pipecat-ai/vp-fix-typo
update examples
2025-05-12 15:33:56 -05:00
vipyne
b25b251a44 update examples 2025-05-12 14:07:17 -05:00
Mattie Ruth
64b2a75a94 Update Modal App: (#1755)
* Update Modal App:

Updated Modal App to include:

1. Latest Modal API usage
2. Ability to launch different Pipecat pipelines, much like the
   simple chatbot example
3. Ability to choose which pipeline is launched via the
   /connect endpoint
4. Added a pipeline option for connecting to a self-hosted LLM
   on Modal
5. Improved READMEs
6. Added a web client for interacting with the Modal deployment

tmp

* Update README
2025-05-12 12:45:43 -05:00
Aleix Conchillo Flaqué
b33a60f3a5 Merge pull request #1793 from pipecat-ai/khk/deepgram-async-fix
Fix Deepgram TTS streaming
2025-05-12 09:59:46 -07:00
Filipi Fuchter
d22dbb1a6d Fixing ruff format. 2025-05-12 10:36:21 -03:00
Filipi Fuchter
983199a6cd New example capturing the audio from the participant using the custom audio source. 2025-05-12 10:18:43 -03:00
Filipi Fuchter
133d7ee33a Fixing the default audio source for capture_participant_audio 2025-05-12 10:16:32 -03:00
Mark Backman
0bd888afc7 Merge pull request #1796 from nikp06/patch-1
Wrong deprecation warning when importing ai_services.py
2025-05-12 09:12:48 -04:00
nikp06
537bd1c58d Update ai_services.py
fix: correct deprecation warning format in ai_services module
2025-05-12 12:01:13 +02:00
Kwindla Hultman Kramer
5ef519fe2c Fix Deepgram TTS to use stream_raw() 2025-05-11 15:40:31 -07:00
Mark Backman
20498fb47f Merge pull request #1790 from AngeloGiacco/angelo/fix-api-key
[elevenlabs tts ] fix api key
2025-05-10 19:16:27 -04:00
Angelo Giacco
b57dfb3b5d fix lint 2025-05-10 16:36:26 +01:00
Angelo Giacco
0355ed4aa1 move api key to ws header 2025-05-10 16:34:01 +01:00
Angelo Giacco
1e76cc7bdc fix: elevenlabs api key 2025-05-10 16:09:20 +01:00
Vanessa Pyne
18c0374126 Merge pull request #1785 from pipecat-ai/vp-small-filenmae-change
39-aws-nova-sonic.py -> 40-aws-nova-sonic.py
2025-05-09 12:19:09 -05:00
Aleix Conchillo Flaqué
7072fba7e7 Merge pull request #1780 from pipecat-ai/aleix/deprecate-google-generativeai
GoogleLLMService: deprecate google-generativeai
2025-05-09 09:18:30 -07:00
Aleix Conchillo Flaqué
3d702a5c39 minor examples cleanup 2025-05-09 09:16:10 -07:00
Aleix Conchillo Flaqué
f31efa42c9 GoogleLLMService: deprecate google-generativeai 2025-05-09 09:14:43 -07:00
getchannel
d86502e79a add file_api __init__.py 2025-05-09 10:53:31 -04:00
getchannel
59c7744590 add FileData class events.py 2025-05-09 10:52:04 -04:00
getchannel
949971dea9 Create file_api 2025-05-09 10:51:24 -04:00
getchannel
cd4a893c65 add FileAPI to gemini.py 2025-05-09 10:50:27 -04:00
vipyne
74b369ff20 39-aws-nova-sonic.py -> 40-aws-nova-sonic.py 2025-05-09 08:30:59 -05:00
Filipi Fuchter
46eed0a59a Bumping to use the latest version of @pipecat-ai/react-native-daily-transport, and removing code not needed. 2025-05-08 18:18:00 -03:00
kompfner
9643296e29 Merge pull request #1779 from pipecat-ai/pk/aws-nova-sonic-missing-params-export
Add missing `Params` export to AWS Nova Sonic module
2025-05-08 16:04:38 -04:00
Paul Kompfner
c83c5b5a34 Add missing Params export to AWS Nova Sonic module 2025-05-08 15:23:25 -04:00
Filipi Fuchter
277e2d7fc0 Merge branch 'main' into simple_chatbot-react-native 2025-05-08 09:03:16 -03:00
Mark Backman
7280e390d9 Merge pull request #1774 from pipecat-ai/mb/moondream-ex-server
Add load_dotenv to moondream example server
2025-05-07 19:02:30 -04:00
Mark Backman
4efc3f0a39 Merge pull request #1775 from pipecat-ai/mb/patient-ex-env
Add load_dotenv to patient-intake server file
2025-05-07 19:02:20 -04:00
Mark Backman
cb7e7a8aa3 Add load_dotenv to patient-intake server file 2025-05-07 18:40:04 -04:00
Mark Backman
9136402846 Add load_dotenv to moondream example server 2025-05-07 18:29:27 -04:00
Aleix Conchillo Flaqué
260fc76137 Merge pull request #1773 from pipecat-ai/aleix/pipecat-0.0.67
update CHANGELOG for 0.0.67
2025-05-07 15:05:55 -07:00
Aleix Conchillo Flaqué
7cfb9a4d15 update CHANGELOG for 0.0.67 2025-05-07 14:59:16 -07:00
Mark Backman
2089e0c974 Merge pull request #1768 from pipecat-ai/mb/update-observers
Add DebugLogObserver
2025-05-07 17:30:49 -04:00
Mark Backman
9e0b4fe5d1 Replace list with tuple 2025-05-07 17:21:09 -04:00
Mark Backman
75ce632f84 Add DebugLogObserver 2025-05-07 17:21:08 -04:00
Mark Backman
efeb96c4e8 Remove unused imports 2025-05-07 17:20:42 -04:00
kompfner
fb5438e9c2 Merge pull request #1770 from pipecat-ai/pk/amazon-nova-sonic-interruption-reliability
AWS Nova Sonic service - make interruption handling more reliable, in…
2025-05-07 17:16:06 -04:00
Mark Backman
7da9f66e1c Merge pull request #1761 from pipecat-ai/mb/elevenlabs-context-id
Update ElevenLabsTTSService to use the new websocket API
2025-05-07 17:12:06 -04:00
Mark Backman
9e16e3d614 Update ElevenLabsTTSService to use the new websocket API 2025-05-07 17:09:52 -04:00
Paul Kompfner
84d040c6d0 AWS Nova Sonic service - make interruption handling more reliable, in terms of:
- not getting the conversation into a "stuck" state
- not losing assistant text that should've made it into the context
2025-05-07 16:34:18 -04:00
Mark Backman
f3e0beb8f1 Merge pull request #1762 from pipecat-ai/iss-1734-rtvi-function-call-breakage
Revert breaking change in RTVI protocol for function calling
2025-05-07 15:25:22 -04:00
Aleix Conchillo Flaqué
e00a1196ef Merge pull request #1767 from pipecat-ai/aleix/daily-python-0.18.2
pyproject: update daily-python to 0.18.2
2025-05-07 12:19:59 -07:00
Aleix Conchillo Flaqué
3867c0f8e7 Merge pull request #1766 from pipecat-ai/aleix/daily-fix-multiple-audio-video-sources
fix multiple audio video sources
2025-05-07 12:19:46 -07:00
Aleix Conchillo Flaqué
cdf0953722 pyproject: update daily-python to 0.18.2 2025-05-07 11:56:36 -07:00
Aleix Conchillo Flaqué
ed00f7d071 add video_source field to UserImageRequestFrame 2025-05-07 11:50:21 -07:00
Aleix Conchillo Flaqué
a3038afa02 DailyTransport: fix multiple audio/video sources 2025-05-07 11:50:00 -07:00
kompfner
f9ca0b8cc6 Merge pull request #1704 from pipecat-ai/pk/amazon-nova-sonic
Amazon Nova Sonic LLM service
2025-05-07 14:45:28 -04:00
Paul Kompfner
2920aa5af4 [WIP] AWS Nova Sonic service - pull AWS Nova Sonic support out of the aws optional dependency in pyproject.toml and into its own aws-nova-sonic optional dependency. That's because it requires Python >= 3.12, a higher version than the base project's 3.10. This change allows anyone using any of the other AWS services (including our own unit tests) to continue using the lower Python version. 2025-05-07 14:32:32 -04:00
Paul Kompfner
93c9cc4a0e [WIP] AWS Nova Sonic service - minor fix 2025-05-07 13:54:06 -04:00
Paul Kompfner
b53f9235e4 [WIP] AWS Nova Sonic service - remove unnecessary _context_available state, instead just relying on the presence of _context 2025-05-07 13:54:06 -04:00
Paul Kompfner
1491462d15 [WIP] AWS Nova Sonic service - remove _handling_bot_stopped_speaking, which no longer seems to be necessary; I'm no longer observing back-to-back BotStoppedSpeaking frames 2025-05-07 13:54:06 -04:00
Paul Kompfner
c78f779800 [WIP] AWS Nova Sonic service - log an error message if you try to use AWS Nova Sonic without the proper dependency (e.g. without having done pip install pipecat-ai[aws]) 2025-05-07 13:54:06 -04:00
Paul Kompfner
b013e375fb [WIP] AWS Nova Sonic service - simplify a bit of logic (and do the same simplification in the OpenAI Realtime service) 2025-05-07 13:54:06 -04:00
Paul Kompfner
52036138c1 [WIP] AWS Nova Sonic service - remove unnecessary (no-op) code 2025-05-07 13:54:06 -04:00
Paul Kompfner
4ba9a42861 [WIP] AWS Nova Sonic service - add more accurate typing 2025-05-07 13:54:06 -04:00
Paul Kompfner
27bff7a759 [WIP] AWS Nova Sonic service - fix comment 2025-05-07 13:54:06 -04:00
Paul Kompfner
896f8d85f7 [WIP] AWS Nova Sonic service - remove out-of-date TODO comment 2025-05-07 13:54:06 -04:00
Paul Kompfner
ed06cdd2c7 [WIP] AWS Nova Sonic service - add CHANGELOG entry 2025-05-07 13:54:02 -04:00
Paul Kompfner
8473647269 [WIP] AWS Nova Sonic service - update persistent-context example to better avoid saving "transitional", as opposed to meaningful, context messages 2025-05-07 13:52:51 -04:00
Paul Kompfner
5579145a06 [WIP] AWS Nova Sonic service - post-rebase, update examples to play nicely with recent pipecat changes 2025-05-07 13:52:51 -04:00
Paul Kompfner
35848d10b3 [WIP] AWS Nova Sonic service - remove various TODO comments 2025-05-07 13:52:51 -04:00
Paul Kompfner
c7e223e85a [WIP] AWS Nova Sonic service - remove print statements in favor of logger 2025-05-07 13:52:51 -04:00
Paul Kompfner
885b2d1d2f [WIP] AWS Nova Sonic service - make parameters configurable 2025-05-07 13:52:51 -04:00
Paul Kompfner
73020be511 [WIP] AWS Nova Sonic service - minor fix: only try to read received JSON if we have it 2025-05-07 13:52:51 -04:00
Paul Kompfner
d388c057c0 [WIP] AWS Nova Sonic service - recover from unwanted disconnection due to an error 2025-05-07 13:52:51 -04:00
Paul Kompfner
c4d0f91a7f [WIP] AWS Nova Sonic service - remove some old code that was accidentally still there, possibly sending a duplicate system instruction 2025-05-07 13:52:51 -04:00
Paul Kompfner
467233be04 [WIP] AWS Nova Sonic service - support multi-line system prompt 2025-05-07 13:52:51 -04:00
Paul Kompfner
2b02d08f4c [WIP] AWS Nova Sonic service - add comments to examples pointing out the us-east-1 is the only supported region so far 2025-05-07 13:52:51 -04:00
Paul Kompfner
9fe265ea64 [WIP] AWS Nova Sonic service - implement ability to persist and load conversations 2025-05-07 13:52:51 -04:00
Paul Kompfner
cc1f4ba81c [WIP] AWS Nova Sonic service - add a hacky way of programmatically triggering an assistant response 2025-05-07 13:52:51 -04:00
Paul Kompfner
3784bdbd27 [WIP] AWS Nova Sonic service - in our hacky direct manipulation of the context, aggregate assistant text rather than recording every chunk as a separate message 2025-05-07 13:52:51 -04:00
Paul Kompfner
4ffdc3b77c [WIP] AWS Nova Sonic service - do hacky direct manipulation of the context for now, since I can't seem to get assistant context aggregation working properly with frames, grr 2025-05-07 13:52:51 -04:00
Paul Kompfner
38c9fa681a [WIP] AWS Nova Sonic service - Protect against back-to-back BotStoppedSpeaking calls, which I've observed 2025-05-07 13:52:51 -04:00
Paul Kompfner
c477039954 [WIP] AWS Nova Sonic service - just for safety, add a short delay after BotStoppedSpeaking before sending LLMFullResponseEndFrame + TTSStoppedFrame, to give a bit of leeway for the LLM to deliver the "FINAL" text block describing what was said 2025-05-07 13:52:51 -04:00
Paul Kompfner
d6ef3d64ac [WIP] AWS Nova Sonic service - fix context problems of double-counting LLM text, and mis-categorizing user text as LLM text 2025-05-07 13:52:51 -04:00
Paul Kompfner
6938152db6 [WIP] AWS Nova Sonic service - fix comment 2025-05-07 13:52:51 -04:00
Paul Kompfner
2154db07f0 [WIP] AWS Nova Sonic service - remove unnecessary error log 2025-05-07 13:52:51 -04:00
Paul Kompfner
5e0803479e [WIP] AWS Nova Sonic service - add send_transcription_frames option 2025-05-07 13:52:51 -04:00
Paul Kompfner
3960c604a4 [WIP] AWS Nova Sonic service - fix empty assistant conversation history item in the context after tool use 2025-05-07 13:52:51 -04:00
Paul Kompfner
394648f1c9 [WIP] AWS Nova Sonic service - fix user utterances not making it into the context 2025-05-07 13:52:51 -04:00
Paul Kompfner
da5c4953d5 [WIP] AWS Nova Sonic service - allow passing in tools into initializer 2025-05-07 13:52:51 -04:00
Paul Kompfner
2b7e1cb5b1 [WIP] AWS Nova Sonic service - add tool calling 2025-05-07 13:52:51 -04:00
Paul Kompfner
f182eafb40 [WIP] AWS Nova Sonic service - add ability to pass in OpenAILLMContext 2025-05-07 13:52:51 -04:00
Paul Kompfner
9f7f42e885 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
9b8bce1914 [WIP] AWS Nova Sonic service - add voice_id 2025-05-07 13:52:51 -04:00
Paul Kompfner
96d05e12fc [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
68c1069548 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
5b64613f65 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
1f9baefba8 [WIP] AWS Nova Sonic service - added stubs for handling interruption and user-started-speaking frames 2025-05-07 13:52:51 -04:00
Paul Kompfner
0c255d2618 [WIP] AWS Nova Sonic service - added TTSTextFrame and reworked/cleaned up some bookkeeping logic 2025-05-07 13:52:51 -04:00
Paul Kompfner
a38206de9c [WIP] AWS Nova Sonic service - added TranscriptionFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
260f7c9b85 [WIP] AWS Nova Sonic service - format 2025-05-07 13:52:51 -04:00
Paul Kompfner
de294caed9 [WIP] AWS Nova Sonic service - added LLMFullResponseStartFrame, LLMTextFrame, and LLMFullResponseEndFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
e40aa4f99a [WIP] AWS Nova Sonic service - added TTSStartedFrame and TTSStoppedFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
b1d413b9be [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
8cbad070ad [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
13569a5a5a [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
d789334a60 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
7668b27fc0 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
6d30f441e8 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
a9e395b366 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
5e5626f04f [WIP] AWS Nova Sonic service 2025-05-07 13:52:47 -04:00
Aleix Conchillo Flaqué
d80aa5b44e Merge pull request #1753 from pipecat-ai/aleix/add-bedrock-support
Add support for Amazon Bedrock LLMs
2025-05-07 09:31:48 -07:00
Aleix Conchillo Flaqué
80ef6dc4de update README with AWS Bedrock and Transcribe 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
458549f7df AWSBedrockLLMService: fix function calling 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
a8405649d0 aws: use AWS prefix for all services 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
ce1a72850b tests: add bedrock context aggregator tests 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
58de381746 AWS: add missing utils 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
bed2e894a2 BedrockLLMService: pull initial system frame from messages 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
b4de98cfb7 AWS: various cleanups (logs, imports...) 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
a4b9db9e07 fix formatting 2025-05-07 09:26:26 -07:00
Adithya Suresh
664111a3c9 Added cache related info to metrics 2025-05-07 09:26:26 -07:00
Adithya Suresh
aa964847f3 System param to be a list 2025-05-07 09:26:26 -07:00
Adithya Suresh
fa5cac7e0a Bug fix in content format 2025-05-07 09:26:26 -07:00
Adithya Suresh
b2b01861b2 Remove model restriction 2025-05-07 09:26:26 -07:00
Adithya Suresh
f014f718eb Restructured STT and enabled prosody tags for generative Polly 2025-05-07 09:26:26 -07:00
Adithya Suresh
05ae8d3ffa Removed OpenAI based context formatting 2025-05-07 09:26:26 -07:00
Adithya Suresh
88c9e08bd8 Updated tools parsing logic 2025-05-07 09:26:26 -07:00
Adithya Suresh
844f61dfea Initial implementation 2025-05-07 09:26:26 -07:00
Tico Ballagas
acb7d597cb Change example to use generative voices 2025-05-07 09:19:49 -07:00
Tico Ballagas
2b18f60261 Initial implementation of AWS Transcribe TTS 2025-05-07 09:19:49 -07:00
mattie ruth backman
5b66133a6c Revert breaking change in RTVI protocol for function calling 2025-05-07 12:08:28 -04:00
Mark Backman
0c5bc6a57a Merge pull request #1760 from WebinarGeek/wg/daily-active-speaker-event
DailyTransport: added on_active_speaker_changed event handler
2025-05-07 11:17:49 -04:00
Mark Backman
7981e00955 Merge pull request #1759 from pipecat-ai/mb/readme-nvidia-riva
Update README with Riva services
2025-05-07 11:13:51 -04:00
Dan Berg
5e39c0cfeb DailyTransport: added on_active_speaker_changed event handler 2025-05-07 15:22:30 +02:00
Mark Backman
a444701929 Update README with Riva services 2025-05-07 09:02:08 -04:00
Mark Backman
f6c1eb5d9d Merge pull request #1757 from pipecat-ai/mb/remove-canonical
Removing CanonicalMetricsService
2025-05-06 21:38:03 -04:00
Mark Backman
a1d46cb26b Removing CanonicalMetricsService 2025-05-06 21:23:23 -04:00
Aleix Conchillo Flaqué
99ab148d88 Merge pull request #1739 from pipecat-ai/aleix/observers-frame-pushed-class
BaseObserver: add FramePushed class and deprecate multiple arguments
2025-05-06 15:29:05 -07:00
Aleix Conchillo Flaqué
d69fa5dba5 update CHANGELOG with UltravoxSTTService fix 2025-05-06 15:26:25 -07:00
Aleix Conchillo Flaqué
0d30b000af BaseObserver: add FramePushed class and deprecated multiple arguments 2025-05-06 15:26:23 -07:00
Mark Backman
e7c0e742d2 Merge pull request #1752 from pipecat-ai/mb/deepgram-tts-aura-2
Update Deepgram TTS default voice to Aura 2 voice
2025-05-06 16:26:26 -04:00
Mark Backman
2aff2dcca3 Merge pull request #1751 from pipecat-ai/mb/11labs-enable_ssml_parsing
Add enable_ssml_parsing and enable_logging to ElevenLabsTTSService
2025-05-06 16:25:20 -04:00
Mark Backman
288f8865c8 Add enable_logging to ElevenLabsTTSService 2025-05-06 12:13:26 -04:00
Mark Backman
8691870bcb Update Deepgram TTS default voice to Aura 2 voice 2025-05-06 11:29:32 -04:00
Mark Backman
e06146c237 Add enable_ssml_parsing to ElevenLabsTTSService 2025-05-06 11:06:57 -04:00
Aleix Conchillo Flaqué
c68e990cda Merge pull request #1748 from pipecat-ai/aleix/task-manager-dictionary
task manager dictionary and cleanup PipelineTask
2025-05-06 07:57:53 -07:00
Aleix Conchillo Flaqué
4583905313 PipelineTask: cleanup if task is cancelled from outside Pipecat 2025-05-05 21:33:21 -07:00
Aleix Conchillo Flaqué
9cc498b1fa TaskManager: use a dictionary instead of a set to store tasks 2025-05-05 21:27:49 -07:00
Mark Backman
b3c5dc4045 Merge pull request #1443 from adithyaxx/anthropic-client-bug-fixes
Handle missing token counts in AsyncAnthropicBedrock client properly
2025-05-05 21:13:11 -04:00
Filipi Fuchter
56ca7360ae Fixing versions 2025-05-05 19:11:59 -03:00
Filipi Fuchter
d5ab3251f0 Bumping the dependencies, updating readme, adding .gitignore. 2025-05-05 18:43:04 -03:00
Filipi Fuchter
915c284420 Fixing readme 2025-05-05 18:32:04 -03:00
Aleix Conchillo Flaqué
3824da7261 Merge pull request #1745 from pipecat-ai/aleix/make-sure-transports-are-ready
only send data to transports after they are really ready
2025-05-05 14:21:36 -07:00
Filipi Fuchter
40154824e8 Creating a RN example for simple-chatbot 2025-05-05 18:17:39 -03:00
Aleix Conchillo Flaqué
855d567b1e only send data to transports after they are really ready 2025-05-05 14:06:58 -07:00
Mark Backman
b323a7bd88 Merge pull request #1742 from pipecat-ai/mb/pcc-krisp-filter
Update pipecat-cloud-example to use Krisp in PCC deployment only
2025-05-05 15:46:12 -04:00
Mark Backman
fa011d0018 Update pipecat-cloud-example to use Krisp in PCC deployment only 2025-05-05 15:09:29 -04:00
Aleix Conchillo Flaqué
e15fa8777a Merge pull request #1737 from CerebriumAI/kyle/fix-ultravox-spacing
[Fix] Ultravox frame spacing issue
2025-05-05 09:34:49 -07:00
Aleix Conchillo Flaqué
2143a6d927 Merge pull request #1732 from pipecat-ai/aleix/daily-remote-custom-tracks
DailyTransport: remove custom tracks before leaving
2025-05-05 08:44:11 -07:00
Aleix Conchillo Flaqué
044e2d3e73 DailyTransport: remove custom tracks before leaving 2025-05-05 08:35:35 -07:00
Kyle Gani
be112ec63f Merge branch 'kyle/fix-ultravox-performance' of github.com:CerebriumAI/pipecat into kyle/fix-ultravox-performance 2025-05-05 17:13:26 +02:00
Kyle Gani
d2f56c4e8f Fix: Spacing issue 2025-05-05 17:13:21 +02:00
Mark Backman
ddc6a9c695 Merge pull request #1670 from pipecat-ai/mb/daily-twilio-sip-example
Add standalone Daily + Twilio SIP example
2025-05-05 10:57:16 -04:00
Mark Backman
2bebdbc371 Merge pull request #1671 from pipecat-ai/khk/rime-arcana
support for rime arcana model
2025-05-05 10:54:50 -04:00
Mark Backman
8b9f1f0608 Add a changelog entry 2025-05-05 10:51:46 -04:00
Kwindla Hultman Kramer
b25f3b2ed2 support for rime arcana model 2025-05-05 10:50:46 -04:00
Mark Backman
a995cf81b6 Merge pull request #1724 from pipecat-ai/mb/demo-fixes
Demo fixes
2025-05-05 08:44:57 -04:00
Aleix Conchillo Flaqué
75d261639f Merge pull request #1726 from pipecat-ai/aleix/pipecat-0.0.66
update CHANGELOG for pipecat 0.0.66
2025-05-02 20:54:57 -07:00
Aleix Conchillo Flaqué
f720d795d0 update CHANGELOG for pipecat 0.0.66 2025-05-02 20:29:51 -07:00
Aleix Conchillo Flaqué
f6fe83e358 Merge pull request #1725 from pipecat-ai/aleix/update-daily-python-0.18.1
update to daily-python 0.18.1
2025-05-02 20:27:50 -07:00
Mark Backman
0513d0b6a8 Update README 2025-05-02 22:44:50 -04:00
Mark Backman
0679bb217d Remove Twilio from phone-chatbot directory 2025-05-02 22:18:50 -04:00
Mark Backman
38bd55e518 Update README 2025-05-02 22:18:50 -04:00
Mark Backman
65c7423280 Add other dial-in event handlers 2025-05-02 22:18:50 -04:00
Mark Backman
f24a85cc94 Add logic to only forward the first on_dialin_ready event 2025-05-02 22:18:50 -04:00
Mark Backman
53887b7c98 Display phone number in WebRTC call 2025-05-02 22:18:50 -04:00
Mark Backman
523c012c38 Use a Twilio asset to ring the phone throughout 2025-05-02 22:18:50 -04:00
Mark Backman
97c28989c1 Add standalone Daily + Twilio SIP example 2025-05-02 22:18:50 -04:00
Mark Backman
c19be6ebb2 Demo fixes 2025-05-02 20:58:10 -04:00
Aleix Conchillo Flaqué
54971a0735 update to daily-python 0.18.1 2025-05-02 17:47:44 -07:00
Mark Backman
4513e81e13 Merge pull request #1723 from pipecat-ai/mb/base-output-bot-speaking-log
Only display the destination in the bot started/stopped speaking log …
2025-05-02 17:32:47 -04:00
Mark Backman
872204b795 Only display the destination in the bot started/stopped speaking log when there is a desintation 2025-05-02 17:29:28 -04:00
Aleix Conchillo Flaqué
a94cbfe6f5 Merge pull request #1722 from pipecat-ai/aleix/base-output-transport-audio-task-fix
BaseOutputTransport: always initialize audio task
2025-05-02 14:26:30 -07:00
Aleix Conchillo Flaqué
7152faafb2 BaseOutputTransport: always initialize audio task
We also use the audio task to also send synchronized images with audio.
2025-05-02 14:23:15 -07:00
Mark Backman
e6aadaccd8 Merge pull request #1721 from pipecat-ai/mb/simli-silent-frames
Fix: SimliVideoService was continuously emitting audio, preventing Bo…
2025-05-02 16:44:39 -04:00
Mark Backman
3a73aa71b8 Merge pull request #1613 from pipecat-ai/mb/improve-storybot-readme
demo: Restructure storytelling-chatbot directory, update README steps…
2025-05-02 16:39:59 -04:00
Mark Backman
814e7509e1 demo: Restructure storytelling-chatbot directory, update README steps, link to vercel demo 2025-05-02 16:37:37 -04:00
Vanessa Pyne
e0cf5ec016 Merge pull request #1705 from pipecat-ai/vp-update-nvidia-models
Riva Service: add magpie-tts-multilingual model
2025-05-02 15:34:23 -05:00
vipyne
667bd32e6a Riva: remove deprecated lines in example 2025-05-02 15:33:10 -05:00
vipyne
b2ecd83706 update CHANGELOG with Riva details 2025-05-02 15:33:10 -05:00
vipyne
b2754117c8 Riva: refactor function_id and model_name 2025-05-02 15:33:10 -05:00
vipyne
6c428c303b update magpie voice 2025-05-02 15:33:10 -05:00
Mark Backman
e7d889a143 Update RivaSTTService to use by default 2025-05-02 15:33:10 -05:00
Mark Backman
da60e7069b Update pyproject.toml to use nvidia-riva-client 2.19.1 2025-05-02 15:33:10 -05:00
Mark Backman
c14406a3b9 Demos use the latest services 2025-05-02 15:33:10 -05:00
Mark Backman
725ab5ec21 Small fixes: No default api_key of None, ParakeetSTTService uses RivaSTTService.InputParams 2025-05-02 15:33:10 -05:00
Mark Backman
daf9d47e58 Update RivaSegmentedSTTService 2025-05-02 15:33:10 -05:00
vipyne
63a65627a2 Riva Service: add magpie-tts-multilingual model 2025-05-02 15:33:10 -05:00
Mark Backman
02c07755b0 Add Changelog entry for PR 1707 2025-05-02 15:33:10 -05:00
Matt Kim
15cbd18acc [Rime] Add phonemizeBetweenBrackets and pauseBetweenBrackets to RimeTTSService (ws)
There is a fix incoming in
2025-05-02 15:33:10 -05:00
Kwindla Hultman Kramer
93c40b87dc small groq updates 2025-05-02 15:33:10 -05:00
Mark Backman
eeaa9f67a1 Fix: SimliVideoService was continuously emitting audio, preventing BotStoppedSpeakingFrame from being sent 2025-05-02 16:32:42 -04:00
Mark Backman
b60691c7b2 Merge pull request #1720 from pipecat-ai/mb/changelog-pr-1707
Add Changelog entry for PR 1707
2025-05-02 16:13:40 -04:00
Mark Backman
2bb1b0b343 Add Changelog entry for PR 1707 2025-05-02 16:09:50 -04:00
Mark Backman
047ef9f86c Merge pull request #1707 from rimelabs/matt/rime/url_param_serialization
[Rime] Add new params to RimeTTSService
2025-05-02 16:08:01 -04:00
Kwindla Hultman Kramer
9a2c603c91 Merge pull request #1711 from pipecat-ai/khk/groq-updates 2025-05-02 12:21:15 -07:00
Filipi da Silva Fuchter
94c4169407 Merge pull request #1717 from pipecat-ai/local_smart_turn_torch
Local smart turn torch
2025-05-02 15:53:30 -03:00
Filipi Fuchter
cb8a551db8 Mentioning the new LocalSmartTurnAnalyzer in the changelog. 2025-05-02 14:32:18 -03:00
Filipi Fuchter
779f09af70 Fixing lint. 2025-05-02 14:22:38 -03:00
Filipi Fuchter
19dc0f2bfb New example using the local smart turn 2025-05-02 14:21:42 -03:00
Filipi Fuchter
f0709e22ba Creating a local smart turn using torch. 2025-05-02 14:21:29 -03:00
Mark Backman
8250736f5e Merge pull request #1708 from pipecat-ai/mb/gemini-user-context
Push GeminiMultimodalLiveLLMService TranscriptionFrame Upstream, remo…
2025-05-02 13:10:27 -04:00
Mark Backman
83348a9f93 Merge pull request #1714 from pipecat-ai/mb/fix-gemini-text-modality
Restore TEXT modalities support to GeminiMultimodalLiveLLMService
2025-05-02 10:41:05 -04:00
Mark Backman
96d40903a9 Only send TTSStoppedFrame from Gemini when in AUDIO mode, only send one LLMFullResponseEndFrame 2025-05-02 10:18:53 -04:00
Aleix Conchillo Flaqué
2560811805 Merge pull request #1697 from pipecat-ai/aleix/daily-custom-audio-tracks
add support for multiple transport destinations
2025-05-02 06:34:09 -07:00
Mark Backman
2b8c44c008 Merge pull request #1710 from pipecat-ai/mb/openai-context-aggregation
fix: OpenAIRealtimeBetaLLMService writes two assistant messages to th…
2025-05-02 07:43:35 -04:00
Mark Backman
38e2d37674 Restore TEXT modalities support to GeminiMultimodalLiveLLMService 2025-05-02 07:36:12 -04:00
Vanessa Pyne
6278561f88 Merge pull request #1709 from pipecat-ai/vp-fix-fastpitch-params-update
Riva TTS: update FastPitch params
2025-05-01 21:23:10 -05:00
Aleix Conchillo Flaqué
750e79c1ce DailyParams: rename to camera/microphone_out_enabled 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
71eb2963c5 examples: added daily-custom-tracks 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
f44e2c86ea BaseOutputTransport: compute sample_rate and audio_chunk_size in main class 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
afe1f0df8c DailyTransport: make sure we can write audio frames to destination 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
458fddfb48 update CHANGELOG with new Daily and Transport features 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
8d915c5ccb DailyParams: allow enabling/disabling camera/microphone tracks 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
304153dd03 TTSService: set transport destination to all TTS frames 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
a6781b7352 rename destination to transport_destination 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
5ad0058303 update CHANGELOG with frame source/destination support 2025-05-01 19:11:13 -07:00
Aleix Conchillo Flaqué
75c039de33 examples: add daily-multi-translation 2025-05-01 19:11:13 -07:00
Aleix Conchillo Flaqué
74e3c3677e DailyTransport: fix audio/video renderers registration 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
dc20327f10 DailyTransport: register audio destination and use custom tracks 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
e738affd29 BaseOutputTransport: allow sending audio/video to multiple destinations 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
ef3d732607 DailyTransport: allow capturing multiple simultaneous audio/video sources 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
6d63cff1bf DailyTransport: custom audio tracks support 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
12f42605a1 pyproject: update daily-python to 0.18.0 2025-05-01 18:58:44 -07:00
Kwindla Hultman Kramer
fac3337927 small groq updates 2025-05-01 17:09:15 -07:00
Mark Backman
76d198151c Push GeminiMultimodalLiveLLMService TranscriptionFrame Upstream, remove direct context addition 2025-05-01 15:41:04 -04:00
Mark Backman
6a907058de fix: OpenAIRealtimeBetaLLMService writes two assistant messages to the context 2025-05-01 15:37:39 -04:00
vipyne
6e1f531f64 Riva TTS: update FastPitch params
91138c3f66 (diff-ece228577b1d233ce600a948243f90cece53e3a9b89554a0b27a48bc4d6e0fdfR45)
2025-05-01 11:14:41 -05:00
Matt Kim
4232cca5b6 [Rime] Add phonemizeBetweenBrackets and pauseBetweenBrackets to RimeTTSService (ws)
There is a fix incoming in
2025-04-30 18:09:22 -07:00
Mark Backman
a6a4d3d71f Merge pull request #1706 from rimelabs/matt/rime/update_url
[Rime] - Update url for Websockets API
2025-04-30 19:14:04 -04:00
Mark Backman
c52de0f5de Merge pull request #1696 from pipecat-ai/mb/fix-gemini-live-context
Fix: GeminiMultimodalLiveLLMService was appending tokens to the context
2025-04-30 19:12:06 -04:00
Mark Backman
a1e1255f16 Strip newlines from generated user transcript 2025-04-30 18:27:46 -04:00
Mark Backman
c4f758725e Ignore TranscriptionFrames too 2025-04-30 18:22:43 -04:00
Aleix Conchillo Flaqué
7bc9a78ce6 udpate CHANGELOG with RTVIObserverParams 2025-04-30 15:13:14 -07:00
Aleix Conchillo Flaqué
f8be71b32c Merge pull request #1688 from pipecat-ai/aleix/add-rtvi-observer-params
RTVIObserver: add RTVIObserverParams to configure what to send
2025-04-30 15:11:18 -07:00
Aleix Conchillo Flaqué
957fa5546d RTVIObserver: add RTVIObserverParams to configure what to send 2025-04-30 15:09:02 -07:00
Aleix Conchillo Flaqué
039cb8fcae Merge pull request #1690 from pipecat-ai/aleix/rtvi-function-call-single-param
RTVIProcessor: use single FunctionCallParams
2025-04-30 15:04:05 -07:00
Mark Backman
8e05f2f1a1 Merge pull request #1702 from pipecat-ai/mb/stt-mute-transcription-frames
Add InterimTranscriptionFrame and TranscriptionFrame to STTMuteFilter…
2025-04-30 17:54:24 -04:00
Matt Kim
8467aa1ed3 [Rime] - Update url for Websockets API
Rime has migrated their Websockets api to the base url `user.rime.ai` along with all other tts endpoints. 

See the [docs](https://docs.rime.ai/api-reference/endpoint/websockets)

`users-ws.rime.ai` is deprecated and will not reflect upgrades to the rime ws api.
2025-04-30 14:20:13 -07:00
Mark Backman
9c5878af3d OpenAI Realtime and Gemini Live push LLMTextFrame again, overwrite the assitant context aggregator for LLMTextFrame 2025-04-30 17:18:20 -04:00
Mark Backman
ef29800fe9 Update the changelog 2025-04-30 16:28:17 -04:00
Mark Backman
7e09933070 OpenAI Realtime should push TTSTextFrame only 2025-04-30 16:28:17 -04:00
Mark Backman
82a9d7f992 Gemini Mulitmodal Live to push TTSTextFrame only 2025-04-30 16:28:17 -04:00
Mark Backman
facbebb15f Transcribe user audio in 26b 2025-04-30 16:28:16 -04:00
Mark Backman
2ba60fc41f Update TranscriptProcessor to handle GeminiMultimodalLiveLLMService changes 2025-04-30 16:28:16 -04:00
Mark Backman
685f951ae2 Fix: GeminiMultimodalLiveLLMService was appending tokens to the context 2025-04-30 16:28:16 -04:00
Mark Backman
27d4c927a8 Merge pull request #1701 from pipecat-ai/mb/gemini-extend-session
Add context_window_compression support to GeminiMultimodalLiveLLMService
2025-04-30 14:35:50 -04:00
Mark Backman
20a59e8c56 Add InterimTranscriptionFrame and TranscriptionFrame to STTMuteFilter frame processing 2025-04-30 10:50:56 -04:00
Mark Backman
d9a0a93667 Add context_window_compression support to GeminiMultimodalLiveLLMService 2025-04-30 09:55:34 -04:00
Mark Backman
154d5d1859 Merge pull request #1699 from pipecat-ai/mb/more-docs-mocks
Additional import mocks to fix docs failure
2025-04-30 08:36:57 -04:00
Mark Backman
a192217256 Additional import mocks to fix docs failure 2025-04-29 21:45:27 -04:00
Mark Backman
10cdc47e05 Merge pull request #1689 from pipecat-ai/mb/handle-http-smart-turn-errors
Handle case where Fal Smart Turn returns a 500 error
2025-04-29 14:21:45 -04:00
Mark Backman
2b4d41a548 Merge pull request #1693 from flixoflax/bump-modal-example-dependencies
Bump modal deployment example dependencies
2025-04-29 14:21:31 -04:00
Filipi da Silva Fuchter
962f8062a5 Merge pull request #1581 from pipecat-ai/voice_agent_ice_servers
Configuring the voice-agent example to wait for all the ice candidates.
2025-04-29 13:30:12 -03:00
Filipi Fuchter
d80d385b2f Adding a section explaining about ice servers. 2025-04-29 12:19:59 -03:00
Filipi Fuchter
b347ca472f Checking the state again to avoid any eventual race condition 2025-04-29 12:04:20 -03:00
Filipi Fuchter
c3c4952abf Reducing the timeout to 2 seconds for gathering the ice candidates. 2025-04-29 11:34:58 -03:00
Filipi Fuchter
f369ab4c1a Printing each new ice candidate. 2025-04-29 11:23:33 -03:00
flixoflax
62b41c6789 removed aiohttp from deps, and version constraint from pipecat-ai 2025-04-29 16:14:51 +02:00
Filipi da Silva Fuchter
2872bc7902 Merge pull request #1587 from pipecat-ai/improving_ice_servers
Updated `SmallWebRTCConnection` to support `ice_servers` with credentials.
2025-04-29 11:07:32 -03:00
Filipi Fuchter
9658b75a10 Configuring the voice-agent example to use ice-servers and wait for all the ice candidates. 2025-04-29 10:52:07 -03:00
flixoflax
63de9039e6 Bumb modal deployment example deps 2025-04-29 15:50:56 +02:00
Filipi Fuchter
9352396d7e Mentioning the new feature in the changelog. 2025-04-29 10:40:09 -03:00
Filipi Fuchter
d1ab1d38b7 Fixing the examples to use the new IceServer structure. 2025-04-29 10:33:19 -03:00
Filipi Fuchter
080f70d91c Allowing to define the username and credential for the ice servers. 2025-04-29 10:32:42 -03:00
Mark Backman
ebed1fc6ea Restructure _send_raw_request to raise errors early, then process the successful response 2025-04-29 09:05:14 -04:00
Aleix Conchillo Flaqué
6821b1cdab RTVIProcessor: use single FunctionCallParams 2025-04-29 05:56:23 -07:00
Mark Backman
144ae9b611 Handle case where Fal Smart Turn returns a 500 error 2025-04-29 08:53:02 -04:00
Aleix Conchillo Flaqué
a2e7331ce2 Merge pull request #1680 from pipecat-ai/aleix/local-input-select-stt-update
examples: update local-input-select-stt
2025-04-28 12:01:12 -07:00
Aleix Conchillo Flaqué
8accd3e387 Merge pull request #1681 from pipecat-ai/aleix/tts-service-llm-full-response-end-fix
TTSService: do not push LLMFullResponseEndFrame if not needed
2025-04-28 12:00:55 -07:00
Mark Backman
3d05a74dc0 Merge pull request #1678 from Dev-Khant/integrate-mem0-oss
Integrate with Mem0 OSS
2025-04-28 14:15:40 -04:00
Aleix Conchillo Flaqué
e3c965f4d5 TTSService: do not push LLMFullResponseEndFrame if not needed 2025-04-28 11:03:22 -07:00
Dev-Khant
5354e5d891 formatting 2025-04-28 22:57:18 +05:30
Aleix Conchillo Flaqué
5784e91cff update .gitignore 2025-04-28 09:34:45 -07:00
Aleix Conchillo Flaqué
bc5f098aaa examples: update local-input-select-stt 2025-04-28 09:28:26 -07:00
Aleix Conchillo Flaqué
93534b4692 Merge pull request #1679 from pipecat-ai/aleix/update-package-lock-apr-28
examples: update package-lock.json
2025-04-28 09:27:01 -07:00
Aleix Conchillo Flaqué
b23d54c609 examples: update package-lock.json 2025-04-28 09:15:14 -07:00
Dev-Khant
aa23a7b1e6 update changelog 2025-04-28 20:03:20 +05:30
Dev-Khant
c0c41789ab Integrate with Mem0 OSS 2025-04-28 17:15:53 +05:30
Jin Kim
cf2f249f8a Use "use_original_timestamps" only for sonic-2 model 2025-04-27 19:33:14 +09:00
Aleix Conchillo Flaqué
029ef4f8c2 Merge pull request #1667 from pipecat-ai/aleix/function-call-single-parameter
function call single parameter
2025-04-25 13:53:10 -07:00
Aleix Conchillo Flaqué
9cad6dfce9 RTVIProcessor: simplify handle_function_call() and depreacted handle_function_call_start() 2025-04-25 13:34:05 -07:00
Aleix Conchillo Flaqué
4df6444832 examples: update with single FunctionCallParams parameter 2025-04-25 13:34:05 -07:00
Aleix Conchillo Flaqué
944bc23135 LLMService: use a single FunctionCallParams parameter for function calls 2025-04-25 13:34:03 -07:00
Aleix Conchillo Flaqué
1d863ee7de Merge pull request #1669 from pipecat-ai/aleix/short-utterances-fixes
short utterances fixes
2025-04-25 13:25:15 -07:00
Vanessa Pyne
9ca775d1ab Merge pull request #1668 from pipecat-ai/vp-update-transport-params-in-ex
Update examples with new transport param names
2025-04-25 15:24:28 -05:00
Aleix Conchillo Flaqué
03002ad685 LLMUserContextAggregator: reduce aggregation_timeout to 0.5 2025-04-25 13:21:00 -07:00
Aleix Conchillo Flaqué
99a4154cbc LLMUserContextAggregator: ignore short uterrances while bot speaking 2025-04-25 13:21:00 -07:00
vipyne
0f68cc182d Update examples with new transport param names 2025-04-25 15:18:56 -05:00
Aleix Conchillo Flaqué
3ac50b9902 Merge pull request #1664 from CerebriumAI/kyle/fix-ultravox-performance
Improved: Ultravox performance
2025-04-25 12:59:25 -07:00
Michael Louis
9557705b53 Changed warmup to internal function 2025-04-25 14:40:19 -04:00
Mark Backman
a7718926e9 Merge pull request #1666 from pipecat-ai/mb/add-vad-start-stop-events
Add VADUserStartedSpeakingFrame and VADUserStoppedSpeakingFrame
2025-04-25 10:49:17 -04:00
Mark Backman
dfa10af6ed Simplify VAD events to be detected and emitted from BaseInputTransport 2025-04-25 10:28:44 -04:00
Mark Backman
8485ea6c5e Merge pull request #1650 from pipecat-ai/mb/fal-smart-turn-readme
Add hosted demo link to Fal smart turn README
2025-04-25 09:39:54 -04:00
Mark Backman
b298376766 Add VADUserStartedSpeakingFrame and VADUserStoppedSpeakingFrame 2025-04-25 09:06:54 -04:00
Mark Backman
7cfefe4f84 Merge pull request #1593 from WebinarGeek/wg/gladia-translations
Push gladia translations as a TranscriptionFrame
2025-04-25 08:35:36 -04:00
Mark Backman
71c7373987 Update CHANGELOG 2025-04-25 08:26:40 -04:00
Mark Backman
d1086914fe Add TranslationFrame and use in GladiaSTTService; add 13c-gladia-translation.py 2025-04-25 08:25:39 -04:00
Dan Berg
2fb85941d3 Add hint installing test requirements 2025-04-25 08:24:51 -04:00
Dan Berg
be8788e4da Push gladia translations as a TranscriptionFrame 2025-04-25 08:24:51 -04:00
Mark Backman
acb6abd761 Merge pull request #1644 from pipecat-ai/mb/add-transport-examples
Add transports examples to foundational examples
2025-04-25 08:22:26 -04:00
Mark Backman
a528aad957 Merge pull request #1642 from pipecat-ai/mb/foundational-requirements
Add deepgram and cartesia to foundational example requirements to mak…
2025-04-25 08:22:08 -04:00
Mark Backman
5c13252801 Merge pull request #1658 from pipecat-ai/mb/update-fal-smart-turn-version
Update daily-transport version in fal-smart-turn demo
2025-04-25 08:21:50 -04:00
Mark Backman
a4422ac6c2 Add transports examples to foundational examples 2025-04-25 08:20:04 -04:00
Mark Backman
985a031353 Merge pull request #1657 from pipecat-ai/mb/word-wrangler-example
Add Word Wrangler demos
2025-04-25 08:16:02 -04:00
Kyle Gani
5c4079b286 Cleaned up: layout 2025-04-25 13:18:03 +02:00
Kyle Gani
5489ac5a73 Merge branch 'main' into kyle/fix-ultravox-performance 2025-04-25 13:16:48 +02:00
Kyle Gani
49fbcc86ac Improved: Ultravox performance 2025-04-25 13:12:08 +02:00
Aleix Conchillo Flaqué
d20c3307b9 Merge pull request #1648 from pipecat-ai/aleix/always-push-audio
input transports now always push audio frames
2025-04-24 18:59:09 -07:00
Aleix Conchillo Flaqué
9fd76923fd STTService: passthrough audio frames by default 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
a753a623d4 examples: allow setting custom program arguments 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
4ee6c4b59e BaseOutputTransport: reword camera with video 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
e79a002e5a examples: update camera_* with video_* 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
420912dd4b transports: deprecate TransportParams.camera_* in favor of video_* 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
de7185e8db examples: remove vad_enabled=True 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
8bfcfe8b1d transports: deprecate TransportParams.vad_enabled 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
26d2ce5926 examples: remove vad_audio_passthrough=True 2025-04-24 17:14:18 -07:00
Aleix Conchillo Flaqué
9ee56bff9e transports: push audio by default. s/vad_audio_passthrough/audio_in_passthrough/ 2025-04-24 17:14:16 -07:00
Vanessa Pyne
2e0d77e4f0 Merge pull request #1627 from pipecat-ai/vp-mcp-take-3
MCP Service
2025-04-24 18:16:19 -05:00
vipyne
4cc8a4312c Revert "update 39* examples as per #1648"
This reverts commit b29ffeef29.
2025-04-24 18:11:35 -05:00
vipyne
cb7cb381aa Add MCPClient changelog line 2025-04-24 18:05:20 -05:00
vipyne
b29ffeef29 update 39* examples as per #1648 2025-04-24 17:19:50 -05:00
vipyne
b7b2a5b7a1 mcp service fix and add multiple mcp example 2025-04-24 17:13:40 -05:00
vipyne
3384598e07 mcp_service: pr notes 2025-04-24 17:13:40 -05:00
vipyne
c420dbe57f MCP Service
WIP getting mcp.run to work

add mcp[cli] to toml and lint

mcp stdio example

mcp sse example

mcp_run POC

ruff formatting
2025-04-24 17:13:40 -05:00
Mark Backman
fa8aafc7a5 Merge pull request #1660 from pipecat-ai/mb/foundational-examples-readme
Update foundational README with ToC
2025-04-24 18:12:27 -04:00
Mark Backman
4b364dda29 Update foundational README with ToC 2025-04-24 18:04:36 -04:00
Mark Backman
6bb765e40f Update daily-transport version in fal-smart-turn demo 2025-04-24 15:06:08 -04:00
Mark Backman
c80d09f66c Add Word Wrangler demos 2025-04-24 14:31:48 -04:00
Mark Backman
f8ff10c5d5 Add hosted demo link to Fal smart turn README 2025-04-23 21:27:53 -04:00
Mark Backman
09ff836ef6 Merge pull request #1640 from pipecat-ai/mb/fal-smart-turn-example
Fal Smart Turn example
2025-04-23 17:27:27 -04:00
Mark Backman
e446ecac14 Merge pull request #1649 from pipecat-ai/mb/fix-smart-turn-metrics
fix: SmartTurnMetricsData was reporting 0 for inference and processin…
2025-04-23 17:23:16 -04:00
Mark Backman
8c0c8a6153 Code review fixes 2025-04-23 17:22:19 -04:00
Mark Backman
70033ae00b fix: SmartTurnMetricsData was reporting 0 for inference and processing time 2025-04-23 17:17:46 -04:00
Mark Backman
ac9dce63ae README updates 2025-04-23 16:55:35 -04:00
Mark Backman
8b2df48fab Fix DailyTransport bot name 2025-04-23 16:43:54 -04:00
Mark Backman
156a5690fc Merge pull request #1647 from mattmatters/fix-typo
Fix wating -> waiting typo
2025-04-23 16:27:00 -04:00
Matt Lewis
d42c618398 Fix wating -> waiting typo 2025-04-23 15:55:14 -04:00
Mark Backman
b23ca5a4a8 Merge pull request #1641 from pipecat-ai/mb/11labs-input-params
ElevenLabs: InputParams can be set individually
2025-04-23 14:31:37 -04:00
Mark Backman
63a6697a90 ElevenLabs: InputParams can be set individually 2025-04-23 14:28:38 -04:00
Aleix Conchillo Flaqué
f1e45d0f02 Merge pull request #1646 from pipecat-ai/aleix/pipecat-0.0.65
update CHANGELOG for 0.0.65
2025-04-23 11:27:25 -07:00
Mark Backman
4ad227ca2d Merge pull request #1643 from pipecat-ai/mb/gladia-keepalive
Add a keepalive task to GladiaSTTService
2025-04-23 14:27:15 -04:00
Aleix Conchillo Flaqué
66cc18194b update CHANGELOG for 0.0.65 2025-04-23 11:25:32 -07:00
Mark Backman
7d65132c93 Add a keepalive task to GladiaSTTService 2025-04-23 14:21:44 -04:00
Mark Backman
db7d7a4204 Merge pull request #1645 from pipecat-ai/mb/telnyx-auto-hang-up
Add auto_hang_up to Telnyx serializer
2025-04-23 14:20:28 -04:00
Mark Backman
7bbac11084 Add docstrings to TelnyxFrameSerializer 2025-04-23 14:13:20 -04:00
Mark Backman
76c8322b57 Make call_sid optional in TwilioFrameSerializer 2025-04-23 14:08:50 -04:00
Mark Backman
7b1cd3523d Twilio: send only one hangup command 2025-04-23 13:41:36 -04:00
Mark Backman
6bd821ac9a Add auto_hang_up to Telnyx serializer 2025-04-23 13:29:54 -04:00
Mark Backman
a6d51c343e Add deepgram and cartesia to foundational example requirements to make quickstart smoother 2025-04-23 08:47:47 -04:00
Mark Backman
1a5cf7a521 Add local run and deployment steps to README 2025-04-22 21:37:35 -04:00
Mark Backman
69491417ec Fal Smart Turn example 2025-04-22 21:16:41 -04:00
Aleix Conchillo Flaqué
b91780ced2 Merge pull request #1638 from pipecat-ai/aleix/pipecat-0.0.64
update CHANGELOG for 0.0.64
2025-04-22 17:35:25 -07:00
Aleix Conchillo Flaqué
8ded666958 update CHANGELOG for 0.0.64 2025-04-22 17:32:06 -07:00
Filipi da Silva Fuchter
2490c804a5 Merge pull request #1631 from pipecat-ai/smart_turn_timeout
Returning the turn as complete if the request don’t return a result within SmartTurnParams stop_secs
2025-04-22 19:51:10 -03:00
Filipi Fuchter
dd8856a673 Merge branch 'main' into smart_turn_timeout
# Conflicts:
#	dot-env.template
2025-04-22 19:49:32 -03:00
Aleix Conchillo Flaqué
e7da08dab1 move smart turn files to audio.turn.smart_turn package 2025-04-22 15:29:31 -07:00
Aleix Conchillo Flaqué
ae60d42016 s/SmartTurnAnalyzer/HttpSmartTurnAnalyzer/ and add FalSmartTurnAnalyzer 2025-04-22 15:13:12 -07:00
Aleix Conchillo Flaqué
50e8d82ece SmartTurn: some linting cleanup 2025-04-22 14:39:02 -07:00
Mark Backman
cc9901a82f Replace httpx with aiohttp 2025-04-22 17:14:19 -04:00
Aleix Conchillo Flaqué
1fd43e8a3f Merge pull request #1636 from pipecat-ai/aleix/examples-logging
examples: always use loguru for logging
2025-04-22 13:06:40 -07:00
Aleix Conchillo Flaqué
fdc508a1a5 examples: always use loguru for logging 2025-04-22 11:51:49 -07:00
Mark Backman
37269db247 Merge pull request #1634 from pipecat-ai/mb/twilio-end-call
Automatically hangup Twilio calls
2025-04-22 14:05:10 -04:00
Mark Backman
51269aabbd Added cancel method to WebsocketServerOutputTransport 2025-04-22 13:58:39 -04:00
Mark Backman
74ecc19e09 Code review feedback 2025-04-22 13:54:12 -04:00
Mark Backman
c6d48c16df Add twilio to pyproject.toml, update demo to use twilio option 2025-04-22 13:01:56 -04:00
Mark Backman
873d84aa09 Twilio serializer to return None 2025-04-22 12:50:11 -04:00
Mark Backman
7360866c97 Add docstrings 2025-04-22 12:49:17 -04:00
Mark Backman
81f4768661 Automatically hangup Twilio calls 2025-04-22 12:45:34 -04:00
Vanessa Pyne
972d65f61b Merge pull request #1628 from pipecat-ai/vp-typo-fixes
typo fixes in phone-chatbot example
2025-04-22 10:05:56 -05:00
Mark Backman
1da9d398e3 Merge pull request #1619 from pipecat-ai/mb/grok-3-beta
GrokLLMService uses grok-3-beta as default model
2025-04-22 10:33:32 -04:00
Filipi Fuchter
7358bc6428 Returning the turn as complete if the request don’t return a result within SmartTurnParams stop_secs 2025-04-22 10:35:14 -03:00
vipyne
a6af499f84 typo fixes in phone-chatbot example 2025-04-21 23:49:13 -05:00
Aleix Conchillo Flaqué
f9d1a53e28 Merge pull request #1609 from pipecat-ai/aleix/pyproject-py-typed
pyproject: fix license fields
2025-04-21 16:14:22 -07:00
Mark Backman
3f3010af79 Add a SmartTurnMetricsData class, emitted by Metrics Frame in response to smart turn responses 2025-04-21 18:56:14 -04:00
Aleix Conchillo Flaqué
a02d47ddbd Merge pull request #1625 from 0xPatryk/patch-1
Fixed AttributeError: object has no attribute '_sample_rate"
2025-04-21 15:40:54 -07:00
Patryk
a649aff3e7 Fixed AttributeError: 'OpenAITTSService' object has no attribute '_sample_rate' 2025-04-21 11:03:45 +02:00
Mark Backman
a9b551d73e GrokLLMService uses grok-3-beta as default model 2025-04-19 08:05:59 -04:00
Mark Backman
747a821943 Merge pull request #1614 from pipecat-ai/mb/changelog-for-1525
Add CHANGELOG entry for PR 1525
2025-04-19 07:10:13 -04:00
Aleix Conchillo Flaqué
010db3ccd5 README: minor update 2025-04-18 20:57:05 -07:00
Aleix Conchillo Flaqué
db773b8b93 Merge pull request #1616 from pipecat-ai/aleix/new-readme
make README more fun
2025-04-18 18:15:35 -07:00
Mark Backman
16b7bf71b4 Additional README changes 2025-04-18 21:00:57 -04:00
Aleix Conchillo Flaqué
82d19508a4 make README more fun 2025-04-18 14:37:28 -07:00
Mark Backman
dc3646f0e7 Merge pull request #1615 from pipecat-ai/mb/issue-template
Add issue templates and move the pull request template to .github
2025-04-18 14:58:09 -04:00
Mark Backman
62e659cd3a Update to .yml templates so that types are used 2025-04-18 13:21:01 -04:00
Mark Backman
b2945f44fd Add issue templates and move the pull request template to .github 2025-04-18 12:17:46 -04:00
Mark Backman
618fbef81c Add CHANGELOG entry for PR 1525 2025-04-18 11:32:34 -04:00
Mark Backman
70c42dfa6e Merge pull request #1525 from shaiyon/google-default-creds
Enable usage of Application Default Credentials in Google services
2025-04-18 11:31:08 -04:00
Mark Backman
9ab374dd1f Merge pull request #1612 from pipecat-ai/mb/07g-stt-model
examples: Fix 07g by changing STT model
2025-04-18 08:04:20 -04:00
Mark Backman
cc6d284417 examples: Fix 07g by changing STT model 2025-04-18 07:13:34 -04:00
Filipi da Silva Fuchter
f77d8f0b6f Merge pull request #1611 from pipecat-ai/smart_turn_changelog
Mentioning the Smart Turn Detection into the changelog.
2025-04-17 23:02:57 -03:00
Varun Singh
9c0beb05cf Merge pull request #1597 from pipecat-ai/vr000m-opus-added
Changing default codec to OPUS for telephony
2025-04-17 18:42:12 -07:00
Aleix Conchillo Flaqué
858981c404 Merge pull request #1610 from pipecat-ai/aleix/add-base-turn-analyzer
audio: add BaseTurnAnalyzer class
2025-04-17 18:38:08 -07:00
Aleix Conchillo Flaqué
9eed225aa2 audio: add BaseTurnAnalyzer class 2025-04-17 18:37:52 -07:00
Filipi Fuchter
9f7371e485 Mentioning the Smart Turn Detection into the changelog. 2025-04-17 22:31:40 -03:00
Aleix Conchillo Flaqué
d77c37ff14 pyproject: add py.typed (PEP 561) 2025-04-17 17:29:04 -07:00
Aleix Conchillo Flaqué
b4916f9dae pyproject: fix license fields 2025-04-17 17:28:14 -07:00
Aleix Conchillo Flaqué
004a920920 Merge pull request #1563 from Bnowako/packaging-type-information
Add marker file for static type checkers
2025-04-17 17:26:15 -07:00
Filipi da Silva Fuchter
203c5a3a60 Merge pull request #1592 from pipecat-ai/smart_turn
Smart turn
2025-04-17 18:21:47 -03:00
Filipi Fuchter
7f6fb1754b Merge remote-tracking branch 'origin/smart_turn' into smart_turn 2025-04-17 17:53:53 -03:00
Filipi Fuchter
a390ce13a4 Removing the UserEndOfTurnFrame 2025-04-17 17:53:31 -03:00
Filipi da Silva Fuchter
61d31d1c40 Restoring stop_secs to default value.
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-17 17:44:47 -03:00
Filipi da Silva Fuchter
e872ff943a Using the default model for OpenAi.
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-17 17:43:39 -03:00
Filipi da Silva Fuchter
c71005e249 Using the default model for OpenAi.
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-17 17:43:23 -03:00
Filipi Fuchter
6e06bf97c0 Preventing emitting the UserStartedSpeaking event multiple times. 2025-04-17 17:21:29 -03:00
Filipi Fuchter
a80dc94e91 Fixing ruff format. 2025-04-17 16:47:17 -03:00
Filipi Fuchter
3ea9cfd251 Keeping the _speech_triggered as true if the state is incomplete. 2025-04-17 16:46:15 -03:00
Filipi Fuchter
a80f82cdb6 Moving the environment variables to inside the demo. 2025-04-17 16:28:50 -03:00
Aleix Conchillo Flaqué
d24bab354f Merge pull request #1607 from pipecat-ai/aleix/fix-websocket-disconnects
services: fix TTS websocket services disconnections
2025-04-17 12:27:52 -07:00
Filipi Fuchter
53ee3fb64c Changing the log levels used in smart_turn 2025-04-17 16:14:13 -03:00
Filipi Fuchter
3599761e4e Changing the default behavior to only use the last vad segment, and increasing the default stop_secs to 3 2025-04-17 16:07:03 -03:00
Aleix Conchillo Flaqué
c0b3fe3985 services: only read from TTS websocket if websocket connection established 2025-04-17 11:54:07 -07:00
Aleix Conchillo Flaqué
497d48b6c8 services: fix TTS websocket services disconnections
Fixes #1467
2025-04-17 11:29:49 -07:00
Filipi Fuchter
e179916c9c Creating a new param use_only_last_vad_segment 2025-04-17 11:49:51 -03:00
Filipi Fuchter
b0b38beb19 Returning the max duration back to 8 seconds. 2025-04-17 11:39:48 -03:00
Filipi Fuchter
8577139d21 Fixing to keep the last max samples. 2025-04-17 11:39:06 -03:00
Filipi Fuchter
e2fbbb4b40 Renaming the smart turn classes. 2025-04-17 10:43:21 -03:00
Filipi Fuchter
88ce117e84 Changing the max duration default value to 16 seconds. 2025-04-17 10:35:13 -03:00
Filipi Fuchter
266537c3f4 Fixing to respect the stop_secs. 2025-04-17 10:07:08 -03:00
Filipi Fuchter
230d2f80fa Merge branch 'main' into smart_turn 2025-04-17 09:36:30 -03:00
Filipi Fuchter
3f0688aefa Testing smart turn using stop_secs as 5 seconds 2025-04-17 09:36:03 -03:00
Filipi da Silva Fuchter
5be3e6979e Merge pull request #1533 from pipecat-ai/daily_small_webrtc
Example interoping between SmallWebRTC and Daily
2025-04-17 09:19:23 -03:00
Mark Backman
9c19cff818 Merge pull request #1585 from ArmanJR/main
Troubleshooting SSL error
2025-04-16 22:46:45 -04:00
Mark Backman
95f3537bde Merge pull request #1598 from pipecat-ai/mb/11labs-http-timestamps
Added word/timestamp pairs to ElevenLabsHttpTTSService
2025-04-16 22:38:26 -04:00
Mark Backman
7ff748defd Merge pull request #1600 from pipecat-ai/mb/11labs-previous-text
Add previous_text context to ElevenLabsHttpTTSService
2025-04-16 22:33:38 -04:00
Mark Backman
2dafbee2aa Code review fixes 2025-04-16 22:29:33 -04:00
Mark Backman
1e0a9d7b06 Add previous_text context to ElevenLabsHttpTTSService 2025-04-16 22:22:08 -04:00
Mark Backman
4a23e138b1 Added word/timestamp pairs to ElevenLabsHttpTTSService 2025-04-16 22:20:51 -04:00
Mark Backman
384f80983f Added word/timestamp pairs to ElevenLabsHttpTTSService 2025-04-16 21:55:00 -04:00
Aleix Conchillo Flaqué
f6f01ea7e4 Merge pull request #1588 from pipecat-ai/aleix/llm-aggregator-params
LLM aggregator params
2025-04-16 15:25:21 -07:00
Aleix Conchillo Flaqué
f385cc0460 pyproject: add websockets as google dependency 2025-04-16 15:19:25 -07:00
Aleix Conchillo Flaqué
e97de43de2 add LLMUserAggregatorParams and LLMAssistantAggregatorParams 2025-04-16 15:19:19 -07:00
Aleix Conchillo Flaqué
8299c96ad4 Merge pull request #1603 from pipecat-ai/aleix/deepgram-tavus-fixes
deepgram/tavus fixes
2025-04-16 14:55:45 -07:00
Aleix Conchillo Flaqué
e9af585edd DeepgramTTSService: re-add base_url to constructor 2025-04-16 14:54:02 -07:00
Aleix Conchillo Flaqué
31f7082d12 DeepgramTTSService: use Deepgram's asyncrest instead of asyncio.to_thread 2025-04-16 14:40:59 -07:00
Aleix Conchillo Flaqué
6cea71270e tts: use smaller audio chunk sizes 2025-04-16 14:40:59 -07:00
Aleix Conchillo Flaqué
d05b2d0e8d TavusVideoService: fix rate limiting and max size 2025-04-16 14:40:59 -07:00
Filipi Fuchter
a458c1e92b Improving the README and fixing the env.example 2025-04-16 18:38:48 -03:00
Filipi Fuchter
5bbf1d0209 Example interoping between SmallWebRTC and Daily. 2025-04-16 17:14:12 -03:00
Mark Backman
235cd9cecc Merge pull request #1586 from rahultayal22/rah_google_vertex_issue
Fixed params issue in Google Vertex ai
2025-04-16 14:56:46 -04:00
Mark Backman
829f3ed2db Merge pull request #1601 from pipecat-ai/mb/eject-at-exp-token
Add eject_at_token_exp to Daily REST helpers, modify default values
2025-04-16 14:54:41 -04:00
Rahul Tayal
ac64f0ba91 Run ruff on code 2025-04-16 23:19:09 +05:30
Rahul Tayal
ce41a7585b Resolved comment to update change log 2025-04-16 22:24:25 +05:30
Mark Backman
ce92dfb5ec Add eject_at_token_exp to Daily REST helpers, modify default values 2025-04-16 12:26:33 -04:00
Mark Backman
ee132a2188 Merge pull request #1596 from pipecat-ai/mb/gpt-4.1
Update services and examples to use gpt-4.1 by default
2025-04-16 08:37:48 -04:00
Mark Backman
5f3bbf9828 Rely on default OpenAI model for examples and tests 2025-04-16 08:33:34 -04:00
Mark Backman
55d1d81430 Merge pull request #1595 from pipecat-ai/mb/rtvi-start-convo
Update client/server demos to kick off conversation in on_client_read…
2025-04-16 08:23:16 -04:00
Filipi Fuchter
8e36bdbed7 Adding some comments to the code. 2025-04-16 09:11:27 -03:00
Filipi Fuchter
cd8bd7f487 Adding some comments to the code. 2025-04-16 08:58:40 -03:00
Filipi Fuchter
5fa47b7a5c Adding the dependencies for the remote smart turn 2025-04-16 08:45:01 -03:00
Filipi Fuchter
616961b487 Stop removing segments from the end 2025-04-16 08:04:38 -03:00
Filipi Fuchter
650d4d9ee2 Changing the start speech time and adding logs. 2025-04-16 07:55:20 -03:00
Filipi Fuchter
2627cb6bf2 Allowing to define SmartTurnParams 2025-04-16 07:13:13 -03:00
Filipi Fuchter
0e4115049b Refactoring to use keep alive sessions. 2025-04-16 06:44:57 -03:00
Filipi Fuchter
3ebef9346f Adding support for RemoteSmartTurn 2025-04-16 06:33:42 -03:00
Filipi Fuchter
3e2d21779f Refactoring the BaseEndOfTurnAnalyzer to include most of the logic 2025-04-16 06:11:56 -03:00
Filipi Fuchter
cfefcac35f Resetting the silence frames when the user speaks. 2025-04-15 20:51:36 -03:00
Filipi Fuchter
57b39c084f Triggering to check if the turn is complete based on the maximum timeout 2025-04-15 20:42:41 -03:00
Filipi Fuchter
11b6de0900 Triggering to check if the turn is complete each time the user stops speaking based on the vad 2025-04-15 17:28:00 -03:00
Varun Singh
824bc9bf16 Update dial.js 2025-04-15 12:48:33 -07:00
Varun Singh
d0ddef6c12 Update server.py 2025-04-15 12:37:33 -07:00
Mark Backman
ad40a0f076 Update OpenAILLMService and OpenPipeLLMService to use gpt-4.1 by default 2025-04-15 15:11:05 -04:00
Filipi Fuchter
e6325a8229 Integrating with the smart turn model to predict 2025-04-15 16:01:09 -03:00
Mark Backman
6d10732889 Update OpenAILLMService examples to use gpt-4.1 2025-04-15 14:59:55 -04:00
Mark Backman
fdb46a0fa9 Update client/server demos to kick off conversation in on_client_ready handler 2025-04-15 14:50:38 -04:00
Filipi Fuchter
3588b06718 Adding missing torch dependency. 2025-04-15 12:28:36 -03:00
Filipi Fuchter
73874f6ec0 Loading the smart turn model. 2025-04-15 12:11:06 -03:00
Filipi Fuchter
6ab9a8ad7f Starting to create a local smart turn 2025-04-15 11:24:39 -03:00
Filipi Fuchter
821e303249 Bringing Aleix initial implementation for the smart turn. 2025-04-15 10:21:40 -03:00
chadbailey59
efae26a5a8 Client connect/disconnect events for DailyTransport (#1544)
* added multi transport example

* added working example

* restructured example and added readme

* removed image

* cleanup

* changed data type of callback signature

* removed pipecat example

* added changelog
2025-04-14 15:56:41 -05:00
Aleix Conchillo Flaqué
d16ace22ac Merge pull request #1583 from pipecat-ai/aleix/soundfilemixer-constructor-updates
SoundfileMixer: add mixing argument and require keywords
2025-04-14 10:59:30 -07:00
Rahul Tayal
001c26b79c Fixed params issue in Google Vertex ai 2025-04-14 23:29:16 +05:30
Arman
8dc4f1cda0 Troubleshooting SSL error 2025-04-14 13:39:53 -04:00
Aleix Conchillo Flaqué
ab6be11a0e SoundfileMixer: add mixing argument and require keywords 2025-04-14 08:30:56 -07:00
Filipi da Silva Fuchter
054158b0ff Merge pull request #1579 from pipecat-ai/fixing_smallwebrtc_issue
Fixed an issue in `SmallWebRTCTransport`
2025-04-14 10:44:22 -03:00
Filipi da Silva Fuchter
174cf13abd Merge pull request #1580 from pipecat-ai/fixing_voice_agent_example
Fixing the voice agent example to always create the video transceiver.
2025-04-14 10:44:07 -03:00
Filipi Fuchter
099d2c02e1 Fixing the voice agent example to always create the video transceiver. 2025-04-14 10:41:39 -03:00
Filipi Fuchter
e1108466f6 Fixed an issue in SmallWebRTCTransport where an error was thrown if the client did not create a video transceiver. 2025-04-14 10:36:25 -03:00
Mark Backman
edd53d425e Merge pull request #1577 from pipecat-ai/hush/trackStoppedSimpleChatbot
docs: Fix TrackStopped typo in SimpleChatbot
2025-04-14 08:32:58 -04:00
James Hush
b160cf34e9 Remove formatting 2025-04-14 15:13:45 +08:00
James Hush
dae3b927e1 docs: Fix TrackStopped typo in SimpleChatbot 2025-04-14 15:12:17 +08:00
Mark Backman
bd3d30111a Merge pull request #1569 from pipecat-ai/pipecat-0.0.63
Update CHANGELOG for 0.0.63
2025-04-11 20:09:58 -04:00
Mark Backman
8c7e16e717 Update CHANGELOG for 0.0.63 2025-04-11 20:04:50 -04:00
Mark Backman
f6accbd510 Updating foundation examples to use SmallWebRTCTransport and pipecat-ai-small-webrtc-prebuilt (#1534)
Co-authored-by: Filipi Fuchter <filipi@daily.co>
2025-04-11 19:44:16 -04:00
Mark Backman
8186219879 Merge pull request #1513 from pipecat-ai/mb/gemini-context-formatting
Fix: GeminiMultimodalLiveLLMService, add spaces between words in assi…
2025-04-11 15:30:51 -04:00
Mark Backman
b9a2ed5b58 Fix: GeminiMultimodalLiveLLMService, add spaces between words in assistant context messages 2025-04-11 15:14:52 -04:00
Mark Backman
7ac12ffc85 Merge pull request #1550 from pipecat-ai/mb/cartesia-spelling-timestamps
Fix: Cartesia's spelling feature adds whole word to context
2025-04-11 15:13:55 -04:00
Filipi da Silva Fuchter
f623cf96f7 Merge pull request #1560 from pipecat-ai/bot_left_signalling
Bot left signalling message
2025-04-11 16:08:01 -03:00
Mark Backman
06be20eb16 Fix: Cartesia's spelling feature adds whole word to context 2025-04-11 15:04:58 -04:00
Filipi Fuchter
816b3a9545 Fixing ruff format 2025-04-11 15:37:16 -03:00
Filipi Fuchter
255666925b Sending a new signalling message peerLeft. 2025-04-11 15:35:50 -03:00
Mark Backman
0df065fda4 Merge pull request #1566 from pipecat-ai/mb/gemini-live-beta
Add Gemini Live support for languages, native model transcriptions, media resolution, and VAD settings
2025-04-11 12:40:04 -04:00
Mark Backman
241a947b8b Add CHANGELOG entries 2025-04-11 11:46:48 -04:00
Mark Backman
e28c199dd1 Add GeminiMultimodalLiveLLMService support for VAD Params 2025-04-11 11:46:48 -04:00
Filipi da Silva Fuchter
6220ee4efb Merge pull request #1565 from pipecat-ai/fixing_video_transform_demo
Fixing the video transform demo to use 20ms audio.
2025-04-11 11:45:29 -03:00
Filipi Fuchter
b650d043bf Fixing the video transform demo to use 20ms audio. 2025-04-11 11:22:41 -03:00
Mark Backman
121e6d2157 Add media resolution support to GeminiMultimodalLiveLLMService 2025-04-11 10:18:29 -04:00
Mark Backman
dbd7869de7 Add model transcription support 2025-04-11 10:02:52 -04:00
Mark Backman
b7d56d5ff0 Add language support for Gemini Live 2025-04-11 09:21:14 -04:00
Bnowako
61cba0136f Add marker file for static type checkers 2025-04-11 11:00:57 +02:00
Filipi da Silva Fuchter
ed743b55d4 Merge pull request #1561 from pipecat-ai/fixing_voice_agent
Fixing voice agent example
2025-04-10 23:33:35 -03:00
Filipi Fuchter
fb074895f5 Fixing ruff format. 2025-04-10 23:19:31 -03:00
Filipi Fuchter
d916865ccc Fixing voice agent example to work with the last released version of pipecat. 2025-04-10 23:10:50 -03:00
Filipi Fuchter
6378a8ccd3 Starting to implement a signalling message to when the bot has left 2025-04-10 23:02:27 -03:00
Aleix Conchillo Flaqué
5dbb5f176b Merge pull request #1551 from pipecat-ai/aleix/daily-python-0.17.0
pyproject: update daily-python to 0.17.0
2025-04-10 09:06:55 -07:00
Filipi da Silva Fuchter
b89f2611f7 Merge pull request #1539 from pipecat-ai/small_wbertc_mute_state
SmallWebRTC mute state
2025-04-10 11:26:53 -03:00
Filipi Fuchter
db0f783c55 Updating the video-transform demo to use the latest version of the SmallWebRTCTransport. 2025-04-10 11:23:28 -03:00
Filipi Fuchter
20ec323647 Refactoring the video-transform demo to be able to enable or disable the cam. 2025-04-10 11:23:05 -03:00
Filipi Fuchter
f71c09a4fd Added support in SmallWebRTCTransport to detect when remote tracks are muted. 2025-04-10 11:22:37 -03:00
Mark Backman
cba4ebfcf9 Merge pull request #1555 from pipecat-ai/mb/gemini-beta-base 2025-04-10 09:01:16 -04:00
Mark Backman
3b9a8946f9 Update GeminiMultimodalLiveLLMService base_url 2025-04-10 08:17:52 -04:00
Mark Backman
db3620c4be Merge pull request #1553 from balaji-atoa/main
feat: change default model name on live api
2025-04-10 08:10:35 -04:00
Mark Backman
11338ea92d Merge pull request #1552 from pipecat-ai/mb/p2p-capture-image
Add image capture to SmallWebRTCTransport
2025-04-10 07:52:13 -04:00
Filipi da Silva Fuchter
90563a4091 Merge pull request #1542 from pipecat-ai/small_webrtc_prebuilt_ui
Using the small-webrtc-prebuilt-ui
2025-04-10 07:39:26 -03:00
Filipi da Silva Fuchter
937f5f7cb7 Update examples/p2p-webrtc/video-transform/server/requirements.txt
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-10 07:37:23 -03:00
Filipi da Silva Fuchter
4f221b817a Update examples/p2p-webrtc/video-transform/README.md
Co-authored-by: Mark Backman <mark@daily.co>
2025-04-10 07:37:07 -03:00
balaji-atoa
c79c1f65fc feat: change default model name on live api 2025-04-10 11:59:11 +05:30
Mark Backman
8ad2ad0e59 Add image capture to SmallWebRTCTransport 2025-04-09 23:01:06 -04:00
Aleix Conchillo Flaqué
499b258bf9 pyproject: update daily-python to 0.17.0 2025-04-09 18:59:10 -07:00
Filipi Fuchter
05b6a5ae4b Improving the video-transform readme 2025-04-09 15:55:13 -03:00
Filipi Fuchter
65fcea28ce Using the small-webrtc-prebuilt-ui 2025-04-09 15:45:30 -03:00
Kwindla Hultman Kramer
005c0b55b6 Merge pull request #1545 from pipecat-ai/khk/gem-live-0408
Gemini Multimodal Live API base_url format fix
2025-04-08 21:46:30 -07:00
Kwindla Hultman Kramer
1828127f41 small fix to wss base_url 2025-04-08 20:22:26 -07:00
Filipi da Silva Fuchter
77ab841cab Merge pull request #1532 from pipecat-ai/p2p_ios_demo
iOS demo for the p2p-webrtc video-transform example
2025-04-07 16:58:06 -03:00
Filipi Fuchter
3bbc75110a Mentioning the iOS client inside the changelog and fixing the readme. 2025-04-07 16:54:26 -03:00
Filipi Fuchter
b2ce1d9378 Merge branch 'main' into p2p_ios_demo 2025-04-07 16:50:58 -03:00
Filipi Fuchter
58714865df Using the public version of pipecat-client-ios-small-webrtc 2025-04-07 16:48:18 -03:00
Mark Backman
03b3635b0a Merge pull request #1521 from pipecat-ai/mb/increase-bot-vad-stop-secs
Increase BOT_VAD_STOP_SECS for services with slower speech patterns
2025-04-07 14:44:31 -04:00
Mark Backman
aaa7b5e626 Merge pull request #1524 from pipecat-ai/mb/tts-generate-with-text
TTS: Skip generation when there is no text
2025-04-07 14:44:18 -04:00
Varun Singh
0b8486ce39 Merge pull request #1418 from pipecat-ai/vr000m-pcc-dialin-webhook-server
Pipecat Cloud: Companion server to handle webhooks for pinless dial-in
2025-04-07 09:00:38 -07:00
Mark Backman
d4ae091ddd Update port in FastAPI README, add run steps to nextjs README 2025-04-07 11:09:43 -04:00
Mark Backman
9e0a57a6de Rename directories 2025-04-07 10:44:41 -04:00
Mark Backman
fc4c1e4110 README updates 2025-04-07 10:33:18 -04:00
Mark Backman
9b740d9e72 Merge pull request #1537 from pipecat-ai/mb/azure-tts-lang
Fix: Set language for Azure TTS services
2025-04-07 09:46:08 -04:00
Mark Backman
b03563765f Fix: Set language for Azure TTS services 2025-04-07 09:24:31 -04:00
Filipi Fuchter
a1578bd67a iOS demo for the p2p-webrtc video-transform example 2025-04-04 16:40:52 -03:00
Filipi da Silva Fuchter
6466573b84 Merge pull request #1498 from pipecat-ai/aiortc_example_ios
Improvements for the SmallWebRTCTransport
2025-04-04 16:39:06 -03:00
Filipi Fuchter
b42dc83696 Improvements for the SmallWebRTCTransport:
- Wait until the pipeline is ready before triggering the `connected` event.
  - Queue messages if the data channel is not ready.
  - Update the aiortc dependency to fix an issue where the 'video/rtx' MIME type
    was incorrectly handled as a codec retransmission.
  - Avoid initial video delays.
2025-04-04 16:33:57 -03:00
Filipi Fuchter
fe5931b884 Updating aiortc to fix an issue where 'video/rtx' MIMEType retransmission incorrectly handled as a codec 2025-04-04 16:28:54 -03:00
Filipi Fuchter
4b438ff7d7 Allowing ngrok connections to the video-transform demo 2025-04-04 16:28:37 -03:00
Filipi da Silva Fuchter
89a8c16676 Merge pull request #1531 from pipecat-ai/fix_chunk_default_value
Fixed SmallWebRTCTransport to support dynamic chunk values.
2025-04-04 16:04:05 -03:00
Filipi Fuchter
c4c92585f9 Fixed SmallWebRTCTransport to support dynamic chunk values. 2025-04-04 15:38:12 -03:00
Prem Adithya
c510870736 Merge branch 'pipecat-ai:main' into anthropic-client-bug-fixes 2025-04-04 16:41:04 +11:00
Shaiyon Hariri
af23200511 Use default google creds as fallback when not provided in llm_vertex,stt, and tts 2025-04-03 16:42:58 -04:00
Mark Backman
63146d6f85 TTS: Skip generation when there is no text 2025-04-03 16:15:58 -04:00
Mattie Ruth
ec00edc893 Update client examples to use latest versions (#1523) 2025-04-03 15:47:03 -04:00
Mark Backman
a21be058e2 Increase BOT_VAD_STOP_SECS for services with slower speech patterns 2025-04-03 15:25:48 -04:00
Mark Backman
c226c20e12 Merge pull request #1519 from pipecat-ai/mb/ref-docs-toc
Docs: Update ToC With Adapters and Observers
2025-04-03 15:19:35 -04:00
Aleix Conchillo Flaqué
78e6669105 Merge pull request #1514 from pipecat-ai/aleix/producer-consumer-processors
processors: add ProducerProcessor and ConsumerProcessor
2025-04-03 12:18:49 -07:00
Aleix Conchillo Flaqué
79f29e14dd processors: add ProducerProcessor and ConsumerProcessor 2025-04-03 09:44:56 -07:00
Mark Backman
d4a00fd080 Merge pull request #1517 from pipecat-ai/mb/update-simple-chatbot-packages
Update client packages for simple-chatbot JS and React
2025-04-03 10:07:40 -04:00
Mark Backman
d4186fa115 Merge pull request #1518 from pipecat-ai/mb/openai-verse
Add verse voice and bump the OpenAI version
2025-04-03 09:48:09 -04:00
Mark Backman
3536cbcd13 Add docstrings to FunctionSchema, update CONTRIBUTING.md with docstrings guidance, ignore __init__ docstrings if a class is sufficiently documented 2025-04-03 09:21:26 -04:00
Mark Backman
e3bcb70b13 Update ToC With Adapters and Observers 2025-04-03 09:02:09 -04:00
Mark Backman
19a82f9522 Add verse voice and bump the OpenAI version 2025-04-03 08:23:59 -04:00
Mark Backman
8c0a847449 Update client packages for simple-chatbot JS and React 2025-04-03 07:43:25 -04:00
Dominic Stewart
e3704cd1a1 Updated imports to work with pipecat 0.62 (#1515) 2025-04-03 15:07:02 +08:00
Dominic Stewart
1ba037865b Call Transfer demo (#1348)
* Updated code to dial out to an operator, keep track of operator conversation while escalated and then return to conversation when finished

* Removed unnecessary imports

* Updated bot runner code, added call routing file and then updated the call transfer and voicemail detection examples

* Updated the bot files

* Made prompt one level higher in the body and an array

* Updated call transfer examples to work correctly

* Updated gemini voicemail detection example to work

* Added twilio bot support back to the bot_runner

* Moved some state management, participant management and other logic to the helper file.

* Updated comments

* Updated env and requirements file

* Ran the examples and made sure code works. Still need to work on the prompts a bit

* Fixed format issue

* Add support to disable summary in call transfer

* Added support for operator transfer mode

* Updated readme file

* Updated readme based on feedback, and handling of various properties in the json to be more flexible for future examples

* Updated number of endpoints

* Updated readme to remove fly deployment text and replaced with Pipecat Cloud

* Starting to tweak function calls and prompts

* Updated examples to more consistently call the functions and say what they need to say

* Updated examples

* Updated examples

* Updated examples to work correctly

* Add simple bot versions of dialin and dialout

* Refactored the bot runner file to make adding future examples easier

* Based on feedback, removed examples for multiple LLMs and also adjusted voicemail detection code to be simpler

* Made sure to only capture the users transcription once

* Updated readme with latest changes

* Forgot to update the order of examples in one place

* Fixed formatting issue

* Adjusted based on james feedback

* Changed default_mode to default_calltransfer_mode
2025-04-03 09:03:23 +09:00
Aleix Conchillo Flaqué
909520f76e Merge pull request #1508 from pipecat-ai/mb/gemini-push-stop-speaking-frame
LLMAssistantContextAggregator should push BotStoppedSpeakingFrames
2025-04-02 16:25:08 -07:00
Mark Backman
d06cfcd597 Merge pull request #1512 from pipecat-ai/mb/fix-gemini-examples
Examples: Fix context_aggregator.assistant() pipeline position
2025-04-02 19:07:09 -04:00
Mark Backman
2579d0cf57 Examples: Fix context_aggregator.assistant() pipeline position 2025-04-02 16:11:03 -04:00
Mark Backman
1ec20b2e74 Merge pull request #1509 from pipecat-ai/mb/openia-voices
Add new voices to OpenAITTSService
2025-04-02 15:50:39 -04:00
Mark Backman
55a6e5aa4c Add new voices to OpenAITTSService 2025-04-02 12:09:36 -04:00
Varun Singh
2229730169 moving to appropriate directory 2025-04-01 23:45:09 -07:00
Varun Singh
24b54c66ee fixes review comments 2025-04-01 23:39:21 -07:00
Varun Singh
a14205415f replaced dailyAPIKey with pccApiKey, also allow handling of messages when hmac is missing 2025-04-01 23:34:24 -07:00
Varun Singh
18b56d4a10 Fix README.md 2025-04-01 23:32:50 -07:00
Mark Backman
b85bd91d08 LLMAssistantContextAggregator should push BotStoppedSpeakingFrames 2025-04-01 23:35:09 -04:00
Aleix Conchillo Flaqué
23f3285a7d Merge pull request #1507 from pipecat-ai/aleix/pipecat-0.0.62
update CHANGELOG for 0.0.62
2025-04-01 19:00:06 -07:00
Aleix Conchillo Flaqué
94f6436619 update CHANGELOG for 0.0.62 2025-04-01 18:55:04 -07:00
Aleix Conchillo Flaqué
480692971c Merge pull request #1506 from pipecat-ai/aleix/websockets-mixer-loop-fixes
transports(websocket): close connection from last transport
2025-04-01 18:52:47 -07:00
Aleix Conchillo Flaqué
5df5f6ae4c transports(websocket): close connection from last transport 2025-04-01 18:32:03 -07:00
Aleix Conchillo Flaqué
6940112ab9 Merge pull request #1504 from pipecat-ai/aleix/base-output-transport-audio-10ms-chunk-update
TransportParams: set audio_out_10ms_chunks to 4
2025-04-01 15:15:24 -07:00
Aleix Conchillo Flaqué
80584e9138 TransportParams: set audio_out_10ms_chunks to 4 2025-04-01 15:13:28 -07:00
Aleix Conchillo Flaqué
1fd01e715d Merge pull request #1503 from pipecat-ai/aleix/function-call-result-system-frame
frames: make FunctionCallResultFrame a SystemFrame
2025-04-01 15:08:26 -07:00
Aleix Conchillo Flaqué
a7a1cd0cde Merge pull request #1502 from pipecat-ai/aleix/test-user-idle-py310
tests: fix test_user_idle_processor for python 3.10
2025-04-01 15:08:10 -07:00
Aleix Conchillo Flaqué
e5a6b9d2b4 Merge pull request #1500 from pipecat-ai/aleix/base-output-transport-optimize-bot-speaking
BaseOutputTransport: optimize BotSpeakingFrames
2025-04-01 14:59:25 -07:00
Aleix Conchillo Flaqué
169b50af61 frames: make FunctionCallResultFrame a SystemFrame 2025-04-01 14:42:22 -07:00
Aleix Conchillo Flaqué
31311d8ac5 tests: fix test_user_idle_processor for python 3.10 2025-04-01 13:54:59 -07:00
Aleix Conchillo Flaqué
bfd06b321d BaseOutputTransport: optimize BotSpeakingFrames 2025-04-01 11:11:49 -07:00
Aleix Conchillo Flaqué
3efbcab39c Merge pull request #1499 from pipecat-ai/aleix/base-output-transport-set-chunks-size
BaseOutputTransport: allow setting 10ms output audio chunks
2025-04-01 11:10:34 -07:00
Aleix Conchillo Flaqué
b40ca391f5 BaseOutputTransport: allow setting 10ms output audio chunks 2025-04-01 10:48:36 -07:00
Aleix Conchillo Flaqué
43008c8c5b Merge pull request #1501 from pipecat-ai/aleix/transcription-processor-interruption
TranscriptProcessor: send TranscriptionUpdateFrame after interruption
2025-04-01 10:46:16 -07:00
Aleix Conchillo Flaqué
3a37b11e56 TranscriptProcessor: send TranscriptionUpdateFrame after interruption 2025-04-01 10:21:21 -07:00
Mark Backman
9ea81bc982 Merge pull request #1497 from pipecat-ai/mb/gladia-languages
Align languages with Gladia's supported languages, remove audio_enhancer option
2025-04-01 11:54:24 -04:00
Mark Backman
98b499e2e9 Remove audio_enhancer option 2025-04-01 10:26:28 -04:00
Mark Backman
72c8f6c8c3 Update GladiaSTTService language list 2025-04-01 10:17:42 -04:00
Mark Backman
ea61256ddc Merge pull request #1496 from pipecat-ai/mb/gladia-model
Update GladiaSTTService default model
2025-04-01 08:52:13 -04:00
Mark Backman
babafadbe4 Merge pull request #1494 from pipecat-ai/mb/p2p-examples-gitignore
Add .gitignore to p2p video-transform example
2025-04-01 07:39:35 -04:00
Mark Backman
a5660f6dc7 Add .gitignore to p2p video-transform example 2025-04-01 07:20:39 -04:00
Aleix Conchillo Flaqué
64ad916c5f Merge pull request #1492 from pipecat-ai/aleix/downgrade-to-aiohttp-3.11.12
pyproject: downgrade to aiohttp 3.11.12
2025-03-31 19:01:04 -07:00
Aleix Conchillo Flaqué
13d0563298 pyproject: downgrade to aiohttp 3.11.12
See https://pypi.org/project/aiohttp/#history
2025-03-31 18:59:41 -07:00
Mark Backman
20a1dd066d Update GladiaSTTService default model 2025-03-31 19:02:28 -04:00
Mark Backman
56f6e3ceb4 Merge pull request #1490 from pipecat-ai/fix_ruff_format
Fixing ruff format.
2025-03-31 18:37:19 -04:00
Mark Backman
3afab63870 Merge pull request #1488 from pipecat-ai/mb/stt-mute-filter-logline
Clarify the mute/unmute log line in STTMuteFilter
2025-03-31 18:35:47 -04:00
Filipi Fuchter
d3b9a0aab0 Fixing ruff format. 2025-03-31 19:17:40 -03:00
Filipi da Silva Fuchter
6b21081a7d Merge pull request #1487 from pipecat-ai/smallwebrtc_ios_support
SmallWebRTCTransport: Improvements to work with mobile
2025-03-31 19:10:03 -03:00
Aleix Conchillo Flaqué
648bdea64c fix formatting 2025-03-31 15:04:45 -07:00
milo157
ed387e876a Merge pull request #1486 from CerebriumAI/feature/ultravox
Feature/ultravox - bug fixes
2025-03-31 15:03:26 -07:00
Aleix Conchillo Flaqué
2fb9aa4d76 Merge pull request #1489 from pipecat-ai/aleix/base-ai-services-restructure
services: restructure base AI services into modules
2025-03-31 15:00:13 -07:00
Aleix Conchillo Flaqué
9eba8f1637 services: restructure base AI services into modules 2025-03-31 13:53:36 -07:00
Mark Backman
43c255f58a Clarify the mute/unmute log line in STTMuteFilter 2025-03-31 16:45:02 -04:00
Filipi Fuchter
121e70a029 Improvements on the video transform example to work on mobile. 2025-03-31 17:11:38 -03:00
Filipi Fuchter
70e28a0547 Adding support to yuvj420p which is the format that we receive from mobile iOS. 2025-03-31 13:12:20 -03:00
Mark Backman
c9a93f2504 Merge pull request #1469 from pipecat-ai/mb/update-gladia
Refactor GladiaSTTService to support addition params
2025-03-31 11:18:32 -04:00
Adithya Suresh
e8783f6a33 Handle cache token counts being none 2025-03-31 15:25:11 +11:00
Mark Backman
8a12470efd Reorganize into a directory 2025-03-30 20:01:40 -04:00
Mark Backman
05d53bc66f Refactor GladiaSTTService; add support for additional params 2025-03-30 19:54:55 -04:00
Aleix Conchillo Flaqué
e763cd7bee Merge pull request #1471 from pipecat-ai/aleix/services-restructure
services: restructure services into folders
2025-03-30 16:23:26 -07:00
Aleix Conchillo Flaqué
94ec5118e6 track already reported deprecated modules (mark's update) 2025-03-30 16:21:00 -07:00
Aleix Conchillo Flaqué
7203ef6885 examples: use new services packages 2025-03-30 16:21:00 -07:00
Aleix Conchillo Flaqué
3074a62bb1 services: restructure services into folders 2025-03-30 16:21:00 -07:00
Mark Backman
31712b84ac Merge pull request #1479 from pipecat-ai/mb/qwen-pyproject-entry 2025-03-29 22:36:40 -04:00
Mark Backman
c99ec0b0b7 Add placeholder entry for qwen to pyproject.toml 2025-03-29 20:20:48 -04:00
Mark Backman
cd7abd2962 Merge pull request #1478 from pipecat-ai/mb/alibaba-cloud-offerings
Add QwenLLMService
2025-03-29 20:13:21 -04:00
Mark Backman
c7544954cf Merge pull request #1476 from pipecat-ai/mb/ref-docs-mem0-mlx-whisper
Update reference docs generation for mem0 and mlx-whisper
2025-03-29 20:12:58 -04:00
Mark Backman
4f390b15a3 Merge pull request #1477 from pipecat-ai/mb/fix-mem0-example-number
Renumber mem0 example, small changelog updates
2025-03-29 20:04:45 -04:00
Mark Backman
f2a05b065d Add QwenLLMService 2025-03-29 19:43:37 -04:00
Mark Backman
5d5041eb2b Renumber mem0 example, small changelog updates 2025-03-29 18:45:39 -04:00
Mark Backman
f4dc66cb13 Update reference docs generation for mem0 and mlx-whisper 2025-03-29 18:42:08 -04:00
Mark Backman
b88744b18d Merge pull request #1475 from pipecat-ai/khk/whisper-mlx-example
Example and CHANGELOG for WhisperSTTServiceMLX service
2025-03-29 18:09:17 -04:00
Kwindla Hultman Kramer
209de2638d WhisperSTTServiceMLX example and CHANGELOG 2025-03-29 18:04:07 -04:00
Mark Backman
5d829fb6a9 Merge pull request #1474 from pipecat-ai/khk/mem0-changelog
Changelog entry for mem0 service
2025-03-29 18:02:32 -04:00
Mark Backman
a978a5cd4a Fix Whisper formatting 2025-03-29 17:57:50 -04:00
Mark Backman
b9ea3f0fd9 Update README, organize pyproject.toml 2025-03-29 17:56:17 -04:00
Kwindla Hultman Kramer
d2f5ee2915 Changelog entry for mem0 service 2025-03-29 17:55:26 -04:00
Mark Backman
acddddc508 Merge pull request #1472 from pipecat-ai/mb/small-webrtc-readme
Add README link for SmallWebRTCTransport
2025-03-29 17:38:15 -04:00
Kwindla Hultman Kramer
0c2c6fa771 Merge pull request #1383 from zboyles/add-mlx-whisper
Added Support for MLX Whisper models on Apple M-Series
2025-03-29 14:25:37 -07:00
Mark Backman
80088c6138 Merge pull request #1473 from pipecat-ai/mb/ref-docs-updates
Update packages for auto-generating docs
2025-03-29 17:20:46 -04:00
Kwindla Hultman Kramer
766639a9a4 Merge pull request #1388 from deshraj/user/dyadav/mem0-integration
Added mem0 service.
2025-03-29 13:12:58 -07:00
Mark Backman
675e2b1498 Update packages for auto-generating docs 2025-03-29 08:21:58 -04:00
Mark Backman
af6c23f7b1 Add README link for SmallWebRTCTransport 2025-03-28 21:29:24 -04:00
Aleix Conchillo Flaqué
d212e88030 Merge pull request #1468 from pipecat-ai/aleix/smallwebrtc-updates
transports(webrtc): some SmallWebRTC updates
2025-03-28 14:41:45 -07:00
Aleix Conchillo Flaqué
d6758bf2ad transports(webrtc): rename appMessage to app-message 2025-03-28 14:35:11 -07:00
Filipi Fuchter
5abfb15300 Registering the event handlers and fixing the examples. 2025-03-28 17:30:06 -03:00
Aleix Conchillo Flaqué
f576254d61 transports(webrtc): some SmallWebRTC updates 2025-03-28 13:19:23 -07:00
Aleix Conchillo Flaqué
a90807a3d2 Merge pull request #1465 from roey-priel/main
Tavus / Deepgram TTS compatibility
2025-03-28 08:43:09 -07:00
roey
a06fc4ce50 yield outside of the loop 2025-03-28 08:41:36 -07:00
roey
80cb4497f0 Merge pull request #1 from roey-priel/deepgram-tts-tavus-compatibility
Update deepgram.py
2025-03-27 17:06:33 -07:00
roey
8aa878c5e9 Update deepgram.py 2025-03-27 17:05:29 -07:00
Filipi da Silva Fuchter
e982b3d919 Merge pull request #1290 from pipecat-ai/aiortc_example
P2P WebRTC transport option to Pipecat
2025-03-27 18:29:44 -03:00
Filipi Fuchter
8945fd1fc6 Starting the server by default as localhost. 2025-03-27 18:27:56 -03:00
Filipi Fuchter
16b97d151b Adding the SmallWebRTCTransport to the changelog. 2025-03-27 17:56:12 -03:00
Filipi Fuchter
f7ac142ad2 Merge branch 'main' into aiortc_example 2025-03-27 17:50:46 -03:00
Filipi da Silva Fuchter
2355067f61 Merge pull request #1441 from pipecat-ai/aiortc_example_small_webrtc_transport
P2P WebRTC transport - example improvements.
2025-03-27 17:49:12 -03:00
Filipi Fuchter
76f9626d35 Using the @pipecat-ai/small-webrtc-transport from npm. 2025-03-27 17:48:32 -03:00
Filipi da Silva Fuchter
f82c2566e8 Merge pull request #1270 from pipecat-ai/improve_protobuf_serializer
Added support to `ProtobufFrameSerializer` to send the transport messages
2025-03-27 17:28:37 -03:00
Filipi Fuchter
b6007bb3d6 Added support to ProtobufFrameSerializer to send the transport messages 2025-03-27 17:26:03 -03:00
Filipi Fuchter
311a5360ad Renaming the example to p2p-webrtc 2025-03-27 16:46:00 -03:00
Filipi Fuchter
62cb0376f2 Changing the file types. 2025-03-27 16:34:40 -03:00
Filipi Fuchter
91a69b7029 Improving the readmes for the webrtc examples. 2025-03-27 16:32:46 -03:00
Mark Backman
1d4d7f28a1 Merge pull request #1463 from pipecat-ai/mb/add-piper-readme
Add Piper to README
2025-03-27 08:52:31 -04:00
Mark Backman
a55a7bbb96 Add Piper to README 2025-03-27 08:03:16 -04:00
Mark Backman
a394b35e85 Merge pull request #1459 from pipecat-ai/mb/issue-1454
Fix: GoogleTTSService was emitting two TTSStoppedFrames
2025-03-27 08:00:16 -04:00
Mark Backman
aa85df4fd6 Fix: GoogleTTSService was emitting two TTSStoppedFrames 2025-03-27 07:55:19 -04:00
Filipi da Silva Fuchter
3bb1f5f7a8 Merge pull request #1130 from pedro-a-n-moreira/piper-tts
Add support for Piper TTS
2025-03-27 08:08:05 -03:00
Filipi Fuchter
7c115f9d59 Merge branch 'main' into piper-tts
# Conflicts:
#	CHANGELOG.md
2025-03-27 08:01:38 -03:00
Filipi Fuchter
a82b847971 Fixing ruff format. 2025-03-27 07:58:53 -03:00
Filipi Fuchter
50515aa842 Adding PiperTTSService to the changelog. 2025-03-27 07:50:47 -03:00
Filipi Fuchter
b348fde32b Refactoring PiperTTSService to match the others TTS services provided by Pipecat and fixing noise issue due to wav header. 2025-03-27 07:46:38 -03:00
Filipi Fuchter
45787520b2 Refactoring the piper test to use run_test provided by Pipecat 2025-03-27 07:45:28 -03:00
Filipi Fuchter
053bf72da2 Adding pytest-aiohttp to the dev requirements. 2025-03-27 07:44:46 -03:00
Filipi Fuchter
ca4893397a Creating a foundational example which uses the piper service. 2025-03-27 07:44:26 -03:00
Filipi Fuchter
c1f6a4e079 Adding PIPER_BASE_URL to the env template. 2025-03-27 07:44:05 -03:00
Aleix Conchillo Flaqué
135ed811f1 Merge pull request #1460 from pipecat-ai/aleix/segmented-tts-ignore-emulated-frames
segmented tts ignore emulated frames
2025-03-26 16:03:59 -07:00
Aleix Conchillo Flaqué
055a3f1c53 LLMAssistantContextAggregator: stop emulations if the user starts speaking 2025-03-26 14:39:12 -07:00
Aleix Conchillo Flaqué
750bb88586 SegmentedSTTService: ignore emulated frames 2025-03-26 14:38:48 -07:00
Aleix Conchillo Flaqué
c4f9171fe1 frames: indicate if UserStartedSpeakingFrame/UserStoppedSpeakingFrame are emulated 2025-03-26 14:37:36 -07:00
Filipi Fuchter
d223201c3f Merge branch 'main' into piper-tts
# Conflicts:
#	test-requirements.txt
2025-03-26 16:47:45 -03:00
Mark Backman
86701fd3c7 Merge pull request #1457 from pipecat-ai/mb/fix-rtvi-observer-gemini
Fix: Resolve an issue where Google LLM context messages were causing …
2025-03-26 14:18:37 -04:00
Mark Backman
b414077a07 Fix: Resolve an issue where Google LLM context messages were causing a TypeError 2025-03-26 13:55:42 -04:00
kompfner
15f23929e9 Merge pull request #1455 from pipecat-ai/prepare-0.0.61
Update CHANGELOG for 0.0.61
2025-03-26 13:50:29 -04:00
Mark Backman
cc9e4047d0 Merge pull request #1447 from nicougou/feat/support_tts_instruct
feature/support instructions in OpenAITTSService
2025-03-26 13:35:41 -04:00
Paul Kompfner
4ef4dcefce Update CHANGELOG for 0.0.61 2025-03-26 13:06:31 -04:00
kompfner
f3caa8cf7a Merge pull request #1452 from pipecat-ai/daily-python-0.16.1
Bump daily-python dependency to 0.16.1 to pick up a bugfix
2025-03-26 13:01:38 -04:00
Mark Backman
e5470fec7a Merge pull request #1453 from pipecat-ai/khk/groq
New GroqTTSService
2025-03-26 12:49:18 -04:00
Mark Backman
887c197bce Add sample_rate to the constructor 2025-03-26 12:29:40 -04:00
Kwindla Hultman Kramer
f5d49fea81 try/catch import of groq SDK 2025-03-26 12:29:40 -04:00
Kwindla Hultman Kramer
e087f6ec5d GroqTTSService added to CHANGELOG.md 2025-03-26 12:29:39 -04:00
Kwindla Hultman Kramer
406f5a395b fix class heirarchy and audio chunking 2025-03-26 12:29:18 -04:00
Kwindla Hultman Kramer
060bb4c26b wip 2025-03-26 12:29:18 -04:00
Nico
499e69846d review: add changelog entries 2025-03-26 17:13:30 +01:00
Paul Kompfner
e6e339a02e Bump daily-python dependency to 0.16.1 to pick up a bugfix 2025-03-26 11:22:23 -04:00
Nico
dc2ee2bf0a review: remove websocket_base_url 2025-03-26 15:41:42 +01:00
Nico
d982fc35d8 fix: formatter 2025-03-26 15:41:42 +01:00
Nico
72d373e565 feature/support instructions in OpenAITTSService 2025-03-26 15:41:42 +01:00
Aleix Conchillo Flaqué
59fdfe697d Merge pull request #1449 from pipecat-ai/aleix/google-assistant-aggregator-function-call-result
GoogleAssistantContextAggregator: allow any value as function call result
2025-03-26 07:25:34 -07:00
Filipi da Silva Fuchter
97c9e0676e Merge pull request #1451 from pipecat-ai/set-tool-choice-from-context-aggregator
Set tool choice from context aggregator
2025-03-26 09:12:26 -03:00
Filipi Fuchter
aeac40312e Added the feature to change dynamically the tool choice to the changelog. 2025-03-26 09:06:29 -03:00
Filipi Fuchter
ce9f75a851 Fixing the tool choice extra type to be a dict instead of string. 2025-03-26 08:17:50 -03:00
Filipi Fuchter
c45d852f6b Merge branch 'main' into set-tool-choice-from-context-aggregator
# Conflicts:
#	src/pipecat/processors/aggregators/llm_response.py
2025-03-26 07:14:57 -03:00
Deshraj Yadav
55cc1fe9f6 Fix import lines 2025-03-25 23:35:47 -07:00
Deshraj Yadav
1ba7e2d6fa Format imports properly 2025-03-25 23:30:01 -07:00
Deshraj Yadav
1b8d326b49 Run ruff 2025-03-25 23:15:35 -07:00
Aleix Conchillo Flaqué
077952b658 GoogleAssistantContextAggregator: allow any value as function call result 2025-03-25 19:11:27 -07:00
Deshraj Yadav
e694971423 Merge pull request #2 from pipecat-ai/khk/mem0
small changes to make 35-mem0.py
2025-03-25 18:10:36 -07:00
Kwindla Hultman Kramer
d00ae492e5 small changes to make 35-mem0.py like the other foundational single-file examples. 2025-03-25 15:51:38 -07:00
Aleix Conchillo Flaqué
9450b07ec5 Merge pull request #1442 from pipecat-ai/aleix/on-context-updated-as-task
LLMAssistantContextAggregator: create a task to run on_context_updated
2025-03-25 15:39:36 -07:00
Aleix Conchillo Flaqué
19b464ba23 tests: add assistant aggregator function call frame handling 2025-03-25 15:37:06 -07:00
Aleix Conchillo Flaqué
8aebf00c2d GoogleAssistantContextAggregator: function call result should be a JSON object 2025-03-25 15:37:06 -07:00
Aleix Conchillo Flaqué
01458895c2 LLMAssistantContextAggregator: create a task to run on_context_updated 2025-03-25 14:37:11 -07:00
kompfner
2082d023ef Merge pull request #1448 from pipecat-ai/daily-python-0.16.0
Bump daily-python dependency to 0.16.0 to pick up support in `DailyTr…
2025-03-25 17:32:38 -04:00
Paul Kompfner
c99436b80e Bump daily-python dependency to 0.16.0 to pick up support in DailyTransport for updating remote participants' canReceive permission via the update_remote_participants() method 2025-03-25 17:29:48 -04:00
Filipi Fuchter
f884c93826 Refactoring the video-transform example to use pipecat client. 2025-03-25 17:32:25 -03:00
Deshraj Yadav
2780c6eed6 Incorporate suggestions 2025-03-25 10:45:08 -07:00
Deshraj Yadav
7ad36eeaf4 Add mem0 as a service integration 2025-03-25 10:44:12 -07:00
Filipi Fuchter
67a93d09c2 Merge branch 'main' into aiortc_example 2025-03-25 10:31:53 -03:00
Aleix Conchillo Flaqué
f3b50bc3c4 Revert "LLMAssistantContextAggregator: create a task to run on_context_updated"
This reverts commit 397bae29f7.
2025-03-24 15:40:26 -07:00
Aleix Conchillo Flaqué
397bae29f7 LLMAssistantContextAggregator: create a task to run on_context_updated 2025-03-24 15:39:35 -07:00
Mark Backman
3b3fdd0da1 Merge pull request #1439 from pipecat-ai/mb/fix-rtvi-bot-speaking-events
Fix: RTVIObserver now outputs a single bot started and stopped speaki…
2025-03-24 11:44:31 -04:00
Mark Backman
a9b1298f3b Fix: RTVIObserver now outputs a single bot started and stopped speaking event per turn 2025-03-24 10:25:31 -04:00
Filipi Fuchter
2fcf4e6d70 Fixing ruff format 2025-03-24 11:23:55 -03:00
Filipi Fuchter
fcb8b9a5b3 Refactoring how we are creating the answer so we don't need to wait for the client gathering all ice candidates. 2025-03-24 11:12:41 -03:00
Filipi Fuchter
fee0409f63 Logging if the remote peer supports trickle ice. 2025-03-24 08:59:21 -03:00
Filipi Fuchter
3be6973e2c Adding support to define ice servers. 2025-03-24 08:57:24 -03:00
Filipi Fuchter
5184d178ef Merge branch 'main' into aiortc_example 2025-03-24 08:37:08 -03:00
Thomas B.
48e8d3968a fix: recognition language correctly set for Azure STT (#1436) 2025-03-23 19:29:52 -07:00
Aleix Conchillo Flaqué
59644a939a Merge pull request #1434 from pipecat-ai/aleix/examples-07-interruptible-local
examples: add foundational 07x-interruptible-local.py
2025-03-23 05:44:40 -07:00
Aleix Conchillo Flaqué
3311afc581 examples: add foundational 07x-interruptible-local.py 2025-03-22 21:58:55 -07:00
Filipi da Silva Fuchter
a3ccbf91f7 Merge pull request #1429 from pipecat-ai/fixing_set_tool_issue
Only checking the length if tools is a list.
2025-03-21 13:56:45 -03:00
Filipi Fuchter
3ed764a769 Only checking the length if tools is a list. 2025-03-21 12:56:05 -03:00
Mark Backman
be8d5a31f5 Merge pull request #1425 from Allenmylath/patch-25
Update env.example
2025-03-21 08:39:03 -04:00
Mark Backman
480bcc1ab1 Merge pull request #1424 from Allenmylath/patch-24
Update requirements.txt
2025-03-21 08:38:54 -04:00
allenmylath
dd81048ddb Update env.example
EXAMPLE USES CARTESI NOT ELEVNE LABS
2025-03-21 10:11:28 +05:30
allenmylath
04d462ff02 Update requirements.txt
example uses cartesia not elevenlabs
2025-03-21 10:09:09 +05:30
Aleix Conchillo Flaqué
7e7aaeddd9 Merge pull request #1423 from pipecat-ai/aleix/elevenlabs-pcm-8000
ElevenLabs: add support for a sample rate of 8000
2025-03-20 19:34:16 -07:00
Aleix Conchillo Flaqué
e77f7c8456 update ruff and pyright versions 2025-03-20 19:19:08 -07:00
Aleix Conchillo Flaqué
442f18d47b ultravox: fix formatting 2025-03-20 19:19:08 -07:00
Aleix Conchillo Flaqué
fc78e6fc5a ElevenLabs: add support for a sample rate of 8000 2025-03-20 19:13:23 -07:00
Aleix Conchillo Flaqué
d71b520153 update CHANGELOG.md and fix formatting 2025-03-20 18:58:06 -07:00
milo157
3b4d91e1c1 Fixed ultravox service bugs (#1420) 2025-03-20 18:55:43 -07:00
Aleix Conchillo Flaqué
09c62d939a Merge pull request #1422 from pipecat-ai/aleix/pipecat-0.0.60
update CHANGELOG for 0.0.60
2025-03-20 16:25:52 -07:00
Aleix Conchillo Flaqué
f2b9789acf update CHANGELOG for 0.0.60 2025-03-20 16:17:34 -07:00
Aleix Conchillo Flaqué
1592703e77 Merge pull request #1421 from pipecat-ai/aleix/rollback-deepgram-to-3.8.0
pyproject: rollback deepgram-sdk to 3.8.0
2025-03-20 16:16:08 -07:00
Aleix Conchillo Flaqué
66e42ae410 pyproject: rollback deepgram-sdk to 3.8.0 2025-03-20 16:15:43 -07:00
Mark Backman
8d6dbbe293 Merge pull request #1417 from pipecat-ai/mb/update-realtime-transcription
Update InputAudioTranscription to use gpt-4o-transcribe model, update…
2025-03-20 18:49:06 -04:00
Mark Backman
2ac8f2ec2d Fix linting 2025-03-20 18:40:16 -04:00
Paul Kompfner
41688205be Provide new settings in OpenAI Realtime example 2025-03-20 18:23:25 -04:00
Mark Backman
541a4b6063 Update InputAudioTranscription to use gpt-4o-transcribe model, update 19 examples to use FunctionSchema 2025-03-20 18:23:24 -04:00
Aleix Conchillo Flaqué
8f6d92ce7d update CHANGELOG with BaseOpenAILLMService default_headers 2025-03-20 13:47:15 -07:00
Aleix Conchillo Flaqué
96fa6c19a8 Merge pull request #1398 from nicougou/feature/openai_custom_headers
feature: add custom headers to AsyncOpenAI
2025-03-20 13:45:57 -07:00
Varun Singh
c9f7882728 initial commit 2025-03-20 12:31:08 -07:00
Aleix Conchillo Flaqué
0fdd577ae7 Merge pull request #1416 from pipecat-ai/aleix/pipecat-0.0.59
update CHANGELOG for 0.0.59
2025-03-20 11:48:14 -07:00
Aleix Conchillo Flaqué
2133152e5b update CHANGELOG for 0.0.59 2025-03-20 11:42:54 -07:00
Aleix Conchillo Flaqué
c3f3f4603d Merge pull request #1413 from pipecat-ai/aleix/llm-user-aggregator-emulate-fixes
LLMUserContextAggregator: fix emulated user started/stopped speaking issues
2025-03-20 11:41:26 -07:00
Aleix Conchillo Flaqué
b20ce7d655 examples: move 07u-interruptible-neuphonic to 07v 2025-03-20 11:38:29 -07:00
Aleix Conchillo Flaqué
66ba1116a4 pyproject: rollback azure to 1.42.0 2025-03-20 11:23:40 -07:00
Aleix Conchillo Flaqué
08956e914a livekit: remove unnecessary transport cleanup() function 2025-03-20 11:23:40 -07:00
Aleix Conchillo Flaqué
5a39f146f6 LLMUserContextAggregator: fix emulated user started/stopped speaking issues 2025-03-20 11:23:40 -07:00
kompfner
de8a831ee1 Merge pull request #1414 from pipecat-ai/march-main
March OpenAI updates
2025-03-20 14:22:09 -04:00
Aleix Conchillo Flaqué
efa5f133d7 openai_realtime: fix and update function calling 2025-03-20 11:14:59 -07:00
Paul Kompfner
44380bc8c0 Remove duplicate changelog entry due to rebase mistake 2025-03-20 13:51:16 -04:00
Paul Kompfner
721ee75887 Comment tweak 2025-03-20 13:43:00 -04:00
Paul Kompfner
ada68f0699 More robust handling of conversation item retrieval errors in OpenAIRealtimeBetaLLMService 2025-03-20 13:43:00 -04:00
Mark Backman
70dbf0d6fc Updated default models for OpenAISTTService and OpenAITTSService to gpt-4o based models 2025-03-20 13:42:56 -04:00
Paul Kompfner
f0774268cc Rename gpt-4o-transcribe-latest to gpt-4o-transcribe in OpenAIRealtimeBetaLLMService 2025-03-20 13:39:40 -04:00
Chad Bailey
2ae5bdd8a9 lets talk about dogs 2025-03-20 13:39:40 -04:00
Chad Bailey
0d74bcacb7 updated models in the 07g example 2025-03-20 13:39:40 -04:00
Paul Kompfner
f94a099111 Revert the default model to be "gpt-4o-realtime-preview-2024-12-17" In OpenAIRealtimeBetaLLMService 2025-03-20 13:39:36 -04:00
Paul Kompfner
3dd4ef7230 Tweak changelog entries describing slate of recent updates to OpenAIRealtimeBetaLLMService 2025-03-20 13:36:22 -04:00
Paul Kompfner
e707efbffa Update changelog with slate of recent updates to OpenAIRealtimeBetaLLMService 2025-03-20 13:35:12 -04:00
Paul Kompfner
7b594093dd Handle the possibility of multiple concurrent calls to retrieve_conversation_item() in the OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
31317ce77d Add error handling to the retrieve_conversation_item() method of the OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
f693a3c70f Add retrieve_conversation_item() method to OpenAIRealtimeBetaLLMService, using the new conversation.item.retrieve introspection message. 2025-03-20 13:31:28 -04:00
Paul Kompfner
39ca607bbb Add on_conversation_item_created and on_conversation_item_updated events to OpenAIRealtimeBetaLLMService.
The hope is that this will expose to the user conversation item ids at relevant times for them to use with the new `conversation.item.retrieve` introspection message.
2025-03-20 13:31:28 -04:00
Paul Kompfner
9840abd85b Make it so you specifying model=None when creating a InputAudioTranscription results in a validation error 2025-03-20 13:31:28 -04:00
Paul Kompfner
1075c25055 Add new semantic turn detection option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
e91610c69e linter fix 2025-03-20 13:31:28 -04:00
Paul Kompfner
1a20d9bed7 Add new input_audio_noise_reduction option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
d009b80438 Add new GPT-4o transcription option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
kompfner
fe5fc30211 Revert "Add new GPT-4o transcription option to OpenAIRealtimeBetaLLMService" 2025-03-20 13:31:28 -04:00
Paul Kompfner
be2cf6d556 formatting fix 2025-03-20 13:31:28 -04:00
Paul Kompfner
e80bfe22de Add new GPT-4o transcription option to OpenAIRealtimeBetaLLMService 2025-03-20 13:31:28 -04:00
Paul Kompfner
214c8f79eb linter fix 2025-03-20 13:31:28 -04:00
Paul Kompfner
16accafa6d formatting fix 2025-03-20 13:31:28 -04:00
Kwindla Hultman Kramer
4449e9a25b add response.done status=failed error 2025-03-20 13:31:28 -04:00
Kwindla Hultman Kramer
bfdf52bd69 change examples/foundational/19-openai-realtime-beta.py to use the new preview model 2025-03-20 13:31:28 -04:00
Kwindla Hultman Kramer
2b4debec11 add support for conversation.item.input_audio_transcription.delta 2025-03-20 13:31:28 -04:00
Mark Backman
f4626287cd Merge pull request #1411 from pipecat-ai/mb/add-fal-wizper
Add FalSTTService
2025-03-20 13:08:08 -04:00
Mark Backman
e4bb4aacb4 Example: Rename 07 ultravox example 2025-03-20 12:46:00 -04:00
Mark Backman
f298febacf Add FalSTTService 2025-03-20 12:45:16 -04:00
Aleix Conchillo Flaqué
c51291190b Merge pull request #1394 from pipecat-ai/aleix/function-calls-as-tasks
function calls as tasks
2025-03-20 09:34:37 -07:00
Aleix Conchillo Flaqué
e0c3f6ad83 services: mark function calls as completed even the result is None 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
b1d506c137 GoogleAssistantContextAggregator: properly update function response 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
1f6ed01ba6 LLMAssistantContextAggregator: remove tool call id with image requests 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
3e9678db84 user image requests can now be related to function calls 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
d455fd070e update CHANGELOG 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
d1550d5a85 tests: remove TestFrameProcessor, reimplement with run_test() 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
c15286b148 examples: deprecate start_callback from LLMService.register_function() 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
a98000fd1d function calling now run in tasks 2025-03-20 08:51:25 -07:00
Aleix Conchillo Flaqué
fc06306efd Merge pull request #1406 from pipecat-ai/aleix/pipeline-task-idle-timeouts
PipelineTask: automatically cancel tasks if pipeline is idle
2025-03-20 08:37:39 -07:00
Mark Backman
039fa59165 Merge pull request #1409 from pipecat-ai/aleix/segmented-stt-service-vad-events
SegmentedSTTService: use VAD events to detect valid audio
2025-03-20 09:11:08 -04:00
Aleix Conchillo Flaqué
0e14cec139 pyproject: update multiple libraries 2025-03-20 01:22:33 -07:00
Aleix Conchillo Flaqué
2417ec4f92 LLMUserContextAggregator: increase bot_interruption_timeout to 5 seconds 2025-03-20 01:20:34 -07:00
Aleix Conchillo Flaqué
7cdcd1c3d1 OpenAITTSService: allow specifying any model name 2025-03-20 01:20:34 -07:00
Aleix Conchillo Flaqué
b6be25ab84 SegmentedSTTService: use VAD events to detect valid audio 2025-03-20 00:31:49 -07:00
Aleix Conchillo Flaqué
e18d9f6a11 PipelineTask: automatically cancel tasks if pipeline is idle 2025-03-19 23:30:46 -07:00
Mark Backman
3a73346a41 Merge pull request #1408 from pipecat-ai/mb/claude-models-example
Update to Claude 3.7 Sonnet latest in examples
2025-03-20 01:44:59 -04:00
Aleix Conchillo Flaqué
8d58d1c8bb Merge pull request #1404 from pipecat-ai/aleix/gemini-push-frame-fixes
GeminiMultimodalLiveLLMService: fix duplicated messages in context
2025-03-19 21:51:39 -07:00
Mark Backman
07a77e066f Update to Claude 3.7 Sonnet latest in examples 2025-03-19 23:18:30 -04:00
Aleix Conchillo Flaqué
3024896d3d Merge pull request #1405 from pipecat-ai/aleix/tts-services-fallback
WebsocketTTSService: add `on_connection_error` and `reconnect_on_error`
2025-03-19 19:39:51 -07:00
Aleix Conchillo Flaqué
a3b5e4413a WebsocketTTSService: add on_connection_error and reconnect_on_error 2025-03-19 19:38:08 -07:00
Aleix Conchillo Flaqué
f31e77c4f6 pyproject: added empty tavus dependencies 2025-03-19 18:43:07 -07:00
Aleix Conchillo Flaqué
8942c2e053 GeminiMultimodalLiveLLMService: fix duplicated messages in context
Fixes #1384
2025-03-19 15:33:54 -07:00
Aleix Conchillo Flaqué
afb26be0ad Merge pull request #1396 from pipecat-ai/aleix/stt-service-audio-passthrough
SegmentedSTTService: allow audio to pass-through downstream
2025-03-19 11:16:40 -07:00
Aleix Conchillo Flaqué
48d73a2636 SegmentedSTTService: allow audio to pass-through downstream 2025-03-19 11:06:12 -07:00
Aleix Conchillo Flaqué
da531dabfd Merge pull request #1304 from pipecat-ai/aleix/handle-emails-user-email-gathering
add skip tags aggregator to support TTS service spelling out tags
2025-03-19 11:05:10 -07:00
Aleix Conchillo Flaqué
336e2f1579 TTSServices: for now just specify a single text aggregator 2025-03-19 11:02:29 -07:00
Aleix Conchillo Flaqué
fc0f404d26 examples: add new 36-user-email-gathering.py 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
54620133d4 services: add spelling out support to CartesiaTTSService and RimeTTSService 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
e7224473f2 utils(text): add new SkipTagsAggregator 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
1a3a268c9d utils(string): add new function parse_start_end_tags() 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
11984b89b7 utils(string): add support for floating point numbers 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
1dbad2326a utils(string): support email addresses in end of sentence matching 2025-03-19 10:57:27 -07:00
Mark Backman
2e0c6c2bd1 Merge pull request #1397 from pipecat-ai/mb/disconnect-bot
Fix: RTVI message disconnect-bot now pushes EndTaskFrame
2025-03-19 10:45:24 -04:00
Nico
5f28834588 feature: add custom headers to AsyncOpenAI 2025-03-19 14:49:51 +01:00
Mark Backman
7f1ccab445 Fix: RTVI message disconnect-bot now pushes EndTaskFrame 2025-03-19 07:07:45 -04:00
Aleix Conchillo Flaqué
7ddac4eb88 Merge pull request #1395 from pipecat-ai/aleix/multiple-text-filters-and-aggregators
TTSService: allow passing multiple text filters and aggregators
2025-03-18 21:25:29 -07:00
Aleix Conchillo Flaqué
514ecda755 TTSService: allow passing multiple text filters and aggregators 2025-03-18 17:31:01 -07:00
balalo
48b6850df4 allow other function names 2025-03-18 20:45:31 +01:00
Aleix Conchillo Flaqué
71a38a120e Merge pull request #1376 from pipecat-ai/aleix/event-handlers-as-tasks
event handlers are now executed in separate tasks
2025-03-18 12:10:34 -07:00
Mark Backman
79616de7a4 Merge pull request #1392 from pipecat-ai/mb/fix-google-stt-timeout
Fix an issue where GoogleSTTService would timeout due to stream inact…
2025-03-18 14:17:44 -04:00
Mark Backman
6368fbe0dd Merge pull request #1318 from Vaibhav159/vl_google_vertex_llm
adding vertex google llm
2025-03-18 14:17:21 -04:00
Mark Backman
5dc8b48fbe Fix an issue where GoogleSTTService would timeout due to stream inactivity 2025-03-18 14:06:32 -04:00
Aleix Conchillo Flaqué
9112ff114f Merge pull request #1359 from lucasrothman/tavus-output-sample-rate
Tavus support for custom output rate
2025-03-18 10:16:34 -07:00
Aleix Conchillo Flaqué
32609b1132 event handlers are now executed in separate tasks 2025-03-18 09:25:39 -07:00
Vaibhav159
4303ed4991 rename service 2025-03-18 20:58:21 +05:30
Mark Backman
4677c34663 Merge pull request #1387 from pipecat-ai/mb/pattern-aggregator
Add PatternPairAggregator
2025-03-18 08:46:42 -04:00
Mark Backman
b28276446d Code review feedback 2025-03-18 07:49:54 -04:00
Mark Backman
2dee882710 Add unit tests 2025-03-18 07:30:37 -04:00
Mark Backman
6ec4052f29 Add CHANGELOG entries 2025-03-18 07:30:36 -04:00
Mark Backman
ddcc1fbb2f Add foundational example 35 2025-03-18 07:30:11 -04:00
Mark Backman
e731a0d41f Add PairPatternAggregator 2025-03-18 07:30:11 -04:00
Mark Backman
4918eab4e8 Merge pull request #1371 from pipecat-ai/mb/openai-realtime-transcription
Add TranscriptProcessor support for OpenAIRealtimeBetaLLMService
2025-03-18 07:28:07 -04:00
Mark Backman
11987765d8 Merge pull request #1381 from pipecat-ai/mb/recording-example-stt
Update the 34-audio-recording.py example to include an STT processor
2025-03-18 07:20:42 -04:00
Mark Backman
6f09ee25b8 Merge pull request #1385 from pipecat-ai/mb/add-neuphonic-readme
Add Google Imagen and Neuphonic TTS to README
2025-03-18 07:20:15 -04:00
Mark Backman
83dda8a759 Merge pull request #1390 from adnansiddiquei/add-neuphonic-languages
Added 5 new languages for Neuphonic: FR, PT, RU, ZH, HI.
2025-03-18 07:18:27 -04:00
Adnan Siddiquei
188677e601 Added 4 new languages: FR, PT, RU, ZH, HI. 2025-03-18 10:35:22 +00:00
balalo
dc5067407d Fix ruff check 2025-03-18 11:12:51 +01:00
balalo
1c19777d5e Fix format 2025-03-18 11:09:40 +01:00
balalo
2e1a18503b Set tool choice from context aggregator 2025-03-18 10:41:43 +01:00
Lucas Rothman
c57fa93a70 Renamed to sample_rate 2025-03-17 16:22:36 -07:00
Mark Backman
6885d07e88 Simplify the TranscriptProcessor _emit_aggregated_text logic 2025-03-17 16:36:03 -04:00
Mark Backman
acd0660f66 Update GeminiMultimodalLiveLLMService to work with the TranscriptProcessor 2025-03-17 16:36:03 -04:00
Mark Backman
3f002f8ffb Remove unnecessary TranscriptProcessor examples 2025-03-17 16:36:02 -04:00
Mark Backman
d5776c27f4 Update 19-openai-realtime-beta 2025-03-17 16:35:35 -04:00
Mark Backman
6e6905405b Update CHANGELOG 2025-03-17 16:35:35 -04:00
Mark Backman
571c10403f tests: Add additional coverage to test_transcript_processor 2025-03-17 16:35:35 -04:00
Mark Backman
5b6b700214 OpenAIRealtimeBetaLLMService outputs a TTSTextFrame 2025-03-17 16:35:35 -04:00
Mark Backman
1ad8e28025 Update TranscriptProcessor to more robustly handle different TTSTextFrame outputs 2025-03-17 16:35:35 -04:00
Mark Backman
3458f1b6de Add Google Imagen to README 2025-03-17 11:43:40 -04:00
Mark Backman
02dbef8f5a Add Neuphonic TTS to README 2025-03-17 11:28:51 -04:00
Zac
1baa52a17e Enhanced whisper.py with MLX Whisper model support and added optional mlx-whisper to pyproject.toml. Added error handling for missing modules and created a new WhisperSTTServiceMLX class for MLX Whisper integration. 2025-03-16 02:18:54 -04:00
Mark Backman
c1382b0691 Update the 34-audio-recording.py example to include an STT processor 2025-03-15 20:30:35 -04:00
Vaibhav159
5f000efc61 adding example 2025-03-15 10:36:26 +05:30
Vaibhav159
fa7da8f5f6 adding vertex llm 2025-03-15 10:21:40 +05:30
Mark Backman
8b86f6991d Merge pull request #1343 from pipecat-ai/mb/pipecat-cloud-example
Add a Pipecat Cloud deployment example
2025-03-14 20:49:45 -04:00
Mark Backman
d3cd1a6c59 Update with latest starter 2025-03-14 20:40:33 -04:00
Mark Backman
24220f38f0 Add a Pipecat Cloud deployment example 2025-03-14 20:40:29 -04:00
Aleix Conchillo Flaqué
1f8752ab03 Merge pull request #1378 from pipecat-ai/aleix/remove-deprecations
removed most deprecations
2025-03-14 14:42:34 -07:00
Aleix Conchillo Flaqué
16d7df1c9f removed most deprecations 2025-03-14 14:37:08 -07:00
Aleix Conchillo Flaqué
2474211291 Merge pull request #1379 from pipecat-ai/aleix/introduce-text-aggregators
introduce text aggregators
2025-03-14 13:03:49 -07:00
Aleix Conchillo Flaqué
b632d71465 TTSService: flush_audio() should be in the base class 2025-03-14 10:48:25 -07:00
Aleix Conchillo Flaqué
f8610a69a5 introduce text aggregators 2025-03-14 10:48:25 -07:00
Aleix Conchillo Flaqué
624a454f8b Merge pull request #1366 from adnansiddiquei/neuphonic-tts-plugin
Add integration for Neuphonic TTS
2025-03-14 10:27:24 -07:00
Aleix Conchillo Flaqué
11ba08b7ba Merge pull request #1377 from pipecat-ai/aleix/task-upstream-downstream-filters
PipelineTask: only call event handlers if a filter is matched
2025-03-14 08:49:24 -07:00
Adnan Siddiquei
11b13d053b Fixed a bug from previous commit. Removed the concept of model from Neuphonic. 2025-03-14 11:17:22 +00:00
Adnan Siddiquei
7dec8431e1 Review comments by aconchillo. 2025-03-14 10:52:13 +00:00
Aleix Conchillo Flaqué
ce3f3b2edb Merge pull request #1372 from pipecat-ai/khk-fix-multimodal-live-example
fix for 26-gemini-multimodal-live.py
2025-03-13 20:22:07 -07:00
Aleix Conchillo Flaqué
1b3b4ee04a PipelineTask: only call event handlers if a filter is matched 2025-03-13 18:44:30 -07:00
Mark Backman
676c5d9ba7 Merge pull request #1374 from pipecat-ai/mb/add-riva-to-readme 2025-03-13 20:41:05 -04:00
Mark Backman
6eb3a8409f README: Add Parakeet and FastPitch 2025-03-13 18:42:19 -04:00
Filipi Fuchter
526f9c2e06 Fixing the voice agent feedback when disconnected. 2025-03-13 18:41:40 -03:00
Kwindla Hultman Kramer
c9a31ea513 fix for 26-gemini-multimodal-live.py 2025-03-13 14:35:47 -07:00
Filipi Fuchter
2770d64a25 Fixing ruff format. 2025-03-13 17:38:11 -03:00
Filipi Fuchter
8a7e305619 Closing the old peer connection 2025-03-13 17:35:47 -03:00
Filipi Fuchter
8f2dadf5a0 Improving the reconnection logic to be able to recreate the peer connection in some cases. 2025-03-13 17:07:32 -03:00
Aleix Conchillo Flaqué
c0c7c5d600 Merge pull request #1370 from pipecat-ai/aleix/minor-ultravox-updates
services(ultravox): CHANGELOG, formatting and minor changes
2025-03-13 12:05:13 -07:00
Aleix Conchillo Flaqué
87004937be services(ultravox): CHANGELOG, formatting and minor changes 2025-03-13 11:49:18 -07:00
Aleix Conchillo Flaqué
b426be3067 Merge pull request #1331 from CerebriumAI/feature/ultravox
Added ultravox service
2025-03-13 10:40:00 -07:00
Aleix Conchillo Flaqué
b71e2b97ff Merge pull request #1368 from pipecat-ai/aleix/pipelinetask-frame-event-handlers
PipelineTask: add on_frame_reached_upstream and on_frame_reached_downstream
2025-03-13 10:31:33 -07:00
Aleix Conchillo Flaqué
25dcf7def6 PipelineTask: add on_frame_reached_upstream/on_frame_reached_downstream 2025-03-13 10:26:11 -07:00
Filipi Fuchter
30432639b4 Creating a keep alive connection 2025-03-13 14:20:25 -03:00
Adnan Siddiquei
1bf964a667 Added two examples on how to use Neuphonic as a TTS (07u). 2025-03-13 14:42:42 +00:00
Adnan Siddiquei
08fb931ef6 Swapped NEUPHONIC_API_TOKEN for NEUPHONIC_API_KEY. 2025-03-13 12:10:03 +00:00
Aleix Conchillo Flaqué
c5aa931096 Merge pull request #1358 from pipecat-ai/aleix/abstractmethod-fixes
ai_services: fix abstractmethod issues
2025-03-12 17:26:48 -07:00
Filipi Fuchter
d33a4b3a11 Implementing reconnection logic. 2025-03-12 18:23:12 -03:00
Filipi Fuchter
9cad8bfcc6 Increasing the time that we are waiting for the frame. 2025-03-12 15:36:40 -03:00
Mark Backman
b084a3e9e7 Merge pull request #1367 from MaCaki/macaki/rime/send_msg_in_flush_audio
[rime client] Sending over trailing space to help indicate end of utt…
2025-03-12 14:25:18 -04:00
macaki
5c9e33bc7a formatting 2025-03-12 12:20:18 -06:00
Filipi Fuchter
93d8ddf4f2 Only showing the timout warning to receive frame if the client is connected. 2025-03-12 15:13:59 -03:00
Adnan Siddiquei
0b9c4b2255 Fixed a couple of small bugs. 2025-03-12 18:04:48 +00:00
macaki
effb5f6cd8 added changelog 2025-03-12 11:57:25 -06:00
Adnan Siddiquei
ead555eb4b Corrected versions on pyproject.toml. 2025-03-12 17:39:04 +00:00
macaki
f843482968 [rime client] Sending over trailing space to help indicate end of utterance after a punctuation. 2025-03-12 11:26:43 -06:00
Adnan Siddiquei
23a4933af9 Initial implementation of Neuphonic service. A TTS provider. 2025-03-12 17:15:31 +00:00
Filipi Fuchter
0d05312071 Supporting renegotiation inside the voice agent server. 2025-03-12 11:55:38 -03:00
Filipi Fuchter
f8e33d8b7b Improving the video transform feedback when we are connecting, and cleaning the pc_id when disconnected. 2025-03-12 11:49:33 -03:00
Filipi Fuchter
f24c5b0aa7 Adding support for renegotiation. 2025-03-12 11:31:18 -03:00
Michael Louis
d9ef19233a Added foundational example for ultravox 2025-03-12 10:30:23 -04:00
Mark Backman
357334e3c9 Merge pull request #1341 from pipecat-ai/mb/fix-google-typo
Add a set_language convenience method for GoogleSTTService
2025-03-12 09:05:52 -04:00
Filipi Fuchter
da25e0c008 Configuring the bot to receive the video live. 2025-03-12 10:00:33 -03:00
Filipi Fuchter
c99d02d8bb Adding support for interruptions when using SmallWebRTCTransport. 2025-03-12 09:09:11 -03:00
Mark Backman
59ea94af86 Merge pull request #1360 from pipecat-ai/mb/update-cartesia-voice
Update Cartesia voice for demos
2025-03-12 08:02:26 -04:00
Mark Backman
4a363bebf0 Add a set_language convenience method for GoogleSTTService 2025-03-12 07:58:29 -04:00
Mark Backman
c196fb5f98 Merge pull request #1342 from pipecat-ai/mb/lmnt-flush-audio 2025-03-11 22:22:38 -04:00
Mark Backman
5f97f6ff94 Add flush_audio() to LmntTTSService 2025-03-11 21:57:54 -04:00
Mark Backman
5860fe5319 Merge pull request #1340 from pipecat-ai/mb/fish-flush
Add flush_audio to FishTTSService
2025-03-11 21:56:44 -04:00
Mark Backman
3522bbb533 tmp 2025-03-11 21:55:18 -04:00
Mark Backman
cfca7269f4 Update the Cartesia voice in all demos with one built for sonic-2 2025-03-11 21:53:03 -04:00
Mark Backman
e6f269a903 Add flush_audio to FishTTSService 2025-03-11 21:48:41 -04:00
Mark Backman
468e936a5f Merge pull request #1356 from pipecat-ai/mb/add-chirp-tts-support
Add support for Chirp voices in GoogleTTSService
2025-03-11 20:12:52 -04:00
Lucas Rothman
ecc4411128 Tavus support for custom output rate 2025-03-11 16:02:33 -07:00
Aleix Conchillo Flaqué
740ba4e759 ai_services: fix abstractmethod issues 2025-03-11 14:29:03 -07:00
Filipi Fuchter
e56c8f881c Full video transformation example using SmallWebRTCTransport. 2025-03-11 11:36:47 -03:00
Filipi Fuchter
a747f08017 Simple voice agent example using SmallWebRTCTransport. 2025-03-11 11:36:23 -03:00
Filipi Fuchter
c6c0b73345 P2P WebRTC transport option to Pipecat: SmallWebRTCTransport. 2025-03-11 11:35:39 -03:00
Filipi Fuchter
fde90ee01d Creating an EventEmitter util class 2025-03-11 11:33:47 -03:00
Filipi Fuchter
689a844aaf Created a new transport param to inform if the camera input should be enabled. 2025-03-11 11:33:24 -03:00
Filipi Fuchter
aab98b61a0 Fixed issue where sending too many images per second caused Gemini to ignore them. 2025-03-11 11:32:38 -03:00
Mark Backman
a62741df94 Add support for Chirp voices in GoogleTTSService 2025-03-11 07:56:27 -04:00
Mark Backman
5bd359ada9 Merge pull request #1354 from pipecat-ai/mb/cartesia-changelog
Changelog entry for Cartesia model update
2025-03-11 07:20:04 -04:00
Mark Backman
40562402a2 Changelog entry for Cartesia model update 2025-03-10 21:10:11 -04:00
Mark Backman
98e5089fbe Merge pull request #1353 from kunal-cai/main
[Cartesia] Update the default alias for Cartesia TTS Service
2025-03-10 21:07:19 -04:00
Kunal Shah
e1c8a09b60 [Cartesia] Update the default alias for Cartesia TTS Service 2025-03-10 14:43:58 -07:00
Filipi da Silva Fuchter
154fe65011 Merge pull request #1336 from pipecat-ai/fixing_function_calling_examples
Pipecat small fixes and refactored function calling examples
2025-03-07 16:10:27 -03:00
Mark Backman
61f534ca34 Merge pull request #1334 from pipecat-ai/aleix/user-and-bot-turn-audio
add support for user and bot turn audio
2025-03-06 18:35:56 -05:00
Mark Backman
a91c26785f Store recording in a folder 2025-03-06 18:31:48 -05:00
Aleix Conchillo Flaqué
d7e93551d2 examples(chatbot-audio-recording): add support for user/bot turn audio 2025-03-06 11:49:01 -08:00
Aleix Conchillo Flaqué
06c742a2ad AudioBufferProcessor: add on_user_turn_audio_data and on_bot_turn_audio_data 2025-03-06 11:49:01 -08:00
Filipi Fuchter
55b0797fd5 Removing the extra examples inside the unified-format-function-calling folder 2025-03-06 12:00:22 -03:00
Filipi Fuchter
21443b9a08 Refactored gemini multimodal example to use the unified format for function calling. 2025-03-06 11:59:08 -03:00
Filipi Fuchter
4b167a3c3d Fixing the ruff format. 2025-03-06 10:38:45 -03:00
Filipi Fuchter
2df77430aa Refactoring the 14 series examples to use the unified format for function calling. 2025-03-06 10:35:26 -03:00
Filipi Fuchter
2d114b15f9 Adding missing flush_audio method to AzureTTSService. 2025-03-06 10:34:25 -03:00
Filipi Fuchter
26000b616d Fixing the base_whisper services to implement set_language. 2025-03-06 10:15:04 -03:00
Aleix Conchillo Flaqué
710eebab09 Merge pull request #1332 from pipecat-ai/aleix/base-object-and-event-handlers
introduce BaseObject class
2025-03-05 13:41:27 -08:00
Dominic Stewart
532423eb4c Updated example to switch pipelines per the original request (#1320) 2025-03-05 13:40:36 -08:00
Aleix Conchillo Flaqué
bb29e50adb introduce BaseObject class 2025-03-05 13:38:53 -08:00
Filipi da Silva Fuchter
4048d6782b Merge pull request #1211 from pipecat-ai/function_calling_unified_format
Unified format for function calling
2025-03-05 18:30:22 -03:00
Filipi Fuchter
76d36a312b Adding the unified format function calling to the changelog. 2025-03-05 14:18:37 -03:00
Filipi Fuchter
2a75373c04 Created examples for unified format function calling. 2025-03-05 14:12:30 -03:00
Filipi Fuchter
a840b0e815 Prevents pytest from collecting TestFrameProcessor. 2025-03-05 14:11:52 -03:00
Filipi Fuchter
ebcde719a6 Integration test for function calling. 2025-03-05 14:11:16 -03:00
Filipi Fuchter
5c912927bb Unit tests for function calling adapters. 2025-03-05 14:11:02 -03:00
Filipi Fuchter
0e55db054e Created script to fix ruff format issues. 2025-03-05 14:10:47 -03:00
Filipi Fuchter
5967ac0d4f Implementing unified format for function calling. 2025-03-05 14:10:32 -03:00
Aleix Conchillo Flaqué
1451483cf7 Merge pull request #1330 from pipecat-ai/aleix/playht-update-0.1.12
pyproject: update pyht to 0.1.12
2025-03-04 18:35:03 -08:00
Michael Louis
3fe7c1d730 Added ultravox service 2025-03-04 13:59:03 -05:00
Aleix Conchillo Flaqué
c14b85c12b pyproject: update pyht to 0.1.12
Fixes #1309
2025-03-04 10:26:11 -08:00
kompfner
9f3c0219d7 Merge pull request #1329 from pipecat-ai/add-permissions-to-daily-meeting-token-properties
Add the `permissions` property to `DailyMeetingTokenProperties`
2025-03-03 14:44:10 -05:00
Aleix Conchillo Flaqué
ec36fef26e updated CHANGELOG and fix GladiaSTTService formatting 2025-03-03 09:53:03 -08:00
allenmylath
5f1848d24b Update gladia.py (#1317)
* Update gladia.py

According to gladia docs 
https://docs.gladia.io/api-reference/v2/live/init
speech threshould value close to 1 enables gladia to better isolate speeech from noise.
2025-03-03 09:51:11 -08:00
Aleix Conchillo Flaqué
d6867bd12f Merge pull request #1321 from pipecat-ai/aleix/allow-setting-context-aggregator-parameters
LLMService: add user/assistant args to create_context_aggregator()
2025-03-03 09:48:31 -08:00
Aleix Conchillo Flaqué
17a1f30572 LLMService: add user/assistant args to create_context_aggregator() 2025-03-03 09:46:37 -08:00
Paul Kompfner
8e0dc1f256 Add the permissions property to DailyMeetingTokenProperties 2025-03-03 10:13:25 -05:00
Kwindla Hultman Kramer
b9100beee3 Merge pull request #1327 from pipecat-ai/azure-realtime-changelog
CHANGELOG.md entry for AzureRealtimeBetaLLMService
2025-03-02 20:30:40 -08:00
Mark Backman
b8bc3d2565 Merge pull request #1326 from pipecat-ai/mb/11labs-speed
Add speed as InputParam to ElevenLabs TTS services
2025-03-02 15:20:01 -05:00
Kwindla Hultman Kramer
3213e85b7d CHANGELOG.md entry for AzureRealtimeBetaLLMService 2025-03-02 12:16:50 -08:00
Kwindla Hultman Kramer
de3bcd64c4 Merge pull request #1324 from pipecat-ai/azure-realtime
Support for Azure OpenAI Realtime API
2025-03-02 12:13:29 -08:00
Mark Backman
ad7f1eec12 Create a function to build voice_settings dictionary 2025-03-02 08:27:29 -05:00
Mark Backman
29310b4e92 Add speed as InputParam to ElevenLabs TTS services 2025-03-02 08:19:44 -05:00
Kwindla Hultman Kramer
2f4d36a146 docstring fixup 2025-03-01 15:44:10 -08:00
Kwindla Hultman Kramer
6c9bb782b1 add __init__.py 2025-03-01 15:42:20 -08:00
Kwindla Hultman Kramer
010d9103d4 support for Azure OpenAI Realtime API 2025-03-01 15:39:19 -08:00
Aleix Conchillo Flaqué
12131eb7c5 Merge pull request #1313 from Vaibhav159/vl_add_automated_formatting
using ruff automated formatting to avoid action failures.
2025-02-28 13:12:31 -08:00
Aleix Conchillo Flaqué
80b830322a Merge pull request #1311 from pipecat-ai/aleix/llm-full-response-aggregator
add new LLMFullResponseAggregator
2025-02-28 13:08:06 -08:00
Aleix Conchillo Flaqué
8db9d16174 add new LLMFullResponseAggregator 2025-02-28 13:05:21 -08:00
Aleix Conchillo Flaqué
1c92fab1fb Merge pull request #1308 from Vaibhav159/vl_google_openai_format
adding GoogleLLMOpenAIBetaService
2025-02-28 12:04:37 -08:00
Vaibhav159
974717d1b9 sync with main 2025-03-01 01:16:21 +05:30
Vaibhav159
59fb631390 fixing function calling and adding example 2025-03-01 01:14:37 +05:30
Vaibhav159
4824220260 adding GoogleLLMOpenAIBetaService 2025-03-01 01:14:26 +05:30
Mark Backman
55a338614d Merge pull request #1312 from pipecat-ai/mb/move-server-message-frame
Rename ServerMessageFrame to RTVIServerMessageFrame and move to rtvi.py
2025-02-28 13:59:31 -05:00
Vaibhav159
f033046963 using ruff automated formatting to avoid repeated failures 2025-02-28 08:25:15 +05:30
Mark Backman
6018fc068c Rename ServerMessageFrame to RTVIServerMessageFrame and move to rtvi.py 2025-02-27 20:07:07 -05:00
Aleix Conchillo Flaqué
d5b634301f Merge pull request #1302 from pipecat-ai/aleix/cleanup-llm-tts-logging
services: minor LLM and TTS logging improvements
2025-02-27 13:51:04 -08:00
Aleix Conchillo Flaqué
a37eb1049d Merge pull request #1310 from Canonical-AI-Inc/without-audio
Optional Recording
2025-02-27 13:37:39 -08:00
Adrian Cowham
803ea9d8bc update the canonical client so that the audio recording is optional as long as there is a transcript 2025-02-27 12:31:02 -08:00
Mark Backman
499bc25217 Merge pull request #1303 from pipecat-ai/mb/add-server-to-client-msg
Add a new generic server to client message and frame type
2025-02-27 12:56:57 -05:00
Mark Backman
53d403af4b Remove the RTVIServerMessage logic from the RTVIProcessor 2025-02-27 12:50:43 -05:00
Aleix Conchillo Flaqué
a0a8ea1641 Merge pull request #1301 from pipecat-ai/aleix/example-22d-fix-llm-aggregator 2025-02-26 22:39:48 -08:00
Mark Backman
26c68ccd7c Add a new generic server to client message and frame type 2025-02-26 18:59:06 -05:00
Aleix Conchillo Flaqué
fa010c8644 services: minor LLM and TTS logging improvements 2025-02-26 15:36:25 -08:00
Aleix Conchillo Flaqué
d58f398bc4 examples: fix for 22d-natural-conversation-gemini-audio.py 2025-02-26 13:15:07 -08:00
Aleix Conchillo Flaqué
11383a86a1 Merge pull request #1300 from pipecat-ai/aleix/prepare-0.0.58
update CHANGELOG for 0.0.58
2025-02-26 11:31:24 -08:00
Aleix Conchillo Flaqué
daa52ff8df update CHANGELOG for 0.0.58 2025-02-26 11:29:04 -08:00
Mark Backman
a5f41e22f7 Merge pull request #1299 from pipecat-ai/mb/add-track-level-recording
Added on_track_audio_data callback to AudioBufferProcessor for track level recording
2025-02-26 13:49:36 -05:00
Mark Backman
530bb5233d example: Added a foundational example (34) for audio recording 2025-02-26 13:44:32 -05:00
Aleix Conchillo Flaqué
4a64e09f6c Merge pull request #1297 from pipecat-ai/aleix/daily-python-0.15.0
pyproject: update daily-python, aiohttp and pydantic
2025-02-26 10:26:59 -08:00
Aleix Conchillo Flaqué
74582bb8d5 pyproject: update daily-python, aiohttp and pydantic 2025-02-26 10:22:34 -08:00
Mark Backman
1ca2101e3a Added on_track_audio_data callback to AudioBufferProcessor for track level recording 2025-02-26 10:48:56 -05:00
Aleix Conchillo Flaqué
e80311c323 Merge pull request #1296 from pipecat-ai/aleix/google-always-send-text-with-audio
GoogleLLMService: always send text with audio
2025-02-26 07:47:56 -08:00
Aleix Conchillo Flaqué
2f24c422b6 Merge pull request #1289 from pipecat-ai/aleix/tts-http-improvements
small TTS http improvements
2025-02-26 07:47:26 -08:00
Mark Backman
0d0b9fddef Merge pull request #1291 from pipecat-ai/mb/playht-http-protocol
PlayHTHttpTTSService now takes a separate protocol input
2025-02-26 08:09:49 -05:00
Mark Backman
1753cc99f4 PlayHTHttpTTSService now takes a separate protocol input 2025-02-26 08:01:54 -05:00
Aleix Conchillo Flaqué
4f8b036abe pyproject: remote httpx old dependency and upgrade anthropic/google-genai 2025-02-25 22:28:21 -08:00
Aleix Conchillo Flaqué
f83c89c202 examples: update google examples 2025-02-25 22:28:02 -08:00
Aleix Conchillo Flaqué
bb89a036e5 google: always send text part when sending inline audio 2025-02-25 22:27:38 -08:00
Aleix Conchillo Flaqué
b994a03466 examples: add more HTTP TTS services examples 2025-02-25 21:40:41 -08:00
Aleix Conchillo Flaqué
27161f8e3b BaseOutputTransport: cleanup audio buffer after bot stops talking 2025-02-25 21:39:47 -08:00
Aleix Conchillo Flaqué
8acf9a488b tts: some small HTTP-based services improvements 2025-02-25 21:39:47 -08:00
Aleix Conchillo Flaqué
96c6aeaada Merge pull request #1295 from pipecat-ai/aleix/pipelinetask-keyword-arguments
PipelineTask: force constructor keyword arguments
2025-02-25 19:00:58 -08:00
Aleix Conchillo Flaqué
6722aae598 PipelineTask: force constructor keyword arguments 2025-02-25 18:58:47 -08:00
Aleix Conchillo Flaqué
66564392a6 Merge pull request #1293 from pipecat-ai/aleix/log-pipecat-version
log pipecat version on application startup
2025-02-25 18:57:52 -08:00
Aleix Conchillo Flaqué
f258f5ab66 Merge pull request #1292 from pipecat-ai/aleix/audiocontext-terminate-nicely
AudioContextWordTTSService: wait for all requested audio
2025-02-25 18:56:41 -08:00
Aleix Conchillo Flaqué
f8f0578c3d log pipecat version on application startup 2025-02-25 18:55:45 -08:00
Aleix Conchillo Flaqué
aa60a413f3 Merge pull request #1294 from pipecat-ai/aleix/improve-test-requirements
improve test-requirements.txt
2025-02-25 18:55:18 -08:00
Aleix Conchillo Flaqué
3e66f2378d improve test-requirements.txt 2025-02-25 17:34:33 -08:00
Aleix Conchillo Flaqué
9a50f33e36 AudioContextWordTTSService: wait for all requested audio 2025-02-25 15:35:47 -08:00
Aleix Conchillo Flaqué
4bd5e9c0a7 Merge pull request #1285 from pipecat-ai/aleix/handle-stop-task-gracefully
handle stop task gracefully
2025-02-25 11:25:38 -08:00
Mark Backman
12092c8715 Merge pull request #1288 from pipecat-ai/mb/clean-up-tts-text-input
TTSService: Remove newlines before sending text to TTS service to gen…
2025-02-25 14:00:43 -05:00
Mark Backman
92cc6d39f2 TTSService: Remove newlines before sending text to TTS service to generate 2025-02-25 13:37:25 -05:00
Aleix Conchillo Flaqué
34a50033cb tk: use TkTransportParams in examples 2025-02-25 10:24:24 -08:00
Aleix Conchillo Flaqué
e60b65228b allow multiple StartFrames 2025-02-25 10:24:04 -08:00
Mark Backman
e74864335b Merge pull request #1287 from pipecat-ai/mb/30-observer-pipeline-task
Example 30: Move observers to PipelineTask
2025-02-25 12:11:23 -05:00
Mark Backman
27a088a457 Merge pull request #1286 from pipecat-ai/mb/update-grok-2
Set grok-2 as default model for GrokLLMSService
2025-02-25 12:11:09 -05:00
Mark Backman
cfe72143b8 Example 30: Move observers to PipelineTask 2025-02-25 10:54:25 -05:00
Mark Backman
36a729cbfe Set grok-2 as default model for GrokLLMSService 2025-02-25 10:00:45 -05:00
Aleix Conchillo Flaqué
d2f006682c introduce new BaseTaskManager 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
fb7fe540f5 tts: don't connect to websocket if already connected 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
1ec68bd071 make sure we don't create tasks if already created 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
4536d03e82 FrameProcessor: cancel input/push tasks on CancelFrame 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
699704732c asyncio: re-raise CancelledError in wait_for_task() 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
376d969a77 task: handle StopFrame and StopTaskFrame gracefully 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
68789dfcf0 frames: add new StopFrame 2025-02-24 21:34:23 -08:00
Aleix Conchillo Flaqué
fe9fc61c4e Merge pull request #1282 from pipecat-ai/aleix/pipelinetask-observers-constructor
PipelineTask: pass observers in contructor parameter
2025-02-24 21:29:46 -08:00
Aleix Conchillo Flaqué
6028f0f23a PipelineTask: pass observers in contructor parameter 2025-02-24 21:29:17 -08:00
Aleix Conchillo Flaqué
e9a0959e28 Merge pull request #1283 from pipecat-ai/aleix/check-dangling-tasks
PipelineTask: add check_dangling_tasks parameter
2025-02-24 21:26:32 -08:00
Dominic Stewart
f66be2cfa7 Dom/gemini system prompt switching (#1260)
* Updated example to use Gemini

* Fixed typo

* Based on feedback, made the gemini file something that can be called separately

* Updated the readme

* Updated the readme

* Changed example to use gemini 2.0 flash lite

* This works

* Improvement

* I think this works

* Updated the code to use the correct prompt broken down into smaller pieces

* Added a few more things to detect in the prompt

* Fixed import ordering

* Updated prompt for non gemini bot to look for more voicemail examples, plus added logic to detect if we're doing dialin or not to avoid a non-fatal dialin related error

* moved terminate call to handlers class

* Simplified logic for dialin

* Forgot to use the same logic for the openai bot

* Starting to add logic for native audio input for flash lite

* Fixed logic

* Fixed some code based on suggestions
2025-02-24 22:29:55 -06:00
Aleix Conchillo Flaqué
f818bed58f Merge pull request #1281 from pipecat-ai/aleix/google-context-aggregator-upgrade-context
google: updgrade OpenAILLMContext to GoogleLLMContext
2025-02-24 17:37:26 -08:00
Aleix Conchillo Flaqué
07b9be5308 PipelineTask: add check_dangling_tasks parameter 2025-02-24 17:33:10 -08:00
Aleix Conchillo Flaqué
40c2452d6e google: updgrade OpenAILLMContext to GoogleLLMContext 2025-02-24 15:35:18 -08:00
Aleix Conchillo Flaqué
30cdd1b71a Merge pull request #1280 from pipecat-ai/aleix/add-completion-timeout
services(llm): add on_completion_timeout event
2025-02-24 15:07:20 -08:00
Aleix Conchillo Flaqué
2110b79507 services(llm): add on_completion_timeout event 2025-02-24 14:55:36 -08:00
Aleix Conchillo Flaqué
fc544fa61c Merge pull request #1272 from pipecat-ai/aleix/tts-websocket-interruptions
services: fix some TTS websocket service interruption handling
2025-02-24 14:54:41 -08:00
Mark Backman
976fe95304 Merge pull request #1279 from pipecat-ai/mb/remove-open-optional-dep
Remove `openai` optional dependency from services as it's now required
2025-02-24 17:42:53 -05:00
Aleix Conchillo Flaqué
408270b647 lmnt: don't send "eof" before closing the socket 2025-02-24 14:37:37 -08:00
Mark Backman
1dfb75bc9d Merge pull request #1278 from pipecat-ai/mb/claude-3-7
Update AnthropicLLMService to use claude-3-7-sonnet-20250219 by default
2025-02-24 15:41:28 -05:00
Mark Backman
cefc2a1088 Fix test-requirements.text ordering 2025-02-24 15:06:13 -05:00
Mark Backman
3b9b9200ea Remove openai optional dependency from services as it's now required 2025-02-24 15:05:42 -05:00
Mark Backman
d6f29a0f4b Update AnthropicLLMService to use claude-3-7-sonnet-20250219 by default 2025-02-24 14:32:00 -05:00
Aleix Conchillo Flaqué
5b762d11ef Merge pull request #1228 from CarlKho-Minerva/main
Missing Cartesia~=1.3.1 → `test-requirements`
2025-02-24 08:47:41 -08:00
Aleix Conchillo Flaqué
2f3e2da6b9 Merge pull request #1259 from pipecat-ai/openai-not-optional
Since the `openai` package is used by pretty much everything in pipec…
2025-02-24 08:45:45 -08:00
allenmylath
45058d4a94 Update audio_buffer_processor.py (#1266) 2025-02-24 08:41:19 -08:00
Aleix Conchillo Flaqué
5b637bd826 services: fix some TTS websocket service interruption handling 2025-02-24 08:37:22 -08:00
Mark Backman
2d4fd7e903 Merge pull request #1274 from pipecat-ai/mb/add-ellipsis-test
Add one additional ellipsis test to test_utils_string
2025-02-23 11:26:20 -05:00
Mark Backman
b5662520aa Add one additional ellipsis test to test_utils_string 2025-02-23 11:04:24 -05:00
Aleix Conchillo Flaqué
af45c170b5 Merge pull request #1264 from pipecat-ai/aleix/add-log-observers
add initial log observers
2025-02-21 15:20:45 -08:00
Aleix Conchillo Flaqué
65f548b2ec examples(30-observer): update to use LLMLogObserver 2025-02-21 15:15:16 -08:00
Aleix Conchillo Flaqué
b29ab8c608 observers: add LLMLogObserver and TranscriptionLogObserver 2025-02-21 15:15:16 -08:00
Aleix Conchillo Flaqué
d6dc37f0b6 Merge pull request #1269 from pipecat-ai/aleix/endofsentence-support-ellipses
utils: add support for ellipses in match_endofsentence()
2025-02-21 15:08:22 -08:00
Aleix Conchillo Flaqué
12bce2e8c0 utils: add support for ellipses in match_endofsentence() 2025-02-21 15:05:50 -08:00
Aleix Conchillo Flaqué
4acf7296e0 Merge pull request #1261 from pipecat-ai/aleix/emualted-frames-being-triggered-prematurely
LLMUserContextAggregator: don't reset timer with interim transcription
2025-02-21 10:15:28 -08:00
Aleix Conchillo Flaqué
98706d429c LLMUserContextAggregator: make sure incoming transcription has text 2025-02-21 10:12:54 -08:00
Aleix Conchillo Flaqué
41720b1a13 LLMUserContextAggregator: don't reset timer with interim transcription
It turns out that in some cases we only get interim transcriptions (e.g. someone
is speaking very very softly or someone is talking in the background). In those
cases we don't want to interrupt the bot because there's really nothing to
interrupt the bot for.

We originally thought we should interrupt the bot right at the time we got an
interim frame, but this is causing too many false positives. It's actually
better to simply wait for a real transcription before interrupting (in case VAD
didn't interrupt).
2025-02-21 09:05:56 -08:00
Aleix Conchillo Flaqué
3ef4245166 Merge pull request #1265 from pipecat-ai/aleix/transport-remove-audio-out-is-live 2025-02-21 06:51:09 -08:00
Filipi da Silva Fuchter
3bb0797922 Merge pull request #1257 from pipecat-ai/fastapi_disconnect_issue
Fixed an issue where FastAPI was not triggering on_client_disconnected.
2025-02-21 09:15:15 -03:00
Filipi Fuchter
7c7b4c52af Fixed an issue where EndTaskFrame was not triggering on_client_disconnected or closing the WebSocket in FastAPI. 2025-02-21 09:11:58 -03:00
Aleix Conchillo Flaqué
01f083b7fc transports: remove TransportParams.audio_out_is_live 2025-02-20 23:33:06 -08:00
Aleix Conchillo Flaqué
91fcaebe25 Merge pull request #1263 from Vaibhav159/vl_fix_deepgram_sample_rate_mismatch
fixing deepgram mismatch
2025-02-20 22:39:06 -08:00
Vaibhav159
9c5fe5c85e fixing deepgram mismatch 2025-02-21 09:32:40 +05:30
Aleix Conchillo Flaqué
7e5e167a4b Merge pull request #1250 from pipecat-ai/aleix/context-aggregation-simulatenous-text-tools
AssistantContextAggregator: append aggregation and tools in the same turn
2025-02-20 17:32:57 -08:00
Aleix Conchillo Flaqué
d04c4b36f3 AssistantContextAggregator: append aggregation and tools in the same turn 2025-02-20 17:29:43 -08:00
Aleix Conchillo Flaqué
a811e53626 Merge pull request #1253 from pipecat-ai/aleix/http-tts-services-stopped-frame
HTTP TTS services stopped frame
2025-02-20 17:28:05 -08:00
Paul Kompfner
df57202a05 Since the openai package is used by pretty much everything in pipecat (due to OpenAILLMContext being the standard context representation), let's make it a non-optional dependency.
This change solves an issue faced by users who aren't intending to use OpenAI getting scary error messages saying that they need the `openai` optional dependency "in order to use OpenAI", along with an instruction to set the OPENAI_API_KEY environment variable.

Note that with this change we could theoretically remove from pyproject.toml a number of defined optional dependencies that list only the `openai` package as a dependency (like `deepseek`, for example), but I didn't want to "break the API" in terms of how users install/consume pipecat and its set of built-in services.

Finally, I removed the `python-deepcompare` dependency from the `openai` optional dependency, since it appears to me like it was added by mistake (my guess is it was used for debugging during development and then never removed).
2025-02-20 15:21:35 -05:00
Aleix Conchillo Flaqué
69e6f3fdb7 rime: pass aiohttp session to constructor 2025-02-20 07:36:24 -08:00
Aleix Conchillo Flaqué
6809254963 tts: fix metrics and TTSStoppedFrame frame in HTTP services
Fixes #1247
2025-02-20 07:36:21 -08:00
Aleix Conchillo Flaqué
81093d3bed Merge pull request #1252 from pipecat-ai/aleix/remove-vad-extra-logging
BaseInputTransport: remove VAD logging
2025-02-20 07:32:20 -08:00
Aleix Conchillo Flaqué
d9a67164f6 Merge pull request #1251 from pipecat-ai/aleix/fish-tts-service-push-stop-frame
FishAudioTTSService should push TTSStoppedFrame
2025-02-20 07:32:05 -08:00
Aleix Conchillo Flaqué
98259af54e update CHANGELOG 2025-02-19 22:05:48 -08:00
Dominic Stewart
039d144c79 examples(phone-bot): updated example to use Gemini (#1233) 2025-02-19 22:03:37 -08:00
Aleix Conchillo Flaqué
d0f67fc189 BaseInputTransport: remove VAD logging
These logs are very verbose. They were added to try to find an issue that
resulted in being because of low CPU/memory resources, but these logs were not
helpful to determine that.
2025-02-19 21:55:11 -08:00
Aleix Conchillo Flaqué
6e3f96aa83 fish: automatically send TTSStoppedFrame after timeout 2025-02-19 21:41:18 -08:00
Aleix Conchillo Flaqué
293677588d tts: make push_stop_frames default to 2.0s 2025-02-19 21:39:00 -08:00
Filipi da Silva Fuchter
77e777b1ce Merge pull request #1249 from pipecat-ai/invoking_call_start_function
Fixed an issue that `start_callback` was not invoked for some LLM services
2025-02-19 18:09:00 -03:00
Filipi Fuchter
7e7926059c Fixed an issue that start_callback was not invoked for some LLM services. 2025-02-19 18:04:20 -03:00
Aleix Conchillo Flaqué
c948754eff Merge pull request #1248 from pipecat-ai/aleix/daily-transport-room-url
daily: add room_url property
2025-02-19 09:46:46 -08:00
Aleix Conchillo Flaqué
83f1a8830d daily: add room_url property 2025-02-19 09:29:53 -08:00
James Hush
80f8e05fcf docs: fix transcripts in translation chatbot example (#1199) 2025-02-19 16:07:22 +08:00
Aleix Conchillo Flaqué
afd1a1e80b Merge pull request #1245 from pipecat-ai/aleix/stt-mute-filter-trace-logging 2025-02-18 21:21:55 -08:00
Aleix Conchillo Flaqué
84ac88cad7 STTMuteFilter: change suppressed logging to trace 2025-02-18 18:03:37 -08:00
Aleix Conchillo Flaqué
211163e5c7 Merge pull request #1241 from pipecat-ai/aleix/deepgram-nova-3
deepgram: use the new nova-3 model as default
2025-02-18 17:53:04 -08:00
Aleix Conchillo Flaqué
1b0bcebef6 deepgram: use the new nova-3 model as default 2025-02-18 17:51:54 -08:00
Aleix Conchillo Flaqué
89736b03c4 Merge pull request #1243 from pipecat-ai/aleix/add-deepgram-addons
deepgram: add ability to provide custom addons
2025-02-18 17:47:48 -08:00
Aleix Conchillo Flaqué
4edda718ed deepgram: add ability to provide custom addons 2025-02-18 17:45:41 -08:00
Aleix Conchillo Flaqué
22a62edc9e Merge pull request #1242 from pipecat-ai/aleix/utils-network-exponential
network: added exponential_backoff_time() function
2025-02-18 17:44:21 -08:00
Aleix Conchillo Flaqué
50b6cc8135 network: added exponential_backoff_time() function 2025-02-18 17:42:43 -08:00
Aleix Conchillo Flaqué
45cf36925a Merge pull request #1240 from pipecat-ai/aleix/handle-deepgram-on-error
deepgram: handle error event and reconnect
2025-02-18 17:41:29 -08:00
Filipi da Silva Fuchter
83a71e1fec Merge pull request #1112 from pipecat-ai/bot-ready-signalling-rn
React Native client for the bot ready example.
2025-02-18 15:17:38 -03:00
Filipi Fuchter
e809c8680e Upgrading to use the latest node stable version 2025-02-18 15:12:44 -03:00
Aleix Conchillo Flaqué
c926063d74 deepgram: handle error event and reconnect 2025-02-18 09:52:18 -08:00
Aleix Conchillo Flaqué
0334550356 Merge pull request #1238 from pipecat-ai/aleix/stt-mute-filter-ignore-input-audio-frames
STTMuteFilter: ignore audio frames so no transcriptions are generated
2025-02-18 09:48:13 -08:00
Aleix Conchillo Flaqué
90b9dce710 STTMuteFilter: ignore audio frames so no transcriptions are generated 2025-02-17 19:59:05 -08:00
Carl Kho
a5cdd5f1b8 Add Cartesia API key to dot-env.template 2025-02-14 21:29:37 -08:00
Carl Kho
5f937b8479 Update test requirements to include Cartesia version 1.3.1 2025-02-14 21:14:32 -08:00
Aleix Conchillo Flaqué
b45f7fee6f Merge pull request #1225 from pipecat-ai/aleix/prepare-0.0.57
update CHANGELOG for 0.0.57
2025-02-14 18:50:08 -08:00
Aleix Conchillo Flaqué
01c06c5cac update CHANGELOG for 0.0.57 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
329e89c1d9 TTSService: push BotStoppedSpeakingFrame 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
883410d8ac FrameProcessor: no need to create an input event every time 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
1f5b790dd0 TTSService: reset processing text during interruptions 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
a107b1cb4b examples(06a): use CartesiaTTSService 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
63950912f0 LLMAssistantContextAggregator: add missing variable initialization 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
2ce9402571 LLMAssistantResponseAggregator: initialize messages 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
f6912c0f9a utils: don't consider colon an end of sentence 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
633a4d4c58 FalImageGenService: load image async to not block the event loop 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
67da745bb3 tts: make frame pausing/resuming optional 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
5126d4de92 tts: handle incoming frames pausing/resuming from base TTSService class 2025-02-14 18:47:33 -08:00
Aleix Conchillo Flaqué
426d7ac213 transports: some local audio and tk updates 2025-02-14 18:47:33 -08:00
Mark Backman
9115692c72 Merge pull request #1227 from pipecat-ai/mb/fix-25-error
fix: ensure proper Google message format conversion in transcription …
2025-02-14 21:01:05 -05:00
Mark Backman
c26fe3f277 fix: ensure proper Google message format conversion in transcription filter 2025-02-14 20:28:26 -05:00
Mark Backman
47b059d387 Merge pull request #1226 from pipecat-ai/mb/add-transcript-processor-tests
tests: add tests for TranscriptProcessor
2025-02-14 19:50:38 -05:00
Mark Backman
a49d81e519 tests: add tests for TranscriptProcessor 2025-02-14 17:10:40 -05:00
Aleix Conchillo Flaqué
b3a575c7c7 Merge pull request #1212 from Vaibhav159/vl_fix_incorrect_has_regular_messages_check
fixing google llm service error
2025-02-14 13:16:37 -08:00
Aleix Conchillo Flaqué
790d0c1256 Merge pull request #1224 from M1ngXU/patch-1
Update openai.py
2025-02-14 13:13:00 -08:00
Aleix Conchillo Flaqué
ee7e0dc3f7 Merge pull request #1223 from pipecat-ai/aleix/audio-context-tts-service
audio context tts service and cartesia fixes
2025-02-14 12:12:42 -08:00
Aleix Conchillo Flaqué
f53ee79ddb RimeTTSService: use AudioContextWordTTSService 2025-02-14 11:55:54 -08:00
Aleix Conchillo Flaqué
aeadb40c3f CartesiaTTSService: use AudioContextWordTTSService
By supporting multiple audio requests we fix an issue that was causing audio
overlapping.
2025-02-14 11:55:54 -08:00
Aleix Conchillo Flaqué
cacb07f4c2 introduce AudioContextWordTTSService 2025-02-14 11:55:54 -08:00
M1ngXU
0b91d821fb Update openai.py
d
2025-02-14 20:27:08 +01:00
Aleix Conchillo Flaqué
af66a43056 Merge pull request #1222 from pipecat-ai/aleix/websocket-service-handle-clean-disconnection
WebsocketService: handle clean server disconnection
2025-02-14 10:33:54 -08:00
Aleix Conchillo Flaqué
e006dcf172 WebsocketService: handle clean server disconnection
The websocket async iterator doesn't raise an exception when the server
disconnects cleanly. We should handle that and raise an exception so we can
reconnect.
2025-02-14 10:11:56 -08:00
Filipi da Silva Fuchter
8588f8b0d8 Merge pull request #1220 from pipecat-ai/instant_voice_demo_example
Instant voice example.
2025-02-14 14:24:13 -03:00
Filipi Fuchter
bff54547b0 Instant voice example. 2025-02-14 14:19:17 -03:00
Mark Backman
b2754bf208 Merge pull request #1219 from pipecat-ai/mb/markdown-text-filter-tests
Add MarkdownTextFilter tests
2025-02-13 21:10:52 -05:00
Mark Backman
9a4942b0d0 Merge pull request #1218 from pipecat-ai/mb/user-idle-tests
Add UserIdleProcessor tests
2025-02-13 18:53:22 -05:00
Mark Backman
ed6201910b Add MarkdownTextFilter tests 2025-02-13 18:51:46 -05:00
Mark Backman
ac5ebc587e Add tests for UserIdleProcessor 2025-02-13 18:47:29 -05:00
Aleix Conchillo Flaqué
dff4c54e57 Merge pull request #1209 from pipecat-ai/aleix/reimplement-llm-response-aggregators
reimplement LLM response aggregators
2025-02-13 15:30:40 -08:00
Aleix Conchillo Flaqué
c744409651 SegmentedSTTService: fix process_audio_frame() arguments 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
7578fbeaef update google requirements 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
5909dff423 LLMContextResponseAggregator: add VAD emulation support 2025-02-13 15:25:22 -08:00
Aleix Conchillo Flaqué
a6502df72c services: forgot to pass context instead of user aggregator 2025-02-13 13:50:33 -08:00
Aleix Conchillo Flaqué
e0d24d7fc0 update CHANGELOG 2025-02-13 13:21:32 -08:00
Aleix Conchillo Flaqué
99779046a8 services: use push_context_frame() 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
67cdc0063a BaseTransportOutput: allow pushing frames upstream 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
b28f752afa tests: add anthropic and google aggregator tests 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
463078e375 initialize assistant aggregators with context and push upstream instead 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
84510fd521 LLMUserContextAggregator: add space between transcriptions 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
9f6a1c093a LLMUserContextAggregator: reset user speaking time after bot interruption 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
b602e78625 tests: add OpenAI context aggregator tests 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
7c815121ea LLMContextResponseAggregator: add missing reset() implementation 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
16a107948b services: missing kwargs in anthropic/openai user context aggregator 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
839aa7d935 llm_response: add some initial docstrings to LLM aggregators 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
4cbcfe2b0b LLMUserContextAggregator: interrupt the bot if VAD happened a while back 2025-02-13 13:20:38 -08:00
Aleix Conchillo Flaqué
91a628d1ba UserResponseAggregator: implement on top of LLMUserResponseAggregator 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
50288eeaaa tests: add LLM response aggregators tests 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
e1f2bbceb3 reimplement LLM response aggregators 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
8bdd7ed0ed tests: implement langchain tests with run_test() 2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
1b7dfe8126 tests: add a new SleepFrame
The new SleepFrame allow us to control when system frames are pushed to the
pipeline.
2025-02-13 13:20:37 -08:00
Aleix Conchillo Flaqué
d1ee851a65 tests: rename some variables to make things clearer 2025-02-13 13:20:37 -08:00
Filipi da Silva Fuchter
0358673b46 Merge pull request #1215 from pipecat-ai/instant_voice_demo
Instant voice demo improvements - part 02
2025-02-13 18:14:15 -03:00
Filipi Fuchter
16fe1b10e9 - Added support for the RTVIProcessor to handle buffered audio in base64 format, converting it into InputAudioRawFrame for transport.
- Added support for the `RTVIProcessor` to trigger `start_audio_in_streaming` only after the `client-ready` message.
2025-02-13 18:08:55 -03:00
Filipi Fuchter
f001819df8 - Added a new audio_in_stream_on_start field to TransportParams.
- Added a new method `start_audio_in_streaming` in the `BaseInputTransport`.
- Updated `DailyTransport` to respect the `audio_in_stream_on_start` field, ensuring it only starts receiving the audio input if it is enabled.
2025-02-13 18:08:36 -03:00
Filipi Fuchter
dceec60186 Updated FastAPIWebsocketOutputTransport to send TransportMessageFrame and TransportMessageUrgentFrame to the serializer. 2025-02-13 18:07:33 -03:00
Filipi Fuchter
b96979a4ed Update WebsocketServer to not wrap the message inside a text frame. 2025-02-13 18:07:04 -03:00
Mark Backman
745c40def4 Merge pull request #1214 from pipecat-ai/mb/stt-mute-tests
Improve STTMuteFilter, add tests
2025-02-13 09:50:43 -05:00
Mark Backman
42ab62716d Merge pull request #1198 from pipecat-ai/mb/more-whisper-params
Add prompt and temperature args to OpenAI and Groq hosted Whisper STT…
2025-02-13 09:16:38 -05:00
Mark Backman
16ba2010aa Refactor process_frame to be more consistent 2025-02-13 09:15:29 -05:00
Mark Backman
ec0ca46617 Fix temperature docstrings to reference optional 2025-02-13 09:04:20 -05:00
Mark Backman
6ff1f526ff Merge pull request #1216 from pipecat-ai/mb/google-cloud-speech
Add the google-cloud-speech package to the google dependency
2025-02-13 07:04:34 -05:00
Mark Backman
84143cc80c self._muted now returns from STT process_audio_frames 2025-02-13 07:00:44 -05:00
Mark Backman
229dccedc6 Add the google-cloud-speech package to the google dependency 2025-02-12 23:19:17 -05:00
Aleix Conchillo Flaqué
68aaa1f8f4 Merge pull request #1213 from pipecat-ai/aleix/base-transport-output-bot-vad-stop-secs
BaseOutputTransport: use specific VAD stop secs for the bot
2025-02-12 19:01:56 -08:00
Aleix Conchillo Flaqué
f110a45c85 BaseOutputTransport: use specific VAD stop secs for the bot 2025-02-12 19:01:39 -08:00
Mark Backman
1e8a86de63 Handle starting muted, add tests 2025-02-12 19:01:49 -05:00
Mark Backman
ee93e2a2b1 Reorder frame pushing for STTMuteFilter, update STTMuteFrame to SystemFrame 2025-02-12 15:51:18 -05:00
Mark Backman
2e87a019a8 Merge pull request #1208 from pipecat-ai/mb/stt-mute-first-bot-speech
Add new STTMuteStrategy: MUTE_UNTIL_FIRST_BOT_COMPLETE
2025-02-12 12:21:02 -05:00
Vaibhav159
687b3d9d4c fixing google llm service error 2025-02-12 22:22:04 +05:30
Mark Backman
397768d872 Add new STTMuteStrategy: MUTE_UNTIL_FIRST_BOT_COMPLETE 2025-02-12 10:59:28 -05:00
Mark Backman
24cdcd74e6 Merge pull request #1197 from pipecat-ai/mb/google-stt
Add GoogleSTTService
2025-02-12 10:16:18 -05:00
Mark Backman
5d6370690c Add _reconnect_if_needed to simplify reconnect logic 2025-02-12 10:11:18 -05:00
Mark Backman
9f728aa623 Add reconnect logic to handle Google's 5 min time limit 2025-02-12 10:11:18 -05:00
Mark Backman
32d8f6153f Update InputParams to languages: support str or List of Languages 2025-02-12 10:11:18 -05:00
Mark Backman
8c2071f248 Add ClientOptions for region selection 2025-02-12 10:11:18 -05:00
Mark Backman
a9c2197dc6 Add ability to update options 2025-02-12 10:11:18 -05:00
Mark Backman
ce0358804b Docstrings and cleanup 2025-02-12 10:11:18 -05:00
Mark Backman
66a6a6a295 Enable interim transcriptions, add VAD events option 2025-02-12 10:11:18 -05:00
Mark Backman
9f1732c390 Update CHANGELOG and README 2025-02-12 10:11:17 -05:00
Mark Backman
b44ddf2456 07n uses all Google services 2025-02-12 10:09:36 -05:00
Mark Backman
17420f4d0c Update language support 2025-02-12 10:09:36 -05:00
Mark Backman
6cb55ec2cb Add GoogleSTTService 2025-02-12 10:09:36 -05:00
Filipi da Silva Fuchter
e2b4554a54 Merge pull request #1129 from pipecat-ai/instant_voice_demo
Pipecat improvements for the instant voice demo
2025-02-12 11:53:40 -03:00
Mark Backman
fd68b82e48 Merge pull request #1163 from pipecat-ai/mb/rime-websocket
Add RimeTTSService
2025-02-12 09:51:56 -05:00
Filipi Fuchter
cc90f5ab9f Sending the RTVI messages to the websocket 2025-02-12 11:46:49 -03:00
Filipi Fuchter
08f40d9179 Adding support to DailyTransport receive raw-audio through appMessage 2025-02-12 11:46:37 -03:00
Aleix Conchillo Flaqué
80e1325621 include codecov.yml 2025-02-11 23:46:19 -08:00
Aleix Conchillo Flaqué
ed76a5bfa5 Merge pull request #1202 from pipecat-ai/aleix/fix-simli-audiolayout-error
simli: fix audio layout error
2025-02-11 22:24:22 -08:00
Mark Backman
69b0d9035f Mark end_time as unused 2025-02-11 17:44:52 -05:00
Mark Backman
dcc63dd648 Use the vendor default for temperature 2025-02-11 14:29:33 -05:00
Aleix Conchillo Flaqué
2d08f42870 Merge pull request #1204 from pipecat-ai/aleix/add-coverage-support
github: add coverage support
2025-02-11 11:09:25 -08:00
Mark Backman
0814c0bc82 Merge pull request #1203 from pipecat-ai/expose-update-remote-participants-on-daily-transport
Expose `update_remote_participants()` from `DailyTransport`
2025-02-11 13:57:08 -05:00
Paul Kompfner
28e233b195 Update CHANGELOG to reflect the addition of update_remote_participants() 2025-02-11 13:23:47 -05:00
Aleix Conchillo Flaqué
6e4d2d6ade examples: fix more dependabot warnings 2025-02-11 10:09:33 -08:00
Aleix Conchillo Flaqué
266135ec54 examples: fix dependabot warnings 2025-02-11 10:07:05 -08:00
Aleix Conchillo Flaqué
d81aa48262 test-requirements: update transformers to 4.48.0 2025-02-11 10:04:21 -08:00
Aleix Conchillo Flaqué
8c7752fbc2 github: add coverage support 2025-02-11 09:58:21 -08:00
Julien Le Bourg
77fb63372a fix: incorrectly changed the base type in my last pull request for L… (#1184)
* fix: incorrectly changed the base type in my last pull request for  LocalAudioTransport

* update examples to use the new LocalTransportParams

* add local device select example
2025-02-11 08:35:57 -08:00
Paul Kompfner
5a8279d3c2 Expose update_remote_participants() from DailyTransport 2025-02-11 11:28:03 -05:00
Aleix Conchillo Flaqué
4db620198a simli: fix audio layout error
Fixes #1201
2025-02-11 07:05:35 -08:00
Mark Backman
d35f4c6b99 Add prompt and temperature args to OpenAI and Groq hosted Whisper STT services 2025-02-10 21:06:37 -05:00
Aleix Conchillo Flaqué
0a990b2aaa Merge pull request #1196 from pipecat-ai/aleix/audio-buffer-processor-continuous-intermittent-stream
AudioBufferProcessor: handle continuous and intermittent user audio
2025-02-10 16:07:12 -08:00
Mark Backman
97586b132d Simplify _calculate_word_times 2025-02-10 18:45:49 -05:00
Mark Backman
8020db350e Update RimeHttpTTSService to use mistv2 model by default 2025-02-10 18:45:48 -05:00
Mark Backman
54f64b8dad Code review feedback 2025-02-10 18:45:08 -05:00
Mark Backman
8f8a3ae7f9 Add RimeTTSService 2025-02-10 18:45:06 -05:00
Mark Backman
344aff5681 Merge pull request #1191 from pipecat-ai/mb/azure-tts-error-handling
Improve AzureTTSService error handling
2025-02-10 18:01:39 -05:00
Mark Backman
0d2e90cff1 Merge pull request #1190 from pipecat-ai/mb/languages-hosted-whisper
Add language support to OpenAI and Groq hosted Whisper
2025-02-10 17:49:38 -05:00
Mark Backman
1a8dd6b713 Improve AzureTTSService error handling 2025-02-10 17:48:55 -05:00
Mark Backman
2dc585aee0 Merge pull request #1185 from pipecat-ai/mb/update-readme-hacking
Add missing pip install -e . step to the README, and clarify steps
2025-02-10 17:45:58 -05:00
Mark Backman
a64fa44811 Merge pull request #1186 from pipecat-ai/mb/whisper-multilingual
Add language support to WhisperSTTService
2025-02-10 17:26:10 -05:00
Aleix Conchillo Flaqué
baeb83484d Merge pull request #1194 from Vaibhav159/vl_fix_elevenlabs_disconnect_issue
fixing disconnect issue
2025-02-10 13:41:59 -08:00
Vaibhav159
b0c3f80963 resolve merge conf 2025-02-11 03:03:32 +05:30
Aleix Conchillo Flaqué
eb3c9b1e75 AudioBufferProcessor: handle continuous and intermittent user audio
Fixes #1172
2025-02-10 11:26:31 -08:00
Mark Backman
ad4cbdb1ec Merge pull request #1159 from Canonical-AI-Inc/gemini-rag
Gemini 2.0 Flash Lite RAG example
2025-02-10 13:42:11 -05:00
Aleix Conchillo Flaqué
32baee924b RTVI: fix premature bot-tts-text messages (#1193) 2025-02-10 10:37:54 -08:00
Adrian Cowham
9cc53509d1 PR feedback: renamed file, added docstring, changed file read logic 2025-02-10 09:39:01 -08:00
Vaibhav159
2c62d3bf32 break once ConnectionClosed error 2025-02-10 23:04:05 +05:30
Vaibhav159
b06b16adb7 fixing disconnect issue 2025-02-10 22:55:20 +05:30
Mark Backman
cd52d73027 Add language support to OpenAI and Groq hosted Whisper 2025-02-10 10:18:00 -05:00
Mark Backman
c9d8c572c7 Add language support to WhisperSTTService 2025-02-09 10:51:23 -05:00
Mark Backman
d9439fd398 Add missing pip install -e . step to the README, and clarify steps 2025-02-09 09:15:10 -05:00
Mark Backman
081abcedb3 Merge pull request #1176 from pipecat-ai/mb/stt-mute-deprecate-stt-service
Deprecate stt_service parameter in STTMuteFilter
2025-02-09 08:35:22 -05:00
Mark Backman
1455e24ad1 Add keyword args, collocated warnings import with the deprecation 2025-02-09 08:29:20 -05:00
Mark Backman
4613cf4790 Merge pull request #1181 from pipecat-ai/mb/daily-docstrings
Add docstrings to daily.py
2025-02-09 08:05:59 -05:00
Mark Backman
7aa2e1209d Merge pull request #1177 from pipecat-ai/mb/perplexity
Add PerplexityLLMService
2025-02-09 08:05:46 -05:00
Mark Backman
76daaab6ca Add PerplexityLLMService 2025-02-09 08:00:31 -05:00
Mark Backman
37cfe870cc Merge pull request #1183 from pipecat-ai/mb/add-groq-stt
Add GroqSTTService, BaseWhisperSTTService, and refactor OpenAISTTService
2025-02-09 07:56:35 -05:00
Mark Backman
160167758b Add docstrings to daily.py 2025-02-09 07:53:51 -05:00
Mark Backman
4b634713a5 Merge pull request #1182 from pipecat-ai/mb/28c-optional-db
Update 28c option to output to log line only by default
2025-02-09 07:52:21 -05:00
Mark Backman
72954d5f15 Remove to base_whisper.py 2025-02-09 07:51:30 -05:00
Mark Backman
f2b07271c1 Update GroqLLMService to use llama-3.3-70b-versatile as the default model 2025-02-09 07:51:30 -05:00
Mark Backman
32b9de5f51 Add GroqSTTService, BaseWhisperSTTService, and refactor OpenAISTTService 2025-02-09 07:51:28 -05:00
Mark Backman
71ce8f9bcf Merge pull request #1179 from pipecat-ai/mb/remove-command-dash-badge
Remove CommandDash badge from README
2025-02-09 07:47:32 -05:00
Mark Backman
7d05728e2f Update 28c option to output to log line only by default 2025-02-08 10:00:45 -05:00
Mark Backman
dee5448b57 Merge pull request #1123 from pipecat-ai/cb/sqlite
Add SQLite storage to the Gemini persistent storage example
2025-02-08 09:07:52 -05:00
Mark Backman
d67861925a Merge pull request #1128 from golbin/whisper-api
Add Whisper STT service using OpenAI API
2025-02-08 08:35:26 -05:00
Mark Backman
0180619d44 Merge pull request #1173 from TheCodingLand/local-pyaudio-device-ids
adds configurable device ids for local audio transport
2025-02-08 08:04:00 -05:00
Mark Backman
f07e498612 Remove CommandDash badge from README 2025-02-08 07:59:39 -05:00
TheCodingLand
57964cb929 fix LocalAudioTransport param type 2025-02-08 12:32:20 +01:00
TheCodingLand
6840c77684 apply ruff formatting 2025-02-08 12:03:23 +01:00
Mark Backman
a1b58115ce Deprecate stt_service parameter in STTMuteFilter 2025-02-07 19:24:03 -05:00
chadbailey59
23eb6e3d46 storybot fixes (#1175)
* storybot fixes

* readme cleanup
2025-02-07 13:58:02 -06:00
Mark Backman
74a2c38c6c Merge pull request #1174 from pipecat-ai/mb/bump-google-genai-version
Bump google-genai version to 1.0.0
2025-02-07 14:53:44 -05:00
Mark Backman
90b217fda8 Bump google-genai version to 1.0.0 2025-02-07 14:32:37 -05:00
Aleix Conchillo Flaqué
6855bc0ada Merge pull request #1166 from pipecat-ai/aleix/google-rtvi-observer
rtvi: separate specific google RTVI into a GoogleRTVIObserver
2025-02-08 03:19:02 +08:00
TheCodingLand
a359434307 remove Doc and Annotated imports 2025-02-07 19:42:34 +01:00
TheCodingLand
856c8959c3 enhance doc 2025-02-07 19:38:26 +01:00
TheCodingLand
8da7a42137 adds configurable input and output device ids for local audio 2025-02-07 19:23:18 +01:00
Aleix Conchillo Flaqué
510a0f5ef5 rtvi: deprecate RTVI.observer() 2025-02-07 09:19:43 -08:00
Aleix Conchillo Flaqué
03ac744bcf rtvi: deprecate frame processors 2025-02-07 09:17:29 -08:00
Aleix Conchillo Flaqué
b058461a7d GoogleRTVIObserver: add explicit constructor 2025-02-07 09:15:32 -08:00
Mark Backman
abd9f16b90 Export .rtvi, update new-chatbot example, rename and update foundational 32 2025-02-07 09:15:32 -08:00
Aleix Conchillo Flaqué
d07732f2e8 rtvi: separate specific google RTVI into a GoogleRTVIObserver 2025-02-07 09:15:32 -08:00
Aleix Conchillo Flaqué
4d25582e16 dev-requirements: update pyright and ruff 2025-02-06 21:51:57 -08:00
Aleix Conchillo Flaqué
d4b2160f9c Merge pull request #1161 from pipecat-ai/aleix/prepare-0.0.56
update CHANGELOG for 0.0.56
2025-02-06 13:50:04 -08:00
Aleix Conchillo Flaqué
dd7926aab5 update CHANGELOG for 0.0.56 2025-02-06 13:45:13 -08:00
Aleix Conchillo Flaqué
070bf66980 transports: fix local transports audio cleanup 2025-02-06 13:45:13 -08:00
Aleix Conchillo Flaqué
962fc27dbd Merge pull request #1160 from pipecat-ai/aleix/fix-unit-test-logging
tests: remove logger from tests.utils
2025-02-06 13:26:37 -08:00
Mark Backman
3d4d6132fc Merge pull request #1158 from pipecat-ai/mb/update-22c
Update foundation examples 22b, 22c, and 22d to be ready for function…
2025-02-06 16:25:05 -05:00
Aleix Conchillo Flaqué
a96d9294b7 tests: remove logger from tests.utils 2025-02-06 13:18:28 -08:00
Aleix Conchillo Flaqué
a6e78550d5 Merge pull request #1156 from pipecat-ai/aleix/prefer-optional
prefer Optional over to "| None"
2025-02-06 13:08:48 -08:00
Adrian Cowham
d9f6b7b93c added an example using using Gemini's large context window for RAG 2025-02-06 12:49:29 -08:00
Mark Backman
969de92ad9 Update foundation examples 22b, 22c, and 22d to be ready for function calling 2025-02-06 15:36:16 -05:00
Aleix Conchillo Flaqué
c4dbe92b30 prefer Optional over to "| None" 2025-02-06 11:11:37 -08:00
Aleix Conchillo Flaqué
684764fece Merge pull request #1155 from pipecat-ai/aleix/sentry-fixes-and-example
sentry fixes and example
2025-02-06 11:09:31 -08:00
Aleix Conchillo Flaqué
c4be07693f examples: added sentry-metrics example 2025-02-06 10:46:04 -08:00
Aleix Conchillo Flaqué
c5d5ca8232 SentryMetrics: use transactions and call parent methods 2025-02-06 10:44:38 -08:00
Mark Backman
428e763814 Merge pull request #1149 from pipecat-ai/mb/update-google-default-llm-model
Use gemini-2.0-flash-001 as the default model for GoogleLLMService
2025-02-06 12:41:13 -05:00
Mark Backman
0efa2711ff Merge pull request #1152 from pipecat-ai/mb/docstrings
Add docstrings for PipelineTask and related classes/functions
2025-02-06 12:30:12 -05:00
Mark Backman
4904f52cee Use gemini-2.0-flash-001 as the default model for GoogleLLMService 2025-02-06 12:29:15 -05:00
Aleix Conchillo Flaqué
dbcf14ddb4 Merge pull request #1154 from pipecat-ai/aleix/twilio-telnyx-sample-rates
serializers: don't update twilio/telnyx sample rates
2025-02-06 09:27:42 -08:00
Aleix Conchillo Flaqué
7c13ec10d9 examples: cleanup ElevenLabsTTSService constructor arguments 2025-02-06 09:25:52 -08:00
Aleix Conchillo Flaqué
29b9dccc53 serializers: don't update twilio/telnyx sample rates 2025-02-06 09:25:52 -08:00
Aleix Conchillo Flaqué
e8ce826473 Merge pull request #1151 from pipecat-ai/aleix/base-output-transport-resample
BaseOutputTransport: resample incoming audio if needed
2025-02-06 09:25:07 -08:00
Aleix Conchillo Flaqué
bbb991dfd8 Merge pull request #1153 from pipecat-ai/aleix/base-input-transport-show-vad
BaseInputTransport: show VAD results when interruptions not allowed
2025-02-06 09:24:12 -08:00
Mark Backman
4432e7e4f7 Add docstrings for PipelineTask and related classes/functions 2025-02-06 11:04:54 -05:00
Aleix Conchillo Flaqué
ee9cce64b2 BaseInputTransport: show VAD results when interruptions not allowed 2025-02-06 07:40:03 -08:00
Aleix Conchillo Flaqué
1ae4f0150d BaseOutputTransport: resample incoming audio if needed 2025-02-06 07:37:43 -08:00
Mark Backman
4c77c3ed34 Merge pull request #1148 from pipecat-ai/mb/fix-twilio-serializer
Fix sample rate handling in Twilio and Telnyx serializers
2025-02-06 10:25:13 -05:00
Aleix Conchillo Flaqué
975b97472a Merge pull request #1144 from pipecat-ai/aleix/frame-processor-missing-init-warning
FrameProcessor: add an error about missing super().process_frame(...)
2025-02-06 07:18:35 -08:00
Mark Backman
c8ccf13bc7 fix: Use audio_in_sample_rate to deserialize data for TelnyxFrameSerializer 2025-02-06 09:59:21 -05:00
Mark Backman
ba59736f87 fix: Use audio_in_sample_rate to deserialize data for TwilioFrameSerializer 2025-02-06 09:55:15 -05:00
Jin Kim
5989e1ed16 Merge branch 'main' into whisper-api 2025-02-06 13:14:36 +09:00
fatwang2
8cda4512ad Merge branch 'pipecat-ai:main' into main 2025-02-06 10:50:25 +08:00
Aleix Conchillo Flaqué
bc21a0b817 FrameProcessor: add an error about missing super().process_frame(...) 2025-02-05 18:33:03 -08:00
Aleix Conchillo Flaqué
99d3227ff5 Merge pull request #1126 from pipecat-ai/aleix/prepare-0.0.55
update CHANGELOG for 0.0.55
2025-02-05 11:32:39 -08:00
Aleix Conchillo Flaqué
7730f59635 update CHANGELOG for 0.0.55 2025-02-05 11:30:40 -08:00
Aleix Conchillo Flaqué
ba31546c32 Merge pull request #1139 from pipecat-ai/aleix/task-start-metadata
pipeline task start metadata and unit test improvements
2025-02-05 10:51:51 -08:00
Aleix Conchillo Flaqué
a363d12d1f dev-requirements: fix conflicts because of nvidia-riva-client 2025-02-05 10:34:46 -08:00
Aleix Conchillo Flaqué
feab9c8fa2 tests: run_test() now uses PipelineTask 2025-02-05 10:34:38 -08:00
Aleix Conchillo Flaqué
61f6669926 task: allow passing StartFrame metadata via start_metadata param 2025-02-05 10:34:38 -08:00
Aleix Conchillo Flaqué
3be69908d2 Merge pull request #1131 from pipecat-ai/aleix/global-audio-sample-rates
introduce PipelineParams audio input/output sample rates
2025-02-05 08:11:25 -08:00
Aleix Conchillo Flaqué
fcb80ec330 playht: don't set sample_rate in _settings 2025-02-05 07:46:24 -08:00
Mark Backman
c9f5684e2f OpenAITTSService: Add warning about changing sample_rate 2025-02-05 10:13:46 -05:00
Mark Backman
c257fa1573 AzureTTSService, AzureHttpTTSService: add start() method 2025-02-05 10:05:19 -05:00
Mark Backman
97c55da29f PlayHTHttpTTSService: add start() method to set sample_rate 2025-02-05 09:54:41 -05:00
Aleix Conchillo Flaqué
49426aa9a1 transport(websocket): improve exception logging 2025-02-04 23:50:45 -08:00
Aleix Conchillo Flaqué
0a333c26da services(elevenlabs): warn if sample rate not supported 2025-02-04 23:50:21 -08:00
Aleix Conchillo Flaqué
75a29424ff examples(telnyx-chatbot): use cartesia so we can use 8khz 2025-02-04 23:49:50 -08:00
Filipi da Silva Fuchter
cd1b429308 Merge pull request #1133 from pipecat-ai/fixing_krisp_issue
Fixing the issue in Krisp when trying to create more than one
2025-02-04 20:44:29 -03:00
Filipi Fuchter
7f1ae4b8cc Fixing the issue in Krisp when trying to create more than one filter in the same process. 2025-02-04 20:10:56 -03:00
Aleix Conchillo Flaqué
af9fd811cd examples(moondream-chatbot): fix UserImageRequester 2025-02-04 14:37:53 -08:00
Aleix Conchillo Flaqué
69f5c9b9d3 update anthropic and openpipe versions 2025-02-04 14:37:36 -08:00
Aleix Conchillo Flaqué
ab45e481be introduce PipelineParams audio input/output sample rates 2025-02-04 14:12:56 -08:00
Pedro Moreira
79ac696973 Add support for Piper TTS 2025-02-04 13:51:33 -03:00
Jin Kim
ef1e4277d3 Add an example for Whisper using OpenAI API 2025-02-04 10:32:55 +09:00
Jin Kim
823b763b25 Change OpenAI example file name 2025-02-04 10:28:06 +09:00
Jin Kim
3cb189eb1f Add whisper STT service using OpenAI API 2025-02-04 10:27:28 +09:00
Aleix Conchillo Flaqué
cc54255c41 Merge pull request #1125 from pipecat-ai/aleix/twilio-chatbot-improvements 2025-02-03 11:10:33 -08:00
Aleix Conchillo Flaqué
1cdb66f889 examples(twilio-chatbot): create sample rate variable 2025-02-03 10:58:06 -08:00
Aleix Conchillo Flaqué
51a86a509c examples: multiple twilio-chatbot improvements 2025-02-03 10:36:24 -08:00
Aleix Conchillo Flaqué
824898f7b7 Merge pull request #1121 from pipecat-ai/aleix/audio-resamplers
introduce audio resamplers
2025-02-03 10:32:55 -08:00
Aleix Conchillo Flaqué
57dadb6359 audio(utils): some variable renames 2025-02-03 09:33:04 -08:00
Aleix Conchillo Flaqué
5dcdc68ef5 examples: fix 22 series initial gate state 2025-02-03 09:16:58 -08:00
Aleix Conchillo Flaqué
aafb2db620 GatedOpenAILLMContextAggregator: use keyword argument and add start_open 2025-02-03 09:16:44 -08:00
Aleix Conchillo Flaqué
f3f22cf61c AudioBufferProcessor: add start_recording()/stop_recording() 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
371c2f3704 canonical: do not reset audio buffers 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
1f14f62696 AudioBufferProcessor: fix audio buffer silence computation 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
06449eff2c BaseAudioResampler: make resample() async 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
dcfb86583d serializers: serialize()/deserialize() are now async 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
cda34a1320 AudioBufferProcessor: fix user/bot audio buffers silence padding 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
13611fd8e1 AudioBufferProcessor: call callback on CancelFrame 2025-02-01 11:06:58 -08:00
Aleix Conchillo Flaqué
fc89aad469 introduce audio resamplers 2025-02-01 11:06:55 -08:00
Aleix Conchillo Flaqué
6c7474e1a2 frames: add pass to DTMFFrames 2025-01-31 18:37:40 -08:00
Aleix Conchillo Flaqué
95f0dbf3f3 CHANGELOG.md: task.cancel() and EndFrame clarification 2025-01-31 18:35:35 -08:00
Aleix Conchillo Flaqué
11aeb68ddb frames: fix type s/OuputDTMFFrame/OutputDTMFFrame/ 2025-01-31 18:28:38 -08:00
Aleix Conchillo Flaqué
a43c102fc8 Merge pull request #1064 from jcbjoe/jg/additional_dtmf_frames
Added: Additional DTMF frames
2025-01-31 18:25:08 -08:00
Chad Bailey
d236973c0f moved sqlite code back to a single example 2025-01-31 23:18:06 +00:00
Mark Backman
16b49bdce6 Merge pull request #1122 from pipecat-ai/mb/openai-org-id
Add organization and project level auth in OpenAILLMService
2025-01-31 14:35:26 -05:00
Mark Backman
41477c8f78 Add organization and project level auth in OpenAILLMService 2025-01-31 14:27:25 -05:00
Aleix Conchillo Flaqué
bb9a2560c3 Merge pull request #1118 from pipecat-ai/aleix/task-manager
introduce TaskManager
2025-01-31 10:24:52 -08:00
Aleix Conchillo Flaqué
002699f16c rtvi: delay creating tasks until we get StartFrame 2025-01-31 10:06:11 -08:00
chadbailey59
a17243bc1e More Storybot updates (#1116)
* initial changes for gemini storybot

* storybot updates for gemini

* more storybot updates

* interim interruptible commit

* cleanup

* cleanup

* cleanup

* first draft

* wip

* more storybot fixes

* more storybot updates WIP

* committing before changing the image prompting strategy

* wip

* prompt updating

* cleanup

* cleanup

* cleanup

* readme cleanup

* fixup
2025-01-30 20:13:18 -06:00
Aleix Conchillo Flaqué
d95819746a tests: make sure QueuedFrameProcessor push frames 2025-01-30 13:48:44 -08:00
Aleix Conchillo Flaqué
b65f32e8e1 task: start TaskObserver when tasks can be created
We have to start proxy observer tasks once we know the TaskManager has an event
loop.
2025-01-30 13:46:56 -08:00
Aleix Conchillo Flaqué
0131d0a531 examples: make sure unhandled frames are always pushed 2025-01-30 13:15:49 -08:00
Aleix Conchillo Flaqué
642affb2fe add missing super().process_frame() calls 2025-01-30 13:15:17 -08:00
Aleix Conchillo Flaqué
a145005498 SyncParallelPipeline: cleanup source/sink processors 2025-01-30 13:13:02 -08:00
Aleix Conchillo Flaqué
241f241ed9 SyncParallelPipeline: don't add source/sink processors inside pipeline 2025-01-30 13:12:37 -08:00
Aleix Conchillo Flaqué
85e572e2d8 gladia: cleanup receive messages task 2025-01-30 13:10:47 -08:00
Aleix Conchillo Flaqué
10716e8ec1 utils: protect obj_id() and obj_count() with a lock 2025-01-30 13:10:36 -08:00
Aleix Conchillo Flaqué
41d60a14cc introduce TaskManager and PipelineRunner event loop 2025-01-30 13:10:36 -08:00
Aleix Conchillo Flaqué
e69c065a86 update CHANGELOG and fix formatting 2025-01-30 08:55:29 -08:00
Aleix Conchillo Flaqué
f90c17ab30 Merge pull request #1083 from team-telnyx/creating_telnyx_chatbot
Creating telnyx chatbot
2025-01-30 08:49:20 -08:00
Aleix Conchillo Flaqué
bc4fdd587a Merge pull request #1103 from pipecat-ai/aleix/tts-service-push-silence-before-tts-stop-frame
services(tts): allow pushing silence audio before TTSStoppedFrame
2025-01-30 08:48:41 -08:00
Aleix Conchillo Flaqué
665a6017f9 services(tts): allow pushing silence audio before TTSStoppedFrame 2025-01-30 08:46:56 -08:00
Aleix Conchillo Flaqué
4119d7a115 Merge pull request #1104 from pipecat-ai/aleix/twilio-transport-message-frames
serializers(twilio): handle transport message frames
2025-01-30 08:45:55 -08:00
Aleix Conchillo Flaqué
2634b03ffa serializers(twilio): handle transport message frames 2025-01-30 08:30:09 -08:00
Aleix Conchillo Flaqué
6a50759b9f Merge pull request #1105 from pipecat-ai/aleix/websocket-client
added new websocket client transport
2025-01-30 08:28:26 -08:00
Mark Backman
7982faba67 Merge pull request #1115 from pipecat-ai/mb/elevenlabs-language-fixes
Improve ElevenLabs language checking logic
2025-01-30 10:03:22 -05:00
Mark Backman
2b4bf57c04 Improve ElevenLabs language checking logic 2025-01-30 09:52:36 -05:00
Filipi Fuchter
7e3e126730 Migrating the base API URL for the react native example to an .env file. 2025-01-30 10:42:16 -03:00
Filipi Fuchter
75ca0571bb Improving the layout from the bot ready react native demo. 2025-01-30 10:31:04 -03:00
Filipi Fuchter
a48e5d0714 Only sending the message when it is a remote audio track. 2025-01-30 10:14:37 -03:00
Filipi Fuchter
2b6a992207 Sending the app-message to start playing audio once the track has started. 2025-01-30 09:37:33 -03:00
Filipi Fuchter
24cf106ed2 Refactoring the code to ask for the room that it should connect. 2025-01-30 09:14:18 -03:00
Rafal Skorski
b93e4ab9cb Formatting adjusted and the encoding selection moved from TelnyFrameSerilaizer to websocket_endpoint function in server.py 2025-01-30 12:52:30 +01:00
Dominic Stewart
c140c04b9a Merge pull request #1080 from DominicStewart/dom/voicemail-detection-bot
Add voicemail detection example
2025-01-30 09:20:12 +09:00
Dominic
a7c8d2af8e Removed extra space too 2025-01-30 09:18:29 +09:00
Dominic
f3f520a76a Removed formatting that vs code automatically adds to readme file 2025-01-30 09:17:27 +09:00
Mark Backman
5e0f42a3e0 Merge pull request #1111 from pipecat-ai/mb/gemini-restructure-messages
GoogleLLMContext: Allow _restructure_from_openai_messages to handle c…
2025-01-29 19:06:47 -05:00
Filipi Fuchter
95c8346cb5 Starting to create a react native client for the bot ready example. 2025-01-29 19:00:42 -03:00
Mark Backman
220ce9fd0f GoogleLLMContext: Allow _restructure_from_openai_messages to handle context frames that contain function call data and / or messages 2025-01-29 16:01:39 -05:00
Filipi da Silva Fuchter
5d0486a26f Merge pull request #1008 from pipecat-ai/cutting_initial_words
Avoid cutting off the beginning of the audio
2025-01-29 17:02:40 -03:00
Chad Bailey
bc98c2e36c added sqlite storage example 2025-01-29 19:12:15 +00:00
Aleix Conchillo Flaqué
091258f617 improve create_task names 2025-01-29 11:11:40 -08:00
Aleix Conchillo Flaqué
2a1408eb2a transports(websocket server): remove unused variable 2025-01-29 11:11:40 -08:00
Aleix Conchillo Flaqué
6393b41d58 transports(websocket): added WebsocketClientTransport 2025-01-29 11:11:37 -08:00
Filipi Fuchter
2a5728264c Adding missing dependency to openai 2025-01-29 15:52:42 -03:00
Filipi Fuchter
2ef0735462 Adding readme to teach how to use. 2025-01-29 15:45:48 -03:00
Filipi Fuchter
80bbfff4be Merge branch 'main' into cutting_initial_words 2025-01-29 15:36:52 -03:00
Aleix Conchillo Flaqué
4ff68e66b9 Merge pull request #1110 from pipecat-ai/aleix/frame-metadata
frames: added metadata field to Frame class
2025-01-29 10:30:59 -08:00
Aleix Conchillo Flaqué
3a688840fc frames: added metadata field to Frame class 2025-01-29 09:53:21 -08:00
Aleix Conchillo Flaqué
2ca8b95bbf Merge pull request #1106 from Vaibhav159/vl_moving_test_utils_to_pipecat_package
moving test utils inside of package
2025-01-29 09:44:34 -08:00
Mark Backman
2aafc6bd1d Merge pull request #1107 from AngeloGiacco/angelo/increase-ws-connection
fix: elevenlabs tts increase websocket max message size limit to 16MB
2025-01-29 10:04:42 -05:00
Angelo Giacco
0ff9ef8707 fix: add changelog 2025-01-29 14:27:39 +00:00
Angelo Giacco
596cae994d fix: elevenlabs tts increase websocket max message size limit to 16MB 2025-01-29 13:55:27 +00:00
Dominic
9ad9cb1ff8 Cleaned up formatting 2025-01-29 17:36:08 +09:00
Dominic Stewart
60e800e9ba Merge branch 'main' into dom/voicemail-detection-bot 2025-01-29 17:30:56 +09:00
Dominic
1c8f0ed7da Finalised code and added a bit about this example to the README 2025-01-29 17:27:44 +09:00
Vaibhav159
8407a86532 moving test utils inside of package 2025-01-29 12:46:43 +05:30
Dominic
417d661d28 Updated bot_runner and bot_daily with adjustments necessary to run voicemail detection from bot_daily code 2025-01-29 16:11:45 +09:00
Aleix Conchillo Flaqué
8cd23c42fc Merge pull request #1100 from pipecat-ai/aleix/use-task-cancel-on-left-disconnected
use `task.cancel()` when participant leaves/disconnects
2025-01-28 16:02:02 -08:00
Aleix Conchillo Flaqué
0547a15695 task: allow queuing a CancelFrame to cancel the task 2025-01-28 15:59:36 -08:00
Aleix Conchillo Flaqué
3fe2124314 examples: use task.cancel() when participant leaves or disconnects 2025-01-28 15:46:20 -08:00
Aleix Conchillo Flaqué
ba358a4f0a task: cleanup processors after task finishes running 2025-01-28 15:02:25 -08:00
Aleix Conchillo Flaqué
79ef8c947d Merge pull request #1099 from pipecat-ai/aleix/daily-transport-queue-events
transports(daily): queue events until join completes
2025-01-28 14:38:25 -08:00
Aleix Conchillo Flaqué
f024476b08 transports(daily): queue events until join completes 2025-01-28 11:22:42 -08:00
Dominic
73690a13d9 Moved voicemail detection to phone-chatbot and working on that now 2025-01-28 22:31:08 +09:00
Dominic
6ebf06a6fb Removed start_terminate_call function as unnecessary 2025-01-28 10:39:10 +09:00
Dominic
2f4f779c91 Fixed a few things 2025-01-28 10:39:10 +09:00
Dominic
941ee6e5e8 Add voicemail detection example 2025-01-28 10:39:10 +09:00
Aleix Conchillo Flaqué
cd5075ed7a Merge pull request #1097 from pipecat-ai/aleix/pipecat-0.0.57
prepare CHANGELOG for 0.0.54
2025-01-27 14:56:51 -08:00
Aleix Conchillo Flaqué
6f41a667c8 prepare CHANGELOG for 0.0.54 2025-01-27 14:48:56 -08:00
Aleix Conchillo Flaqué
0b222a7eae Merge pull request #1085 from pipecat-ai/aleix/task-creation-and-cancellation
improve task creation and cancellation
2025-01-27 14:47:20 -08:00
Aleix Conchillo Flaqué
f09f4b8fc4 services(tavus): fix EndFrame and CancelFrame processing 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
cca241a2b7 examples(22c): fix cancel_task call 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
1489e44740 gemini(multimodal live): fix model audio queue variable 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
f55f78e70e update CHANGELOG.md 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
10202dc529 transports(websockets): cancel or wait for tasks to finish 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
498805a34c FrameProcessor: add wait_for_task() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
509f143e1b update CHANGELOG.md 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
737e4fa3bd gemini(multimodal live): connect on StartFrame 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
8b5228a105 utils: move task functions to asyncio module 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
6cc01bc5b0 examples: update 14 series with TTSSpeakFrame 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
2a2928d96c gemini: create transcribe tasks only once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
a3a6adbd17 user_idle_processor: add missing parent cleanup() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
bf5ced18b2 fix parallel pipelines cleanup 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
2eccd1b1e9 utils: update some logging levels 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
9374bed878 tests: langchain fixes 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
c03d0352b1 utils/tasks: added new documentation 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
af90b8b4fa utils: add wait_for_task() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
0a9daa2f56 task: avoid canceling tasks more than once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
e48c0e52ef transports(daily): avoid canceling task more than once 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
6bca8396d3 utils: error if we try to cancel the same task multiple times 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
c2d8a45a07 runner: warn about remaining dangling tasks 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
80a7f1b1e7 runner: improve signal handler task cancellation 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
aff6e24560 pipeline: fix pipeline cleanup 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
cb93f6b368 utils: store created tasks and add current_tasks() 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
ff0bcec33a transports: improve task naming 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
5885fcc230 add id and name properties 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
57b186cde8 base_transport: add name and id fields 2025-01-27 14:42:23 -08:00
Aleix Conchillo Flaqué
d1a3f404a5 improve task creation and cancellation
If a FrameProcessor needs to create a task it should use
FrameProcessor.create_task() and FrameProcessor.cancel_task(). This gives
Pipecat more control over all the tasks that are created in Pipecat.

Both functions internally use the utils module: utils.create_task() and
utils.cancel_task() which should also be used outside of FrameProcessors. That
is, unless strictly necessary, we should avoid using asyncio.create_task().
2025-01-27 14:42:23 -08:00
chadbailey59
179ddbea7d Add dialout to the Daily phone example (#998)
* added dialout to daily phone example

* cleanup

* cleanup

* pre-commit hook

* Fix typo

* More explicit README instructions

---------

Co-authored-by: Mark Backman <mark@daily.co>
2025-01-27 12:21:30 -06:00
Mark Backman
86c1e6a3bd Merge pull request #1081 from pipecat-ai/mb/user-idle-add-retry
Added retry functionality and a new callback to the UserIdleProcessor
2025-01-27 10:30:45 -05:00
Mark Backman
9e9822f17d Use inspect.signature to determine which callback to use 2025-01-27 10:24:58 -05:00
Mark Backman
5f9671e2ca Added retry functionality and a new callback to the UserIdleProcessor 2025-01-27 10:24:57 -05:00
Mark Backman
aac8961ae5 Merge pull request #1078 from pipecat-ai/mb/improve-error-handling-truncate-audio
Add better error handling for OpenAIRealtimeBetaLLMService truncate errors
2025-01-27 08:54:39 -05:00
Mark Backman
3e6377346a Merge pull request #1093 from pipecat-ai/mb/update-example-6a 2025-01-26 19:43:39 -05:00
Mark Backman
9d9a622b1a Merge pull request #1094 from pipecat-ai/mb/readme-service-section 2025-01-26 19:43:12 -05:00
Mark Backman
3e9a6b6262 Merge pull request #1095 from pipecat-ai/mb/elevenlabs-lang-codes 2025-01-26 12:21:28 -05:00
Mark Backman
fb3097560f Remove eleven_multilinguagal_v2 from language code list 2025-01-26 07:17:38 -05:00
Mark Backman
ff6368add0 Update README.md
Adding a section so that table can be linked to.
2025-01-25 16:12:53 -05:00
Mark Backman
89fd03d86f Merge pull request #1090 from vengad-arrowhead/main
Adding hindi danda symbol as end of sentence marker
2025-01-25 09:36:19 -05:00
Mark Backman
0672530d6b Fix foundational example 6a to switch images when the bot is speaking 2025-01-25 08:40:42 -05:00
vengadanathan srinivasan
7a0cfc8d3d Adding hindi danda symbol as end of sentence marker 2025-01-25 14:55:51 +05:30
Mark Backman
b881dd57b3 Merge pull request #1086 from pipecat-ai/mb/fix-expiry-time-type-mismatch 2025-01-24 17:31:08 -05:00
Mark Backman
abf0d0d053 Improve token parameter construction using DailyMeetingTokenProperties 2025-01-24 17:22:31 -05:00
Mark Backman
1acdf7aff7 Fix expiry_time type validation in get_token REST API helper 2025-01-24 17:21:50 -05:00
Mark Backman
96b90abda6 Merge pull request #1082 from pipecat-ai/mb/update-function-calling-examples
Update function calling examples to push a TextFrame in the start_cal…
2025-01-24 17:21:13 -05:00
Filipi da Silva Fuchter
202a844eeb Merge pull request #1051 from pipecat-ai/gemini_grounding_metadata_rtvi
Sending Search Response to RTVI
2025-01-24 19:20:50 -03:00
Filipi Fuchter
655d56f634 Fixing pydantic validation when creating meeting token. 2025-01-24 19:15:56 -03:00
Filipi Fuchter
07c84b733b Sending Search Response to RTVI 2025-01-24 18:59:46 -03:00
Filipi da Silva Fuchter
7c52736ff6 Merge pull request #1030 from pipecat-ai/gemini_grounding_metadata
Introduce support for extracting and processing grounding metadata from GoogleLLMService.
2025-01-24 15:41:54 -03:00
Mark Backman
48ce751602 Merge pull request #1075 from Vaibhav159/vl_add_daily_meeting_token_v2
adding models to DailyRestHelper
2025-01-24 13:21:52 -05:00
Vaibhav159
1f1e2dac2b wrapping things up 2025-01-24 23:44:23 +05:30
Vaibhav159
71c2dc3d05 minor typing change 2025-01-24 23:38:44 +05:30
Vaibhav159
ef02ece662 doc string 2025-01-24 22:47:40 +05:30
Vaibhav159
d5818fad5b addressing comments 2025-01-24 22:46:54 +05:30
Rafal Skorski
9c22bd8df1 Improving read me and encoding support 2025-01-24 16:44:11 +01:00
Mark Backman
dbea86baae Update function calling examples to push a TextFrame in the start_callback 2025-01-24 10:21:08 -05:00
Vaibhav159
c5faac1cf8 adding RecordingsBucketConfig 2025-01-24 15:14:20 +05:30
Vaibhav159
e106d7a215 adding line space 2025-01-24 09:12:07 +05:30
Vaibhav159
40c1a8369a updated changelog 2025-01-24 09:11:15 +05:30
Vaibhav159
6ab2404a98 adding more properties to daily room 2025-01-24 09:10:25 +05:30
Mark Backman
e61c996a2e Merge pull request #1079 from ecdeng/patch-1
Update cartesia.py to use the new model pointer `sonic`
2025-01-23 22:15:30 -05:00
Eric Deng
2c81dc1f06 Update cartesia.py to use the new model pointer sonic instead of sonic-english
We are now using `sonic` as a pointer to the latest stable release (https://docs.cartesia.ai/build-with-sonic/models#continuous-updates). sonic-english will forever point to `sonic-2024-10-19`, which is already out of date.
2025-01-23 15:47:07 -08:00
Mark Backman
53251dcb88 Add better error handling for OpenAIRealtimeBetaLLMService truncate errors 2025-01-23 14:25:08 -05:00
Mark Backman
d4e4b12109 Merge pull request #1071 from porcelaincode/patch-1
Update runner.py
2025-01-23 13:19:22 -05:00
Mark Backman
466d26a4f2 Merge pull request #1077 from Vaibhav159/vl_fix_missing_leftover_audio
adding missing audio buffer fix
2025-01-23 13:16:41 -05:00
Vaibhav159
ef511d580d adding missing audio buffer fix 2025-01-23 23:17:49 +05:30
Vaibhav159
5957ddb038 adding missing audio buffer fix 2025-01-23 23:17:18 +05:30
Vaibhav159
799c2d14b8 adding meeting token v2 func 2025-01-23 21:40:42 +05:30
Rafal Skorski
8eef21db6e Adding telnyx serializer 2025-01-23 15:39:46 +01:00
vatsal
dee1224530 Update runner.py 2025-01-23 13:21:49 +05:30
Mark Backman
fc6aa6eae8 Merge pull request #1060 from chhao01/patch-1
[bug]TypeError: object of type 'NoneType' has no len()
2025-01-22 19:14:35 -05:00
Mark Backman
ddd5bf70ab Merge pull request #1061 from Allenmylath/patch-21
Update README.md
2025-01-22 19:13:15 -05:00
allenmylath
aa59744444 Update examples/README.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-23 05:38:37 +05:30
chadbailey59
067ddfe505 Storytelling chatbot updates (#1066)
* initial changes for gemini storybot

* storybot updates for gemini

* more storybot updates

* interim interruptible commit

* cleanup

* cleanup

* cleanup

* cleanup
2025-01-22 15:20:21 -06:00
Mark Backman
a64df978e7 Merge pull request #1046 from pipecat-ai/mb/transcript-tts
Modified `TranscriptProcessor` to use `TTSTextFrame`s
2025-01-22 15:46:01 -05:00
Mark Backman
7167719761 Emit a transcription callback when receiving a CancelFrame, update examples accordingly 2025-01-22 14:56:29 -05:00
Mark Backman
e1430be9f9 Code review fixes 2025-01-22 14:56:29 -05:00
Mark Backman
c2fe8e7fdb Updated CHANGELOG 2025-01-22 14:56:28 -05:00
Mark Backman
31c77d8e35 Update examples for the updated TranscriptProcessor 2025-01-22 14:56:00 -05:00
Mark Backman
2a60d54830 Update the AssistantTranscriptProcessor to use TTSTextFrames in place of OpenAILLMContextFrames 2025-01-22 14:56:00 -05:00
Aleix Conchillo Flaqué
b3c99887dc Merge pull request #1068 from Canonical-AI-Inc/import-fix
Fixing missing import
2025-01-22 11:37:49 -08:00
Mark Backman
38ad75cc17 Merge pull request #1065 from pipecat-ai/mb/fix-openai_realtime-function-calling
OpenAIRealtimeBetaLLMService: Fixed an error in function calling
2025-01-22 14:37:01 -05:00
Adrian Cowham
2debac314c fixing missing import 2025-01-22 11:06:53 -08:00
Mark Backman
e0c9a1a1a2 Merge pull request #1041 from Allenmylath/patch-20
Update bot.py
2025-01-22 09:18:19 -05:00
allenmylath
4cdcca588e Update examples/moondream-chatbot/bot.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-22 19:40:12 +05:30
allenmylath
a90e81e2eb Update examples/moondream-chatbot/bot.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-22 19:38:36 +05:30
Mark Backman
0ba60c9e28 Merge pull request #975 from imsakg/main
fix(gemini): prevent non-audio modality processing
2025-01-22 09:03:18 -05:00
Mark Backman
5ca5fbd825 OpenAIRealtimeBetaLLMService: Fixed an error in function calling 2025-01-22 08:54:03 -05:00
Joe Garlick
b72504f1cb Added: Additional DTMF frames 2025-01-22 13:47:23 +00:00
allenmylath
2b52e2c109 Update README.md
Silero-tts changed to VAD, also description regarding session handling added to websocket chatbot
2025-01-22 14:42:35 +05:30
Cheng Hao
7e8fc2e7e2 [bug]TypeError: object of type 'NoneType' has no len()
Sometimes the chunk.choices is None, and I got exception like: 
```
TypeError: object of type 'NoneType' has no len()
```
2025-01-22 15:31:27 +08:00
Aleix Conchillo Flaqué
0d79a9eaa6 update CHANGELOG.md 2025-01-21 18:00:10 -08:00
Aleix Conchillo Flaqué
f89b9ec23f Merge pull request #1057 from pipecat-ai/aleix/replace-resampy-soxr
improve audio resampling by switching from resampy to soxr
2025-01-21 17:52:49 -08:00
Mark Backman
20d5824e56 Merge pull request #1058 from pipecat-ai/mb/fix-trace-log 2025-01-21 20:44:50 -05:00
Aleix Conchillo Flaqué
f23baa78d8 test-requirements: add soxr and remove resampy 2025-01-21 17:40:17 -08:00
Aleix Conchillo Flaqué
cacd6ba3fa improve audio resampling by switching from resampy to soxr 2025-01-21 17:40:17 -08:00
Aleix Conchillo Flaqué
f87ecd3a51 Merge pull request #1048 from pipecat-ai/aleix/add-unittest-utils
tests: add some initial run_test() utilities
2025-01-21 17:39:06 -08:00
Mark Backman
b96a922aa8 Fix trace log line for resume_processing_frames 2025-01-21 18:15:03 -05:00
Aleix Conchillo Flaqué
401d3ff267 tests: added PipelineTask tests 2025-01-21 11:45:43 -08:00
Aleix Conchillo Flaqué
ab4221a4db task: added BaseTask 2025-01-21 11:45:43 -08:00
Aleix Conchillo Flaqué
bd6f82cf94 task: allow specifying heartbeat period 2025-01-21 11:45:43 -08:00
Aleix Conchillo Flaqué
dd21b424d6 pyproject: ignore 'audioop' deprecation warning 2025-01-21 10:27:34 -08:00
Aleix Conchillo Flaqué
76884877dd tests: add pytest-asyncio dependency 2025-01-21 10:23:19 -08:00
Aleix Conchillo Flaqué
0d6c680133 README: add unit tests badge 2025-01-21 10:14:37 -08:00
Aleix Conchillo Flaqué
a27fe4bde2 tests: move test_ai_services to test_utils_string 2025-01-21 10:06:14 -08:00
Aleix Conchillo Flaqué
177cb2ca8b tests: initial pipeline and parallelpipeline tests 2025-01-21 09:57:54 -08:00
Aleix Conchillo Flaqué
3c970a3cee tests: add more filter tests 2025-01-21 09:43:57 -08:00
Aleix Conchillo Flaqué
af02f8f1cd filters(frame_filter): allow more than one frame 2025-01-21 09:43:33 -08:00
Aleix Conchillo Flaqué
2e0fb198bf frame_processor: allow pushing more frames after EndFrame
This can be useful for testing purposes. In real practice, there shouldn't be
any frames after an EndFrame is pushed.
2025-01-21 09:42:15 -08:00
Filipi da Silva Fuchter
4f758c5a3b Merge pull request #1050 from pipecat-ai/fix_rtvi_warning_msg
Ignoring transport messages that are not intended to RTVI.
2025-01-21 13:36:50 -03:00
Rafal Skorski
89b87289e2 elevenlabs key added to env.example 2025-01-21 17:12:27 +01:00
Rafal Skorski
e0e190a1a2 Create telnyx chat bot example application 2025-01-21 17:09:55 +01:00
Filipi Fuchter
3e0836b340 Ignoring transport messages that are not intended to RTVI. 2025-01-21 10:08:14 -03:00
Aleix Conchillo Flaqué
2f23693bf3 tests: fix test_protobuf_serializer.py 2025-01-20 18:39:59 -08:00
Aleix Conchillo Flaqué
b7dd9748cf serializers: fix special fix initialization 2025-01-20 18:39:41 -08:00
Aleix Conchillo Flaqué
d4d9c3b7ae tests: fix test_aggregators.py 2025-01-20 18:16:14 -08:00
Aleix Conchillo Flaqué
090bc81ec5 tests: add some initial run_test() utilities 2025-01-20 17:41:21 -08:00
Filipi Fuchter
9b61633aa0 Introduce support for extracting and processing grounding metadata from Google LLM responses. 2025-01-20 11:28:12 -03:00
Mark Backman
e3d53d3d9a Merge pull request #1044 from pipecat-ai/mb/elevenlabs-http-fix-voice-settings
Fixed a type error when using voice_settings in ElevenLabsHttpTTSService
2025-01-20 08:11:38 -05:00
Mark Backman
262d3a19c9 Fixed a type error when using voice_settings in ElevenLabsHttpTTSService 2025-01-20 07:57:02 -05:00
allenmylath
491feb691c Update bot.py
quiet and talking frames are determined based on BotStartedSpeakingFrame and BotStoppedSpeakingFrame not ttsframe
2025-01-20 14:00:17 +05:30
Aleix Conchillo Flaqué
e4f83b237e update CHANGELOG (remove 07d-interruptible-elevenlabs-http.py) 2025-01-19 11:36:18 -08:00
fatwang2
fc90bdc638 changed to HailuoHttpTTSService 2025-01-19 09:43:48 +08:00
fatwang2
5a88165a26 Merge branch 'pipecat-ai:main' into main 2025-01-19 09:40:08 +08:00
Aleix Conchillo Flaqué
a169e0cde9 Merge pull request #1035 from pipecat-ai/aleix/prepare-0.0.53
update CHANGELOG for 0.0.53
2025-01-18 14:50:35 -08:00
Aleix Conchillo Flaqué
c6d643d4ec update CHANGELOG for 0.0.53 2025-01-18 14:48:48 -08:00
Aleix Conchillo Flaqué
2abbd4bb27 Merge pull request #1039 from pipecat-ai/aleix/fish-audio-websocket-service
services(fish): FishAudioTTSService to use WebsocketService
2025-01-18 14:48:20 -08:00
Aleix Conchillo Flaqué
e0011a3996 services(fish): FishAudioTTSService to use WebsocketService 2025-01-18 14:29:45 -08:00
Aleix Conchillo Flaqué
ea44c59ddd Merge pull request #1037 from Vaibhav159/fixing_unused_11labs_package
removing unused 11labs package imports
2025-01-17 22:08:04 -08:00
Vaibhav159
a9c7dbbc05 removing unused code 2025-01-18 10:58:07 +05:30
Vaibhav159
8a87e92b2b adding missing 11labs package 2025-01-18 10:48:57 +05:30
Mark Backman
982f2becc6 Merge pull request #1002 from pipecat-ai/mb/add-on-error-callback
Register the on_error handler
2025-01-17 21:58:59 -05:00
Mark Backman
e049ae470d Register the on_error handler 2025-01-17 21:49:42 -05:00
Mark Backman
e159f2dce1 Merge pull request #1024 from pipecat-ai/mb/elevenlabs-http
Add ElevenLabsHttpTTSService
2025-01-17 21:30:31 -05:00
Aleix Conchillo Flaqué
e9162ae467 Merge pull request #1004 from Fluentsai/feature/dtmf_input
Twilio serializer reading dtmf websocket messages
2025-01-17 18:14:46 -08:00
Aleix Conchillo Flaqué
bb65512ff4 Merge pull request #1034 from pipecat-ai/aleix/ulaw-resample-update
ulaw resample update
2025-01-17 17:47:18 -08:00
Mark Backman
b81323d676 Code review fixes + docstrings 2025-01-17 20:12:43 -05:00
Aleix Conchillo Flaqué
65fa77dfa5 audio: use resample_audio to resample ulaw bytes 2025-01-17 15:24:41 -08:00
Aleix Conchillo Flaqué
9ddd9ae27c Merge pull request #1011 from Vaibhav159/vl_deepgram_metrics_without_vad
adding metric generation without deepgram VAD
2025-01-17 14:47:19 -08:00
Aleix Conchillo Flaqué
12fc6e17ef Merge pull request #1033 from pipecat-ai/aleix/observers-performance
task: add TaskObserver and avoid pipeline blocking
2025-01-17 14:43:26 -08:00
Aleix Conchillo Flaqué
3e4020cdba task: add TaskObserver and avoid pipeline blocking
Observers now process frames in separate tasks. This avoids blocking the
pipeline while the observer is processing the frame.
2025-01-17 11:15:52 -08:00
Aleix Conchillo Flaqué
4f883ee31f Merge pull request #1023 from pipecat-ai/aleix/introduce-heartbeat-frames
introduce heartbeat frames
2025-01-17 10:31:07 -08:00
Mark Backman
3ff360f042 Merge pull request #1032 from pipecat-ai/mb/user-idle-fixes
Start UserIdleProcessor on speaking frame, fix bug not pushing EndFrame
2025-01-17 13:18:09 -05:00
Aleix Conchillo Flaqué
45cbad5b3e task: add HEARTBEAT_MONITOR_SECONDS 2025-01-17 10:11:28 -08:00
Aleix Conchillo Flaqué
477d0d154b frame_processor: make sure clock is initialized 2025-01-17 10:05:23 -08:00
Aleix Conchillo Flaqué
4b3c776f58 task: don't use push queue to send a heartbeat
This is because we might be waiting for the EndFrame. Currently, if we push an
EndFrame to the task, the task will block until the EndFrame traverses all the
pipeline.
2025-01-17 10:04:24 -08:00
Aleix Conchillo Flaqué
da0c4cfd99 task: increase heartbeat monitoring to 5 seconds 2025-01-17 10:04:05 -08:00
Aleix Conchillo Flaqué
f22a00570d task: start heartbeats task when push task starts 2025-01-17 10:03:13 -08:00
Mark Backman
85f4663a41 Start UserIdleProcessor on speaking frame, fix bug not pushing EndFrame 2025-01-17 12:54:17 -05:00
Aleix Conchillo Flaqué
915e3bb3c7 Merge pull request #1029 from Vaibhav159/vl_fixing_idle_frame_processor_logic
fixing IdleFrameProcessor and UserIdleProcessor init logic
2025-01-17 06:48:13 -08:00
Vaibhav159
80779c48d6 sort fix 2025-01-17 20:07:25 +05:30
Vaibhav159
c444557965 fixing IdleFrameProcessor and UserIdleProcessor init logic 2025-01-17 19:50:53 +05:30
Mark Backman
d51893f61c Refactor for aiohttp, correct use of settings 2025-01-16 23:49:53 -05:00
fatwang2
3466842cd4 add hailuo tts service 2025-01-17 12:46:05 +08:00
Mark Backman
740d2743df Add TTFB metrics 2025-01-16 23:05:53 -05:00
Mark Backman
0dd22fb879 Merge pull request #1022 from pipecat-ai/mb/fix-abstractmethod
Remove @abstractmethod from set_model and set_model in TTSService class
2025-01-16 22:59:26 -05:00
Mark Backman
225b65c3d2 Add ElevenLabsHttpTTSService 2025-01-16 22:46:32 -05:00
Aleix Conchillo Flaqué
2503f76107 examples: add 31-heartbeats.py 2025-01-16 19:31:13 -08:00
Aleix Conchillo Flaqué
ff8aa68942 introduce heartbeat frames 2025-01-16 19:31:13 -08:00
Maxim Makatchev
c5edbf4b75 Made InputDTMFFrame a DataFrame and moved up to data frames 2025-01-17 12:27:04 +09:00
Aleix Conchillo Flaqué
799777774b Merge pull request #1018 from pipecat-ai/aleix/streamline-thread-pool-executors
transports: streamline max_workers for ThreadPoolExecutors
2025-01-16 19:05:41 -08:00
Mark Backman
fdef8a97e2 Remove @abstractmethod from set_model and set_model in TTSService class 2025-01-16 21:36:51 -05:00
Mark Backman
0163247410 Merge pull request #1021 from pipecat-ai/mb/improve-30
Add a second observer to the 30-observer.py example
2025-01-16 21:19:35 -05:00
James Hush
221e044046 demo: Update translator bot example (#1005)
* docs: Update translator bot example

Updates the translator bot to do the following:

- Allow you to specify the in and out languages
- Uses TranscriptionProcessor to handle transcriptions

* Simplify the example, improve performance

---------

Co-authored-by: Mark Backman <mark@daily.co>
2025-01-17 10:08:15 +08:00
Mark Backman
532fd31fd7 Add a second observer to the 30-observer.py example 2025-01-16 19:46:18 -05:00
Mark Backman
3e178fd46f Merge pull request #1020 from pipecat-ai/mb/observer-foundational
Add foundational example 30 to show how to use an Observer
2025-01-16 19:28:26 -05:00
Mark Backman
07cb8b7a89 Extend the example to include BotStartedSpeakingFrame and BotStoppedSpeakingFrame 2025-01-16 19:24:01 -05:00
Mark Backman
e805738d4c Merge pull request #1009 from pipecat-ai/mb/tts-ignore-interim-transcripts
TTSService should only process LLMTextFrames
2025-01-16 17:09:24 -05:00
Mark Backman
119bc7e35f Update check to exclude transcription frames 2025-01-16 16:43:46 -05:00
Mark Backman
b9b02845a3 Add foundational example 30 to show how to use an Observer 2025-01-16 16:37:32 -05:00
Aleix Conchillo Flaqué
3714f12edc Merge pull request #1019 from Canonical-AI-Inc/canonical-transcripts
Add transcript to Canonical Metrics Service
2025-01-16 13:36:55 -08:00
Aleix Conchillo Flaqué
d2b8171197 transports: streamline max_workers for ThreadPoolExecutors 2025-01-16 13:34:04 -08:00
Filipi Fuchter
c4c15eff39 Sending a silence frame to prevent the audio from clipping. 2025-01-16 18:30:19 -03:00
Adrian Cowham
d0b48c95bb updated the example to use stereo audio and pass in the context. also updated the service to send the transcripts if they're available 2025-01-16 13:12:38 -08:00
Aleix Conchillo Flaqué
73ed0c1ad7 Merge pull request #1017 from pipecat-ai/aleix/additional-trace-logging
additional trace logging
2025-01-16 12:38:47 -08:00
Vanessa Pyne
c211580fec Merge pull request #1016 from pipecat-ai/vp-1007-nonetype
services(gemini_multimodal_live): set content to [] if not present in messages
2025-01-16 14:14:50 -06:00
Aleix Conchillo Flaqué
359b55a85e additional trace logging 2025-01-16 11:19:42 -08:00
Filipi Fuchter
7efd00e0f7 Asking for the bot to send the audio only when the audio element is already on playing state. 2025-01-16 16:00:56 -03:00
kompfner
8b602a3f62 Merge pull request #1010 from pipecat-ai/ios-simplechatbot-assorted-improvements
iOS SimpleChatbot assorted improvements
2025-01-16 13:59:45 -05:00
kompfner
485c231f69 Merge pull request #1012 from pipecat-ai/simplechatbot-readme-local-pipecat
Add to the SimpleChatbot server README a step for pointing to the loc…
2025-01-16 13:46:19 -05:00
vipyne
8ba3b150eb services(gemini_multimodal_live): set content to [] if not present in messages
... which it will be if the message is a tool call
2025-01-16 11:59:02 -06:00
Paul Kompfner
b5f72b4378 Add to the SimpleChatbot server README a step for pointing to the local version of pipecat 2025-01-16 11:59:44 -05:00
Vaibhav159
85e7d62f94 fixing log text 2025-01-16 21:36:51 +05:30
Vaibhav159
923d33eeff fixing ruff 2025-01-16 21:32:48 +05:30
Vaibhav159
7ee6e7193d adding metric generation without deepgram VAD 2025-01-16 21:23:56 +05:30
Paul Kompfner
156fffe6fc In iOS SimpleChatbot demo, add clarifying note to Audio Settings section header explaining that "(No selection = system default)".
Ideally we could add a row showing that the system default is selected, but this is OK as a short-term fix. Also, the presence of that row might suggest that "system default" is selectable, but it's not: this is currently a limitation in the Pipecat Client.
2025-01-16 10:32:55 -05:00
Paul Kompfner
c9834e2712 In iOS SimpleChatbot demo, remove unused LLMHelperDelegate protocol conformance 2025-01-16 10:31:17 -05:00
Paul Kompfner
1e7e307f69 In iOS SimpleChatbot demo, call release() when disconnecting the voice client, since we're not using it after disconnecting 2025-01-16 10:30:06 -05:00
Mark Backman
67e47a388d TTSService should only process LLMTextFrames 2025-01-16 10:03:24 -05:00
Filipi Fuchter
119c0da299 Configuring a proxy so we can test from mobile 2025-01-16 11:02:53 -03:00
Filipi Fuchter
ea1323723d Handling the signalling to play the audio 2025-01-16 10:42:22 -03:00
Filipi Fuchter
d2efe27350 Improving the logs and updating status 2025-01-16 10:36:45 -03:00
Filipi Fuchter
5dc7d2a378 Creating the bot when pressing to connect. 2025-01-16 10:28:39 -03:00
Filipi Fuchter
88c540f9bc Starting to create the example signalling through app message. 2025-01-16 10:14:38 -03:00
Maxim Makatchev
dcf317f2fa Twilio serializer reading dtmf websocket messages and generating InputDTMFFrame containing the corresponding value of KeypadEntry 2025-01-16 17:43:12 +09:00
Aleix Conchillo Flaqué
b8ffd7b16b Merge pull request #996 from pipecat-ai/aleix/introduce-observers
introduce pipeline frame observers
2025-01-15 18:05:33 -08:00
Aleix Conchillo Flaqué
08f1dda94e observers: add a timestamp to on_push_frame() 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
45039e7cde update CHANGELOG.md 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
e50c76d075 examples(simple-chatbot): use RTVIObserver for server-client messages 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
dd9f9179cc rtvi(RTVIObserver): use observers for RTVI server->client messages 2025-01-15 17:45:00 -08:00
Aleix Conchillo Flaqué
c8da531402 pipeline(task): add support for pipeline frame observers 2025-01-15 17:43:59 -08:00
Aleix Conchillo Flaqué
25bcaf5c7c observers: introduce pipeline observers 2025-01-15 17:43:59 -08:00
Aleix Conchillo Flaqué
2d0f3341c3 frames: add LLMTextFrame and TTSTextFrame
This is to distinguish what type of service has generated the TextFrames.
2025-01-15 17:43:59 -08:00
Aleix Conchillo Flaqué
7626d7b04b Merge pull request #999 from pipecat-ai/aleix/add-pre-commit-hooks
add pre-commit hooks
2025-01-15 17:39:34 -08:00
Aleix Conchillo Flaqué
f78520f7d0 add pre-commit hooks
Fixes #945
2025-01-15 13:44:21 -08:00
Aleix Conchillo Flaqué
bb4766455d Merge pull request #997 from pipecat-ai/aleix/update-dependencies-01-15-25
update dependencies (go back to numpy1)
2025-01-15 13:35:46 -08:00
Aleix Conchillo Flaqué
9dacbbbbf4 fix ruff formatting 2025-01-15 13:02:13 -08:00
Aleix Conchillo Flaqué
4de192fbb0 update dependencies (go back to numpy1)
Fixes #911, #913
2025-01-15 12:04:28 -08:00
kompfner
80b6c28431 Merge pull request #992 from pipecat-ai/live-updates-to-selected-and-available-mics
In the iOS SimpleChatbot demo, wire up live updates to the selected m…
2025-01-15 15:00:14 -05:00
Mark Backman
f471744bca Merge pull request #995 from pipecat-ai/vp-riva-bump
deps(riva): bump to 2.18.0
2025-01-15 14:35:39 -05:00
Mark Backman
d5df4b064b Merge pull request #987 from pipecat-ai/mb/deepseek-typo
Fix error log in DeepSeekLLMService and CerebrasLLMService
2025-01-15 14:31:34 -05:00
Mark Backman
06a0e29920 Merge pull request #991 from pipecat-ai/mb/update-web-simple-chatbot
Update simple-chatbot example to use the latest client SDKs
2025-01-15 13:36:03 -05:00
Aleix Conchillo Flaqué
64eb8e7262 Merge pull request #994 from Vaibhav159/vl_deepgram_with_vad
finalize on DeepgramSTTService on VAD
2025-01-15 10:28:11 -08:00
Filipi da Silva Fuchter
d8386c12dc Merge pull request #990 from pipecat-ai/bumping_ios_example
Using PipecatClient version 0.3.2
2025-01-15 14:29:01 -03:00
vipyne
50e798bcd9 deps(riva): bump to 2.18.0 2025-01-15 10:24:57 -06:00
Vaibhav159
d1ac7751da finalize on DeepgramSTTService 2025-01-15 20:43:23 +05:30
Paul Kompfner
110ce27c91 In the iOS SimpleChatbot demo, wire up live updates to the selected mic and available mics list. This is beneficial for a few reasons:
- Live updates are nice! We can now more easily see what's going on when we connect or disconnect a mic.
- Resolves an issue where the initial selected mic was not shown.
- Let us see when the Pipecat client automatically switches to a new mic, like when one is connected.
2025-01-15 09:56:27 -05:00
Mark Backman
8b657158ca Update React simple-chatbot client to use latest client SDKs 2025-01-15 09:50:43 -05:00
Mark Backman
cce14fca97 Update JS simple-chatbot client to use latest client SDKs 2025-01-15 09:47:20 -05:00
Filipi Fuchter
7c051516d8 Using PipecatClient version 0.3.2 2025-01-15 09:57:57 -03:00
Mark Backman
5f402ad741 Merge pull request #988 from pipecat-ai/mb/readme-openrouter
Update README.md
2025-01-14 18:38:35 -05:00
Mark Backman
a80b186cea Update README.md
Add OpenRouter to the README
2025-01-14 18:08:14 -05:00
Mark Backman
c65aaf3b2e Merge pull request #967 from sahilsuman933/openrouter-integration
feat(services): Add OpenRouter LLM Service Integration
2025-01-14 18:06:13 -05:00
Mark Backman
e815d7776f Fix error log in DeepSeekLLMService and CerebrasLLMService 2025-01-14 18:03:29 -05:00
sahil suman
11fc08ef24 fix changelog
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:57:09 +05:30
sahil suman
6f3b0fdf73 fix changelog
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:56:16 +05:30
sahil suman
885bc32827 added changes in changelog.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:53:04 +05:30
sahil suman
7339cc7197 Merge remote-tracking branch 'origin/main' into openrouter-integration 2025-01-15 02:52:19 +05:30
sahil suman
62e9e6bc5a changed the file name.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:21:58 +05:30
Sahil Suman
329da50338 Update src/pipecat/services/openrouter.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-01-15 02:20:22 +05:30
sahil suman
4d307d26d8 made the required changes.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 02:19:05 +05:30
Mark Backman
a74b9354ec Merge pull request #962 from pipecat-ai/mb/improve-tts-reconnection-logic
Improve websocket based TTS service reconnection logic
2025-01-14 14:48:00 -05:00
sahil suman
11381a536f added example for function calling and made the required changes.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-15 01:00:33 +05:30
Mark Backman
b53bc8a879 _calculate_wait_times as private, add and use WebsocketServiceException 2025-01-14 13:20:13 -05:00
Mark Backman
e3d8910814 Update CHANGELOG 2025-01-14 13:12:40 -05:00
Mark Backman
e60a59434f Refactor LMNTTTSService to make a websocket connection directly, then use the WebsocketService base class 2025-01-14 13:09:58 -05:00
Mark Backman
5e5de618f3 Update PlayHTTTSService to use the WebsocketService base class 2025-01-14 13:09:58 -05:00
Mark Backman
8af92f7923 Update ElevenLabsTTSService to use the WebsocketService base class 2025-01-14 13:09:58 -05:00
Mark Backman
f39e17857e Add a WebsocketService base class to retry, ensure that retries reset after a successful connection, update Cartesia to use the new WebsocketService 2025-01-14 13:09:58 -05:00
Aleix Conchillo Flaqué
5b632de04a Merge pull request #982 from pipecat-ai/aleix/pipelinetask-cleanup-sink
pipeline(task): cleanup Sink processor
2025-01-14 09:14:03 -08:00
Mark Backman
6bcc196489 Merge pull request #969 from pipecat-ai/mb/deepseek
Add support for DeepSeek LLM
2025-01-14 09:40:06 -05:00
Mark Backman
66375e9dff Update dot-env.template API keys 2025-01-14 09:34:34 -05:00
Mark Backman
bc839492b6 Add support for DeepSeek LLM 2025-01-14 09:34:33 -05:00
Filipi da Silva Fuchter
4854645637 Merge pull request #960 from pipecat-ai/example_gemini_with_goolge_search
Example with Gemini using google search to retrieve news.
2025-01-14 10:07:15 -03:00
Mark Backman
98e80b7d4a Merge pull request #970 from pipecat-ai/mb/user-controlled-run-llm
Add an override_run_llm option to optionally defer function call completion
2025-01-13 18:48:00 -05:00
Mark Backman
8c0ecb89de Refactor for new on_context_updated callback and new frame properties 2025-01-13 17:20:41 -05:00
Aleix Conchillo Flaqué
4c8fcb2cfc pipeline(task): cleanup Sink processor
Fixes #953
2025-01-13 13:29:44 -08:00
Aleix Conchillo Flaqué
92313d6ce7 Merge pull request #972 from pipecat-ai/aleix/simple-chatbot-android-workflow-update
github: only run android simple-chatbot worflow if android example modified
2025-01-13 13:26:12 -08:00
Mark Backman
1ca6ecc46e Update CHANGELOG 2025-01-13 09:49:09 -05:00
Mark Backman
f1947d7d38 Update Anthropic and Gemini to allow overriding run_llm 2025-01-13 09:48:43 -05:00
Mark Backman
0852570212 Update Grok for function call override 2025-01-13 09:48:43 -05:00
Mark Backman
874b8bb136 Allow for an override of running a completion after a function call completes, OpenAI 2025-01-13 09:48:43 -05:00
Mark Backman
da1878537b Merge pull request #974 from pipecat-ai/mb/26d-example
Align 26d example with foundation norms
2025-01-12 19:44:31 -05:00
Mark Backman
f406d93b0f Align 26d example with foundation norms 2025-01-12 19:19:16 -05:00
Aleix Conchillo Flaqué
3cd2b90177 Merge pull request #971 from pipecat-ai/aleix/update-copyright-keep-original-year
update copyright keeping original year (2024)
2025-01-12 11:37:15 -08:00
Aleix Conchillo Flaqué
c4f0c7bcfd github: only run android simple-chatbot worflow if android example modified 2025-01-12 11:35:34 -08:00
Aleix Conchillo Flaqué
95e69597f3 update copyright keeping original year (2024) 2025-01-12 11:34:00 -08:00
Aleix Conchillo Flaqué
710baa5e17 Merge pull request #973 from pipecat-ai/aleix/simple-chatbot-clients
examples/simple-chatbot: move clients to client directory
2025-01-12 11:28:21 -08:00
Mert Sefa AKGUN
14e5419913 fix(gemini): prevent non-audio modality processing
Add an early return in the _handle_transcribe_model_audio method to
prevent unnecessary processing when the modalities setting is not set
to audio. This change ensures that audio transcription only occurs
when appropriate.
2025-01-12 22:17:10 +03:00
Mark Backman
8c953bac41 Merge pull request #966 from imsakg/main
fix(services): handle TranscriptionFrame separately in TTSService
2025-01-12 11:33:38 -05:00
Mark Backman
4c0861ce39 Some addition links and README changes 2025-01-12 09:27:23 -05:00
Mark Backman
12b1e1db9d Merge pull request #965 from pipecat-ai/mb/aws-add-session-token
Add optional aws_session_token for PollyTTSService
2025-01-12 09:13:03 -05:00
Mark Backman
53bfdfd83f Merge pull request #963 from pipecat-ai/mb/cleanup-examples
Update examples to align with latest best practices
2025-01-12 09:12:34 -05:00
Mark Backman
2a5593afea Merge pull request #968 from pipecat-ai/mb/readme-websocket
Update README.md
2025-01-12 09:12:19 -05:00
Aleix Conchillo Flaqué
a04a920e54 examples/simple-chatbot: move clients to client directory 2025-01-11 19:16:05 -08:00
Aleix Conchillo Flaqué
2ce6d92455 Merge pull request #959 from KevGTL/fix-livekit-transport
fix: push input audio frame only via push_audio_frame()
2025-01-11 19:03:35 -08:00
Mark Backman
1ecd5da219 Update README.md
Add websocket docs links to README.
2025-01-11 08:37:17 -05:00
sahil suman
e04da334d7 add support for openrouter.
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
2025-01-11 17:50:58 +05:30
Mert Sefa AKGUN
7ec351813c style(ai_services): fix import order with ruff 2025-01-11 13:04:26 +03:00
Mert Sefa AKGUN
df6c2fc403 fix(services): handle TranscriptionFrame separately in TTSService
Exclude TranscriptionFrame from text frame processing in TTSService by updating the type check condition. This resolves unintended processing behavior when handling different frame types.
2025-01-11 13:00:38 +03:00
Mark Backman
71e107725c Add optional aws_session_token for PollyTTSService 2025-01-10 19:33:47 -05:00
Mark Backman
4d0c11fcab Update examples to align with latest best practices 2025-01-10 15:07:06 -05:00
Mark Backman
a8ae79831e Merge pull request #921 from pipecat-ai/mb/playht-http
PlayHTHttpTTSService fixes
2025-01-10 13:26:45 -05:00
Mark Backman
86516d2415 PlayHTHttpTTSService fixes 2025-01-10 13:21:27 -05:00
Vanessa Pyne
5cd9dab14b Merge pull request #949 from imsakg/main
fix(examples): correct TTS service import and setup
2025-01-10 10:58:50 -06:00
Kwindla Hultman Kramer
a3e2e06975 Merge pull request #961 from pipecat-ai/khk/tiny-chatbot-readme-fix
fixed 404 in SimpleChatbot iOS example README
2025-01-10 08:45:05 -08:00
Kwindla Hultman Kramer
e7107b99c5 fixed 404 in SimpleChatbot iOS example README 2025-01-10 08:37:13 -08:00
Filipi Fuchter
aa1b8879ee Fixing ruff format 2025-01-10 13:21:51 -03:00
Mark Backman
6802459165 Merge pull request #956 from pipecat-ai/mb/tavus
Update the Tavus example and comment about using the PERSONA_ID
2025-01-10 11:18:05 -05:00
Filipi Fuchter
6719d1fddc Example with Gemini using google search to retrieve news. 2025-01-10 13:13:59 -03:00
kompfner
a798bf18f2 Merge pull request #955 from pipecat-ai/ios-simple-chatbot-mainactor-fixes
iOS SimpleChatbot @MainActor fixes
2025-01-10 09:37:02 -05:00
Kevin Oury
f9d0cca60f fix: push input audio frame only via push_audio_frame() 2025-01-10 15:02:38 +01:00
Mark Backman
cb22de0d13 Update the Tavus example and comment about using the PERSONA_ID 2025-01-10 08:01:00 -05:00
marcus-daily
7d161cc53b Setting target SDK to 35 2025-01-10 09:50:37 +00:00
marcus-daily
255abf46ef Updating Gradle and AGP 2025-01-10 09:50:37 +00:00
marcus-daily
27579bcb70 Fixing imports 2025-01-10 09:50:37 +00:00
marcus-daily
1295b64879 Updating library dependencies 2025-01-10 09:50:37 +00:00
marcus-daily
ca57670f65 Removing unnecessary drawables 2025-01-10 09:50:37 +00:00
marcus-daily
06d0a231b9 Android demo app for simple-chatbot example 2025-01-10 09:50:37 +00:00
Mert Sefa AKGUN
67af4e619b style(examples): fix ruff formatting in Gemini text example
Refactor `CartesiaTTSService` instantiation to comply with line
length requirements from the ruff linter.
2025-01-10 12:32:53 +03:00
Mert Sefa AKGUN
21c274944e Update examples/foundational/26d-gemini-multimodal-live-text.py
Co-authored-by: Vanessa Pyne <vipyne@gmail.com>
2025-01-10 12:28:13 +03:00
Paul Kompfner
3239249feb In the iOS SimpleChatbot, fix @MainActor-related warnings (which would be errors in Swift 6). The delegate methods aren't contractually guaranteed to run on the main thread, so we can't mark them as @MainActor. 2025-01-09 17:35:44 -05:00
Paul Kompfner
216979c377 Bump iOS SimpleChatbot's pipecat-client-ios-daily dependency to version 0.3.1 2025-01-09 16:22:26 -05:00
Filipi da Silva Fuchter
b9db53d3cd Merge pull request #952 from pipecat-ai/fixing_gemini_function_calling
Fixing GeminiMultimodalLiveLLMService function calling to work with pipecat-flows
2025-01-09 17:50:25 -03:00
Filipi Fuchter
58bfcc8370 Fixing GeminiMultimodalLiveLLMService function calling when using with pipecat-flows. 2025-01-09 12:22:37 -03:00
Mert Sefa AKGUN
6664c492ac feat(gemini): enable audio transcription in live text example
Add options to transcribe both user and model audio during the GeminiMultimodalLiveLLMService setup in the 26d-gemini-multimodal-live-text.py example.
2025-01-09 15:38:33 +03:00
Mert Sefa AKGUN
7634058f97 fix(examples): correct TTS service import and setup
- Update import to use CartesiaTTSService instead of CartesiaMultiLingualTTSService.
- Adjust GeminiMultimodalLiveLLMService setup to use set_model_modalities with TEXT modality.
2025-01-09 02:19:08 +03:00
Mark Backman
39c6446bdc Merge pull request #947 from pipecat-ai/mb/add-rime-set-voices
Add setters for model and voice to RimeHttpTTSService
2025-01-08 14:25:24 -05:00
Filipi da Silva Fuchter
2df7dfcc91 Merge pull request #943 from pipecat-ai/simple_chat_bot_ios
SimpleChatbot iOS app.
2025-01-08 16:17:39 -03:00
Mark Backman
c23c9e046c Add setters for model and voice to RimeHttpTTSService 2025-01-08 14:17:32 -05:00
Mark Backman
9dae753e8c Merge pull request #926 from imsakg/main
feat(gemini): add text handling to GeminiMultimodalLive
2025-01-08 13:42:17 -05:00
Mert Sefa AKGUN
40e9ee6d63 fix(examples): correct import order in Gemini example
- Move `CartesiaMultiLingualTTSService` import to maintain proper order.
- Reorganize `enum` import to adhere to styling standards.
2025-01-08 21:14:29 +03:00
Mert Sefa AKGUN
a342fe732e docs: update CHANGELOG with Gemini modalities and examples 2025-01-08 19:34:42 +03:00
Mert Sefa AKGUN
a729834482 refactor(gemini): reposition WebSocket connection code
Move WebSocket connection setup earlier in the function for better
organization and to prepare for subsequent configuration steps.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
94a6f1086e feat(gemini): change default modality to AUDIO
Modify the default modality in the `InputParams` class from TEXT to AUDIO
to better align with the intended use case for GeminiMultimodalLive
service.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
b42d3a8257 feat(gemini): add modality configuration for GeminiMultimodalLive
- Introduce `GeminiMultimodalModalities` enum for modality options.
- Add modality field to `InputParams`, defaulting to text.
- Simplify modality setup with `set_model_modalities` method.
- Refactor WebSocket configuration to support dynamic response modalities.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
12ae980abe feat(gemini): handle full text response in GeminiMultimodalLive
- Add a buffer to store bot text responses.
- Push a `LLMFullResponseStartFrame` when text begins.
- Clear the text buffer and send `LLMFullResponseEndFrame` after processing.
2025-01-08 19:29:36 +03:00
Mert Sefa AKGUN
cdb909958c feat(examples): add Gemini multimodal live text example
Introduce a new example `26d-gemini-multimodal-live-text.py` to
demonstrate the use of GeminiMultimodalLiveLLMService with text-only
responses. This example sets up a pipeline for audio input via DailyTransport,
processing with Gemini, and output via Cartesia TTS.
2025-01-08 19:29:35 +03:00
Mert Sefa AKGUN
c72c3025f6 feat(gemini): add configuration methods for response modalities
- Introduce `set_model_only_audio` and `set_model_only_text` methods
  to toggle between audio-only and text-only response modes in
  `GeminiMultimodalLiveLLMService`.
- Refactor configuration setup to a class attribute for improved
  reusability and maintenance.
- Remove redundant configuration instantiation in the WebSocket
  connection setup process.
2025-01-08 19:29:35 +03:00
Mert Sefa AKGUN
5cbd719780 feat(gemini): add text handling to GeminiMultimodalLive
- Introduce text attribute in Part class for handling string data.
- Incorporate text processing in GeminiMultimodalLiveLLMService to push TextFrame if text is present.
2025-01-08 19:29:35 +03:00
Filipi Fuchter
23d6290672 Removing not used class. 2025-01-08 12:05:04 -03:00
Filipi Fuchter
d4e7e11981 SimpleChatbot iOS app. 2025-01-08 12:00:11 -03:00
Mark Backman
8057fe3fcf Merge pull request #742 from Vaibhav159/vl_feature_websocket_fastapi_timeout
adding session_timeout param
2025-01-08 09:05:41 -05:00
Vaibhav159
3b446234a7 fix hyperlink 2025-01-08 10:54:27 +05:30
Vaibhav159
768487ffb3 final changelog 2025-01-08 10:53:32 +05:30
Vaibhav159
2da5620d10 adding changelog 2025-01-08 10:50:09 +05:30
Vaibhav159
af90d65b3b adding session timeout example in websocket-server example 2025-01-08 10:43:10 +05:30
Vaibhav159
c8569a7b67 Merge remote-tracking branch 'upstream/main' into vl_feature_websocket_fastapi_timeout 2025-01-08 10:21:36 +05:30
Vaibhav159
0ecd98c873 Merge branch 'main' into vl_feature_websocket_fastapi_timeout 2025-01-08 10:20:55 +05:30
Mark Backman
6f863ba2c6 Merge pull request #938 from jcbjoe/jg/optional-authentication-polly
Changed Polly authentication params to be optional
2025-01-07 15:37:23 -05:00
Mark Backman
602ca5ebe6 Merge pull request #939 from Vaibhav159/vl_adding_daily_room_properties
adding more daily room params
2025-01-07 14:33:59 -05:00
Vaibhav159
787ade41f3 adding missing doc string 2025-01-08 00:58:01 +05:30
Joe Garlick
bb767831d5 Added: Changelog entry 2025-01-07 19:05:02 +00:00
Mark Backman
bc25a771dc Merge pull request #935 from pipecat-ai/hush/modalUpdate
docs: update dependencies for modal demo
2025-01-07 13:57:46 -05:00
Vaibhav159
f37626f81d adding more daily room params 2025-01-07 21:38:05 +05:30
Mark Backman
9d54578e65 Merge pull request #934 from pipecat-ai/mb/bump-open-ai-version
Bump openai version to 1.59.0 for realtime and model updates
2025-01-07 08:29:45 -05:00
Joe Garlick
79afe7ec2a Changed: Polly authentication information to be optional 2025-01-07 11:43:57 +00:00
James Hush
2c1fd3c3cc docs: update dependencies for modal demo 2025-01-07 15:45:55 +08:00
Mark Backman
b0dd8e03a6 Bump openai version to 1.59.0 for realtime and model updates 2025-01-06 17:05:22 -05:00
Mark Backman
ee20e48ef8 Merge pull request #931 from pipecat-ai/mb/fix-openai-realtime-
Fix truncation timing of OpenAIRealtimeBetaLLMService
2025-01-06 16:25:09 -05:00
Mark Backman
12b5c5a646 Fix truncation timing of OpenAIRealtimeBetaLLMService 2025-01-06 15:37:58 -05:00
Mark Backman
7a021cc82d Merge pull request #929 from pipecat-ai/mb/add-google-journey-support
Added support for Google Journey TTS voices
2025-01-06 15:13:00 -05:00
Mark Backman
3e1ec4a8ee Added support for Google Journey TTS voices 2025-01-06 14:54:34 -05:00
Mark Backman
a1377b7f1a Merge pull request #924 from xtreme-sameer-vohra/patch-1
Update frames.py
2025-01-06 14:13:10 -05:00
Mark Backman
d6335886e2 Merge pull request #848 from Vaibhav159/vl_add_audio_and_chat_livekit_example
adding example for livekit audio and chat version
2025-01-06 13:27:38 -05:00
Vaibhav159
b3b7a5f023 adding 2025 license 2025-01-06 22:10:46 +05:30
Vaibhav159
5138017b57 ruff changes 2025-01-06 22:07:59 +05:30
Vaibhav159
87670067d7 adding changelog 2025-01-06 22:03:11 +05:30
Vaibhav159
656cd2859e Merge branch 'main' into vl_add_audio_and_chat_livekit_example 2025-01-06 21:57:43 +05:30
Mark Backman
15b2cc210c Merge pull request #927 from pipecat-ai/mb/update-copyright
Update copyright to 2025
2025-01-06 10:33:04 -05:00
Mark Backman
4667624b60 Update copyright to 2025 2025-01-06 10:19:37 -05:00
Sameer Vohra
d07ba80572 Update frames.py
fix minor typo in docs
2025-01-05 22:57:54 -05:00
Aleix Conchillo Flaqué
386ba61483 Merge pull request #909 from pipecat-ai/aleix/pipecat-0.0.52
update CHANGELOG for 0.0.52
2024-12-24 08:16:05 -08:00
Aleix Conchillo Flaqué
e9d275f270 update CHANGELOG for 0.0.52 2024-12-23 19:52:34 -08:00
Aleix Conchillo Flaqué
3a4994370c update README 2024-12-23 19:20:23 -08:00
Aleix Conchillo Flaqué
6125ea882d update README 2024-12-23 19:19:39 -08:00
Aleix Conchillo Flaqué
0a1ce1bb63 update CHANGELOG 2024-12-23 19:13:59 -08:00
Kwindla Hultman Kramer
ab3bcde5f7 Merge pull request #907 from pipecat-ai/khk/gemini-20241221
Gemini unary API fixes and natural conversation demo
2024-12-23 17:34:57 -08:00
Kwindla Hultman Kramer
1368d3db5c revert elevenlabs example changes 2024-12-23 17:33:59 -08:00
Aleix Conchillo Flaqué
cd7dec7391 Merge pull request #906 from pipecat-ai/aleix/fix-duplicate-base-output-frames
transports(base_output): fix duplicate push_frame()
2024-12-23 06:12:31 -08:00
Kwindla Hultman Kramer
a5e985094b remove stray line 2024-12-22 19:45:57 -08:00
Aleix Conchillo Flaqué
c04c69df95 transports(base_output): fix duplicate push_frame() 2024-12-22 14:43:38 -08:00
Aleix Conchillo Flaqué
9c105e25ac Merge pull request #905 from pipecat-ai/aleix/daily-python-0.14.2
pyproject: update daily-python to 0.14.2
2024-12-22 13:03:25 -08:00
Aleix Conchillo Flaqué
6901c4fa57 pyproject: update daily-python to 0.14.2 2024-12-22 12:30:17 -08:00
Mark Backman
469c13c07e Merge pull request #903 from pipecat-ai/mb/send-prebuilt-chat
Add the ability to send_prebuilt_chat_message when using the DailyTra…
2024-12-22 14:33:50 -05:00
Mark Backman
46871ae686 Merge pull request #899 from pipecat-ai/mb/add-fish-audio
Add Fish Audio TTS service
2024-12-22 14:26:59 -05:00
Kwindla Hultman Kramer
ab5df1a236 feature complete gemini audio, transcription, and phrase endpointing demo 2024-12-22 11:19:02 -08:00
Kwindla Hultman Kramer
f5f0de00e4 still some cleanup to do 2024-12-21 23:04:00 -08:00
Kwindla Hultman Kramer
f3dd35bfd9 working but needs cleanup 2024-12-21 22:18:56 -08:00
Kwindla Hultman Kramer
53a5e63990 function calling dead-end 2024-12-21 18:10:25 -08:00
Kwindla Hultman Kramer
d435a6a6d6 fixes to audio buffer 2024-12-21 16:22:53 -08:00
Kwindla Hultman Kramer
59240c7b96 delay gemini multimodal live websocket connect 2024-12-21 14:36:37 -08:00
Mark Backman
6c11753985 Add the ability to send_prebuilt_chat_message when using the DailyTransport 2024-12-21 14:04:46 -05:00
Mark Backman
6fabb7e7d5 Fix metrics calculations 2024-12-21 13:25:43 -05:00
Mark Backman
bce218915e Add Fish to the README 2024-12-21 12:54:07 -05:00
Mark Backman
627c91f4a6 Flush the audio 2024-12-21 12:52:28 -05:00
Mark Backman
dac4468ca1 Add Fish Audio TTS service 2024-12-21 12:42:56 -05:00
Mark Backman
503eddf7d6 Merge pull request #897 from pipecat-ai/mb/update-playht
Update PlayHT to use the latest Websocket connection endpoint
2024-12-20 20:31:41 -05:00
Aleix Conchillo Flaqué
1a0f6f2a21 Merge pull request #898 from pipecat-ai/aleix/reset-input-queue-flag-if-interruption
frame_processor: reset input queue flag with interruptions
2024-12-20 13:58:12 -08:00
Aleix Conchillo Flaqué
43759295cc frame_processor: reset input queue flag with interruptions 2024-12-20 09:33:20 -08:00
Mark Backman
900b95eb92 Update PlayHT to use the latest Websocket connection endpoint 2024-12-20 10:44:47 -05:00
marcus-daily
41d07692ca Fix import order 2024-12-20 14:30:38 +00:00
marcus-daily
dcf6b6e120 Add an RTVIProcessor to the simple-chatbot pipeline 2024-12-20 14:30:38 +00:00
Mark Backman
99dba3b6b9 Merge pull request #893 from pipecat-ai/mb/changelog-11L
Added an `auto_mode` input parameter to `ElevenLabsTTSService`
2024-12-19 21:38:06 -05:00
Aleix Conchillo Flaqué
4547609ffb examples(01a): remove unused import 2024-12-19 17:49:27 -08:00
Mark Backman
9554804a49 Update 11L default model, allow language to be used by more models 2024-12-19 20:33:58 -05:00
Mark Backman
656cbc35e1 Make auto_mode an input parametere for ElevenLabsTTSService; add changelog entry 2024-12-19 20:33:56 -05:00
Aleix Conchillo Flaqué
6f7c4dd998 Merge pull request #894 from pipecat-ai/aleix/daily-python-0.14.0
transports(daily): update to daily-python 0.14.0
2024-12-19 17:14:31 -08:00
Aleix Conchillo Flaqué
8b496f8c6f transports(daily): daily-python 0.14.0 (SIP transfer/refer, DTMF) 2024-12-19 17:08:29 -08:00
Aleix Conchillo Flaqué
15047f5f0a Merge pull request #885 from pipecat-ai/aleix/parallelpipeline-wait-for-slowest-endframe
pipeline(parallel): wait for slowest endframe
2024-12-19 15:18:22 -08:00
Aleix Conchillo Flaqué
e08c24dc41 Merge pull request #883 from pipecat-ai/aleix/base-output-transport-avoid-pushing-endframe
transport(base output): avoid pushing EndFrame twice
2024-12-19 11:26:31 -08:00
Aleix Conchillo Flaqué
5341739ece transport(base output): avoid pushing EndFrame twice 2024-12-19 11:19:49 -08:00
Mark Backman
5b0fc3fa15 Merge pull request #891 from louisjoecodes/louis/flush-shorter-messages-elevenlabs
feat: set auto_mode=true - ElevenLabs tts WSS
2024-12-19 12:08:04 -05:00
Louis Jordan
b7b8e59e9e feat: set auto_mode=true - ElevenLabs tts WSS 2024-12-19 16:57:17 +00:00
Mark Backman
6e0d3aef32 Merge pull request #860 from pipecat-ai/mb/transcription
Add a TranscriptProcessor and new frames
2024-12-19 08:15:53 -05:00
Mark Backman
1ccc84dd7a Merge pull request #888 from pipecat-ai/mb/add-cerebras
Add CerebrasLLMService and foundational example
2024-12-19 08:14:53 -05:00
Mark Backman
c9dd906057 Tailor chat completion inputs to Cerebras API 2024-12-19 08:10:33 -05:00
Mark Backman
4f093f11db Add CerebrasLLMService and foundational example 2024-12-19 08:10:31 -05:00
Mark Backman
887a9170b2 Merge pull request #889 from pipecat-ai/mb/openai-realtime-model
Add model parameter to OpenAI realtime service constructor, update de…
2024-12-19 08:08:52 -05:00
Aleix Conchillo Flaqué
f2e191855a Merge pull request #881 from pipecat-ai/aleix/langchain-updates
pyproject: update langchaing to 0.3.12
2024-12-18 19:42:39 -08:00
Aleix Conchillo Flaqué
78b90e9591 Merge pull request #884 from pipecat-ai/aleix/filters-handle-endframe
processors(filters): allow passing EndFrame
2024-12-18 19:35:56 -08:00
Aleix Conchillo Flaqué
17decee788 Merge pull request #882 from pipecat-ai/aleix/stop-transport-parent-first
transports: call parent stop() before disconnecting
2024-12-18 19:35:39 -08:00
Aleix Conchillo Flaqué
f89014d100 pyproject: update langchaing to 0.3.12 2024-12-18 19:34:49 -08:00
Mark Backman
3b3e22fe7c Add model parameter to OpenAI realtime service constructor, update default model 2024-12-18 18:12:51 -05:00
Aleix Conchillo Flaqué
0df0194cc1 Merge pull request #886 from pipecat-ai/aleix/koala-noise-suppression
audio(koala): add new audio filter KoalaFilter
2024-12-18 14:02:04 -08:00
Mark Backman
8a7a61914e Code review feedback 2024-12-17 22:35:13 -05:00
Mark Backman
1117c21483 Refactor TranscriptProcessor into user and assistant processors 2024-12-17 22:34:22 -05:00
Mark Backman
4211664a77 TranscriptProcessor to handle simple and list content 2024-12-17 22:34:03 -05:00
Mark Backman
1f8a217cd1 Code review changes 2024-12-17 22:34:02 -05:00
Mark Backman
b5bd662fe1 Add changelog and rename examples 2024-12-17 22:33:39 -05:00
Mark Backman
dd2703317a Add timestamp frames and include timestamps in the transcription event and frame 2024-12-17 22:31:15 -05:00
Mark Backman
77aeda36eb Update OpenAI's from_standard_message to convert back to OpenAI's simple format 2024-12-17 22:31:15 -05:00
Mark Backman
51b235df4b Add docstrings for Google and Anthropic's to_standard_messages and from_standard_message functions 2024-12-17 22:31:15 -05:00
Mark Backman
4f2aee5fba Update OpenAI's to_standard_messages to return the verboase message format 2024-12-17 22:31:15 -05:00
Mark Backman
55879bf365 Add TranscriptionProcessor 2024-12-17 22:31:15 -05:00
Aleix Conchillo Flaqué
7322badbe7 audio(koala): add new audio filter KoalaFilter 2024-12-17 18:45:10 -08:00
Aleix Conchillo Flaqué
42bea578e8 pipeline(parallel): wait for slowest endframe
If we are sending an EndFrame and a ParallelPipeline has multiple pipelines we
want to wait before pushing the EndFrame downstream until the slowest pipeline
is finished. Otherwise, we could be disconnecting from the transport too early.
2024-12-17 17:05:11 -08:00
Aleix Conchillo Flaqué
2dfdceb9e6 processors(filters): allow passing EndFrame 2024-12-17 16:22:19 -08:00
Aleix Conchillo Flaqué
5bfcac1f5c transports: call parent stop() before disconnecting
This rollbacks a previous change https://github.com/pipecat-ai/pipecat/pull/855
which was trying to fix an issue in the wrong way.

The reasoning behind this fix is that the parent class might be sending audio or
messages (through the subclass) and if we disconnect before all the data is sent
we will run into incomplete audio or even errors. Therefore, we first make sure
the parent tasks stop and then it will be safe to disconnect.
2024-12-17 16:02:33 -08:00
Aleix Conchillo Flaqué
fb9f72d38b Merge pull request #880 from pipecat-ai/aleix/ruff-check-import-linter
ruff check import linter
2024-12-17 14:14:47 -08:00
Aleix Conchillo Flaqué
146a341a38 Merge pull request #879 from Vaibhav159/vl_add_readme_for_ruff_formatter_in_pycharm
updating readme to support auto-formatting of ruff in pycharm
2024-12-17 11:49:01 -08:00
Aleix Conchillo Flaqué
b9ca667d31 pyproject: use tool.ruff.lint sections 2024-12-17 11:40:43 -08:00
Aleix Conchillo Flaqué
5c57cccea3 github: run ruff check import linter 2024-12-17 11:29:28 -08:00
Aleix Conchillo Flaqué
17162258a2 fix ruff linter import organization 2024-12-17 11:28:58 -08:00
Aleix Conchillo Flaqué
da3fb98101 examples(storytelling-chatbot): update dependencies 2024-12-17 11:24:50 -08:00
Aleix Conchillo Flaqué
6244124d14 README: added Emacs import re-organization with Ruff 2024-12-17 11:20:18 -08:00
Vaibhav159
53049adeea removing --config flag 2024-12-18 00:47:00 +05:30
Vaibhav159
4208d2d7c4 updating readme to support auto-formatting of ruff in pycharm 2024-12-17 23:38:36 +05:30
Mark Backman
9f7f74e4d8 Merge pull request #869 from Vaibhav159/vl_fixing_deepgram_language_bug_#868
fixing [#868] bug where deepgram client fails due to langauge
2024-12-17 12:50:57 -05:00
Vaibhav159
f14d32d09e fixing ruff issue 2024-12-17 23:11:18 +05:30
Vaibhav159
7351e281e2 ruff change 2024-12-17 22:21:56 +05:30
Vaibhav159
b94b10f7d6 added change log 2024-12-17 22:11:52 +05:30
Vaibhav159
1cc90eb1a3 Merge branch 'main' into vl_fixing_deepgram_language_bug_#868 2024-12-17 22:09:30 +05:30
Vaibhav159
5f7d28bb05 adding type check and value check 2024-12-17 22:07:35 +05:30
Mark Backman
204a08ab8f Merge pull request #877 from pipecat-ai/mb/grok-function-calling-fix
Add custom assistant context aggregator for Grok due to content requi…
2024-12-17 10:51:19 -05:00
Aleix Conchillo Flaqué
141b0a6560 sentry: fix formatting 2024-12-17 07:14:31 -08:00
Mark Backman
ca086a856f Add custom assistant context aggregator for Grok due to content requirement in function calling 2024-12-17 09:11:21 -05:00
Aleix Conchillo Flaqué
fe0a7d07bd update CHANGELOG 2024-12-16 21:02:38 -08:00
Aleix Conchillo Flaqué
79eb29d614 Merge pull request #875 from pipecat-ai/aleix/update-dependencies
update dependencies
2024-12-16 20:58:30 -08:00
Aleix Conchillo Flaqué
da15c83bab fix ruff formatting 2024-12-16 20:52:40 -08:00
Aleix Conchillo Flaqué
d6bac77b3c pyproject: add audioop-lts for python 3.13 2024-12-16 20:50:25 -08:00
Aleix Conchillo Flaqué
7faa4eb295 update dev-requirements 2024-12-16 20:50:25 -08:00
Aleix Conchillo Flaqué
0e31413851 pyproject: update numpy, pydantic, loguru 2024-12-16 19:20:34 -08:00
Aleix Conchillo Flaqué
16948b251d services: fix infinite websocket-bases TTS services retries
Fixes #871
2024-12-16 16:36:44 -08:00
Mark Backman
f3112a8638 Merge pull request #866 from pipecat-ai/mb/readme-links
Fix a bunch of README docs links
2024-12-16 10:51:01 -05:00
Mark Backman
0293d40e4e Merge pull request #870 from pipecat-ai/mb/dotenv
Add python-dotenv to dev-requirements.txt
2024-12-16 10:50:46 -05:00
Mark Backman
64038442ed Add python-dotenv to dev-requirements.txt 2024-12-16 09:23:12 -05:00
Vaibhav159
facc280599 fixing [#868] bug where deepgram client fails due to langauge 2024-12-16 17:47:50 +05:30
Mark Backman
f90cbe8086 Fix a bunch of README docs links 2024-12-15 14:30:20 -05:00
Mark Backman
09a611d44b Merge pull request #856 from pipecat-ai/mb/daily-rest-helpers
Remove default 5 min exp time for created rooms, add docstrings
2024-12-13 12:08:58 -05:00
Mark Backman
16d7fb2c4a Remove default 5 min exp time for created rooms, add docstrings 2024-12-13 12:02:26 -05:00
Aleix Conchillo Flaqué
643160c960 Merge pull request #858 from pipecat-ai/aleix/fastpitch-timeout
riva: make sure we don't block on fastpitch
2024-12-13 08:20:38 -08:00
Aleix Conchillo Flaqué
aac907aadb riva: make sure we don't block on fastpitch 2024-12-13 07:32:51 -08:00
Aleix Conchillo Flaqué
8f24ca4e58 Merge pull request #857 from pipecat-ai/aleix/fix-riva-tts-audio-stuttering
riva: fix FastPitchTTSService audio stuttering
2024-12-12 22:20:00 -08:00
Aleix Conchillo Flaqué
420ce16807 riva: fix FastPitchTTSService audio stuttering 2024-12-12 22:15:44 -08:00
Aleix Conchillo Flaqué
2b8c35c681 Merge pull request #855 from pipecat-ai/aleix/transport-services-disconnect-fixes
transports(services): disconnect client first
2024-12-12 19:40:03 -08:00
Mark Backman
3d96369193 Merge pull request #852 from pipecat-ai/mb/readme-docs-badge
Add docs badge to README
2024-12-12 22:21:41 -05:00
Aleix Conchillo Flaqué
d44b36a07c Merge pull request #854 from pipecat-ai/aleix/aiservice-add-missing-process-frame
AIService: add missing super().process_frame()
2024-12-12 19:10:21 -08:00
Aleix Conchillo Flaqué
ccc96994e9 pyproject: update livekit 2024-12-12 19:09:36 -08:00
Aleix Conchillo Flaqué
337d421338 transports: disconnect client first 2024-12-12 19:09:06 -08:00
Aleix Conchillo Flaqué
752720b4d5 AIService: add missing super().process_frame() 2024-12-12 17:25:38 -08:00
Aleix Conchillo Flaqué
f8e69cfa00 Merge pull request #853 from pipecat-ai/revert-849-aleix/no-need-for-super-process-frame
Revert "no longer necessary to call super().process_frame(frame, direction)"
2024-12-12 17:21:20 -08:00
Aleix Conchillo Flaqué
6d11911d83 Revert "no longer necessary to call super().process_frame(frame, direction)" 2024-12-12 17:03:40 -08:00
Mark Backman
ec6e71c8ea Add docs badge to README 2024-12-12 18:08:24 -05:00
Aleix Conchillo Flaqué
10f854aeba Merge pull request #846 from pipecat-ai/aleix/base-output-transport-audio-sync
transport(output): fix non-audio frames sync after audio frames
2024-12-12 14:29:42 -08:00
Aleix Conchillo Flaqué
d8caf007b0 Merge pull request #849 from pipecat-ai/aleix/no-need-for-super-process-frame
no longer necessary to call super().process_frame(frame, direction)
2024-12-12 14:29:10 -08:00
Mark Backman
26ea64ef12 Merge pull request #850 from pipecat-ai/mb/fix-docs-builds
Fix docs generation build issues
2024-12-12 17:27:00 -05:00
Mark Backman
19c178ebc7 Fix docs generation build issues 2024-12-12 17:18:04 -05:00
Aleix Conchillo Flaqué
3c3fd67d96 no longer necessary to call super().process_frame(frame, direction) 2024-12-12 13:03:41 -08:00
Mark Backman
7bbc0ee8df Merge pull request #845 from pipecat-ai/mb/more-docs-updates
Docs auto-gen improvements
2024-12-12 15:42:34 -05:00
Mark Backman
67804edce6 Remove formats from .readthedocs.yaml 2024-12-12 15:41:11 -05:00
Mark Backman
ec082d0888 Remove deprecated VAD module 2024-12-12 15:32:38 -05:00
Mark Backman
8631d71d5a Fix more missing docs 2024-12-12 15:16:37 -05:00
Vaibhav159
62fc95300b adding livekit audio and chat version 2024-12-13 01:09:47 +05:30
Aleix Conchillo Flaqué
db7eaed980 transport(output): fix non-audio frames sync after audio frames 2024-12-12 10:56:02 -08:00
Mark Backman
44c5220104 Update README 2024-12-12 13:28:05 -05:00
Mark Backman
276fd86ecb More fixes for missing packages 2024-12-12 13:25:13 -05:00
Mark Backman
2de0737056 Merge pull request #844 from pipecat-ai/cb-gemini-example-fix
Update requirements.txt for simple-chatbot
2024-12-12 11:18:58 -05:00
Mark Backman
b5d5a0e923 Add special cases for displaying some names 2024-12-12 11:15:36 -05:00
Mark Backman
f3ed12c30b Clean up module and package display names 2024-12-12 11:11:53 -05:00
Mark Backman
e14399727b Add README and build script for local testing 2024-12-12 11:06:53 -05:00
Mark Backman
414dcf9810 Improve TOC in sidebar, fix missing services 2024-12-12 11:06:09 -05:00
chadbailey59
88d530e840 Update requirements.txt for simple-chatbot
The gemini example doesn't actually work from a fresh install, because the requirements.txt file doesn't include google :)
2024-12-12 09:31:15 -06:00
Aleix Conchillo Flaqué
af821d8e95 Merge pull request #841 from pipecat-ai/aleix/aws-to-polly
polly: renamed AWSTTSService to PollyTTSService
2024-12-11 18:13:02 -08:00
Aleix Conchillo Flaqué
133e1aff6c polly: renamed AWSTTSService to PollyTTSService 2024-12-11 17:56:43 -08:00
Aleix Conchillo Flaqué
def415f476 Merge pull request #840 from pipecat-ai/aleix/11labs-playht-more-languages
tts: support more languages in playht and elevenlabs
2024-12-11 14:58:03 -08:00
Aleix Conchillo Flaqué
a34d16dabe tts: support more languages in playht and elevenlabs 2024-12-11 14:53:24 -08:00
Mark Backman
ec7260b237 Merge pull request #839 from pipecat-ai/mb/bump-versions
Bump openai and aiohttp package versions
2024-12-11 17:06:15 -05:00
Mark Backman
96c6c71d5b Bump openai and aiohttp package versions 2024-12-11 16:48:36 -05:00
Aleix Conchillo Flaqué
8e140b2be6 Merge pull request #838 from pipecat-ai/aleix/prepare-0.0.50
update CHANGELOG fot 0.0.50
2024-12-11 11:49:15 -08:00
Aleix Conchillo Flaqué
a70c785b2e update CHANGELOG fot 0.0.50 2024-12-11 11:33:13 -08:00
Aleix Conchillo Flaqué
f1d3c5e9ad Merge pull request #837 from pipecat-ai/aleix/update-protobuf-to-5.29.1
pyproject: update protobuf to 5.29.1
2024-12-11 11:31:49 -08:00
Aleix Conchillo Flaqué
346329ba73 pyproject: update protobuf to 5.29.1 2024-12-11 11:29:48 -08:00
Aleix Conchillo Flaqué
6089d4255c Merge pull request #836 from pipecat-ai/aleix/moondream-studypal-fixes
examples: fixes for moondream-chatbot and studypal
2024-12-11 11:16:09 -08:00
Aleix Conchillo Flaqué
cff9bb6068 Merge pull request #835 from pipecat-ai/aleix/even-more-parallel-pipeline-fixes
parallel_pipeline: fix system frames and parallel pipelines again
2024-12-11 11:15:59 -08:00
Aleix Conchillo Flaqué
fdefdc9d68 Merge pull request #834 from pipecat-ai/aleix/transcription-are-text
frames: transcriptions should be TextFrames as before
2024-12-11 11:15:43 -08:00
Aleix Conchillo Flaqué
2dd418a38d parallel_pipeline: fix system frames and parallel pipelines again
The previous fixes didn't take into account that system frames can be generated
inside the internal pipelines.
2024-12-11 10:55:04 -08:00
Aleix Conchillo Flaqué
42f5ec20f6 examples: fixes for moondream-chatbot and studypal 2024-12-11 10:46:38 -08:00
Aleix Conchillo Flaqué
5b5125b74c frames: transcriptions should be TextFrames as before 2024-12-11 10:42:38 -08:00
Mark Backman
be4df5f713 Merge pull request #833 from pipecat-ai/mb/update-changelog-for-gemini
Update the CHANGELOG and README for Gemini Multimodal Live
2024-12-11 11:41:42 -05:00
Mark Backman
5418cdc4d1 Update the CHANGELOG and README for Gemini Multimodal Live 2024-12-11 11:40:16 -05:00
Mark Backman
6c9f5a81dc Merge pull request #832 from pipecat-ai/khk/gemini-live-function-calling
Gemini Multimodal Live function calling example
2024-12-11 11:39:19 -05:00
Mark Backman
027e360436 Fix demo numbering and prompt the bot to say hi in 26b 2024-12-11 11:36:38 -05:00
Kwindla Hultman Kramer
c219172266 Gemini Multimodal Live function calling example 2024-12-11 08:29:09 -08:00
Mark Backman
7b040be209 Merge pull request #830 from pipecat-ai/khk/gemini-multimodal-live
Gemini Multimodal Live API service
2024-12-11 11:25:55 -05:00
Mark Backman
0d74531f36 Minor changes to demos 2024-12-11 11:23:59 -05:00
Mark Backman
3341c4f608 Merge pull request #831 from pipecat-ai/mb/gemini-simple-chatbot
Gemini updates to the simple-chatbot demo
2024-12-11 11:15:15 -05:00
Mark Backman
1e45e55528 Add copyright block to audio_transcriber 2024-12-11 11:06:48 -05:00
Mark Backman
8086a94e49 Renumber foundational demos 2024-12-11 10:56:51 -05:00
Kwindla Hultman Kramer
81895f4a5c Gemini Multimodal Live API service 2024-12-11 07:38:23 -08:00
Mark Backman
2846d6f461 Update READMEs and comment files 2024-12-11 00:06:35 -05:00
Mark Backman
14f309ce2b Add Gemini Live bot file 2024-12-10 22:25:17 -05:00
Aleix Conchillo Flaqué
62ec2f5d1e Merge pull request #814 from pipecat-ai/aleix/simli-updates
minor simli updates
2024-12-10 18:48:29 -08:00
Aleix Conchillo Flaqué
4f9a4ebce2 Merge pull request #820 from pipecat-ai/aleix/more-parallelpipeline-fixes
parallel_pipeline: fix system frames again
2024-12-10 18:43:34 -08:00
Aleix Conchillo Flaqué
5b478a5c7a add SimliVideoService to CHANGELOG 2024-12-10 18:42:26 -08:00
Aleix Conchillo Flaqué
87c1f2bcce services(simli): remove ready flag, events vs sleep, handle CancelledError 2024-12-10 18:42:12 -08:00
Aleix Conchillo Flaqué
b85072637f examples(26-simli-layer): use room returned by configure() 2024-12-10 18:42:12 -08:00
Aleix Conchillo Flaqué
ffe1e023e7 Merge pull request #819 from pipecat-ai/aleix/fix-openaillmcontext-from-image-frame
fix OpenAILLMContext from image frame
2024-12-10 18:39:55 -08:00
Aleix Conchillo Flaqué
9a358b2e86 Merge pull request #824 from pipecat-ai/aleix/openpipe-use-openai-base-service
services(openpipe): use OpenAILLMService to get access to aggregators
2024-12-10 18:34:46 -08:00
Aleix Conchillo Flaqué
b034c6e247 Merge pull request #821 from pipecat-ai/aleix/update-pyproject
pyproject: update onnxruntime, whisper and azure
2024-12-10 18:34:27 -08:00
Aleix Conchillo Flaqué
c7ca0eea0f Merge pull request #823 from pipecat-ai/aleix/fix-15a-switch-languages
examples: fix 15a-switch-languages pipeline
2024-12-10 18:34:13 -08:00
Aleix Conchillo Flaqué
29d931cdcd Merge pull request #822 from pipecat-ai/aleix/fix-11-sound-effects
examples: fix 11-sound-effects
2024-12-10 18:33:53 -08:00
Aleix Conchillo Flaqué
ecf0c61af9 services(openpipe): use OpenAILLMService to get access to aggregators 2024-12-10 18:29:03 -08:00
Aleix Conchillo Flaqué
67e8252d76 examples: fix 15a-switch-languages pipeline 2024-12-10 18:27:49 -08:00
Aleix Conchillo Flaqué
775aa9493e examples: fix 11-sound-effects 2024-12-10 18:25:43 -08:00
Aleix Conchillo Flaqué
c446f91d4a pyproject: update onnxruntime, whisper and azure 2024-12-10 18:16:27 -08:00
Aleix Conchillo Flaqué
7b6bbc29ed parallel_pipeline: fix system frames again 2024-12-10 18:12:33 -08:00
Aleix Conchillo Flaqué
9e7ecccf1e google: fix VisionImageRawFrame context 2024-12-10 17:39:52 -08:00
Aleix Conchillo Flaqué
a618bd3fa6 openai: remove from_image_frame() and use add_image_frame_message() 2024-12-10 17:39:52 -08:00
Aleix Conchillo Flaqué
246c825a82 examples: rename 07p-interruptible-google-audio-in to 07s 2024-12-10 17:07:17 -08:00
Aleix Conchillo Flaqué
9e6fabf110 Merge pull request #818 from pipecat-ai/aleix/fastpitch-rename
riva: rename FastpitchTTSService to FastPitchTTSService
2024-12-10 13:36:38 -08:00
Aleix Conchillo Flaqué
d2dabe4358 riva: rename FastpitchTTSService to FastPitchTTSService 2024-12-10 13:30:43 -08:00
Vanessa Pyne
1db624575f Merge pull request #795 from pipecat-ai/vp-nvidia-riva
[WIP] add nvidia riva
2024-12-10 15:17:26 -06:00
vipyne
a49b4e450b services(riva): check service config before running tts 2024-12-10 15:15:46 -06:00
vipyne
9211a37efc services(riva): convention tweaks 2024-12-10 15:15:46 -06:00
vipyne
3f9d39329c services(riva): model -> function_id 2024-12-10 15:15:46 -06:00
vipyne
5a98ae6380 chore: update test-requirements 2024-12-10 15:15:46 -06:00
vipyne
8caad15e9b examples trivial update 2024-12-10 15:15:46 -06:00
vipyne
9222d9f721 services(riva): cleanup 2024-12-10 15:15:46 -06:00
vipyne
5a467a30a3 add nvidia riva - fastpitch 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
d74e728332 pyproject: update google-cloud-texttospeech to 2.21.1 2024-12-10 15:15:46 -06:00
vipyne
8a9fdaf441 services(riva): cleanup 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
4b55c73fbe services(riva): make FastpitchTTSService asyncio 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
7e407e5548 services(riva): first working version of ParakeetSTTService 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
ce94421c90 pyproject: add riva option and update protobuf and playht 2024-12-10 15:15:46 -06:00
vipyne
49ce3dcb27 add nvidia riva - fastpitch 2024-12-10 15:15:46 -06:00
Aleix Conchillo Flaqué
6ba2dea6f0 Merge pull request #812 from zzz-heygen/zzz/fix_serializer_backward_compat
fix: make ProtobufFrameSerializer backwards compatible
2024-12-10 13:11:09 -08:00
Aleix Conchillo Flaqué
9ac34ac371 Merge pull request #816 from pipecat-ai/aleix/rtvi-version-update
rtvi: update protocol version to 0.3.0
2024-12-10 11:52:28 -08:00
Aleix Conchillo Flaqué
a8644d2129 Merge pull request #815 from pipecat-ai/aleix/identity-filter
processors(filters): add IdentityFilter
2024-12-10 11:09:20 -08:00
Aleix Conchillo Flaqué
3bf15476a4 processors(filters): add IdentityFilter 2024-12-10 11:01:59 -08:00
Aleix Conchillo Flaqué
acb3e21432 rtvi: update protocol version to 0.3.0 2024-12-10 10:57:42 -08:00
Mark Backman
8c9c81d84b Merge pull request #810 from pipecat-ai/mb/read-the-docs
Changes for Read the Docs hosting
2024-12-10 12:48:26 -05:00
Aleix Conchillo Flaqué
e51e2f781d Merge pull request #765 from simliai/simli
Add Simli Service
2024-12-10 09:23:06 -08:00
Dan Goodman
af6f5ecc86 customize Anthropic client via kwargs, also bumps default model version (#813)
* customize Anthropic client via kwargs

* bump default model
2024-12-10 09:13:44 -08:00
antonyesk601
81a18633ca Remove duplicate frame push if simli connection isn't ready 2024-12-10 10:18:31 +00:00
antonyesk601
397342d0b9 Inizialize simli_client on StartFrame; Follow variable naming scheme; Use logger instead of print statements; 2024-12-10 10:11:07 +00:00
zzz
d6b3a50108 x 2024-12-10 07:50:50 +00:00
Mark Backman
66b08161f1 Changes for Read the Docs hosting 2024-12-10 00:54:21 -05:00
Mark Backman
e7fa1cacce Merge pull request #800 from pipecat-ai/mb/autogen-docs
Auto-generate API reference docs
2024-12-09 22:05:08 -05:00
Mark Backman
2d3864ee09 Move API docs generation to docs/api 2024-12-09 20:44:10 -05:00
Aleix Conchillo Flaqué
0287f06379 Merge pull request #809 from pipecat-ai/aleix/parallel-pipeline-fix-system-frames
fix system frames parallel pipeline
2024-12-09 15:48:27 -08:00
Mark Backman
681c8ffb1d Merge pull request #807 from pipecat-ai/mb/stt-mute-strategy
Add new STT mute strategy, accept a set of strategies
2024-12-09 18:34:30 -05:00
Mark Backman
676643d558 Code review fixes 2024-12-09 18:27:07 -05:00
Mark Backman
0c4cbc2615 Push FunctionCall Frames upstream and downstream; update example 2024-12-09 18:27:07 -05:00
Aleix Conchillo Flaqué
e690c98230 transports(daily): no need for joining flag
This was put back because of an issue in ParallelPipeline but that issue is now
fixed so the joining check is not really necessary.
2024-12-09 09:38:30 -08:00
Aleix Conchillo Flaqué
e0a6c6871c parallel_pipeline: don't queue system frames 2024-12-09 09:38:30 -08:00
Mark Backman
29a042a101 Add changelog entry 2024-12-09 10:52:32 -05:00
Mark Backman
1cc2da571e Add new STT mute strategy, accept a set of strategies 2024-12-09 10:50:08 -05:00
Kwindla Hultman Kramer
c6b401b5d1 Merge pull request #805 from pipecat-ai/khk/parallel-pipeline-fix
Check to avoid double-join in ParallelPipeline case
2024-12-07 21:49:16 -08:00
Kwindla Hultman Kramer
315b7fcc34 check to avoid double-join 2024-12-07 21:22:36 -08:00
Mark Backman
e9f5fe0f37 Merge pull request #802 from Allenmylath/patch-22
Update README.md
2024-12-07 10:14:44 -05:00
allenmylath
64faf2218e Update examples/patient-intake/README.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2024-12-07 19:08:00 +05:30
allenmylath
e77a785a7d Update README.md 2024-12-07 13:36:50 +05:30
Mark Backman
03a269fb87 Merge pull request #801 from pipecat-ai/aleix/rtvi-handle-transport-urgent-frames
rtvi: handle transport urgent frames
2024-12-06 21:33:18 -05:00
Aleix Conchillo Flaqué
d1a55c6063 rtvi: handle transport urgent frames 2024-12-06 17:51:09 -08:00
Mark Backman
61d0fa42f1 Add a workflow to generate the docs 2024-12-06 20:32:33 -05:00
Mark Backman
16de1fca9b Add Read the Docs config 2024-12-06 20:15:17 -05:00
Mark Backman
2ad83f23c8 Initial reference docs commit 2024-12-06 19:44:44 -05:00
Aleix Conchillo Flaqué
422ee98db0 Merge pull request #798 from pipecat-ai/aleix/functioncall-data-frames
frames: FunctionCallResultFrame should be a DataFrame as before
2024-12-06 16:38:23 -08:00
Aleix Conchillo Flaqué
3d4620cf95 frames: FunctionCallResultFrame should be a DataFrame as before 2024-12-06 11:54:50 -08:00
Aleix Conchillo Flaqué
752a6f02b5 Merge pull request #799 from pipecat-ai/aleix/cartesia-interruptions-fix
cartesia: fix broken interruptions
2024-12-06 11:52:22 -08:00
Aleix Conchillo Flaqué
7e41809ec2 cartesia: fix broken interruptions 2024-12-06 11:49:03 -08:00
Aleix Conchillo Flaqué
e344a73d14 Merge pull request #797 from pipecat-ai/aleix/xtts-default-language
services(xtts): default language to Language.EN
2024-12-06 11:00:53 -08:00
Aleix Conchillo Flaqué
d6f480fa50 Merge pull request #791 from pipecat-ai/aleix/fastapi-generic-websocket
FastAPIWebsocketTransport: fix to work with text and binary
2024-12-06 10:46:16 -08:00
Aleix Conchillo Flaqué
423d6485f8 services(xtts): default language to Language.EN 2024-12-06 10:45:20 -08:00
Aleix Conchillo Flaqué
842b3de7f5 FastAPIWebsocketTransport: fix to work with text and binary 2024-12-06 10:31:42 -08:00
Aleix Conchillo Flaqué
3cb7829624 update CHANGELOG 2024-12-06 10:31:11 -08:00
Aleix Conchillo Flaqué
4292507616 Merge pull request #793 from balalofernandez/send-interruption-to-cartesia
fix: Send interruption to cartesia
2024-12-06 10:26:34 -08:00
Aleix Conchillo Flaqué
98c9759f41 Merge pull request #796 from pipecat-ai/aleix/improve-tts-reconnection
services: improve Cartesia, 11Labs, PlayHT and LMNT TTS reconnection
2024-12-06 10:22:54 -08:00
Aleix Conchillo Flaqué
bafb867ffc services: improve Cartesia, 11Labs, PlayHT and LMNT TTS reconnection 2024-12-06 10:11:59 -08:00
Mark Backman
b05809be2e Merge pull request #794 from pipecat-ai/mb/upgrade-anthropic
Upgrade Anthropic to the latest to avoid collision with aiohttp 3.11.9
2024-12-06 12:01:51 -05:00
Mark Backman
57d346ce13 Upgrade Anthropic to the latest to avoid collision with aiohttp 3.11.9 2024-12-06 11:59:19 -05:00
balalo
9001cb17ce Fix interruption frame to avoid issues with sending None 2024-12-06 17:42:46 +01:00
Mark Backman
40cfd9776f Merge pull request #792 from pipecat-ai/mb/cartesia-languages
Add additional languages for Cartesia
2024-12-06 09:57:38 -05:00
Mark Backman
d68b3ad1b2 Add additional languages for Cartesia 2024-12-06 09:22:05 -05:00
Kwindla Hultman Kramer
9b51588b92 Merge pull request #782 from pipecat-ai/khk/flash-transcription
Async Google LLM + Gemini Flash transcription example
2024-12-05 12:50:18 -08:00
Aleix Conchillo Flaqué
9a36a4ca32 Merge pull request #790 from pipecat-ai/aleix/base-output-transport-wait-for-output-tasks
transports(base_output): wait for output tasks on EndFrame
2024-12-05 11:30:55 -08:00
Aleix Conchillo Flaqué
f80a97b545 transports(base_output): wait for output tasks on EndFrame 2024-12-05 11:26:18 -08:00
Mark Backman
274278e229 Merge pull request #789 from pipecat-ai/mb/update-simple-chatbot-demo
Add RTVI transcripts, align styling
2024-12-05 11:56:07 -05:00
Mark Backman
6b94bcac03 Add RTVI transcripts, align styling 2024-12-05 11:12:48 -05:00
Aleix Conchillo Flaqué
969b87dee9 update aiohttp version to 3.11.9 2024-12-05 07:35:21 -08:00
balalo
bc699735a3 Send interruption message to cartesia 2024-12-05 16:23:40 +01:00
Mark Backman
00fd381808 Merge pull request #745 from pipecat-ai/mb/user-idle
Only run the UserIdleProcessor while pipeline is running
2024-12-05 10:12:02 -05:00
Mark Backman
672b1c6d73 Merge pull request #786 from Allenmylath/patch-21
Update README.md
2024-12-05 09:15:24 -05:00
Mark Backman
f455eb171b Merge pull request #784 from pipecat-ai/mb/simple-bot-client
Update the simple-chatbot demo to have JS and React clients
2024-12-05 08:34:33 -05:00
allenmylath
62c8c90e17 Update README.md 2024-12-05 13:23:05 +05:30
Aleix Conchillo Flaqué
28bb448605 Merge pull request #783 from pipecat-ai/aleix/deepgram-vad-event-handlers
deepgram: add VAD event handlers
2024-12-04 19:35:22 -08:00
Aleix Conchillo Flaqué
3d76b30a7c deepgram: add VAD event handlers 2024-12-04 19:31:09 -08:00
Aleix Conchillo Flaqué
0ae8ca0813 Merge pull request #781 from pipecat-ai/aleix/websocket-transports-mixer-fixes
websocket transports mixer fixes
2024-12-04 19:12:20 -08:00
Aleix Conchillo Flaqué
0935d773f5 transport(websockets): fix initial busy loop when using audio mixers 2024-12-04 19:10:39 -08:00
Aleix Conchillo Flaqué
e0f7a8a9f4 audio(mixer): SoundfileMixer doesn't resample files anymore 2024-12-04 19:09:50 -08:00
Aleix Conchillo Flaqué
2a0e01898f Merge pull request #780 from pipecat-ai/aleix/gstreamer-default-sample-rate
gstreamer: update default sample rate to 24000
2024-12-04 19:09:02 -08:00
Aleix Conchillo Flaqué
9d25e325dd Merge pull request #779 from pipecat-ai/aleix/websocket-server-audio-mixins-fix
frames: fix AudioRawFrame mixin
2024-12-04 19:08:41 -08:00
Aleix Conchillo Flaqué
37c21426bf Merge pull request #778 from pipecat-ai/aleix/transports-disconnect-on-last-transport
transports: fix premature input transport closing
2024-12-04 19:08:23 -08:00
Mark Backman
c467ec8ded Merge pull request #772 from pipecat-ai/mb/nim-llm
Add a NIM LLM service
2024-12-04 21:41:09 -05:00
Kwindla Hultman Kramer
a367a038f1 fix for finally clause 2024-12-04 18:31:30 -08:00
Mark Backman
e45a123eab Add image to README 2024-12-04 21:29:22 -05:00
Mark Backman
2ecc0e2b13 Remove node modules 2024-12-04 21:28:17 -05:00
Mark Backman
d532e924cd Add .gitignore 2024-12-04 21:28:17 -05:00
Mark Backman
36208049dc Update changelog 2024-12-04 21:28:17 -05:00
Mark Backman
1d11419691 Update the simple-chatbot demo to have JS and React clients 2024-12-04 21:13:14 -05:00
Mark Backman
05451f882d Merge pull request #777 from pipecat-ai/mb/twilio-example
Improve twilio-chatbot README
2024-12-04 20:26:45 -05:00
Kwindla Hultman Kramer
9c22f5b81b async google llm 2024-12-04 15:52:52 -08:00
Aleix Conchillo Flaqué
891f261191 gstreamer: update default sample rate to 24000 2024-12-04 14:41:44 -08:00
Aleix Conchillo Flaqué
13c27eaa1d frames: fix AudioRawFrame mixin 2024-12-04 13:25:37 -08:00
Mark Backman
c395d1a234 Merge pull request #773 from Allenmylath/patch-20
Update README.md
2024-12-04 14:45:38 -05:00
Mark Backman
49639c8631 Improve the twilio-chatbot README 2024-12-04 14:42:05 -05:00
Mark Backman
695a98a1f7 Remove streams.xml from version control 2024-12-04 14:26:10 -05:00
Mark Backman
5cbc37472c Update .gitignore to exclude streams.xml 2024-12-04 14:25:10 -05:00
Aleix Conchillo Flaqué
5b6d9a1050 transports: fix premature input transport closing 2024-12-04 10:56:57 -08:00
allenmylath
332d36475b Update examples/patient-intake/README.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2024-12-04 23:27:25 +05:30
Mark Backman
29b67578e3 Update README 2024-12-04 12:52:09 -05:00
Mark Backman
9db3743901 Update pyproject.toml with a nim optional dep 2024-12-04 12:52:09 -05:00
Mark Backman
496aded031 Update changelog 2024-12-04 12:38:05 -05:00
Mark Backman
1c1fa0db65 Add a NIM LLM service 2024-12-04 12:35:24 -05:00
Mark Backman
a2ad40d7e0 Merge pull request #775 from pipecat-ai/mb/llm-stubs
Added LLM services for GroqLLMService and GrokLLMService
2024-12-04 12:26:19 -05:00
Mark Backman
2bb3682d88 Update README 2024-12-04 12:24:39 -05:00
Kwindla Hultman Kramer
f33f08d667 partially working audio+transcription parallel pipelines 2024-12-04 08:51:35 -08:00
Mark Backman
d9bc2b618f Update FireworksLLMService to use OpenAILLMService 2024-12-04 11:51:05 -05:00
Mark Backman
d5a50e2cad Update AzureLLMService to use OpenAILLMService 2024-12-04 11:01:56 -05:00
Mark Backman
7013343bf0 Update the changelog 2024-12-04 10:10:55 -05:00
Mark Backman
728acba8a5 Add LLMService stubs for Grok and Groq, add examples 2024-12-04 10:08:28 -05:00
allenmylath
3b2c78747c Update README.md 2024-12-04 10:24:17 +05:30
allenmylath
44a0acffc8 Update README.md 2024-12-04 10:21:17 +05:30
Aleix Conchillo Flaqué
c31d5a4f1a Merge pull request #771 from pipecat-ai/aleix/daily-execute-callbacks-from-task
transports(daily): use a task to execute callbacks
2024-12-03 19:55:38 -08:00
Aleix Conchillo Flaqué
52caaa4afb transports(daily): use a task to execute callbacks
This commit fixes an issue where we were not waiting for
`asyncio.run_coroutine_threadsafe` to complete which can cause a series of
undesired issues (e.g. not actually executing the coroutine).
2024-12-03 18:58:54 -08:00
Aleix Conchillo Flaqué
115e75d808 Merge pull request #770 from pipecat-ai/aleix/system-input-frames-and-audio-buffer-processor
system input frames and audio buffer processor fixes
2024-12-03 18:58:13 -08:00
Mark Backman
897e024dd8 Only run the UserIdleProcessor while pipeline is running 2024-12-03 21:09:03 -05:00
Aleix Conchillo Flaqué
1cf93f1dcb FrameProcessor: ignore other frames during CancelFrame 2024-12-03 16:26:29 -08:00
Aleix Conchillo Flaqué
d278996d5b updated CHANGELOG 2024-12-03 16:12:40 -08:00
Aleix Conchillo Flaqué
322dd0cea1 AudioBufferProcessor: use on_audio_data event handler to retrieve audio 2024-12-03 16:12:40 -08:00
Aleix Conchillo Flaqué
a6a4910931 transports(services): incoming transport messages should be urgent 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
52cefaa9d6 frames: remove AppFrame 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
42658ecd92 frames: use mixins for audio and image data 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
a6606a4040 transports(base_output): remove unused code 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
d6c944cdc1 processors(audio): fix AudioBufferProcessor interruptions 2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
a5c7b02a73 frames: input frames are now system frames
Input frames from a transport should be processed fast and there's no need for
them to be queued internally in each element.
2024-12-03 14:30:15 -08:00
Aleix Conchillo Flaqué
6b9223d87e Merge pull request #768 from pipecat-ai/aleix/websocket-server-interruptions
transports(websockets): use frame serializers during interruptions
2024-12-02 19:18:20 -08:00
Aleix Conchillo Flaqué
c2135cbe11 transports(websockets): use frame serializers during interruptions 2024-12-02 19:17:17 -08:00
Aleix Conchillo Flaqué
32495ddd0b Merge pull request #769 from pipecat-ai/aleix/daily-subscribe-video-source
transports(daily): subscribe to the desired video source
2024-12-02 19:16:14 -08:00
Aleix Conchillo Flaqué
4301f0abf7 Merge pull request #767 from pipecat-ai/aleix/warn-transcription-no-token
transports(daily): warn if transcription enabled but no token provided
2024-12-02 15:06:35 -08:00
Aleix Conchillo Flaqué
5e854c4d03 transports(daily): subscribe to the desired video source 2024-12-02 12:13:23 -08:00
Aleix Conchillo Flaqué
bec46a87ae Merge pull request #766 from Allenmylath/patch-20
Update requirements.txt
2024-12-02 10:32:36 -08:00
Aleix Conchillo Flaqué
71cf94e936 transports(daily): warn if transcription enabled but no token provided 2024-12-02 09:55:17 -08:00
allenmylath
acbecf1c4c Update requirements.txt
daily is not used here.transport is fastapi websocket.
2024-12-02 21:36:29 +05:30
Mark Backman
6095fd342e Merge pull request #763 from Allenmylath/patch-19
Update README.md
2024-12-02 09:30:36 -05:00
Waleed
bf40b4936b updated env template; added simli variables 2024-12-02 12:05:55 +01:00
Waleed
c60dd8d4d2 updated environment variable name for cartesia 2024-12-02 12:05:32 +01:00
Waleed
d472aaf391 updated readme. Added simli 2024-12-02 11:50:51 +01:00
Waleed
6cc0b74e6c integrated simli 2024-12-02 11:35:46 +01:00
allenmylath
23316fbcf9 Update README.md 2024-12-02 13:35:44 +05:30
James Hush
5e22ef251d fix: add logging and error handling for issue #721 (#755) 2024-11-29 13:06:45 +08:00
Mark Backman
c5324df807 Merge pull request #752 from pipecat-ai/mb/google-context-message-conversion
Use Google Gemini message format when adding message to the LLM context
2024-11-27 14:13:17 -05:00
Mark Backman
3c19a7ae3d Use Google Gemini message format when adding message to the LLM context 2024-11-27 12:46:51 -05:00
Mark Backman
98c0a6e047 Merge pull request #749 from pipecat-ai/mb/pipecat-flows-standalone
Make Pipecat Flows an independent package
2024-11-25 17:09:11 -05:00
Mark Backman
f599e160de Make Pipecat Flows an independent package 2024-11-25 13:42:08 -05:00
Mark Backman
11c5d822f9 Merge pull request #746 from pipecat-ai/mb/update-flows
Bumping pipecat-ai-flows version
2024-11-22 11:25:03 -05:00
Mark Backman
c3e22f0931 Bumping pipecat-ai-flows version 2024-11-22 11:21:40 -05:00
Kwindla Hultman Kramer
9409546f90 Merge pull request #743 from pipecat-ai/khk/gemini-exp
Empty text content bug fix for Gemini
2024-11-21 14:04:28 -08:00
Kwindla Hultman Kramer
8ddac0ccd8 Testing with gemini-exp-1114. Bug fix 2024-11-21 10:33:12 -08:00
Vaibhav159
6e8e7fa19a adding session_timeout in fastapi 2024-11-21 14:56:42 +05:30
Vaibhav159
7dfa886669 moving logic to WebsocketServerInputTransport 2024-11-21 14:45:24 +05:30
Vaibhav159
da254c5143 correcting _monitor_websocket 2024-11-21 12:36:51 +05:30
Vaibhav159
e11f128110 adding on_session_timeout 2024-11-21 12:34:32 +05:30
Vaibhav-Lodha
3aa89fb13a adding session_timeout param 2024-11-21 12:20:51 +05:30
Mark Backman
f938960d50 Merge pull request #736 from pipecat-ai/mb/language-support
Make language support more robust
2024-11-20 13:03:47 -05:00
Mark Backman
2981d87bc1 Update changelog 2024-11-20 12:56:35 -05:00
Mark Backman
106042bbb2 Make language support more robust 2024-11-20 12:56:11 -05:00
Filipi da Silva Fuchter
d25ddeb962 Merge pull request #739 from pipecat-ai/krisp_v7
bumping krisp to support v7
2024-11-20 11:39:39 -03:00
Filipi Fuchter
c441baa692 bumping krisp to support v7 2024-11-20 11:37:45 -03:00
Mark Backman
676ff14913 Merge pull request #735 from pipecat-ai/vp-internal-push-frame-fix
internal push frame fix
2024-11-20 06:34:40 -05:00
Vanessa Pyne
14893ade92 Update src/pipecat/processors/frame_processor.py
Co-authored-by: Mark Backman <mark@daily.co>
2024-11-19 22:37:58 -06:00
Mark Backman
2a39ff69d6 Merge pull request #720 from pipecat-ai/mb/conversation-flow 2024-11-19 21:46:20 -05:00
Mark Backman
e79289454a Merge pull request #734 from pipecat-ai/mb/fix-cartesia 2024-11-19 21:27:52 -05:00
Mark Backman
25d02da1b2 Merge pull request #738 from pipecat-ai/mb/natural-conversation-demo 2024-11-19 21:27:38 -05:00
Mark Backman
a36fc370fa Improve the 22c foundational example 2024-11-19 15:49:40 -05:00
Mark Backman
e4c2f6d4c2 Update changelog 2024-11-18 21:32:53 -05:00
Mark Backman
97659ca3f0 Use the new pipecat-ai-flows module 2024-11-18 21:29:35 -05:00
vipyne
e00c75ce3f fix: raise exception in internal_push_frame 2024-11-18 16:01:04 -06:00
Mark Backman
cf62167f54 Revert: services(cartesia): generated TTSStoppedFrame after no more audio 2024-11-18 12:25:04 -05:00
Mark Backman
b3dfeb61c4 Add CHANGELOG entry 2024-11-18 12:18:20 -05:00
Mark Backman
bd020320cd Support a list of messages 2024-11-18 12:18:20 -05:00
Mark Backman
7a55d2d7db Add end session handler and update example 2024-11-18 12:18:20 -05:00
Mark Backman
b7308dca5d Fix issue where actions would execute on terminating nodes 2024-11-18 12:18:20 -05:00
Mark Backman
5301f44b3b Add pre- and post-actions 2024-11-18 12:18:20 -05:00
Mark Backman
686165b95a Add ability to register actions 2024-11-18 12:18:20 -05:00
Mark Backman
4e0ecdd673 Class name updates and remove FrameProcessor base class 2024-11-18 12:18:20 -05:00
Mark Backman
1b74560f9d Move function registration into the ConversationFlowProcessor class 2024-11-18 12:18:20 -05:00
Mark Backman
0c1070433f Clean up and commenting 2024-11-18 12:18:20 -05:00
Mark Backman
ece2c08cde debugging 2024-11-18 12:18:20 -05:00
Mark Backman
0b9742da9e Add a conversation flow processor 2024-11-18 12:18:20 -05:00
Aleix Conchillo Flaqué
635aa6eb5b Merge pull request #729 from pipecat-ai/aleix/fastapi-websocket-dont-close
transports(fastapi): don't try to close socket
2024-11-18 16:01:41 +01:00
Mark Backman
1ff17cc2b6 Merge pull request #733 from pipecat-ai/aleix/add-missing-init-files
processors: add missing __init__.py
2024-11-18 09:44:56 -05:00
Mark Backman
41ce9e9087 Merge pull request #697 from pipecat-ai/cst/leave-message
add handler for disconnect-bot message
2024-11-18 09:38:11 -05:00
Mark Backman
4803c54ecf Update CHANGELOG 2024-11-18 09:36:19 -05:00
Christian Stuff
5d7b3f2b38 add handler for disconnect-bot message 2024-11-18 09:33:30 -05:00
Aleix Conchillo Flaqué
23e5b1ec4d processors: add missing __init__.py 2024-11-18 11:32:20 +01:00
Aleix Conchillo Flaqué
7f5a8928b8 transports(fastapi): don't try to close socket
The websocket is passed from outside (in the transport constructor) so we should
not be trying to close it. FastAPI does actually close it later. We didn't see
any issue because these functions were not implemented properly. The value to
check was `application_state` instead of `client_state`. But in any case,
Pipecat should not be responsible for closing things passed from outside.
2024-11-18 01:15:19 +01:00
Aleix Conchillo Flaqué
53f675f5cf Merge pull request #727 from pipecat-ai/aleix/pipecat-0.0.49
update CHANGELOG for 0.0.49
2024-11-18 06:27:12 +08:00
Aleix Conchillo Flaqué
8173e4ce55 update CHANGELOG for 0.0.49 2024-11-17 23:26:09 +01:00
Aleix Conchillo Flaqué
5445bb0363 rtvi: add on_bot_started event 2024-11-17 22:40:00 +01:00
Mark Backman
a2a94724e5 Merge pull request #725 from pipecat-ai/mb/fix-simple-chatbot
Fix simple-chatbot example
2024-11-16 12:10:05 -05:00
Aleix Conchillo Flaqué
a8f9b0635a Merge pull request #722 from pipecat-ai/aleix/more-dailin-events
transports(daily): add more dial-in events
2024-11-17 01:09:01 +08:00
Mark Backman
4273a31fd5 Fix simple-chatbot example 2024-11-16 07:48:42 -05:00
Aleix Conchillo Flaqué
67f975a2c8 transports(daily): add more dial-in events 2024-11-16 01:22:50 +01:00
Mark Backman
d0bca67666 Merge pull request #716 from pipecat-ai/mb/mute-stt-service
Add STTMuteFilter to un/mute the STT
2024-11-14 19:55:00 -05:00
Mark Backman
966974bfc6 Change STTMuteProcessor to STTMuteFilter 2024-11-14 19:47:37 -05:00
Mark Backman
f807f233bd Suppress UserStartedSpeakingFrame and UserStoppedSpeakingFrame when muted 2024-11-14 17:11:51 -05:00
Mark Backman
33108f5798 Code review feedback 2024-11-14 17:05:08 -05:00
Mark Backman
52de825af8 Update CHANGELOG 2024-11-14 13:47:08 -05:00
Mark Backman
5fe679039c Add STTMuteProcessor to un/mute the STT 2024-11-14 13:35:02 -05:00
Kwindla Hultman Kramer
534f710f5d Merge pull request #688 from pipecat-ai/khk/natural-conversation
More work on llm-as-judge phrase endpointing
2024-11-14 09:15:16 -08:00
Mark Backman
53a11744a8 Merge pull request #712 from pipecat-ai/aleix/some-languages-tweaks
some languages tweaks
2024-11-14 09:33:26 -05:00
Mark Backman
72412cc0c4 Code review feedback 2024-11-14 09:31:04 -05:00
Mark Backman
b77ac07bc6 Merge pull request #715 from pipecat-ai/mb/update-readme-2
Add visual divider below Pipecat README image
2024-11-14 08:54:25 -05:00
Mark Backman
eb6926e0ce Add visual divider below Pipecat README image 2024-11-14 08:51:07 -05:00
Mark Backman
3b2c9de944 Merge pull request #713 from pipecat-ai/mb/update-readme
Update README
2024-11-14 08:45:28 -05:00
Mark Backman
27ff868e5a Move CONTRIBUTING to top directory 2024-11-14 08:43:03 -05:00
Mark Backman
57ef525a8e Update README 2024-11-14 08:43:03 -05:00
Aleix Conchillo Flaqué
d1db54d5fe examples(playht): use a 2.0 engine 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
4f88fc0eb8 services(tts): initialize language to the proper language code 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
37d1f4c4e1 services(tts): some language to service language cleanup 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
ef9e86d997 services(playht): make sure we only skip wav header no matter the size 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
2d2ef5a417 services(playht): voice engine is Play3.0-mini 2024-11-13 17:19:23 +01:00
Aleix Conchillo Flaqué
c1fff00586 services(playht): fix language codes 2024-11-13 17:19:23 +01:00
Mark Backman
0af2196f50 Merge pull request #708 from pipecat-ai/mb/add-rime-ai
Add RimeTTSService
2024-11-12 18:29:53 -05:00
Mark Backman
cd42320788 Update changelog 2024-11-12 18:28:04 -05:00
Mark Backman
70fce52499 Merge pull request #710 from pipecat-ai/mb/update-readme-krisp
Update Krisp README instructions
2024-11-12 11:15:25 -05:00
Mark Backman
70b60c0593 Update Krisp README instructions 2024-11-12 10:26:12 -05:00
Jon Taylor
2d8aa03f31 Merge pull request #706 from pipecat-ai/jpt/modal-example
barebones modal.com deployment example
2024-11-12 11:41:00 +00:00
Kwindla Hultman Kramer
581ff26704 Merge pull request #707 from pipecat-ai/khk/clean-up
tiny PR to remove old comment lines
2024-11-11 21:14:16 -08:00
Kwindla Hultman Kramer
335178ff06 some gemini audio input examples 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
ee53535f41 gemini audio-in with no transcription 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
91ac40307e small fix and more prompt examples 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
b6c2c1f730 anthropic natural conversation example using claude haiku 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
b56c789ae4 fixes for proposed judge pipeline 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
bd435d9e62 missing commit 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
55a81df84f contributing to llm-as-judge phrase endpointing work 2024-11-11 21:04:50 -08:00
Kwindla Hultman Kramer
87434460f5 temp hacking 2024-11-11 21:04:50 -08:00
Mark Backman
958ec42e8d Add Rime.ai TTS service 2024-11-11 21:58:09 -05:00
Jon Taylor
d1fff60d1d barebones modal.com deployment example 2024-11-11 22:30:07 +00:00
Kwindla Hultman Kramer
1438e5654a remove old comment 2024-11-10 16:08:10 -08:00
Aleix Conchillo Flaqué
1d4be0139a Merge pull request #705 from pipecat-ai/aleix/prepare-0.0.48
update CHANGELOG for 0.0.48
2024-11-10 14:08:33 -08:00
Aleix Conchillo Flaqué
f58c3ee322 update CHANGELOG for 0.0.48 2024-11-10 23:01:03 +01:00
Aleix Conchillo Flaqué
379750df91 Merge pull request #704 from pipecat-ai/aleix/cartesia-tts-stopped-frame
services(cartesia): generated TTSStoppedFrame after no more audio
2024-11-10 05:17:36 -08:00
Aleix Conchillo Flaqué
d125a38737 services(cartesia): generated TTSStoppedFrame after no more audio
The TTSStoppedFrame should be generated when the TTS services stoped generating
audio not when the bot stops speaking.
2024-11-10 09:55:45 +01:00
Mark Backman
446bb0aeaf Merge pull request #702 from pipecat-ai/mb/azure-websocket
Add an Azure TTS websocket service
2024-11-09 17:41:53 -05:00
Aleix Conchillo Flaqué
d839080834 Merge pull request #642 from pipecat-ai/aleix/input-queues-block-frames
introduce frame processor input queues block frames
2024-11-09 14:30:17 -08:00
Mark Backman
9b85d0642b Add a changelog entry 2024-11-09 12:37:29 -05:00
Mark Backman
230b51a117 Add an Azure TTS websocket service 2024-11-09 12:37:29 -05:00
Mark Backman
3a965ca396 Merge pull request #701 from pipecat-ai/khk/anthropic-function-calling-fix
fixes for anthropic function calling
2024-11-09 06:39:34 -05:00
Kwindla Hultman Kramer
33fc5bf990 improved 20c-persistent-context-anthropic.py 2024-11-08 16:42:30 -08:00
Kwindla Hultman Kramer
a54ca08405 fixes for anthropic function calling 2024-11-08 16:33:02 -08:00
Filipi da Silva Fuchter
4379db43ed Merge pull request #689 from pipecat-ai/filipi/krisp
Making pipecat work with Krisp
2024-11-08 16:22:52 -03:00
Filipi Fuchter
e915c676aa Added support for Krisp audio filter 2024-11-08 16:18:10 -03:00
Mark Backman
e0a003afa1 Merge pull request #695 from pipecat-ai/mb/initialize-azure-lang
Initialize the speech_recognition_language for Azure TTS
2024-11-08 06:40:40 -05:00
James Hush
d5666727ce feat: toggle looping with soundfile mixer (#693)
* feat: toggle looping with soundfile mixer

* Implement PR changes
2024-11-07 21:08:37 -08:00
Mark Backman
f6d7402530 Update changelog 2024-11-07 15:16:03 -05:00
Mark Backman
aefe190c9f Initialize the speech_recognition_language for Azure TTS 2024-11-07 15:14:05 -05:00
Vanessa Pyne
29925a8f21 Merge pull request #551 from Allenmylath/patch-3
Frame types and short descriptionCreate Frames.md
2024-11-07 10:05:32 -06:00
Aleix Conchillo Flaqué
beb3271168 services(tts): make sure word timestamp is reset properly 2024-11-06 18:54:12 -08:00
Aleix Conchillo Flaqué
b959ac6e1e Merge pull request #694 from pipecat-ai/aleix/daily-add-on-transcription-message
transports(daily): call on_transcription_message event handler
2024-11-06 15:21:17 -08:00
Aleix Conchillo Flaqué
17f4286942 transports(daily): call on_transcription_message event handler 2024-11-06 15:10:58 -08:00
Aleix Conchillo Flaqué
ce89bbb16e tts(elevenlabs): support pausing and resuming frames while speaking 2024-11-06 14:38:33 -08:00
Aleix Conchillo Flaqué
865768039b processors: remove block_on_frames and add pause_processing_frames() instead 2024-11-06 14:20:25 -08:00
Aleix Conchillo Flaqué
7071482583 try to use queue_frame() instead of process_frame() 2024-11-06 14:18:21 -08:00
Aleix Conchillo Flaqué
5353d13151 update CHANGELOG 2024-11-06 13:16:58 -08:00
Aleix Conchillo Flaqué
a9e565f355 processors: fix input queue interruptions 2024-11-06 13:12:24 -08:00
Aleix Conchillo Flaqué
b6f0c16591 examples: restore EndFrame() on 01 and 02 foundational 2024-11-06 13:05:03 -08:00
Aleix Conchillo Flaqué
49005d02f5 services(tts): use TTSSpeakFrame in say() method 2024-11-06 13:05:03 -08:00
Aleix Conchillo Flaqué
6d8b885071 transports(base_output): push bot started/stopped frames downstream 2024-11-06 13:04:37 -08:00
Aleix Conchillo Flaqué
2eccb33e73 processors: allow passing a callback when queued frame is processed 2024-11-06 13:04:37 -08:00
Aleix Conchillo Flaqué
22ca4c5a02 processors: cancel input task and empty queue with interruptions 2024-11-06 13:04:37 -08:00
Aleix Conchillo Flaqué
84f26ac1ca processors: introduce input queues
Frame processors can now decide if they should continue processing frames or
not, and if so also decide when to continue processing frames. For example,
asynchronous TTS services will stop processing frames until they have generated
all the audio for an LLM response.
2024-11-06 12:13:49 -08:00
Aleix Conchillo Flaqué
74937411e6 Merge pull request #691 from pipecat-ai/aleix/rtvi-manual-bot-ready
rtvi: bot-ready message needs to be sent manual
2024-11-06 10:53:25 -08:00
Aleix Conchillo Flaqué
8aab068ffd rtvi: bot-ready message needs to be sent manual 2024-11-05 10:52:54 -08:00
Aleix Conchillo Flaqué
bd50201ce4 transports(daily): just make it clear we subscribe to camera 2024-11-04 17:32:46 -08:00
Aleix Conchillo Flaqué
6082da284e Merge pull request #611 from pipecat-ai/aleix/audio-filters
introduce audio filters
2024-11-04 16:34:47 -08:00
Aleix Conchillo Flaqué
358c458265 transports(base_input): handle filter contorl frames 2024-11-04 16:19:52 -08:00
Aleix Conchillo Flaqué
807dbbe326 audio(noisereduce): allow enabling/disabling filter 2024-11-04 16:13:29 -08:00
Aleix Conchillo Flaqué
3c116b291d audio(mixers): some cosmetics 2024-11-04 15:37:08 -08:00
Aleix Conchillo Flaqué
0dd413ee90 audio(filters): add noisereduce filter 2024-11-04 15:37:08 -08:00
Aleix Conchillo Flaqué
abc8ede3d7 introduce audio filters 2024-11-04 15:37:08 -08:00
Aleix Conchillo Flaqué
126324ca1b Merge pull request #687 from pipecat-ai/aleix/transport-audio-mixers
introduce transport audio mixers
2024-11-04 13:14:36 -08:00
Aleix Conchillo Flaqué
602915ae18 examples(websocket-server): allow interruptions 2024-11-04 13:05:02 -08:00
Aleix Conchillo Flaqué
0ac9e2dd3f transports(network): synchronize with time before sending data 2024-11-04 13:04:18 -08:00
Aleix Conchillo Flaqué
a9ef5ca95d examples: add bot background sound example 2024-11-03 11:13:02 -08:00
Aleix Conchillo Flaqué
81c476dd4c introduce output transport audio mixers 2024-11-03 11:13:02 -08:00
Kwindla Hultman Kramer
151242d3a0 Merge pull request #666 from pipecat-ai/khk/realtime-pipecat-vad
Support using Pipecat turn detection instead of OpenAI Realtime API turn detection
2024-11-02 08:36:31 -07:00
Kwindla Hultman Kramer
93c6e5098c added comment explaining config of TurnDetection 2024-11-02 08:24:54 -07:00
Aleix Conchillo Flaqué
4455b2a428 rtvi: create queues before tasks 2024-11-01 23:06:50 -07:00
Aleix Conchillo Flaqué
94062592ef base_output: generate smaller audio frames of the same incoming type 2024-11-01 23:06:50 -07:00
Aleix Conchillo Flaqué
d2401a76c8 base_output: only generate bot speaking with TTS audio frames 2024-11-01 23:06:50 -07:00
Aleix Conchillo Flaqué
e2b1b56e86 examples: don't require room token if using an STT 2024-11-01 23:06:50 -07:00
Mark Backman
84bd767312 Merge pull request #685 from pipecat-ai/mb/add-recording-events
Add recording events and callbacks
2024-11-01 12:02:46 -04:00
Mark Backman
802c29e9e1 Add recording events and callbacks 2024-11-01 10:20:00 -04:00
Aleix Conchillo Flaqué
f83381860c Merge pull request #677 from pipecat-ai/aleix/add-notifier-and-notifier-filters
add notifiers and more frame filters
2024-10-31 15:55:07 -07:00
Aleix Conchillo Flaqué
4dad1bfe49 examples: add foundational/22-natural-conversation.py 2024-10-31 12:10:33 -07:00
marcus-daily
9ee8896b64 Removing unnecessary ruff arguments from README 2024-10-31 18:02:29 +00:00
marcus-daily
5f7a2f66d4 Add .idea to .gitignore 2024-10-31 18:02:29 +00:00
marcus-daily
76e5f1e847 Remove unnecessary ruff params in CI 2024-10-31 15:07:28 +00:00
marcus-daily
6975340d6c Set Ruff config for the project 2024-10-31 15:07:28 +00:00
marcus-daily
0f4cf56418 Load dotenv in simple chatbot server (fixes #415) 2024-10-31 12:08:30 +00:00
Aleix Conchillo Flaqué
018e51e8a3 add notifiers and more frame filters 2024-10-30 16:36:17 -07:00
Vanessa Pyne
b050143952 Merge pull request #676 from RonakAgarwalVani/fix/chunk-choices-delta-none
Fix uncaught exception when accessing 'tool_calls' in NoneType delta in response handling
2024-10-30 14:44:32 -05:00
Mark Backman
98ea1f0791 Merge pull request #675 from pipecat-ai/mb/playht-add-request-id
Add a request_id to each TTS sequence
2024-10-30 13:56:15 -04:00
Mark Backman
8272c35527 Use a request_id in TTS commands for the PlayHT websocket service 2024-10-30 13:54:18 -04:00
Mark Backman
e973e82e05 Merge pull request #672 from pipecat-ai/mb/fix-playht
Fix PlayHT TTFB metrics
2024-10-30 13:53:02 -04:00
RonakAgarwalVani
d1396bf618 Update openai.py 2024-10-30 14:26:49 +05:30
Vanessa Pyne
8186e423de Merge pull request #637 from pipecat-ai/vp-issue-template
docs: add ISSUE_TEMPLATE.md
2024-10-29 15:08:42 -05:00
vipyne
3010addb8b docs: add CONTRIBUTING.md 2024-10-29 15:03:07 -05:00
vipyne
029e0d391e docs: add ISSUE_TEMPLATE.md 2024-10-29 15:03:07 -05:00
Vanessa Pyne
bf31223577 Merge pull request #671 from pipecat-ai/vp-issue-635
docs: small fix
2024-10-29 14:34:13 -05:00
vipyne
42cc79154f docs: small fix 2024-10-29 14:33:57 -05:00
Mark Backman
05b857006a Update changelog 2024-10-28 20:56:29 -04:00
Mark Backman
2e57d21b89 Fix ttfb metrics 2024-10-28 20:27:24 -04:00
Aleix Conchillo Flaqué
fa05ec46be Merge pull request #667 from pipecat-ai/aleix/base-output-bot-speaking-detection
transports(base_output): use audio frames for bot speaking detection
2024-10-28 10:54:54 -07:00
Aleix Conchillo Flaqué
e3ce619284 transports(base_output): use audio frames for bot speaking detection 2024-10-28 10:07:37 -07:00
Vanessa Pyne
fb512dcd74 Merge pull request #630 from MoofSoup/update-readme
docs: simplify readme
2024-10-28 10:26:30 -05:00
Aleix Conchillo Flaqué
ca15d97383 Merge pull request #662 from pipecat-ai/aleix/daily-transport-async-functions
transports(daily): make functions async
2024-10-25 16:14:06 -07:00
Aleix Conchillo Flaqué
b32448e967 transports(daily): make functions async 2024-10-25 15:01:52 -07:00
Aleix Conchillo Flaqué
7e30da6183 Merge pull request #661 from pipecat-ai/aleix/allow-updating-subscritption-before
transports(daily): allow updating subscriptions before join
2024-10-25 15:00:34 -07:00
Aleix Conchillo Flaqué
a6dd2600d2 examples(tavus): await update_subscriptions 2024-10-25 14:56:56 -07:00
Aleix Conchillo Flaqué
b905b57dfc transports(daily): allow updating subscriptions before join 2024-10-25 14:46:17 -07:00
Kwindla Hultman Kramer
e1a7edfb58 make it possible to use Pipecat turn detection instead of OpenAI turn detection 2024-10-25 15:59:48 -05:00
Aleix Conchillo Flaqué
1b30b1fc23 Merge pull request #665 from pipecat-ai/aleix/fix-bot-started-stopped-speaking
transports(base_output): fix constant bot started/stopped speaking fr…
2024-10-25 13:00:38 -07:00
Aleix Conchillo Flaqué
55026898f6 transports(base_output): use vad stop secs for bot stopped speaking 2024-10-25 12:59:15 -07:00
Aleix Conchillo Flaqué
4283557894 audio(vad): expose params property 2024-10-25 12:59:15 -07:00
Aleix Conchillo Flaqué
5ab00e01aa transports(base_output): fix constant bot started/stopped speaking frames 2024-10-25 12:10:24 -07:00
Aleix Conchillo Flaqué
fcfc729e83 Merge pull request #664 from pipecat-ai/aleix/fix-aws-stuttering
services(aws): read stream and resample in a thread
2024-10-25 11:49:28 -07:00
Aleix Conchillo Flaqué
4eacb34fd8 services(aws): read stream and resample in a thread 2024-10-25 11:22:28 -07:00
Aleix Conchillo Flaqué
3a8aacccf7 Merge pull request #663 from pipecat-ai/aleix/audio-resampling-with-resampy
audio: use resamply for audio resampling
2024-10-25 10:16:20 -07:00
roey
54c0bf0c70 Adding TavusVideoService layer (#617)
Co-authored-by: roey <159067767+roey-tavus@users.noreply.github.com>
Co-authored-by: Mert Gerdan <mert@tavus.io>
Co-authored-by: Aleix Conchillo Flaqué <aleix@daily.co>
2024-10-25 09:46:25 -07:00
Aleix Conchillo Flaqué
778b05a252 audio: use resamply for audio resampling 2024-10-25 09:22:22 -07:00
Mark Backman
f16a416c2b Merge pull request #660 from pipecat-ai/mb/add-gemini-inputs
Add input params to Google Gemini
2024-10-24 20:58:19 -04:00
Aleix Conchillo Flaqué
1be63bccb8 Merge pull request #647 from pipecat-ai/aleix/daily-transport-only-transcribe-users
transport(daily): only transcribe users
2024-10-24 17:40:34 -07:00
Mark Backman
37820ac0df Add input params to Google Gemini 2024-10-24 20:12:41 -04:00
Aleix Conchillo Flaqué
8ea80d43f4 transports(daily): only transcribe user audio 2024-10-24 17:06:43 -07:00
Aleix Conchillo Flaqué
e117d70a00 update to daily-python 0.12.0 2024-10-24 16:49:19 -07:00
Aleix Conchillo Flaqué
2ba753272a Merge pull request #658 from pipecat-ai/aleix/default-to-24000-sample-rate
update TTS and transport output sample rate to 24000
2024-10-24 16:48:41 -07:00
Aleix Conchillo Flaqué
60c8c2f6e9 examples(15a): use daily transcription instead of local whisper 2024-10-24 16:47:41 -07:00
Aleix Conchillo Flaqué
cfb48200c2 services(azure): support sample rates 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
6d317c6e8e audio: don't resample if same sample rate 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
158d52856f transports(livekit): fix VADAnalyzer import 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
92a69e404f update TTS and transport output sample rate to 24000 2024-10-24 16:47:35 -07:00
Aleix Conchillo Flaqué
d24c6185d8 Merge pull request #654 from pipecat-ai/aleix/daily-allow-completion-futures
transport(daily): allow completion futures
2024-10-24 14:28:53 -07:00
Mark Backman
1fd21578a6 Merge pull request #657 from pipecat-ai/mb/add-elevenlabs-output-format-type
Add ElevenLabs output format type
2024-10-24 17:07:04 -04:00
Mark Backman
700db87127 Merge pull request #656 from pipecat-ai/mb/add-gemini-metrics
Add Gemini token usage metrics
2024-10-24 17:04:56 -04:00
Mark Backman
6f1310569c Add ElevenLabs output format type 2024-10-24 17:03:45 -04:00
Aleix Conchillo Flaqué
14cedb0be8 Merge pull request #655 from pipecat-ai/aleix/fix-together-params
services(together): fix together AI InputParams
2024-10-24 13:51:38 -07:00
Mark Backman
fae97f9051 Add Gemini token usage metrics 2024-10-24 16:37:21 -04:00
Aleix Conchillo Flaqué
d930a46e64 services(together): fix together AI InputParams 2024-10-24 13:08:35 -07:00
Aleix Conchillo Flaqué
2e6b5d1843 transports(daily): fix aiohttp timeout 2024-10-24 11:44:30 -07:00
Aleix Conchillo Flaqué
88362db034 transports(daily): no more need for an output message queue 2024-10-24 11:44:30 -07:00
Aleix Conchillo Flaqué
f7f0c44c32 transports(daily): don't block event handlers 2024-10-24 11:44:30 -07:00
Mark Backman
33553b71d4 Merge pull request #653 from pipecat-ai/mb/align-tts-constructors
Align TTSService constructors
2024-10-24 13:52:43 -04:00
Mark Backman
be8ca505cd Merge pull request #652 from pipecat-ai/khk/more-gemini
Gemini new context manager and rewrite to use google data structures internally
2024-10-24 13:47:38 -04:00
Mark Backman
e957cce422 Align TTSService constructors 2024-10-24 13:42:06 -04:00
Mark Backman
418a13a4ec Merge pull request #650 from pipecat-ai/mb/assembly-fix
AssemblyAI: don't disconnect on language change
2024-10-24 11:26:56 -04:00
Mark Backman
fc445c0a1f Merge pull request #649 from pipecat-ai/mb/open-ai-max-tokens
Add max_tokens and max_completion_tokens inputs for OpenAI
2024-10-24 11:26:44 -04:00
Mark Backman
f0c65468ed AssemblyAI: don't disconnect on language change 2024-10-24 08:30:48 -04:00
Mark Backman
ce6a2bdcf7 Add max tokens inputs to OpenAI 2024-10-24 07:03:45 -04:00
Mark Backman
673542e235 Merge pull request #646 from pipecat-ai/mb/grok-function-calling
Support function calling for Grok
2024-10-23 21:56:38 -04:00
Kwindla Hultman Kramer
e032b0b70a gemini context aggregators 2024-10-23 18:44:09 -07:00
Mark Backman
e39f7e965b Support function calling for Grok 2024-10-23 17:22:26 -04:00
Mattie Ruth
d26751e968 add missing PipelineParams to enable the metrics (#645) 2024-10-23 16:46:46 -04:00
Aleix Conchillo Flaqué
e0ca4a9c23 Merge pull request #643 from pipecat-ai/aleix/daily-update-subscriptions
transports(daily): add update_subscriptions()
2024-10-22 17:07:07 -07:00
Aleix Conchillo Flaqué
801e52c095 transports(daily): add update_subscriptions() 2024-10-22 15:02:55 -07:00
Aleix Conchillo Flaqué
a46eaa838b Merge pull request #641 from pipecat-ai/aleix/prepare-0.0.47
prepare 0.0.47
2024-10-22 10:30:42 -07:00
Aleix Conchillo Flaqué
7c432499db update CHANGELOG for 0.0.47 2024-10-22 10:02:50 -07:00
Aleix Conchillo Flaqué
8d75fcc9f0 use warnings package to report deprecated code 2024-10-22 10:02:21 -07:00
Aleix Conchillo Flaqué
61d73f81ae Merge pull request #639 from pipecat-ai/aleix/daily-transcription-model
transport(daily): use "nova-2-general" for transcription
2024-10-22 09:40:43 -07:00
Aleix Conchillo Flaqué
951255def9 transport(daily): use "nova-2-general" for transcription 2024-10-22 09:40:03 -07:00
Moof Soup
bf5a7c3562 docs: Clarify README example and token usage
clarified readme example
2024-10-21 19:54:34 -07:00
Mark Backman
e556f34094 Merge pull request #638 from pipecat-ai/mb/fix-silero-vad-import
Fix Silero VAD import issue
2024-10-21 20:48:06 -04:00
Mark Backman
ccc3691620 Fix Silero VAD import issue 2024-10-21 20:39:20 -04:00
Vanessa Pyne
5321affda7 Merge pull request #588 from Allenmylath/patch-11
Update README.md
2024-10-21 11:20:05 -05:00
Mark Backman
e5ad8dc67b Merge pull request #627 from pipecat-ai/mb/upgrade-gladia-to-v2-api
Update GladiaSTTService to use the Gladia V2 API
2024-10-21 12:01:20 -04:00
Mark Backman
46927805bc Update GladiaSTTService to use the Gladia V2 API 2024-10-21 07:10:38 -04:00
Aleix Conchillo Flaqué
b6b1ef0a40 Merge pull request #589 from Allenmylath/patch-12
Update Dockerfile
2024-10-20 10:59:43 -07:00
Mark Backman
e62f762382 Merge pull request #625 from pipecat-ai/mb/add-assemblyai-stt
Add support for AssemblyAI STT
2024-10-20 13:59:33 -04:00
Aleix Conchillo Flaqué
dbfda14342 Merge pull request #587 from Allenmylath/patch-9
Update env.example
2024-10-20 10:58:50 -07:00
Aleix Conchillo Flaqué
fee85418cd Merge pull request #620 from gregschwartz/main
Start agent/call/bot at localhost root
2024-10-20 10:14:10 -07:00
Mark Backman
015faa3dbd Update CHANGELOG and README 2024-10-20 08:57:57 -04:00
Mark Backman
1dbf4ff27d Add AssemblyAI STT service 2024-10-20 08:57:57 -04:00
Aleix Conchillo Flaqué
4f1b2dce9b Merge pull request #624 from pvilchez/fix_enable_usage_metrics
Fixing `enable_usage_metrics` setting.
2024-10-20 01:00:12 -07:00
Paul Vilchez
5640bd9447 Fixing a config mismatch which caused usage stats to only report when enable_metrics was true. 2024-10-20 03:33:13 -04:00
Aleix Conchillo Flaqué
ee5ae0d631 Merge pull request #621 from pipecat-ai/aleix/prepare-0.0.46
update CHANGELOG for 0.0.46
2024-10-19 18:26:05 -07:00
Aleix Conchillo Flaqué
4b8a4b86fe update CHANGELOG for 0.0.46 2024-10-19 18:25:29 -07:00
Aleix Conchillo Flaqué
3556c9ce0f Merge pull request #618 from pipecat-ai/aleix/examples-switch-to-llm-context
examples: use OpenAILLMContext in all the examples
2024-10-19 18:24:39 -07:00
Aleix Conchillo Flaqué
f971dbe027 examples(audio-recording): record audio into a file 2024-10-19 18:24:00 -07:00
Aleix Conchillo Flaqué
3815e9dec3 examples: fix dialin-chatbot python arguments 2024-10-19 18:24:00 -07:00
Aleix Conchillo Flaqué
320f622255 examples: upgrade storytelling frontend packages 2024-10-19 18:24:00 -07:00
Aleix Conchillo Flaqué
be4bdabdf4 examples: use OpenAILLMContext in all the examples 2024-10-19 18:24:00 -07:00
Greg Schwartz
1fa52b62aa Put start agent/call at localhost root. Before you had to read in the docs to go to /start, or /start_call or /start_bot. Which isn't mentioned in the console output, and is inconsistent, adding friction to learning the codebase 2024-10-19 16:18:43 -07:00
Aleix Conchillo Flaqué
4f66e5d55f Merge pull request #619 from pipecat-ai/aleix/split-vad
move SileroVAD processor to processors package
2024-10-18 23:30:07 -07:00
Aleix Conchillo Flaqué
3502509d3e move SileroVAD processor to processors package 2024-10-18 23:28:29 -07:00
Aleix Conchillo Flaqué
d71ea1c0e0 Merge pull request #615 from DamienDeepgram/patch-1
Update default Deepgram model
2024-10-18 22:47:30 -07:00
Kwindla Hultman Kramer
07712cdb16 gemini function calling and partial implementation of standard context stuff 2024-10-18 17:14:57 -07:00
DamienDeepgram
13f232bafc Update default model 2024-10-18 15:33:50 -07:00
Aleix Conchillo Flaqué
9dd3354b89 Merge pull request #613 from pipecat-ai/aleix/examples-endframe
examples: use EndFrame() when the participant leaves
2024-10-18 11:18:26 -07:00
Aleix Conchillo Flaqué
8c006c24a3 README: update example 2024-10-18 11:18:03 -07:00
Aleix Conchillo Flaqué
4550545528 examples: use EndFrame() when the participant leaves 2024-10-18 11:18:03 -07:00
Aleix Conchillo Flaqué
020f371ecb pyproject: update onnxruntime to support python 3.12 2024-10-18 10:20:28 -07:00
Aleix Conchillo Flaqué
f3c0767c81 Merge pull request #610 from pipecat-ai/aleix/stt-push-audio
allow STT services to passthrough audio frames
2024-10-17 21:02:30 -07:00
Aleix Conchillo Flaqué
c9318ecd5c examples: minor fixes 2024-10-17 16:15:09 -07:00
Aleix Conchillo Flaqué
12eb9437c1 services(stt): allow STT service to passthrough audio 2024-10-17 16:15:09 -07:00
Aleix Conchillo Flaqué
71c8c0dcdb Merge pull request #609 from pipecat-ai/aleix/livekit-force-specifying-vad
livekit force specifying vad
2024-10-17 14:08:55 -07:00
Aleix Conchillo Flaqué
8108423742 transport(livekit): force specifying a vad analyzer
Don't default to SileroVADAnalyzer(). Also, resample to input sample rate.
2024-10-17 14:06:43 -07:00
Aleix Conchillo Flaqué
d67e08be4d Merge pull request #608 from pipecat-ai/aleix/add-audio-utils-and-resample
add audio utils and resample
2024-10-17 14:00:49 -07:00
Aleix Conchillo Flaqué
d3f4ac61b6 move utils.audio to audio.utils and add resample_audio() 2024-10-17 13:59:32 -07:00
Aleix Conchillo Flaqué
c6d28bb0db Merge pull request #607 from pipecat-ai/aleix/pipecat-vad-deprecation
move vad package to audio.vad
2024-10-17 13:51:20 -07:00
Aleix Conchillo Flaqué
2a37b2459a move vad package to audio.vad 2024-10-17 13:49:16 -07:00
Mark Backman
d1000f2fe4 Merge pull request #606 from pipecat-ai/mb/add-playht-options
PlayHT: Add websocket TTS service; rename existing service to PlayHTHttpTTSService, upgrade client, add input params
2024-10-17 16:46:59 -04:00
Mark Backman
e2d7af4b62 Update changelog 2024-10-17 16:16:29 -04:00
Mark Backman
da3810f1a2 Add websocket support for PlayHT 2024-10-17 15:41:33 -04:00
Aleix Conchillo Flaqué
eb21597d1a Merge pull request #603 from pipecat-ai/aleix/silero-vad-processor-fixes
vad: add support for interruption to SileroVAD processor
2024-10-17 10:48:39 -07:00
Aleix Conchillo Flaqué
e3eea0c02f vad: add support for interruption to SileroVAD processor 2024-10-17 10:48:25 -07:00
Mark Backman
45606e177c Add input options to PlayHT, upgrade to latest PlayHT model 2024-10-17 11:56:12 -04:00
Aleix Conchillo Flaqué
197d7b3e2b Merge pull request #604 from natestraub/patch-1
services(livekit) - Stop Sending EndFrame when Participant Disconnects
2024-10-17 08:48:57 -07:00
Nathan Straub
d4ec6827ce services(livekit) - Stop Sending EndFrame when Participant Disconnects
How It Works Now:
A participant disconnecting triggers and EndFrame, invoking stop() on the input and output transports and causing the LiveKit room to disconnect.  

Proposal:
Match the daily implementation, and just trigger the callbacks in the LiveKitTransport.  Leave it up to the implementor to decide whether to send EndFrames when this happens.
2024-10-16 23:53:31 -07:00
Aleix Conchillo Flaqué
e31d1152db Merge pull request #601 from pipecat-ai/aleix/openai-realtime-misc
services(openai): rename OpenAILLMServiceRealtimeBeta to OpenAIRealti…
2024-10-16 16:20:18 -07:00
Mark Backman
bb48a81103 Merge pull request #602 from pipecat-ai/mb/adjust-logger-levels
Adjust log levels for log messages
2024-10-16 18:00:35 -04:00
Mark Backman
55f1ae2564 Adjust log levels for log messages 2024-10-16 17:30:47 -04:00
Kwindla Hultman Kramer
280691b1b3 explanatory comment in 19-openai-realtime-beta.py 2024-10-16 14:27:48 -07:00
Kwindla Hultman Kramer
93c9e219ce fix for message handling bug on initialization 2024-10-16 12:40:20 -07:00
Aleix Conchillo Flaqué
edd44cc181 services(openai): rename OpenAILLMServiceRealtimeBeta to OpenAIRealtimeBetaLLMService 2024-10-16 10:20:19 -07:00
Aleix Conchillo Flaqué
4075b19f7c Merge pull request #600 from pipecat-ai/aleix/prepare-0.0.45
update CHANGELOG to 0.0.45
2024-10-16 09:18:37 -07:00
Aleix Conchillo Flaqué
bb14918a33 update CHANGELOG to 0.0.45 2024-10-16 09:17:33 -07:00
Mark Backman
2aee8a12f8 Merge pull request #599 from pipecat-ai/mb/remove-metrics-from-transport
Move metrics from transport to rtvi
2024-10-16 11:39:58 -04:00
Mark Backman
5760fadb44 Update changelog 2024-10-16 11:33:56 -04:00
Mark Backman
af5a7e9092 Move metrics from transport to rtvi 2024-10-16 11:33:56 -04:00
Mark Backman
8d9a7486d1 Merge pull request #598 from pipecat-ai/mb/add-daily-metrics-message-frame
Comply with RTVI format for sending metrics data via Daily transport
2024-10-16 10:14:44 -04:00
Mark Backman
00d0f9ae48 Comply with RTVI format for sending metrics data 2024-10-16 09:00:38 -04:00
Aleix Conchillo Flaqué
d255b7d1b2 Merge pull request #596 from pipecat-ai/aleix/prepare-0.0.44
prepare for pipecat 0.0.44
2024-10-15 18:13:07 -07:00
Aleix Conchillo Flaqué
4eb2c95b63 update CHANGELOG for 0.0.44 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
3910aeb4de transports(daily): don't send messages if not joined 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
713dcb7a4d transports(daily): cancel messages task when canceling 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
04da51c7d8 transport(base_output): push EndFrame downstream at the right time 2024-10-15 17:51:01 -07:00
Aleix Conchillo Flaqué
e52d18e42d processors(audiobuffer): make functions public 2024-10-15 15:31:59 -07:00
Aleix Conchillo Flaqué
0c4a513ca2 Merge pull request #595 from pipecat-ai/aleix/bot-speaking-system-frames
bot speaking system frames
2024-10-15 15:30:11 -07:00
Aleix Conchillo Flaqué
4a71eacac3 rtvi: reset bot transcription with interruptions 2024-10-15 14:58:21 -07:00
Aleix Conchillo Flaqué
f0d89e57ad frames: some frames need to be SystemFrames
We want to process user and bot started/stopped speaking frames as fast as
possible. If we queue them they might be processed too late.
2024-10-15 14:37:56 -07:00
Mark Backman
79b52d4301 Merge pull request #594 from pipecat-ai/mb/more-text-filter-massaging
More edge case handling for text filtering
2024-10-15 14:51:43 -04:00
Mark Backman
bb00dbefbc More edge case handling for text filtering 2024-10-15 14:08:27 -04:00
Aleix Conchillo Flaqué
0c250c0603 Merge pull request #583 from pipecat-ai/aleix/add-pts-to-llm-full-response-end-frame
add pts to llm full response end frame
2024-10-15 10:39:50 -07:00
Aleix Conchillo Flaqué
7bbaf4dfe9 rtvi: merge TTS/TTSText and LLM/LLMText processors 2024-10-15 10:24:43 -07:00
Aleix Conchillo Flaqué
3a3bf3fe34 services(cartesia): schedule TTSStoppedFrame after text 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
616aa54f75 ruff formatting 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
164f06415c servcies(cartesia): no need to send LLMFullResponseEndFrame
Interruptions are already handled by context aggregators.
2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
51bc4839d1 transport(base_output): simplify code 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
6d778e0491 services: add pts to LLMFullResponseEndFrame in WordTTSService 2024-10-15 10:06:28 -07:00
Aleix Conchillo Flaqué
fc4fa2faaa Merge pull request #593 from pipecat-ai/aleix/bot-transcription-processor
rtvi: add RTVIBotTranscriptionProcessor to send `bot-transcription`
2024-10-15 10:03:39 -07:00
Aleix Conchillo Flaqué
90b7f65545 rtvi: add RTVIBotTranscriptionProcessor to send bot-transcription 2024-10-15 10:03:20 -07:00
Kwindla Hultman Kramer
f7b7f0d680 Merge pull request #541 from pipecat-ai/khk/openai-realtime-beta
openai realtime beta
2024-10-14 21:02:06 -07:00
Kwindla Hultman Kramer
5431c44e51 remove two debug lines 2024-10-14 21:01:20 -07:00
Kwindla Hultman Kramer
40b3e50815 fix system, consecutive same role, and empty message parsing for anthropic 2024-10-14 20:56:42 -07:00
allenmylath
ec98a13a08 Update Dockerfile
utils and assets not used in this example hence removed
2024-10-15 08:18:16 +05:30
allenmylath
b999b76f70 Update README.md
readme description still shows simple-chatbot definition hence made more accurate description
2024-10-15 08:14:43 +05:30
allenmylath
b64dbe7bb4 Update env.example
canonical api url is also used from env.
2024-10-15 08:10:07 +05:30
Kwindla Hultman Kramer
2f6232fac9 fix for initial-messages with single message, and hoisting system message into instructions 2024-10-14 18:14:35 -07:00
Aleix Conchillo Flaqué
b4f2525c76 Merge pull request #585 from pipecat-ai/aleix/daily-urgent-transport-message-hang
transports(daily): send transport messages in a task
2024-10-14 16:31:10 -07:00
Aleix Conchillo Flaqué
8e956a4e88 Merge pull request #584 from pipecat-ai/aleix/urgent-bot-tts-audio
rtvi: bot-tts-audio messages should also be urgent
2024-10-14 16:25:35 -07:00
Aleix Conchillo Flaqué
7b9712daad transports(daily): send transport messages in a task
We queue transport messages and send them in a task to avoid potential hangs by
sending urgent transport messages from a transport event handler.
2024-10-14 16:19:53 -07:00
Kwindla Hultman Kramer
d4269acd67 user started/stopped speaking frames and interruption frames 2024-10-14 16:07:04 -07:00
Kwindla Hultman Kramer
d2ae82fb38 added back in missing LLMFullResponseStartFrame and LLMFullResponseEndFrame 2024-10-14 15:18:50 -07:00
Lewis Wolfgang
270949e6cd Merge pull request #582 from pipecat-ai/lewis/update_readme_aboutsilerofirstrun
Minor README update about Silero VAD.
2024-10-14 16:26:28 -04:00
Aleix Conchillo Flaqué
cfada94c13 rtvi: bot-tts-audio messages should also be urgent 2024-10-14 12:46:11 -07:00
Lewis Wolfgang
68fd6f7c44 Minor README update about Silero VAD.
We no longer download the model during first run - it's part of the repo.
2024-10-14 13:11:16 -04:00
Mark Backman
96bfcc3dca Merge pull request #571 from pipecat-ai/mb/add-code-filtering
Add code and table filtering option to MarkdownTextFilter
2024-10-14 12:54:16 -04:00
Mark Backman
b0890b1f75 Code review fixes 2024-10-14 12:52:16 -04:00
Aleix Conchillo Flaqué
802b3e42c4 Merge pull request #579 from Allenmylath/patch-16
Update Dockerfile
2024-10-14 08:58:02 -07:00
Aleix Conchillo Flaqué
bd134839ff Merge pull request #578 from Allenmylath/patch-15
Create Dockerfile
2024-10-14 08:57:34 -07:00
Aleix Conchillo Flaqué
428ce63e17 Merge pull request #575 from Allenmylath/patch-12
Update README.md
2024-10-14 08:55:12 -07:00
Aleix Conchillo Flaqué
46d6cde383 Merge pull request #574 from Allenmylath/patch-11
Update requirements.txt
2024-10-14 08:54:44 -07:00
allenmylath
6de82b3c11 Create .env.example (#562)
* Create .env.example

.env.example file with required env variables not added hence adding

* Rename .env.example to env.example

file name corrected as directed
2024-10-14 08:52:46 -07:00
Mark Backman
ec0bc7a057 A few bug fixes 2024-10-14 09:44:20 -04:00
allenmylath
c62156a4c3 Update Dockerfile
assets and utils files not found hence removed
2024-10-14 12:00:29 +05:30
allenmylath
e8618a07d0 Create Dockerfile
there is Dockerfile in other examples. this docker file assumes that there is a .env file(i added env.example in another pull request)
2024-10-14 11:49:35 +05:30
allenmylath
0ba99514a9 Update README.md
env.example added hence addying copy command will be necessary
2024-10-14 11:22:56 +05:30
allenmylath
837c8dad27 Update requirements.txt
whisper not used but deepgram used hence changed
2024-10-14 11:20:12 +05:30
allenmylath
0e69625a01 Rename frames.md to frame.md
edited again to frame.md
2024-10-14 10:07:47 +05:30
allenmylath
4e0823fced Rename Frames.md to frames.md
file name changed as requested
2024-10-14 10:05:26 +05:30
Kwindla Hultman Kramer
6f2a464451 conversation save/load for openai, openai-realtime, and anthropic 2024-10-13 18:12:03 -07:00
Kwindla Hultman Kramer
ac4c5ab369 response content item truncation when interrupted 2024-10-13 14:38:04 -07:00
Kwindla Hultman Kramer
9e95419301 much cleanup 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
f390ec9608 temp commit; debugging 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
ce8a83efba tools frame support and wip message resetting/loading 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
e5a2bf9564 context management improvements 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
7838018686 fix default response properties getting appended to ResponseCreateEvent 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
31916ed9fd turn on/off openai vad 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
3a2fbc2b19 send user started/stopped speaking event from openai realtime events
send user started/stopped speaking event from openai realtime events
2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
43520b44da add 'failed' case to Response event object 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
ab4a8d791a RTVI processors should use TextFrame not TextFrame and all subclasses 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
40dc546b81 function call fix and user transcription frames 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
5426891feb added input audio pause setting. no frame to update that state, yet. 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
1c5ccd3406 fixes for settings updates, context updates, and response creation 2024-10-12 21:58:11 -07:00
Mark Backman
3a745bfa3f Handle self._context of None 2024-10-12 21:58:11 -07:00
Mark Backman
ac4e39991e Update ai_services for OpenAI Realtime param inputs 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
c870832da6 types seem complete; some ws error handling 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
e782016c57 renamed a file 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
00badaf98e more pydantic cleanup 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
7dfac0163b bits of pydantic 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
09a3c2a82d major functionality working (not configurable, occasional timing bugs maybe) 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
c32c65014b definitely broke something in the pipeline 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
f082eb10a2 small cleanup 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
b8898e449e lots of debugging statements. multiple function calls broken 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
d1f6d229ca space exploration prompt 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
4fa0318005 configurability via constructor 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
93ebb9d541 working 19-openai-realtime-beta.py example 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
16101c79c5 beginning of realtime impl 2024-10-12 21:58:11 -07:00
Kwindla Hultman Kramer
c866b3f2c9 Merge pull request #572 from pipecat-ai/khk/fix-deepgram-settings
fix for Deepgram settings not merging properly
2024-10-12 20:07:04 -07:00
Mark Backman
c26a45721f Set inputs as Optional 2024-10-12 21:52:56 -04:00
Mark Backman
d9c900f872 Satisfy minimal text requirements for Cartesia and OpenAI 2024-10-12 21:27:37 -04:00
chadbailey59
73becbad29 fixed parallel async function calls bug (#569) 2024-10-12 17:45:24 -05:00
Aleix Conchillo Flaqué
f1df3de263 Merge pull request #560 from Allenmylath/patch-7
Update requirements.txt aiohttp missing
2024-10-12 14:52:24 -07:00
Aleix Conchillo Flaqué
3bc5c8cda7 Merge pull request #557 from Allenmylath/patch-4
Update env.example wrong tts service in env
2024-10-12 14:51:54 -07:00
Aleix Conchillo Flaqué
7b3b1058b2 Merge pull request #559 from Allenmylath/patch-6
Update server.py
2024-10-12 14:51:24 -07:00
Aleix Conchillo Flaqué
87473f857f Merge pull request #558 from Allenmylath/patch-5
Update env.example wrong tts
2024-10-12 14:50:52 -07:00
Aleix Conchillo Flaqué
a96209185c Merge pull request #546 from Allenmylath/patch-2
Update README.md
2024-10-12 14:46:15 -07:00
Aleix Conchillo Flaqué
34cc2ed1a1 Merge pull request #532 from nmaswood/nmaswood/format-logs
Format and Support Unicode for LLM Message Debug Logs
2024-10-12 14:42:58 -07:00
Aleix Conchillo Flaqué
667aa0c25a Merge pull request #542 from joachimchauvet/main
Update LiveKit audio transport for changes introduced in v0.0.42
2024-10-12 14:13:02 -07:00
Mark Backman
12707f4ff7 _settings needs to be Dict 2024-10-12 12:19:54 -04:00
Kwindla Hultman Kramer
53451899a7 fix for Deepgram settings not merging 2024-10-11 21:07:39 -07:00
Aleix Conchillo Flaqué
dc73b20c0b Merge pull request #451 from Canonical-AI-Inc/recording
Audio recording FrameProcessor
2024-10-11 13:48:19 -07:00
Adrian Cowham
4330374ba4 passing kwargs and forcing keyword-only arguments 2024-10-11 12:01:51 -07:00
Adrian Cowham
79c8aa2c4a ruff formatting 2024-10-11 11:35:02 -07:00
Adrian Cowham
083d221dd2 PR feedback 2024-10-11 11:29:01 -07:00
Mark Backman
74d47b725f Add table filtering 2024-10-11 14:10:47 -04:00
Adrian Cowham
917e482876 Merge branch 'main' into recording 2024-10-11 10:36:04 -07:00
Adrian Cowham
522d931950 better interruption handling by moving the processors after the transport output 2024-10-11 10:33:12 -07:00
Mark Backman
d10c7ac7ce Add Changelog entry 2024-10-11 13:28:34 -04:00
Mark Backman
84705427c5 Add code filtering option to MarkdownTextFilter 2024-10-11 11:11:58 -04:00
Aleix Conchillo Flaqué
66a76af341 Merge pull request #567 from pipecat-ai/aleix/prepare-0.0.43
update CHANGELOG for 0.0.43
2024-10-10 14:09:18 -07:00
Aleix Conchillo Flaqué
d402d91c2f update CHANGELOG for 0.0.43 2024-10-10 14:06:18 -07:00
Mark Backman
b05130a089 Merge pull request #566 from pipecat-ai/mb/make-markdown-modifiable
Mark the Markdown processor a util, and allow it to take inputs
2024-10-10 17:00:19 -04:00
Mark Backman
b3cc0779f0 Update the changelog 2024-10-10 16:49:20 -04:00
Mark Backman
cbecae40a9 Mark the Markdown processor a util, and allow it to take inputs 2024-10-10 16:43:48 -04:00
Mark Backman
5b8753c8b6 Add speak_code input param 2024-10-10 13:17:37 -04:00
Mark Backman
3c5f9457f1 More edge case improvements 2024-10-10 12:07:00 -04:00
Mark Backman
e32e56d0bc Merge pull request #565 from pipecat-ai/mb/add-markdown-remover
Add a new processor which removes markdown and special chars from TTS text
2024-10-10 07:16:42 -04:00
Mark Backman
788aec665b Add a new processor which removes markdown and special chars from TTS text 2024-10-10 07:11:31 -04:00
Mark Backman
3cada03a92 Merge pull request #564 from pipecat-ai/mb/bot-tts-text-urgent
Make bot-tts-text messages urgent
2024-10-08 19:26:46 -04:00
Mark Backman
e21fb520f9 Make bot-tts-text messages urgent 2024-10-08 17:07:08 -04:00
allenmylath
864f4d385f Update requirements.txt aiohttp missing
aiohttp is not included but uded in code
2024-10-08 16:39:25 +05:30
allenmylath
26ac2878ae Update server.py
desccription of fastapi sagrgumentparser wrongly shown as stroy teller instead of patient-intake
2024-10-08 15:18:26 +05:30
allenmylath
cac63f5565 Update env.example wrong tts
cartesian used in code but elevenlabs in .env example
2024-10-08 14:24:23 +05:30
allenmylath
aadffd6199 Update env.example wrong tts service in env
cartesian used in code but env got elevenlabs
2024-10-08 14:15:54 +05:30
Aleix Conchillo Flaqué
3403197a90 Merge pull request #552 from pipecat-ai/aleix/rtvi-user-llm-text
rtvi: add RTVIUserLLMTextProcessor
2024-10-07 08:33:29 -07:00
Aleix Conchillo Flaqué
8cdb9ab1ad rtvi: internal transport message should be urgent 2024-10-07 08:04:14 -07:00
Mark Backman
5dbf26d283 Handle cases where text is either a list or a string 2024-10-07 07:21:32 -04:00
Mark Backman
8001bab9b0 Remove another instance of urgent=true 2024-10-07 06:58:32 -04:00
Aleix Conchillo Flaqué
12d0686adc rtvi: rename bot-audio to bot-tts-audio 2024-10-06 16:50:55 -07:00
Aleix Conchillo Flaqué
a28a5e954a add TransportMessageSystemFrame 2024-10-06 16:50:12 -07:00
Aleix Conchillo Flaqué
bb966a89d2 rtvi: add RTVIUserLLMTextProcessor 2024-10-06 01:05:58 -07:00
Aleix Conchillo Flaqué
4a74eb3321 use isinstance tuples 2024-10-06 00:45:27 -07:00
Aleix Conchillo Flaqué
1f54ee6991 pyproject: update deepgram to 3.7.3 2024-10-06 00:40:47 -07:00
Allenmylath
40af3571f0 Create Frames.md
Made asmall explanation for diffrent types of frames in pipcat
2024-10-05 22:04:03 +05:30
joachimchauvet
86143f79a1 use new InputAudioRawFrame and OutputAudioRawFrame 2024-10-05 14:17:27 +03:00
joachimchauvet
b373bc82b5 match behavior of Daily's on_first_participant_joined 2024-10-05 14:17:27 +03:00
Mark Backman
ea2a05a04b Merge pull request #545 from pipecat-ai/mb/fix-language-handling
Improve language string handling for TTS services
2024-10-04 10:03:06 -04:00
Mark Backman
5692ca586c Merge pull request #547 from pipecat-ai/mb/update-test-requirements
Update fastapi version in test-requirements.txt
2024-10-04 08:28:05 -04:00
Mark Backman
a11ad81f02 Update fastapi version in test-requirements.txt 2024-10-04 07:35:48 -04:00
Allenmylath
805efdb144 Update README.md
the description provided is that of simple chatbot and also the video of simple chatbot hence changed
2024-10-04 10:19:38 +05:30
Mark Backman
c49b31e6ad Add CHANGELOG entry 2024-10-03 23:13:59 -04:00
Mark Backman
7796a272ce Improve language handling for TTS services 2024-10-03 23:09:27 -04:00
Adrian Cowham
678e87fd31 comment back in some code 2024-10-03 14:12:23 -07:00
Adrian Cowham
4d81a2ebfe nuked the code that marks user audio in favor for InputAudioRawFrame. also moving to stereo instead of mono with the human and bot on their own channel. 2024-10-03 14:10:03 -07:00
Adrian Cowham
2d82702e04 merge from main 2024-10-03 09:42:06 -07:00
Mark Backman
27dcf83f37 Merge pull request #543 from pipecat-ai/mb/fix-deepgram-stt-language
Deepgram: disconnect and reconnect on language change
2024-10-03 12:40:27 -04:00
Mark Backman
72db83528d Update changelog 2024-10-03 12:37:26 -04:00
Mark Backman
45c7d36b2e Deepgram: disconnect and reconnect on language change 2024-10-03 12:31:42 -04:00
Aleix Conchillo Flaqué
65eeb0f1f6 Merge pull request #540 from pipecat-ai/cb/interruption-fix
Fixed RTVI `tts:interrupt` action not interrupting
2024-10-02 13:46:52 -07:00
Aleix Conchillo Flaqué
1d7d0bb1ea Merge pull request #539 from pipecat-ai/aleix/pipecat-0.0.42-fixes
pipecat 0.0.42 fixes
2024-10-02 13:34:28 -07:00
Aleix Conchillo Flaqué
598936bc53 services: apply service language code before using service 2024-10-02 13:30:01 -07:00
Chad Bailey
b1bf6f7733 fixed botinterruptionframe 2024-10-02 19:43:51 +00:00
Aleix Conchillo Flaqué
75d27aeb9f examples(storytelling): update packages 2024-10-02 12:00:00 -07:00
Aleix Conchillo Flaqué
0a37caf4b4 openai: fix image json logging 2024-10-02 11:57:50 -07:00
Aleix Conchillo Flaqué
6db65f4335 cartesia: use model_name instead of model_id 2024-10-02 11:57:36 -07:00
Aleix Conchillo Flaqué
3648874301 gladia: fix languages 2024-10-02 11:57:25 -07:00
Aleix Conchillo Flaqué
8bcb5d7fd2 services: async generators should yield frames 2024-10-02 11:57:08 -07:00
Aleix Conchillo Flaqué
8c01a900cd google: allow using GOOGLE_APPLICATION_CREDENTIALS 2024-10-02 11:56:01 -07:00
Mark Backman
d378e699d2 Merge pull request #538 from Allenmylath/patch-2
Update env.example for wrong tts
2024-10-02 12:53:50 -04:00
Mark Backman
c25c375c41 Merge pull request #537 from pipecat-ai/mb/fix-nested-strings
Fix nested strings issue
2024-10-02 12:39:00 -04:00
Allenmylath
70c3ff31fd Update env.example
elevenlabs is not used in code instead cartesian is used hence changed
2024-10-02 21:59:51 +05:30
Mark Backman
cd2e29f285 Fix nested strings issue 2024-10-02 12:26:30 -04:00
Aleix Conchillo Flaqué
6d4d7d763d Merge pull request #534 from pipecat-ai/aleix/prepare-0.0.42
update CHANGELOG for 0.0.42
2024-10-02 08:36:32 -07:00
Aleix Conchillo Flaqué
6c1851eef8 update CHANGELOG for 0.0.42 2024-10-02 08:36:17 -07:00
Mark Backman
096a15eef6 Merge pull request #527 from pipecat-ai/mb/google-tts-inputs
Further consolidate service update settings into a single ServiceUpdateSettingsFrame class
2024-10-02 11:13:25 -04:00
Mark Backman
3d642df2b0 Revert aligning voice_id name in TTS service constructor 2024-10-02 11:07:48 -04:00
Mark Backman
d75a02dc51 Use Language enum and set languages accordingly 2024-10-01 21:03:01 -04:00
Mark Backman
28643b453d Update to use LLM, STT, TTS subclasses and remove setter methods 2024-10-01 20:30:27 -04:00
Nasr Maswood
d5635de5f6 add new lines and unicode to JSON debug logs 2024-10-01 13:31:58 -04:00
Mark Backman
88cca7bf68 Consolidate service UpdateSettingsFrame into a single ServiceUpdateSettingsFrame 2024-10-01 11:01:04 -04:00
Mark Backman
a397b859fe Add support for gender and google_style inputs to Google TTS 2024-10-01 10:39:45 -04:00
Kwindla Hultman Kramer
8aae4e9856 Merge pull request #531 from pipecat-ai/khk/function-calling-improvements 2024-10-01 07:23:38 -07:00
Kwindla Hultman Kramer
92d8b37229 implement vision for openai 2024-09-30 21:49:29 -07:00
Kwindla Hultman Kramer
0801fc578b Merge pull request #530 from pipecat-ai/khk/tts-say-fix
fix for multi-sentence tts say utterances
2024-09-30 20:59:53 -07:00
Kwindla Hultman Kramer
0d5cb84531 function calling testing and improvements 2024-09-30 20:59:28 -07:00
Kwindla Hultman Kramer
47b943a117 Merge pull request #522 from pipecat-ai/rebase-openai-multi-function-call
Handle parallel function calls for OpenAI LLMs
2024-09-30 16:23:37 -07:00
Kwindla Hultman Kramer
128355add5 fix for multi-sentence tts say utterances 2024-09-30 16:19:31 -07:00
Kwindla Hultman Kramer
0499fe41e4 get rid of some debug log lines used during development 2024-09-30 16:08:33 -07:00
Kwindla Hultman Kramer
6ad3437fd2 throw error if the llm tries to call a function that's not registered 2024-09-30 16:08:33 -07:00
Kwindla Hultman Kramer
a5c73ec829 handle openai multiple function calls 2024-09-30 16:08:30 -07:00
JeevanReddy
def04ac0ce openai can give multiple tool calls, current implementation assumes only one function call at a time. Fixed this to handle multiple function calls. 2024-09-30 16:07:56 -07:00
Kwindla Hultman Kramer
5d63615b1b Merge pull request #528 from pipecat-ai/khk/sentence-splits
TTS sentence aggregation fix
2024-09-30 16:07:21 -07:00
Kwindla Hultman Kramer
90ee284fe0 Merge pull request #520 from pipecat-ai/khk/context-frame-push
pushing context frames from assistant aggregators
2024-09-30 16:06:54 -07:00
Kwindla Hultman Kramer
539e0b66fb small fix as per aleix 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
fef393dcac assistant aggregator switch for space padding or not 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
ed607d5c4b typo fix 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
37da7e44cd whitespace fix 2024-09-30 16:05:32 -07:00
Kwindla Hultman Kramer
69c7edd60c pushing context frames from assistant aggregators 2024-09-30 16:05:28 -07:00
Aleix Conchillo Flaqué
392f210371 Merge pull request #524 from pipecat-ai/aleix/everything-is-async
all frame processors are asynchrnous
2024-09-30 15:59:03 -07:00
Mark Backman
9a63df1ea1 Merge pull request #529 from pipecat-ai/mb/daily-python-0-11-0
Update daily-python to 0.11.0
2024-09-30 18:29:27 -04:00
Mark Backman
f8a75cede9 Update daily-python to 0.11.0 2024-09-30 18:22:38 -04:00
Aleix Conchillo Flaqué
4d1e370e02 pipeline(task): since everything is async tasks should wait for EndFrame 2024-09-30 15:11:21 -07:00
Aleix Conchillo Flaqué
d080a31a5c tests: fix langchanin tests 2024-09-30 15:11:21 -07:00
Aleix Conchillo Flaqué
a90ebdfe7c syncparallelpipeline: fix now that all frames are asynchronous 2024-09-30 15:11:21 -07:00
Aleix Conchillo Flaqué
c8995b82e5 all frame processors are asynchrnous
In this commit we make all frame processors asynchronous, that is, they have an
internal queue and they push frames using a task from that queue.
2024-09-30 15:11:21 -07:00
Kwindla Hultman Kramer
6b7f924af6 tts sentence aggregation fix 2024-09-30 14:33:08 -07:00
Mark Backman
51580e5349 Merge pull request #526 from pipecat-ai/mb/google-tts-lang-update
Set Google TTS default language to en-US
2024-09-30 15:32:43 -04:00
Mark Backman
ed49cebf2c Set Google TTS default language to en-US 2024-09-30 15:16:46 -04:00
Mark Backman
46ac76701e Merge pull request #517 from pipecat-ai/mb/update-settings-frame
Consolidate update frames classes into a single UpdateSettingsFrame class
2024-09-30 12:56:45 -04:00
Mark Backman
1f77863aef Code review feedback 2024-09-30 12:50:40 -04:00
Mark Backman
d7555609fd Add TTS update settings options 2024-09-30 12:50:40 -04:00
Mark Backman
7fe118ce63 Align use of language param across TTS services 2024-09-30 12:50:40 -04:00
Mark Backman
44a349386c Consolidate update frames classes into a single UpdateSettingsFrame class 2024-09-30 12:50:39 -04:00
Mark Backman
97cba92fa5 Merge pull request #516 from pipecat-ai/mb/google-tts
Add Google TTS
2024-09-30 12:25:16 -04:00
Aleix Conchillo Flaqué
d9b16d4f73 services: import cosmetics 2024-09-27 13:32:27 -07:00
Aleix Conchillo Flaqué
50b6580fbb livekit: add license notice 2024-09-27 13:28:33 -07:00
Mark Backman
e7548f9494 Code review feedback 2024-09-27 08:02:44 -04:00
Mark Backman
830d2df671 Add Google TTS 2024-09-27 07:36:20 -04:00
Aleix Conchillo Flaqué
13b50a07db Merge pull request #515 from pipecat-ai/aleix/rtvi-frame-processors
RTVI frame processors
2024-09-27 00:48:09 -07:00
Aleix Conchillo Flaqué
4501dca133 Merge pull request #467 from joachimchauvet/main
Add LiveKit audio transport
2024-09-26 22:58:25 -07:00
Aleix Conchillo Flaqué
2c8e566507 rtvi: update version to 0.2 2024-09-26 22:42:36 -07:00
Aleix Conchillo Flaqué
6e8a202107 rtvi: fix handling transport messages 2024-09-26 22:42:19 -07:00
Aleix Conchillo Flaqué
2a05cd35b0 rtvi: add multiple RTVI frame processors 2024-09-26 22:42:08 -07:00
Mark Backman
55a70cde8f Merge pull request #514 from pipecat-ai/mb/aws-polly-tts
Add AWS Polly TTS support
2024-09-26 22:20:13 -04:00
Mark Backman
706c00d897 Code review feedback 2024-09-26 22:13:37 -04:00
Aleix Conchillo Flaqué
d323ea9e95 async_generator: keep pushing frames downstream 2024-09-26 16:44:49 -07:00
Aleix Conchillo Flaqué
b8ece84c6e services: super should be super() 2024-09-26 10:39:26 -07:00
Mark Backman
a018112a13 Merge pull request #510 from pipecat-ai/mb/deepgram-tts-http
Improve usability of Deepgram TTS: use Deepgram client, remove aiohttp
2024-09-26 13:38:42 -04:00
Mark Backman
d3a477902b Add changelog entry 2024-09-26 13:35:59 -04:00
Mark Backman
298b151486 Add setter methods 2024-09-26 13:35:59 -04:00
Mark Backman
6a6ea251ae Add AWS Polly TTS support 2024-09-26 13:35:59 -04:00
Aleix Conchillo Flaqué
c7c709a0a7 github: cache venv when running tests 2024-09-26 10:32:22 -07:00
Aleix Conchillo Flaqué
6ac57b4854 Merge pull request #494 from badbye/full-width-punctuations
add full-width punctuations as end of the sentence
2024-09-26 10:17:10 -07:00
Aleix Conchillo Flaqué
f5e0b946c7 services(cartesia): fix string formatting 2024-09-26 09:08:37 -07:00
Mark Backman
b1818cc370 Merge pull request #435 from golbin/main
Add speed and emotion options for Cartesia.
2024-09-26 07:14:59 -04:00
Jin Kim
d05717a1bd Apply Ruff formater 2024-09-26 19:52:25 +09:00
Aleix Conchillo Flaqué
d11daee31a Merge pull request #509 from pipecat-ai/aleix/frameprocessor-event-handlers
frame processor event handlers
2024-09-25 19:50:30 -07:00
Mark Backman
73da8c1910 Improve usability of Deepgram TTS: use Deepgram client, remove aiohttp 2024-09-25 22:43:10 -04:00
Aleix Conchillo Flaqué
f06aa300d0 rtvi: add on_bot_ready event 2024-09-25 16:52:18 -07:00
Aleix Conchillo Flaqué
c4e94e280e processors: add support for event handlers 2024-09-25 16:35:33 -07:00
Kwindla Hultman Kramer
8f2941c575 Merge pull request #492 from pipecat-ai/khk/flush-more-audio
add calls to flush_audio for say() and rtvi action
2024-09-25 12:35:50 -07:00
joachimchauvet
447baad5c3 update send_metrics() to support changes introduced in #474 2024-09-25 21:38:55 +03:00
Mark Backman
2703813e8a Merge pull request #496 from pipecat-ai/mb/azure-tts-inputs
Add Azure TTS input params
2024-09-25 14:38:01 -04:00
Mark Backman
521e152150 Merge pull request #495 from pipecat-ai/mb/elevenlabs-input-lang
Add language_code support for ElevenLabs TTS
2024-09-25 14:37:44 -04:00
Kwindla Hultman Kramer
3d43ad0f4d actually save the file 2024-09-25 10:59:00 -07:00
Kwindla Hultman Kramer
3621fceae2 fixes as noted by aleix 2024-09-25 09:19:58 -07:00
Aleix Conchillo Flaqué
e123f33c03 Merge pull request #506 from pipecat-ai/aleix/async-generator-processor
processors: add AsyncGeneratorProcessor
2024-09-25 00:04:09 -07:00
Aleix Conchillo Flaqué
b8713666c2 processors: add AsyncGeneratorProcessor 2024-09-25 00:01:04 -07:00
Aleix Conchillo Flaqué
cf0ab85e2c Merge pull request #505 from pipecat-ai/aleix/init-task-variables
initialize task variables and add minor description
2024-09-24 23:59:38 -07:00
Aleix Conchillo Flaqué
8502c7c801 Merge pull request #504 from pipecat-ai/aleix/rtvi-handle-frame
rtvi: add RTVIProcessor.handle_message()
2024-09-24 23:59:26 -07:00
Aleix Conchillo Flaqué
e89814dc6b Merge pull request #503 from pipecat-ai/aleix/end-cancel-task-frames
frames: add EndTaskFrame and CancelTaskFrame
2024-09-24 23:59:10 -07:00
Aleix Conchillo Flaqué
9461bacf0d pyproject: update fastapi to 0.115.0 2024-09-24 19:24:37 -07:00
Aleix Conchillo Flaqué
e276dcbab7 initialize task variables and add minor description 2024-09-24 19:19:00 -07:00
Aleix Conchillo Flaqué
1a3de0e819 rtvi: add RTVIProcessor.handle_message() 2024-09-24 19:12:06 -07:00
Aleix Conchillo Flaqué
ee3786fe15 frames: add EndTaskFrame and CancelTaskFrame 2024-09-24 19:10:22 -07:00
Aleix Conchillo Flaqué
31b5667cee frames: log text with [] so we can distinguish spaces better 2024-09-24 13:10:40 -07:00
Aleix Conchillo Flaqué
a483f1a083 rtvi: handle all actions from the action task 2024-09-24 10:48:15 -07:00
Aleix Conchillo Flaqué
2ecec1c9f8 Merge pull request #500 from pipecat-ai/aleix/rtvi-action-frames-task
RTVI action frames task
2024-09-24 10:13:43 -07:00
Aleix Conchillo Flaqué
08ac311971 rtvi: use task to process incoming action frames 2024-09-24 09:36:53 -07:00
Aleix Conchillo Flaqué
cb49b6a0d6 rtvi: add llm-text and tts-text server messages 2024-09-24 09:36:43 -07:00
Aleix Conchillo Flaqué
016da177db Merge pull request #499 from mercuryyy/main
Fix syntax error in deepgram.py
2024-09-24 09:10:05 -07:00
joachimchauvet
ec5998bc36 remove _internal_push_frame from LiveKitInputTransport 2024-09-24 14:54:37 +03:00
mercuryyy
b1e17ee347 Fix syntax error in deepgram.py 2024-09-24 07:45:29 -04:00
joachimchauvet
b6e1d6e6ae format with ruff 2024-09-24 10:21:02 +03:00
joachimchauvet
fa609f1afc adjust output sample rate and create user token 2024-09-24 10:16:54 +03:00
joachimchauvet
470b5eafe7 move tenacity imports inside try block 2024-09-24 10:16:54 +03:00
joachimchauvet
2e5b0c1d6b add tenacity dependency 2024-09-24 10:16:54 +03:00
joachimchauvet
a9390d96a1 add LiveKit audio transport 2024-09-24 10:16:54 +03:00
Mark Backman
8ee9621d66 Add setter functions 2024-09-23 21:12:01 -04:00
Jin Kim
49f2123893 Apply and Fix upstream changes for Cartesia 2024-09-24 07:59:26 +09:00
Jin Kim
cf72129852 Merge remote-tracking branch 'upstream/main' 2024-09-24 07:18:22 +09:00
Mark Backman
8edee8155d Add input params to Azure TTS 2024-09-23 17:52:23 -04:00
chadbailey59
c262b272fa Added RTVIActionFrame (#464)
* added RTVIActionFrame

* server-sent events

* reverted log changes

* fixup
2024-09-23 14:51:17 -05:00
Aleix Conchillo Flaqué
9ef9c1c58a Merge pull request #497 from pipecat-ai/aleix/ruff-formater
introduce Ruff formatting
2024-09-23 10:42:54 -07:00
Aleix Conchillo Flaqué
c7ff79a652 processors: fix formatting string 2024-09-23 09:53:37 -07:00
Aleix Conchillo Flaqué
da81df5284 github: install dev-requirements when running tests 2024-09-23 09:53:37 -07:00
Aleix Conchillo Flaqué
a4420dc88b README: add vscode and emacs ruff instructions 2024-09-23 09:53:37 -07:00
Aleix Conchillo Flaqué
eeb8338dce introduce Ruff formatting 2024-09-23 09:53:37 -07:00
Cyril S.
dfa4ac81fd Implement Sentry instrumentation for performance and error tracking (#470)
* feat: Add Sentry support in FrameProcessor

This update add optional Sentry integration for performance tracking and error monitoring.

Key changes include:

- Add conditional Sentry import and initialization check
- Implement Sentry spans in FrameProcessorMetrics to measure TTFB (Time To First Byte) and processing time when Sentry is available
- Maintain existing metrics functionality with MetricsFrame regardless of Sentry availability

* feat: Enable metrics in DeepgramSTTService for Sentry

This commit enhances the DeepgramSTTService class to enable metrics generation for use with Sentry.

Key changes include:

1. Enable general metrics generation:
   - Implement `can_generate_metrics` method, returning True when VAD is enabled
   - This allows metrics to be collected and used by both Sentry and the metrics system in frame_processor.py

2. Integrate Sentry-compatible performance tracking:
   - Add start_ttfb_metrics and start_processing_metrics calls in the VAD speech detection handler
   - Implement stop_ttfb_metrics call when receiving transcripts
   - Add stop_processing_metrics for final transcripts

3. Enhance VAD support for metrics:
   - Add `vad_enabled` property to check VAD event availability
   - Implement VAD-based speech detection handler for precise metric timing

These changes enable detailed performance tracking via both Sentry and the general metrics system when VAD is active. This allows for better monitoring and analysis of the speech-to-text process, providing valuable insights through Sentry and any other metrics consumers in the pipeline.

* Update frame_processor.py

* Refactor to support flexible metrics implementation

- Modified the __init__ method to accept a metrics parameter that is either FrameProcessorMetrics or one of its subclasses
- Updated the metrics initialization to create an instance with the processor's name
- Moved all FrameProcessorMetrics-related logic to a new processors\metrics\base.py file

* Implement flexible metrics system with Sentry integration

1. Created a new metrics module in processors/metrics/

2. Implemented FrameProcessorMetrics base class in base.py:

3. Implemented SentryMetrics class in sentry.py:
   - Inherits from FrameProcessorMetrics
   - Integrates with Sentry SDK for advanced metrics tracking
   - Implements Sentry-specific span creation and management for TTFB and processing metrics
   - Handles cases where Sentry is not available or initialized
2024-09-23 08:44:14 -07:00
Lewis Wolfgang
ea16dca8aa Merge pull request #469 from pipecat-ai/lewis/remove_torch_dependency
Remove torch dependency for using silero_vad
2024-09-23 09:59:40 -04:00
Mark Backman
306632b29a Add language_code support for ElevenLabs TTS 2024-09-23 09:01:02 -04:00
duyalei
4533ed014f add full-width punctuations as end of the sentence 2024-09-23 16:35:00 +08:00
Jin Kim
68cc4186ad Merge remote-tracking branch 'upstream/main' 2024-09-23 16:34:31 +09:00
Mark Backman
9a4e749c7c Merge pull request #491 from pipecat-ai/mb/elevenlabs-inputs
Add voice_settings and optimize_streaming_latency to ElevenLabs
2024-09-22 21:54:21 -04:00
Mark Backman
55c645c614 Add voice_settings and optimize_streaming_latency to ElevenLabs 2024-09-22 13:58:50 -04:00
Mark Backman
a1024bb365 Merge pull request #490 from pipecat-ai/mb/llm-rtvi-service-option
Add control frames for LLM param updates
2024-09-21 20:10:17 -04:00
Mark Backman
dfc82c3ba4 Merge pull request #486 from pipecat-ai/mb/llm-extra-params
Add extra input param to LLMs
2024-09-21 18:25:47 -04:00
Mark Backman
9e27a8aad0 Add control frames for LLM param updates 2024-09-21 00:02:58 -04:00
Mark Backman
c73111afea Add extra input param to LLMs 2024-09-21 00:01:25 -04:00
Kwindla Hultman Kramer
26a64afd8d Merge pull request #485 from pipecat-ai/khk/metrics-model-exclude-none
fixup for serialization issue
2024-09-20 18:24:19 -07:00
Kwindla Hultman Kramer
78a3f081de fixup for serialization issue 2024-09-20 18:21:06 -07:00
Mark Backman
e8f8a49646 Merge pull request #484 from pipecat-ai/mb/llm-input-params
Add input params for OpenAI, Anthropic, Together AI LLMs
2024-09-20 20:35:49 -04:00
Mark Backman
219304c5ee Added Changelog entries 2024-09-20 20:31:42 -04:00
Mark Backman
f3fd312b83 Add Together AI interruptible example 2024-09-20 20:21:19 -04:00
Mark Backman
357e66d64d Input params for Together AI LLM 2024-09-20 20:21:19 -04:00
Mark Backman
4fa1ea8c4b Input params for Anthropic LLM 2024-09-20 20:21:19 -04:00
Mark Backman
3b81cd462d Input params to OpenAI LLM 2024-09-20 20:21:19 -04:00
Aleix Conchillo Flaqué
14acf05a26 Merge pull request #480 from pipecat-ai/aleix/input-output-frames
introduce input/output audio and image frames
2024-09-20 14:44:37 -07:00
Mattie Ruth
58d9c84bc9 Merge pull request #474 from pipecat-ai/ruthless/improve-metrics-types-2
Ruthless/improve metrics types 2
2024-09-20 09:47:24 -04:00
Aleix Conchillo Flaqué
7e39d9ad3d introduce input/output audio and image frames
We now distinguish between input and output audio and image frames. We introduce
`InputAudioRawFrame`, `OutputAudioRawFrame`, `InputImageRawFrame` and
`OutputImageRawFrame` (and other subclasses of those). The input frames usually
come from an input transport and are meant to be processed inside the pipeline
to generate new frames. However, the input frames will not be sent through an
output transport. The output frames can also be processed by any frame processor
in the pipeline and they are allowed to be sent by the output transport.
2024-09-19 23:11:03 -07:00
mattie ruth backman
a4edb3dab1 Cleanup on aisle METRICS. Note: See below, this is a breaking change
1. Fleshed out MetricsFrames and broke it into a proper set of types
2. Add model_name as a property to the AIService so that it can be
   automatically included in metrics and also remove that
   overhead from all the various services themselves

Breaking change!

Because of the types improvements, the MetricsFrame type has
changed. Each frame will have a list of metrics simlilar to before
except each item in the list will only contain one type of metric:
"ttfb", "tokens", "characters", or "processing". Previously these
fields would be in every entry but set to None if they didn't apply.

While this changes internal handling of the MetricsFrame, it does NOT
break the RTVI/daily messaging of metrics. That format remains the same.

Also. Remember to use model_name for accessing a service's current
model and set_model_name for setting it.
2024-09-19 21:30:34 -04:00
Mattie Ruth
ed409d0460 Merge pull request #478 from pipecat-ai/ruthless/get-tests-running
Ruthless/get tests running
2024-09-19 21:01:27 -04:00
mattie ruth backman
50b45ac2da get the test infrastructure running again
disable broken tests for now
2024-09-19 20:58:17 -04:00
Kwindla Hultman Kramer
29bcbc68c5 Merge pull request #479 from pipecat-ai/khk/small-fixes
fix small issues that crept into main
2024-09-19 17:25:27 -07:00
Kwindla Hultman Kramer
affbe9ac7d fix small issues that crept into main 2024-09-19 17:17:33 -07:00
Aleix Conchillo Flaqué
1790fa452f Merge pull request #436 from pipecat-ai/aleix/frameprocessor-single-task
introduce synchronous and asynchronous frame processors
2024-09-19 11:22:56 -07:00
Aleix Conchillo Flaqué
607a246572 updated CHANGELOG with sync/async frame processors 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
4f1b06e6b2 pipeline: renamed ParallelTask to SyncParallelPipeline 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
62e9a33a70 examples: use CartesiaHttpTTSService to synchronize frames 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
3298f935ef services(fal,moondream): add missing **kwargs 2024-09-19 01:32:17 -07:00
Aleix Conchillo Flaqué
0e8f56c752 services: move TTSService push_stop_frames to AsyncTTSService 2024-09-19 01:32:15 -07:00
Aleix Conchillo Flaqué
8224538372 services(cartesia): added CartesiaHttpTTSService 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
fbf6eef68f transports(base_output): wait for sink tasks before canceling audio/video tasks 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
f078d156de frames: StartFrame is now a SystemFrame 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
23d6eed5ea transports: input()/output() return subclass instead of base class 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
0ed3d118d6 services(moondream); update revision to 2024-08-26 2024-09-19 01:31:12 -07:00
Aleix Conchillo Flaqué
337f048864 introduce synchronous and asynchronous frame processors
Pipecat has a pipeline-based architecture. The pipeline consists of frame
processors linked to each other. The elements travelling across the pipeline are
called frames.

To have a deterministic behavior the frames travelling through the pipeline
should always be ordered, except system frames which are out-of-band frames. To
achieve that, each frame processor should only output frames from a single task.

There are synchronous and asynchronous frame processors. The synchronous
processors push output frames from the same task that they receive input frames,
and therefore only pushing frames from one task. Asynchrnous frame processors
can have internal tasks to perform things asynchrnously (e.g. receiving data
from a websocket) but they also have a single task where they push frames from.
2024-09-19 01:31:10 -07:00
Mark Backman
6f3c421621 Merge pull request #475 from pipecat-ai/mb/tts-sample-rate
Add sample_rate setting to TTS services
2024-09-18 14:59:09 -04:00
Mark Backman
eadd68d40b Add sample_rate setting to TTS services 2024-09-18 14:50:20 -04:00
Lewis Wolfgang
71202e3cd5 Remove torch dependency for using silero_vad 2024-09-17 16:48:52 -04:00
Jin Kim
75008d8f11 Add speed and emotion setting method to Cartesia TTS service 2024-09-18 00:51:45 +09:00
Jin Kim
2da0ecbe3c Revert "model_id" as a main argument 2024-09-18 00:38:12 +09:00
Jin Kim
c7f814b2dc Merge remote-tracking branch 'upstream/main' 2024-09-18 00:33:29 +09:00
Aleix Conchillo Flaqué
13a4a05388 Merge pull request #466 from pipecat-ai/aleix/elevenlabs-cartesia-close-websocket-first
services(cartesia,elevenlabs): close websocket before the receiving task
2024-09-16 23:55:28 -07:00
Aleix Conchillo Flaqué
20c019ae16 services(cartesia,elevenlabs): close websocket before the receiving task 2024-09-16 23:54:21 -07:00
Adrian Cowham
387a36dd8a missed a debug print statement 2024-09-16 17:43:42 -07:00
Aleix Conchillo Flaqué
d9d6571c73 Merge pull request #465 from kunal-cai/ks--fix-ws
[Cartesia] Fix streaming truncation bug with Twilio Fast API WS
2024-09-16 17:17:13 -07:00
Kunal Shah
540cad4844 Undo sorting 2024-09-16 16:07:19 -07:00
Kunal Shah
0a26b650c0 Undo sorting 2024-09-16 16:06:25 -07:00
Kunal Shah
adaac003e5 [Cartesia] Fix streaming truncation bug with Twilio Fast API WS 2024-09-16 15:59:06 -07:00
Adrian Cowham
2e02ab740d PR feedback 2024-09-15 20:59:17 -07:00
Aleix Conchillo Flaqué
3d4f125071 Merge pull request #454 from pipecat-ai/aleix/initial-pipeline-clock-support
initial pipeline clock support
2024-09-13 13:51:04 -07:00
Aleix Conchillo Flaqué
bce87f8717 update CHANGELOG.md 2024-09-13 13:50:03 -07:00
Aleix Conchillo Flaqué
1fe940bd6b servceis(cartesia,elevenlabs): use word start times instead 2024-09-13 13:10:44 -07:00
Aleix Conchillo Flaqué
cb36a71381 fix some linting 2024-09-13 09:56:12 -07:00
Aleix Conchillo Flaqué
5acc4928fe examples: add 07d-interruptible-elevenlabs.py 2024-09-13 09:43:18 -07:00
Aleix Conchillo Flaqué
434493b8aa services(elevenlabs): implement word-by-word support through websockets 2024-09-13 09:31:35 -07:00
Aleix Conchillo Flaqué
f08b25dbb2 examples: assistant aggregator should always goes after transport 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
3665734972 transports(output): initial sink clock synchronization 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
a98d78cdea services(lmnt): change to subclass of AsyncTTSService 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
80f6d74e80 services(cartesia): change to subclass of AsyncWordTTSService 2024-09-12 00:37:34 -07:00
Aleix Conchillo Flaqué
02d926e9bd services: create AsyncTTSService and AsyncWordTTSService 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
7749692f72 processors: get pipeline clock from StartFrame 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
7807cbeb39 pipeline(task): add a clock to the pipeline task 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
72f231b327 frames: add a presentation timestamp (pts) to each frame 2024-09-12 00:31:48 -07:00
Aleix Conchillo Flaqué
3cbe97d346 clocks: added new BaseClock and SystemClock 2024-09-12 00:31:48 -07:00
Adrian Cowham
b4eff2028f Merge branch 'main' into recording 2024-09-10 10:18:57 -07:00
Adrian Cowham
f411bf33fd adding a frame processor with the ability to save a conversation to a buffer and another frame processor to upload audio to Canonical for evaluation and metrics collection. Examples included 2024-09-10 10:15:48 -07:00
Kwindla Hultman Kramer
b880e1a60e Merge pull request #448 from pipecat-ai/khk/aggregation-leading-space
fix for leading space in context aggregator strings
2024-09-10 09:57:35 -07:00
Aleix Conchillo Flaqué
886046e696 Merge pull request #445 from dleybz/patch-1
Update requirements.txt
2024-09-09 17:54:33 -07:00
Aleix Conchillo Flaqué
9106a5f8ae Merge pull request #449 from pipecat-ai/aleix/audio-out-bitrate
transports(daily): allow setting audio output bitrate (default 96kpbs)
2024-09-09 08:39:06 -07:00
Aleix Conchillo Flaqué
98286336bf transports(daily): allow setting audio output bitrate (default 96kpbs)
Fixes #388
2024-09-08 19:39:17 -07:00
Jin Kim
fa0deededa Add voice options and make to use InputParams for Cartesia. 2024-09-09 10:53:23 +09:00
Kwindla Hultman Kramer
081b001c8b fix for leading space in context aggregator strings 2024-09-07 16:42:52 -07:00
Danny D. Leybzon
c92531a02f Update requirements.txt
request.form() throws an error if you don't have python-multipart installed
2024-09-06 20:22:18 +02:00
Aleix Conchillo Flaqué
748a7af602 update CHANGELOG.md 2024-09-05 19:05:29 -07:00
Aleix Conchillo Flaqué
f4a0de6327 Merge pull request #444 from pipecat-ai/aleix/elevenlabs-streaming
services(elevenlabs): add elevenlabs package and use streaming
2024-09-05 11:24:12 -07:00
Aleix Conchillo Flaqué
e405d7af9f services(elevenlabs): add elevenlabs package and use streaming 2024-09-05 11:20:01 -07:00
Aashraya
51cd7fd285 twiliohandle interruption (#422)
* add interuption handler in twilio serializer

* fix autopep8

* revert ruff autoformatting

* address pr comments

* change interruption frame to user started frame in serializer

* remove overrrident handle interrupt

* remove unused import

* change userstarted to interuption frame
2024-09-02 11:06:38 -07:00
Aleix Conchillo Flaqué
aba5f89174 Merge pull request #437 from soof-golan/soof-obj-id-generation
Generate ids with itertools.count
2024-09-02 10:53:48 -07:00
Soof Golan
5c0f5a1613 Generate ids with itertools.count
Avoids the critical section with threading.Lock in favor of itertools.count.

`count` objects are threadsafe, and their critical section is implemented in C and provide better performance that Python level locking.
2024-09-02 15:39:58 +02:00
Aleix Conchillo Flaqué
7c342f7ba2 Merge pull request #433 from pipecat-ai/aleix/process-all-startframes
StartFrame should be the first frame every processor receives
2024-08-30 14:17:38 -07:00
Aleix Conchillo Flaqué
37e2388758 StartFrame should be the first frame every processor receives
Fixes #427
2024-08-29 22:43:44 -07:00
Aleix Conchillo Flaqué
05f0492a8d Merge pull request #421 from pipecat-ai/aleix/improve-multi-lingual-support
improve multi lingual support
2024-08-29 13:19:40 -07:00
Aleix Conchillo Flaqué
c0ac5c6ae8 services(lmnt): fix example and update README and CHANGELOG 2024-08-29 11:11:24 -07:00
Aleix Conchillo Flaqué
be923687fb processors(rtvi): user decices if bot interrupts on update config 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
5f32fb125d updated CHANGELOG.md 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
ae6fbb3146 services: just set model, voice, language independently 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
864768635a services: add voice and language to set_model() 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
d7c9679977 services: allow TTSModelUpdateFrame to also update language and voice 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
fedfc366f6 services(deepgram): fix strenum values 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
b3b39626e1 services: allow switching STT language and mdoel at the same time 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
4e0ece17b6 services: added support for setting STT model and language 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
fd3fdacdee transcriptions: added more languages 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
a253606d50 services(daily): on_joined now returns all data not only participant 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
568d9dc0a3 services(whisper): inherit from SegmentedSTTService 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
6629b853c5 services(deepgram): inherit from STTService instead of AsyncAIService 2024-08-29 11:00:03 -07:00
Aleix Conchillo Flaqué
3931cb3235 services(cartesia): allow setting language and language voices 2024-08-29 11:00:01 -07:00
Aleix Conchillo Flaqué
38cd86ad52 services: added language to transcription frames 2024-08-29 10:59:02 -07:00
Aleix Conchillo Flaqué
c0cdabf61d frames: adde TTSLanguageUpdateFrame and TTSLanguageVoicesUpdateFrame 2024-08-29 10:59:02 -07:00
Aleix Conchillo Flaqué
51270a96c5 frames: add language to transcription frames 2024-08-29 10:59:02 -07:00
Kwindla Hultman Kramer
84d72c0d5c Merge pull request #425 from pipecat-ai/khk/rtvi-together-function-calling
fixup type mismatches between rtvi data structures and together.py
2024-08-28 13:11:52 -07:00
Aleix Conchillo Flaqué
79aca8169a Merge pull request #391 from sharvil/pr/add-lmnt
LMNT TTS
2024-08-27 21:40:46 -07:00
Kwindla Hultman Kramer
b9d362bd62 fixup type mismatches between rtvi data structures and together.py 2024-08-27 17:39:21 -07:00
Sharvil Nanavati
87c4a1bee1 Move stop frame task creation into TTSService.start 2024-08-27 04:45:21 +00:00
Sharvil Nanavati
c979762b70 Handle cancellation, stopping, and restarting 2024-08-27 01:24:00 +00:00
Sharvil Nanavati
1d92fc3199 Merge branch 'main' into pr/add-lmnt 2024-08-24 10:07:52 -07:00
Sharvil Nanavati
8ac7fb1a67 Use a single long-lived Task to push TTSStoppedFrame 2024-08-24 16:18:07 +00:00
Sharvil Nanavati
60c3d33def Default LMNT to 24kHz, add example 2024-08-24 15:40:29 +00:00
Sharvil Nanavati
8a39d3f4eb services: add a generic mechanism to produce TTSStoppedFrames 2024-08-24 15:40:12 +00:00
Aleix Conchillo Flaqué
e038767b6f Merge pull request #413 from pipecat-ai/aleix/pipecat-0.0.41
prepare pipecat 0.0.41
2024-08-22 17:01:43 -07:00
Aleix Conchillo Flaqué
0c46b3e481 prepare pipecat 0.0.41 2024-08-22 11:50:20 -07:00
Aleix Conchillo Flaqué
d42f072ff5 examples: fix studypal errors and update requirements 2024-08-22 11:50:05 -07:00
Aleix Conchillo Flaqué
9b6f29c24a Merge pull request #414 from pipecat-ai/aleix/add-livekit-dependency
added livekit dependency
2024-08-22 10:55:43 -07:00
Aleix Conchillo Flaqué
873d5dc23f added livekit dependency 2024-08-22 10:54:18 -07:00
Aleix Conchillo Flaqué
6d141fd47f Merge pull request #396 from nulyang/feat/livekit-serializers
Add livekit audio serializers
2024-08-22 10:44:24 -07:00
Aleix Conchillo Flaqué
c6f6cb2947 Merge pull request #412 from pipecat-ai/aleix/fastapi-variable-clash
transports(fastapi): fix variable name clash
2024-08-22 09:50:23 -07:00
Aleix Conchillo Flaqué
0eb189ce7f transports(fastapi): fix variable name clash 2024-08-22 08:50:03 -07:00
Sharvil Nanavati
f4fd7b7028 LMNT TTS 2024-08-22 00:47:41 +00:00
Aleix Conchillo Flaqué
21de8e0a35 transport(out): log bot started/stopped speaking 2024-08-21 17:23:44 -07:00
Aleix Conchillo Flaqué
6f55d494bd frames: use VADParams type in VADParamsUpdateFrame 2024-08-21 17:23:12 -07:00
Aleix Conchillo Flaqué
d216edc567 Merge pull request #409 from aashsach/anthropic-empty-tool-argument
handle empty parameters for anthropic function calling
2024-08-21 16:14:51 -07:00
Aashraya
ec6063ecc4 system is not a list, it is handled and assisgned as string 2024-08-21 16:31:50 +05:30
Aashraya
40fe4ce6fb handle empty parameters for anthropic function calling 2024-08-21 15:49:36 +05:30
Aleix Conchillo Flaqué
31d87a4048 update CHANGELOG.md for 0.0.40 2024-08-20 11:48:40 -07:00
Aleix Conchillo Flaqué
ac8b171fa9 Merge pull request #406 from pipecat-ai/hush/cartesiaDocs
Hush/cartesia docs
2024-08-20 11:17:52 -07:00
James Hush
1f06d78213 github: remove *requirements.txt from tests.yaml 2024-08-20 11:16:25 -07:00
James Hush
28eba17df8 docs: update Cartesia references 2024-08-20 11:13:13 -07:00
Aleix Conchillo Flaqué
dfc2e62339 Merge pull request #405 from pipecat-ai/aleix/revert-dailysettings-aliases
Revert "transports(daily): use aliases in DailyDialinSettings"
2024-08-20 08:53:31 -07:00
Aleix Conchillo Flaqué
80c89a39c9 processors(rtvi): add support for client-ready message (fix) 2024-08-20 07:54:11 -07:00
Aleix Conchillo Flaqué
9d1c16e996 Revert "transports(daily): use aliases in DailyDialinSettings"
This reverts commit 47d375309d.
2024-08-20 07:52:35 -07:00
Aleix Conchillo Flaqué
86604c2353 examples(studypal): use aiohttp instead of requests 2024-08-19 18:11:30 -07:00
Aleix Conchillo Flaqué
8f31a02938 Merge pull request #403 from yashn35/studypal-demo
Add studypal
2024-08-19 17:39:19 -07:00
Aleix Conchillo Flaqué
47d375309d transports(daily): use aliases in DailyDialinSettings 2024-08-19 17:27:43 -07:00
Yash Narayan
980265ca97 Add studypal 2024-08-19 16:58:29 -07:00
Aleix Conchillo Flaqué
90479fff95 processors(rtvi): add set_client_ready() 2024-08-19 16:41:43 -07:00
Aleix Conchillo Flaqué
1ce1fcb0ce Merge pull request #401 from pipecat-ai/aleix/use-cartesia-in-more-examples
examples: use Cartesia TTS in most examples
2024-08-19 16:07:35 -07:00
Aleix Conchillo Flaqué
1a662376fc examples: use Cartesia TTS in most examples 2024-08-19 15:31:34 -07:00
Aleix Conchillo Flaqué
1d24f926ec Merge pull request #400 from pipecat-ai/aleix/rtvi-client-ready
processors(rtvi): add support for client-ready message
2024-08-19 10:53:49 -07:00
Aleix Conchillo Flaqué
4f2c37c940 processors(rtvi): add support for client-ready message 2024-08-19 10:33:18 -07:00
Aleix Conchillo Flaqué
042115a6bb processors(rtvi): update initial config when sending bot-ready message 2024-08-19 09:32:27 -07:00
Aleix Conchillo Flaqué
c9f1469b41 transports(daily/helpers): add server error message to the logs 2024-08-19 08:44:05 -07:00
Aleix Conchillo Flaqué
54c9f604c9 updated CHANGELOG with VADParamsUpdateFrame 2024-08-18 21:20:40 -07:00
Kwindla Hultman Kramer
56fbcd6562 Merge pull request #397 from pipecat-ai/khk/rtvi-vad-params
VADParamsUpdateFrame and handling thereof
2024-08-18 21:14:58 -07:00
Kwindla Hultman Kramer
e6b0500568 make VADAnalyzer:set_params() public 2024-08-18 21:11:18 -07:00
Aleix Conchillo Flaqué
41038b6673 Merge pull request #394 from pipecat-ai/aleix/fix-function-calling-examples
fix function calling examples
2024-08-18 20:55:29 -07:00
Aleix Conchillo Flaqué
26d03f26c9 services(openai, anthropic): a None result should not run inference 2024-08-18 20:48:43 -07:00
Aleix Conchillo Flaqué
f3a4e54996 function calling: start callback should have function name first 2024-08-18 20:48:20 -07:00
Kwindla Hultman Kramer
925e80bb20 VADParamsUpdateFrame and handling thereof 2024-08-18 13:34:46 -07:00
nulyang
9bda09b1a8 serializers(livekit): Add audio serializers 2024-08-18 23:40:32 +08:00
Aleix Conchillo Flaqué
ef0d0531fa services: moved request_image_frame() to LLMService 2024-08-17 23:59:38 -07:00
Aleix Conchillo Flaqué
6520f20ffe fix function calling examples 2024-08-17 23:32:39 -07:00
Aleix Conchillo Flaqué
ebc4e0924b Merge pull request #387 from pipecat-ai/aleix/update-reqs-081624
update pyproject.toml and remove requirements files
2024-08-17 23:29:47 -07:00
Aleix Conchillo Flaqué
9e7c0e6033 Merge pull request #390 from sharvil/pr/websocket-fix
transports(websocket): fix `_audio_buffer` being accidentally overwritten
2024-08-17 23:26:35 -07:00
Aleix Conchillo Flaqué
cf5720f316 update CHANGELOG.md 2024-08-17 21:00:32 -07:00
Kwindla Hultman Kramer
655b468269 Merge pull request #393 from pipecat-ai/khk/anthropic-tools-ordering
fix for out-of-order image messages in anthropic context
2024-08-17 15:07:27 -07:00
Kwindla Hultman Kramer
17f8c93e44 fix for out-of-order image messages in anthropic context 2024-08-17 14:47:29 -07:00
Aleix Conchillo Flaqué
5b4061b0d5 processors(rtvi): fix send_error() 2024-08-16 23:46:57 -07:00
Aleix Conchillo Flaqué
6ce0227e98 processors(rtvi): error-response should always include and error 2024-08-16 23:23:55 -07:00
Aleix Conchillo Flaqué
a583a28850 processors(rtvi): error message should use error field 2024-08-16 23:22:27 -07:00
Aleix Conchillo Flaqué
32daf65adc processors(rtvi): send to the client if errors are fatal 2024-08-16 23:17:55 -07:00
Aleix Conchillo Flaqué
e22c80610e frames: add new FatalErrorFrame 2024-08-16 23:17:31 -07:00
Sharvil Nanavati
374f1e7e01 transports(websocket): fix _audio_buffer being accidentally overwritten
`BaseOutputTransport` declares an `_audio_buffer` instance variable.
`WebsocketServerOutputTransport` accidentally reuses that variable
internally assuming it's class-local and not inherited.

This PR renames the variable in `WebsocketServerOutputTransport`
to avoid the name collision.
2024-08-17 05:28:05 +00:00
Aleix Conchillo Flaqué
d2dfa93bf1 processors(rtvi): send bot-ready when participant joins 2024-08-16 13:58:21 -07:00
Aleix Conchillo Flaqué
fa8c6712c6 transports(daily): fix multiple DailyTransport initialization 2024-08-16 13:32:34 -07:00
Aleix Conchillo Flaqué
4c2b84cb4d update pyproject.toml and remove requirements files 2024-08-16 09:28:46 -07:00
Aleix Conchillo Flaqué
b57c9d569b Merge pull request #352 from pipecat-ai/aleix/rtvi-0.1
processors(rtvi): rtvi 0.1 message protocol
2024-08-15 17:35:50 -07:00
Aleix Conchillo Flaqué
f0e50ba000 Merge pull request #336 from nulyang/fix/azure-transcriptionframe
services(azure): fix TranscriptionFrame parameter type
2024-08-15 17:08:56 -07:00
Mattie Ruth
4a6638f749 Merge pull request #385 from pipecat-ai/mrkb/anthropic-beta-caching
Mrkb/anthropic beta caching
2024-08-15 18:26:51 -04:00
Aleix Conchillo Flaqué
31577252f3 processors(rtvi): handle ErrorFrames 2024-08-15 15:23:31 -07:00
Aleix Conchillo Flaqué
5d71c50080 transports(daily): make sure audio_in_task exists before canceling 2024-08-15 15:23:07 -07:00
Aleix Conchillo Flaqué
981269d594 pipeline(task): process ErrorFrame in same task and stop pipeline task 2024-08-15 15:22:40 -07:00
mattie ruth backman
848db985fc bump anthropic in 3.10 requirements 2024-08-15 16:51:48 -04:00
mattie ruth backman
d5d8e31447 add cache tokens to metrics event 2024-08-15 16:51:48 -04:00
Aleix Conchillo Flaqué
66670a2370 Merge pull request #384 from pipecat-ai/aleix/enable-prompt-caching-frames
services(anthropic): allow setting enable prompt caching via frame
2024-08-15 13:26:39 -07:00
Aleix Conchillo Flaqué
5637f349c6 services(anthropic): allow setting enable prompt caching via frame 2024-08-15 12:43:29 -07:00
Aleix Conchillo Flaqué
93248e1d00 Merge pull request #382 from pipecat-ai/khk/anthropic-beta-caching
Support for Anthropic prompt caching beta
2024-08-15 12:34:54 -07:00
Kwindla Hultman Kramer
187769357f update version number of anthropic dependency 2024-08-15 12:28:41 -07:00
Aleix Conchillo Flaqué
5be6422cc8 Revert "processors(rtvi): process options in the order they are defined"
This reverts commit 61ac83e2d9.
2024-08-15 11:51:00 -07:00
Aleix Conchillo Flaqué
8670b2d994 utils: add match_endofsentence and use it in processors 2024-08-15 11:26:25 -07:00
Aleix Conchillo Flaqué
0bc6db428d processors(rtvi): implement bot-started-speaking and bot-stopped-speaking 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
67d565930e services: send TTSStartFrame/TTSStopFrame when really needed 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
b2a7ff6fd3 processors(rtvi): all transport messages should be urgent 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
425a730d7c transports(base_output): send urgent transport messages immediately 2024-08-15 11:05:10 -07:00
Aleix Conchillo Flaqué
84c5709722 frames: add urgent field to TransportMessageFrame 2024-08-15 11:05:10 -07:00
Kwindla Hultman Kramer
94deec01c9 okay, both files now 2024-08-15 00:57:10 -07:00
Kwindla Hultman Kramer
6e0dd4a779 Anthropic beta prompt caching 2024-08-15 00:54:43 -07:00
Kwindla Hultman Kramer
14bde340dd Merge pull request #381 from pipecat-ai/khk/anthropic-fixup-0814.2
Fixup anthropic context set_messages
2024-08-14 23:34:31 -07:00
Kwindla Hultman Kramer
253765c611 and fixing anthropic demos 2024-08-14 23:14:20 -07:00
Kwindla Hultman Kramer
2b26d7182f replaces 379 2024-08-14 22:40:09 -07:00
Aleix Conchillo Flaqué
61ac83e2d9 processors(rtvi): process options in the order they are defined 2024-08-14 22:26:49 -07:00
Aleix Conchillo Flaqué
d5c7b28cad Merge pull request #380 from pipecat-ai/aleix/rtvi-0.1-context-aggregators-updates
processors(aggregators): multiple LLM aggregators updates
2024-08-14 20:43:50 -07:00
Aleix Conchillo Flaqué
959580a708 processors(logger): fix linting 2024-08-14 20:39:24 -07:00
Aleix Conchillo Flaqué
3a5cd17ea3 processors(aggregators): multiple LLM aggregators updates 2024-08-14 20:23:18 -07:00
Kwindla Hultman Kramer
b78981bb9d Merge pull request #374 from pipecat-ai/khk/together
Together.ai service implementation with Llama 3.1 function calling
2024-08-14 17:29:07 -07:00
Kwindla Hultman Kramer
a6d90b0a00 linting fixes to anthropic.py 2024-08-14 17:27:00 -07:00
Aleix Conchillo Flaqué
67016492f2 transports(daily/helpers): add delete_room_from_url() 2024-08-14 17:14:02 -07:00
Aleix Conchillo Flaqué
2c38089527 processors(rtvi): handle incoming messages in a separate task 2024-08-14 15:34:02 -07:00
Kwindla Hultman Kramer
48f68ba6dc Service for together.ai, including Llama 3.1 function calling support 2024-08-13 15:01:54 -07:00
Aleix Conchillo Flaqué
574df4ba3d processors(rtvi): make sure to send bot-ready when transport is joined 2024-08-13 13:25:15 -07:00
Aleix Conchillo Flaqué
49ca16d125 pipeline(task): only send initial metrics frames if metrics enabled 2024-08-13 12:22:37 -07:00
Aleix Conchillo Flaqué
87525b085e processors(rtvi): linting and make send_error() public 2024-08-13 11:21:51 -07:00
Aleix Conchillo Flaqué
6b53c6add3 transports(daily): DailyTransport default DailyParams 2024-08-13 11:13:18 -07:00
Kwindla Hultman Kramer
29ca1b7855 Anthropic tool use core Pipecat pieces refactored (#369)
* processors(rtvi): rtvi 0.1 message protocol

* added a single function call handler

* wip - function calling

* fixup

* fixup

* fixup

* processors(rtvi): no need for configure_on_start()

* processors(rtvi): add new option values if they haven't been set yet

* Add the model name to the LLM usage metrics

* wip - anthropic tool calling

* still wip - anthropic tool use and vision

* anthropic tools and vision working

* anthropic tool calling and vision

* Cartesia error handling

* Anthropic tool use core Pipecat pieces refactored as per plan

* aleix has good ideas

* Usage metrics for Anthropic LLMs

* fix function call result state not getting cleared bug

* Pass **kwargs through from AnthropicLLMService constructor

* about to tinker with anthropic

* added openai function calling

* openai function calling

* fixup

---------

Co-authored-by: Aleix Conchillo Flaqué <aleix@daily.co>
Co-authored-by: Chad Bailey <chadbailey@gmail.com>
Co-authored-by: mattie ruth backman <mattieruth@gmail.com>
Co-authored-by: chadbailey59 <chadbailey59@users.noreply.github.com>
2024-08-13 13:01:24 -05:00
Aleix Conchillo Flaqué
a42d0c9907 processors(rtvi): add interrupt_bot() 2024-08-13 09:22:43 -07:00
marcus-daily
8bc6ceaa3d Fixing pep8 2024-08-13 15:32:23 +01:00
marcus-daily
0b8a1ab5d1 Handle describe-actions message 2024-08-13 15:32:23 +01:00
Brian Hill
358c287db2 chore: Enable build without git 2024-08-12 11:38:41 -04:00
Brian Hill
2e68453655 Merge pull request #371 from pipecat-ai/cbrianhill/allow-build-without-git
chore: Enable build without git
2024-08-12 10:15:55 -04:00
Brian Hill
89b8a9de7d chore: Enable build without git 2024-08-12 09:36:25 -04:00
Aleix Conchillo Flaqué
c4c2058df9 processors(rtvi): handle frames pushed from outside in order 2024-08-11 23:09:11 -07:00
Aleix Conchillo Flaqué
0d85c0085f processors(rtvi): interrupt the bot if a new config is received 2024-08-11 23:09:11 -07:00
Mattie Ruth
6fa8a8f84f Merge pull request #365 from pipecat-ai/ruthless/metrics 2024-08-11 20:35:05 -04:00
mattie ruth backman
a97775bff3 Add the model name to the LLM usage metrics 2024-08-11 12:08:46 -04:00
Aleix Conchillo Flaqué
32640e054d processors(rtvi): add new option values if they haven't been set yet 2024-08-10 21:25:39 -07:00
Aleix Conchillo Flaqué
aa42da5658 processors(rtvi): no need for configure_on_start() 2024-08-10 21:25:21 -07:00
nulyang
900a94a825 services(azure): fix TranscriptionFrame parameter type 2024-08-10 13:00:03 +08:00
Aleix Conchillo Flaqué
c37552de70 processors(rtvi): add support for action responses 2024-08-09 18:12:37 -07:00
Aleix Conchillo Flaqué
916b37926c processors(rtvi): rtvi 0.1 message protocol 2024-08-09 17:24:38 -07:00
Aleix Conchillo Flaqué
2b76c3c15a update macos-py3.10-requirements 2024-08-09 17:18:30 -07:00
Aleix Conchillo Flaqué
cedd7dde18 update linux-py3.10-requirements.txt 2024-08-09 17:14:46 -07:00
Lewis Wolfgang
d088608d8e Merge pull request #340 from pipecat-ai/lewis/silero-vad-via-pip
Install Silero VAD via pip
2024-08-09 13:27:29 -04:00
Aleix Conchillo Flaqué
06ee29bb8b Merge pull request #359 from pipecat-ai/aleix/twilio-elevenlabs-sample-rates
twilio and elevenlabs sample rates
2024-08-09 09:38:35 -07:00
Aleix Conchillo Flaqué
d255e954d6 services(elevenlabs): allow specifying output_format 2024-08-09 09:38:20 -07:00
Aleix Conchillo Flaqué
6a7ab6b8ac serializers(twilio): allow specifying input and output sample rates 2024-08-09 09:37:51 -07:00
Aleix Conchillo Flaqué
45b18cc0b1 Merge pull request #358 from pipecat-ai/aleix/daily-create-room-exp-fixes
transports(daily): fixed create_room expirations
2024-08-09 09:37:01 -07:00
Aleix Conchillo Flaqué
0479431f0a Merge pull request #357 from pipecat-ai/aleix/daily-on-participant-updated
transports(daily): added on_participant_updated event
2024-08-09 09:36:46 -07:00
Aleix Conchillo Flaqué
ec58dbd791 transports(daily): added on_participant_updated event
Fixes #353
2024-08-09 09:36:24 -07:00
Aleix Conchillo Flaqué
91de68aab3 Merge pull request #355 from pipecat-ai/aleix/usage-metrics-update
processors(base): add start_llm_usage_metrics and start_tts_usage_met…
2024-08-09 09:35:36 -07:00
Aleix Conchillo Flaqué
85efc30145 Merge pull request #356 from pipecat-ai/aleix/eleven_turbo_v2_5
services(elevenlabs): update default model to eleven_turbo_v2_5
2024-08-09 09:34:47 -07:00
Aleix Conchillo Flaqué
0032594f21 transports(daily): fixed create_room expirations
Fixes #348
2024-08-08 22:04:22 -07:00
Aleix Conchillo Flaqué
829fdc5679 services(elevenlabs): update default model to eleven_turbo_v2_5
Fixes #349
2024-08-08 21:38:18 -07:00
Aleix Conchillo Flaqué
22e176e329 processors(base): add start_llm_usage_metrics and start_tts_usage_metrics 2024-08-08 16:46:56 -07:00
Lewis Wolfgang
826a70a137 Merge pull request #354 from pipecat-ai/lewis/delete_room_by_name
Add delete_room_by_name to DailyRESTHelper
2024-08-08 17:09:21 -04:00
Lewis Wolfgang
dd0ea674af Treat 404 (room not found) as a success for deletion 2024-08-08 16:57:58 -04:00
Lewis Wolfgang
a4761b8921 Add delete_room_by_name to DailyRESTHelper 2024-08-08 16:31:01 -04:00
chadbailey59
3958bb7903 Additional LLM and TTS metrics (#343)
* added llm and tts usage metrics

* Metrics debug logging

* cleanup
2024-08-07 08:55:51 -05:00
Aleix Conchillo Flaqué
83a037a7ce Merge pull request #345 from pipecat-ai/aleix/base-output-render-time-fixes
transports(base_output): improve render sleep computation
2024-08-06 17:30:47 -07:00
Aleix Conchillo Flaqué
a3eb8337a6 Merge pull request #342 from pipecat-ai/aleix/base-output-transport-push-audio
transport(base_output): push audio downstream
2024-08-06 17:30:32 -07:00
Aleix Conchillo Flaqué
541072f8e0 transports(base_output): improve render sleep computation 2024-08-06 17:20:41 -07:00
Aleix Conchillo Flaqué
881248cbd6 transport(base_output): push audio downstream 2024-08-05 14:00:09 -07:00
Aleix Conchillo Flaqué
d4979f5e64 Merge pull request #337 from pipecat-ai/aleix/audio-video-sync-and-gstreamer
audio/video sync and gstreamer
2024-08-05 09:28:11 -07:00
Aleix Conchillo Flaqué
4133cd03bb processors(gstreamer): add clock_sync property 2024-08-05 09:23:25 -07:00
Lewis Wolfgang
9f07c3ca27 Fly.io example: remove step to cache silero models.
No longer necessary.
2024-08-05 10:12:35 -04:00
Lewis Wolfgang
b20bacb9ed Remove no longer needed code 2024-08-05 10:10:39 -04:00
Lewis Wolfgang
97cfbfee1d Install silero via pip 2024-08-05 10:01:27 -04:00
Aleix Conchillo Flaqué
fa7c941792 examples(gstreamer): add new GStreamer examples 2024-08-04 12:29:36 -07:00
Aleix Conchillo Flaqué
4738879f32 processors(gstreamer): add new GStreamerPipelineSource 2024-08-04 12:29:34 -07:00
Aleix Conchillo Flaqué
d5d88f756a transport(output): improve audio and image handling for video use cases 2024-08-04 12:29:08 -07:00
Aleix Conchillo Flaqué
65b136bf15 Merge pull request #334 from pipecat-ai/aleix/cleanup-examples-remove-requests
cleanup examples and remove requests
2024-08-01 22:05:01 -07:00
Aleix Conchillo Flaqué
bee0b238e4 examples(storytelling-chatbot): include package-lock.json 2024-08-01 18:23:30 -07:00
Aleix Conchillo Flaqué
c891168ffb services: revert optional aiohttp.ClientSession 2024-08-01 18:22:56 -07:00
Aleix Conchillo Flaqué
6376c2f6aa transport(websocket): fix cancel 2024-08-01 18:09:16 -07:00
Aleix Conchillo Flaqué
4d9b7cdd61 DailyRESTHelper now receives an aiohttp client session 2024-08-01 18:08:57 -07:00
Aleix Conchillo Flaqué
8263d1dd6f update CHANGELOG with latest changes 2024-07-31 23:44:07 -07:00
Aleix Conchillo Flaqué
faf41c0b36 services: ignore yielded None values 2024-07-31 23:41:03 -07:00
Aleix Conchillo Flaqué
27a09c0b2c cleanup examples and remove requests library 2024-07-31 23:39:51 -07:00
Aleix Conchillo Flaqué
3db7f6a284 Merge pull request #333 from pipecat-ai/aleix/allow-internal-http-sessions-rebased
services: allow internal http sessions if none is given
2024-07-31 21:57:00 -07:00
Aleix Conchillo Flaqué
3bfeb5b5ef services: allow internal http sessions if none is given 2024-07-31 21:56:19 -07:00
Aleix Conchillo Flaqué
62a7a555b5 Merge pull request #330 from pipecat-ai/aleix/stop-and-cancel-are-different
EndFrame tries to end gracefully CancelFrame cancels tasks
2024-07-31 15:51:29 -07:00
Aleix Conchillo Flaqué
d60e99a043 examples(06a-image-sync): make sure frames go downstream 2024-07-30 11:41:58 -07:00
Aleix Conchillo Flaqué
77723b34c7 EndFrame tries to end gracefully CancelFrame cancels tasks 2024-07-30 11:41:19 -07:00
Aleix Conchillo Flaqué
c466d34a06 Merge pull request #328 from pipecat-ai/aleix/rtvi-towards-custom-pipelines
processors(rtvi): refactor to allow future custom pipelines
2024-07-29 15:07:57 -07:00
Aleix Conchillo Flaqué
f816897833 Merge pull request #327 from pipecat-ai/aleix/bot-start-stop-speaking-frames
bot start stop speaking frames
2024-07-27 17:21:23 -07:00
Aleix Conchillo Flaqué
c1e8a5e522 processors(rtvi): refactor to allow future custom pipelines 2024-07-26 10:26:36 -07:00
Aleix Conchillo Flaqué
76aca32f2e transport(output): emit new bot start|stop speaking frames 2024-07-25 14:50:33 -07:00
Aleix Conchillo Flaqué
7e31b2a795 processors(user_idle): use user speaking instead of interruption frames 2024-07-25 14:47:56 -07:00
Aleix Conchillo Flaqué
028e38a86b Merge pull request #326 from pipecat-ai/aleix/rtvi-bot-ready-fixes
rtvi: send bot-ready when pipeline is ready and first participant joins
2024-07-25 11:39:14 -07:00
Aleix Conchillo Flaqué
8cf7649855 processors(rtvi): send bot-ready when pipeline AND first participant joins 2024-07-25 11:25:51 -07:00
Aleix Conchillo Flaqué
64f5119b08 transports(base): allow registering event handlers without decorators 2024-07-25 11:24:24 -07:00
Aleix Conchillo Flaqué
4d606aefb3 update CHANGELOG 2024-07-25 09:57:01 -07:00
Ankur Duggal
4bafdaa04d Deepgram Adjustments (#313) 2024-07-25 09:51:51 -07:00
Aleix Conchillo Flaqué
5afe1abf82 Merge pull request #323 from pipecat-ai/aleix/base-input-handle-incoming-interruptions
transports(inputs): handle start/stop interruption frames
2024-07-24 15:16:18 -07:00
Aleix Conchillo Flaqué
f066d50b98 transports(inputs): handle start/stop interruption frames 2024-07-24 15:15:09 -07:00
Aleix Conchillo Flaqué
91103e21cc github(publish_test): download tags and depth to 100 2024-07-24 14:49:09 -07:00
Aleix Conchillo Flaqué
f44dabcd65 Merge pull request #322 from pipecat-ai/aleix/base-input-transport-system-frames-fix
transports(inputs): don't queue incoming system frames
2024-07-24 14:44:18 -07:00
Aleix Conchillo Flaqué
0fd2fca231 frames: StartFrame is now a control frame 2024-07-24 14:42:59 -07:00
Aleix Conchillo Flaqué
5bb64098e7 transports(inputs): don't queue incoming system frames 2024-07-24 14:35:00 -07:00
Aleix Conchillo Flaqué
3fc85e75e0 Merge pull request #320 from pipecat-ai/aleix/req-updates-072324
update project requirements and dependencies
2024-07-23 17:45:18 -07:00
Aleix Conchillo Flaqué
3f61ea16b7 update project requirements and dependencies 2024-07-23 17:35:47 -07:00
Aleix Conchillo Flaqué
4b393092b5 Merge pull request #319 from pipecat-ai/aleix/daily-completion-callbacks-0.0.39-fix
transports(daily): fix completion callbacks handling
2024-07-23 15:27:26 -07:00
Aleix Conchillo Flaqué
b583f5162b transports(daily): fix completion callbacks handling 2024-07-23 15:25:59 -07:00
Aleix Conchillo Flaqué
060a22f395 github: only run publish_test manually
We need to run this manually to avoid test.pypi.org project size limits.
2024-07-23 14:19:24 -07:00
Aleix Conchillo Flaqué
d3e85355f1 Merge pull request #318 from pipecat-ai/aleix/prepare-0.0.38
update CHANGELOG for 0.0.38
2024-07-23 14:12:01 -07:00
Aleix Conchillo Flaqué
83e730b768 update CHANGELOG for 0.0.38 2024-07-23 14:10:10 -07:00
Aleix Conchillo Flaqué
5fcc96446c Merge pull request #317 from pipecat-ai/aleix/silero-repo-params
vad(silero): expose cache and repo parameters
2024-07-23 12:13:20 -07:00
Aleix Conchillo Flaqué
ad88925154 vad(silero): expose cache and repo parameters 2024-07-23 12:12:28 -07:00
Aleix Conchillo Flaqué
0a6ddbf15c Merge pull request #316 from pipecat-ai/aleix/metrics-improvements
metrics improvements
2024-07-23 11:23:57 -07:00
Aleix Conchillo Flaqué
08e0722d97 fix initial metrics format 2024-07-23 11:23:03 -07:00
Aleix Conchillo Flaqué
05d4fba551 processors(rtvi): send initial empty metrics 2024-07-23 11:22:41 -07:00
Aleix Conchillo Flaqué
f41c2b3c9f transports(daily): don't send empty metrics 2024-07-23 11:22:41 -07:00
Aleix Conchillo Flaqué
69f64899fe pipeline: add send_initial_empty_metrics flag 2024-07-23 11:22:41 -07:00
Aleix Conchillo Flaqué
33f0865430 Merge pull request #315 from pipecat-ai/aleix/stop-transcription-error
transports(daily): wait until start|stop_transcription are finished
2024-07-23 11:18:59 -07:00
Aleix Conchillo Flaqué
ad5b9202ab transports(daily): wait until start|stop_transcription are finished
Fixes #305
2024-07-22 22:59:30 -07:00
Aleix Conchillo Flaqué
1676693091 Merge pull request #314 from pipecat-ai/aleix/transcription-timestamps
services: transcription timestamp should use ISO8601 format
2024-07-22 22:43:01 -07:00
Aleix Conchillo Flaqué
0852b50b8f services: transcription timestamp should use ISO8601 format 2024-07-22 22:40:28 -07:00
Aleix Conchillo Flaqué
eb998aa502 Merge pull request #312 from pipecat-ai/aleix/rtvi-support
RTVI support
2024-07-22 16:58:40 -07:00
Aleix Conchillo Flaqué
6dab0e9de7 update CHANGELOG for 0.0.37 2024-07-22 16:00:30 -07:00
Aleix Conchillo Flaqué
95ff1d141c update CHANGELOG with RTVIProcessor 2024-07-22 16:00:26 -07:00
Aleix Conchillo Flaqué
87bc8a9da6 examples: remove RTVI since there are full demos elsewhere 2024-07-22 15:53:39 -07:00
Aleix Conchillo Flaqué
087fe9a537 services(cartesia): fix TTFB 2024-07-22 15:30:16 -07:00
Aleix Conchillo Flaqué
c1170260b5 processors(rtvi): use generic LLM and TTS names 2024-07-22 15:27:33 -07:00
Aleix Conchillo Flaqué
65cdf50774 processors(rtvi): fix task cleanup 2024-07-22 15:01:45 -07:00
Aleix Conchillo Flaqué
9233bb490c processors(rtvi): add support for "tts-text" messages 2024-07-22 11:40:17 -07:00
Aleix Conchillo Flaqué
43932220f7 processors(rtvi): use only user-transcription 2024-07-22 09:40:16 -07:00
Aleix Conchillo Flaqué
cea4d1894e processors(rtvi): change voice before LLM updates 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
80baa0358d processors(rtvi): lable is now rtvi 2024-07-22 09:32:18 -07:00
Chad Bailey
5d73db53a0 initial pseudo function calling 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
302ea90dce processors(rtvi): messages now require an id 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
37b04ed283 processors(rtvi): use send a type=response as command responses 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
be6995cfdf processors(rtvi): renamed realtime-ai to rtvi 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
dfbc11300c processors(realtime-ai): use label instead of tag 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
82d539d174 processors(realtime-ai): add support for interrupting the bot 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
6e00f31014 updated CHANGELOG with new frames and realtime-ai changes 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
a46ac3cc92 examples: moved 18-realtime-ai.py to examples/realtime-ai 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
6fbf98d8e2 processors(realtime-ai): llm-context now uses a data field 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
f094c42728 processors(realtime-ai): add transcription messages 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
13827e1282 processors(realtime-ai): send a successful response for every command 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
32170b47d9 processors(realtime-ai): add user-[start|stopped]-speaking messages 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
09c05354c2 processors(realtime-ai): fix voice initialization 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
b0b1475563 processors(realtime-ai): add support making TTS to speak 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
b85dd7283a processors(realtime-ai): add support for appending to the LLM context 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
846ae765e5 services(TTSService): fix sentence cleanup 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
4c629e538e processors(realtime-ai): add assistant before output transport
Cartesia can do word-to-word output instead of full sentences. This means that
for properly adding things into the context we need to add it before the
transport, otherwise some words might be lost.
2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
f6e22bb3b9 processors(realtime-ai): add silero vad to the transport 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
46a048d7f6 processors(realtime-ai): allow default setup to be None 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
bd9f4eea06 processors(realtime-ai): provide default values 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
0a672e61e2 processors(realtime-ai): update it to use groq by default 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
29a8530221 processors(realtime-ai): add support for updating config (model, voice...) 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
3e738642a7 processors(realtime-ai): add support for getting/updating LLM context 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
f551f55f03 examples: add new foundational/18-realtime-ai.py 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
9f012c8002 processors: add new RealtimeAIProcessor 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
0a69a9e5ef transport(daily): also accept TransportMessageFrame 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
194790183a processor: add support for setting a processor parent 2024-07-22 09:32:18 -07:00
Aleix Conchillo Flaqué
2227721173 update CHANGELOG with StatelessTextTransformer fix (update) 2024-07-22 09:30:45 -07:00
Aleix Conchillo Flaqué
77a53da5f5 update CHANGELOG with StatelessTextTransformer fix 2024-07-22 09:28:38 -07:00
Aleix Conchillo Flaqué
ab63ff275d Merge pull request #310 from weedge/fix/StatelessTextTransformer
fix: push_frame use TextFrame
2024-07-22 09:25:27 -07:00
weedge
e5363f65f0 fix: push_frame use TextFrame
Signed-off-by: weedge <weege007@gmail.com>
2024-07-22 17:29:06 +08:00
Lewis Wolfgang
ffc157de65 Merge pull request #307 from pipecat-ai/lewis/increase_openai_keepalive_expiry
Allow openai http connections to remain open in the pool indefinitely.
2024-07-19 07:09:17 -04:00
Lewis Wolfgang
f9fdadb4c0 Allow openai http connections to remain open in the pool indefinitely.
Rather than expiring in 5 seconds.
2024-07-18 11:18:21 -04:00
Aleix Conchillo Flaqué
4efccb79f2 Merge pull request #306 from pipecat-ai/aleix/remove-llm-response-start-end-frame
remove LLMResponseStartFrame and LLMResponseEndFrame
2024-07-17 21:51:02 -07:00
Aleix Conchillo Flaqué
337968199a update CHANGELOG with CartesiaTTSService and TTSService updates 2024-07-17 20:58:10 -07:00
Aleix Conchillo Flaqué
37027f68cb remove LLMResponseStartFrame and LLMResponseEndFrame
This was added in the past to properly handle interruptions for the
LLMAssistantContextAggregator. But this is not necessary anymore since we can
handle interruptions by just processing the StartInterruptionFrame, so there's
no need for these extra frames.
2024-07-17 20:53:35 -07:00
Kwindla Hultman Kramer
d1b62c5495 Merge pull request #304 from pipecat-ai/khk/cartesia-continue
Cartesia streaming (WebSocket) and word-level timestamps support
2024-07-17 20:29:15 -07:00
Kwindla Hultman Kramer
355fe01cb7 fixed forgotten renames 2024-07-17 20:28:27 -07:00
Kwindla Hultman Kramer
9d050a16c7 committing an uncommitted file 2024-07-17 20:23:41 -07:00
Kwindla Hultman Kramer
fa53c67606 comments re fixes 2024-07-17 18:30:45 -07:00
Kwindla Hultman Kramer
5006376fe6 undo changes to 02-llm-say-one-thing.py 2024-07-17 15:18:47 -07:00
Kwindla Hultman Kramer
2204b8e205 cartesia streaming and context management via word-level timestamps 2024-07-17 15:17:00 -07:00
Kwindla Hultman Kramer
270007b17c wip - using cartesia word timestamps for context management 2024-07-17 14:13:52 -07:00
Kwindla Hultman Kramer
568eb2ef4c cartesia websockets and streaming 2024-07-17 14:13:52 -07:00
Kwindla Hultman Kramer
73ca9184a8 wip cartesia continuation (not working yet) 2024-07-17 14:13:52 -07:00
Aleix Conchillo Flaqué
5e8e11e16e pyproject: require python >= 3.10 2024-07-17 09:52:42 -07:00
Aleix Conchillo Flaqué
029bbc16f2 Merge pull request #286 from TomTom101/feat/regex_endofsentence
fix: No more falsely detect a sentence end on "U.S.A", "3:00 a.m."
2024-07-17 09:49:21 -07:00
Aleix Conchillo Flaqué
9e3d87e4f6 Merge pull request #291 from adidoit/main
Fix error with readme example - SyntaxError: positional argument follows keyword argument
2024-07-15 13:10:17 -04:00
Aleix Conchillo Flaqué
f1410a1127 Merge pull request #297 from wtlow003/main
fix: minor typo
2024-07-15 13:08:23 -04:00
wtlow003
2b980d16c3 fix: minor typo 2024-07-12 18:27:57 +08:00
Adi Pradhan
b2b97aafb8 fix error with readme example - SyntaxError: positional argument follows keyword argument 2024-07-10 09:50:20 -04:00
TomTom101
da2082b025 chore: Combined combinable lookaheads 2024-07-06 11:11:40 +02:00
TomTom101
327ea9d547 chore: Make it a const 2024-07-06 11:08:51 +02:00
TomTom101
b23db4a202 chore: commented regex 2024-07-06 11:06:52 +02:00
TomTom101
d1a36004ab fix: No more falsely detect a sentence end on "U.S.A", "3:00 a.m." and more 2024-07-06 11:01:32 +02:00
Jon Taylor
6071920c45 Merge pull request #284 from pipecat-ai/jpt/storybot-load-balance
Update storybot demo
2024-07-03 19:48:32 +01:00
Jon Taylor
5f539e1fba fixed teardown 2024-07-03 17:02:54 +01:00
Jon Taylor
8e1539c360 virtualized deployment and added room-based balancing 2024-07-03 16:48:14 +01:00
Aleix Conchillo Flaqué
065cfb2aca Merge pull request #280 from pipecat-ai/aleix/library-updates-070224
library updates 070224 and pipecat 0.0.36
2024-07-02 10:14:03 -07:00
Aleix Conchillo Flaqué
3147534e86 update CHANGELOG for 0.0.36 2024-07-02 10:13:26 -07:00
Aleix Conchillo Flaqué
be5603bf16 examples: fix 06a-image-sync.py 2024-07-02 10:11:50 -07:00
Aleix Conchillo Flaqué
b9b0bcdcbd services(azure): close the audio stream on exit 2024-07-02 10:11:35 -07:00
Aleix Conchillo Flaqué
5bcece56f3 services(cartesia): make sure we close the client on exit 2024-07-02 10:11:16 -07:00
Aleix Conchillo Flaqué
d67faef88c pyproject: multiple library updates 2024-07-02 09:05:37 -07:00
Aleix Conchillo Flaqué
8f6db5e905 Merge pull request #279 from pipecat-ai/aleix/gladia-stt-support
add Gladia STT support
2024-07-02 08:07:35 -07:00
Aleix Conchillo Flaqué
82e93a0560 use exclude_none=True when dumping BaseModels 2024-07-02 08:03:31 -07:00
Aleix Conchillo Flaqué
a9a82c083b services: add GladiaSTTService support 2024-07-02 08:03:29 -07:00
Aleix Conchillo Flaqué
974d9c33ed Merge pull request #278 from pipecat-ai/aleix/detect-user-idle
add support for detecting user idle
2024-07-02 08:01:27 -07:00
Jon Taylor
c1957ab694 Merge pull request #274 from pipecat-ai/jpt/deployment-examples
Example deployment pattern for fly.io
2024-07-02 10:17:13 +01:00
Jon Taylor
b20a10a4bc fixed double fly 2024-07-02 10:17:01 +01:00
Aleix Conchillo Flaqué
be14ce465d transports(daily): make sure we don't send data if client is closed 2024-07-01 18:26:13 -07:00
Aleix Conchillo Flaqué
d1ca0c5614 examples: added new 17-detect-user-idle.py 2024-07-01 18:17:43 -07:00
Aleix Conchillo Flaqué
535514f506 processors: added new UserIdleProcessor 2024-07-01 18:17:43 -07:00
Aleix Conchillo Flaqué
933b63cf13 processors: added new IdleFrameProcessor 2024-07-01 14:57:42 -07:00
Aleix Conchillo Flaqué
d7c3e380a5 added BotSpeakingFrame 2024-07-01 14:57:18 -07:00
Aleix Conchillo Flaqué
c5298f78cb add more missing keyword-only arguments 2024-07-01 12:34:53 -07:00
Jon Taylor
4f8f7b8d1d added on_call_state event to prevent idle vms 2024-07-01 19:21:16 +01:00
Aleix Conchillo Flaqué
d7d46919ac update macos-py3.10-requirements.txt 2024-07-01 11:00:59 -07:00
Aleix Conchillo Flaqué
e5d73d2e2e update linux-py3.10-requirements.txt 2024-07-01 10:58:49 -07:00
Aleix Conchillo Flaqué
b145e8ec90 update README with XTTS 2024-07-01 10:49:43 -07:00
Aleix Conchillo Flaqué
97ff4a1fb8 Merge pull request #275 from pipecat-ai/aleix/add-missing-keyword-separators
add missing keyword separators
2024-07-01 10:45:31 -07:00
Aleix Conchillo Flaqué
5018a552c1 services(xtts): no need the WAV header 2024-07-01 10:44:32 -07:00
Aleix Conchillo Flaqué
7f9fd9ffce examples: added 07i-interruptible-xtts 2024-07-01 10:41:34 -07:00
Aleix Conchillo Flaqué
ddd0ca6a8f update CHANGELOG 2024-07-01 10:27:26 -07:00
Aleix Conchillo Flaqué
06f817c7e3 transport(websocket): don't send if serializer returns None 2024-07-01 10:27:26 -07:00
Aleix Conchillo Flaqué
df4c3e56c4 services: add missing * keyword separator 2024-07-01 10:27:26 -07:00
Aleix Conchillo Flaqué
9d5c2b9656 Merge pull request #276 from eddieoz/feature/xtts
Added service XTTS
2024-07-01 10:26:53 -07:00
eddieoz
7ce59c5e2e added service xtts 2024-07-01 20:17:19 +03:00
Aleix Conchillo Flaqué
1c9631fc78 Merge pull request #271 from pipecat-ai/aleix/silero-vad-version
vad(silero): allow specifying a Silero VAD version
2024-07-01 09:39:59 -07:00
Aleix Conchillo Flaqué
efbe7297f7 vad(silero): allow specifying a Silero VAD version 2024-07-01 09:38:43 -07:00
Aleix Conchillo Flaqué
1b45946a61 Merge pull request #270 from pipecat-ai/aleix/async-frame-processor
add new AsyncFrameProcessor and AsyncAIService
2024-07-01 09:37:51 -07:00
Aleix Conchillo Flaqué
cbf5a6362c add new AsyncFrameProcessor and AsyncAIService 2024-07-01 09:37:02 -07:00
Aleix Conchillo Flaqué
583b96c341 Merge pull request #269 from pipecat-ai/aleix/improve-error-handling
improve error handling and don't swallow exceptions
2024-07-01 09:36:00 -07:00
Aleix Conchillo Flaqué
fc0920504d improve error handling and don't swallow exceptions 2024-07-01 09:35:45 -07:00
Aleix Conchillo Flaqué
abd65a93b2 Merge pull request #268 from pipecat-ai/aleix/websocket-dont-send-if-closed
transports(websocket): don't send data if websocket closed
2024-07-01 09:33:45 -07:00
Aleix Conchillo Flaqué
c3244fdd7a transports(websocket): don't send data if websocket closed 2024-07-01 09:31:58 -07:00
Aleix Conchillo Flaqué
e8f58938b0 Merge pull request #267 from pipecat-ai/aleix/processing-metrics
add support for processing metrics
2024-07-01 09:31:05 -07:00
Jon Taylor
602b4f34b1 added example fly.toml 2024-07-01 16:50:53 +01:00
Jon Taylor
0399c84dfa added flyio deployment example 2024-07-01 16:46:38 +01:00
Aleix Conchillo Flaqué
fd5d879bf5 add support for processing metrics
Processing metrics indicate how much time a processor takes to generate all of
its output.
2024-06-28 14:26:57 -07:00
Aleix Conchillo Flaqué
8dff460307 Merge pull request #266 from pipecat-ai/aleix/silero-num-frames-fixes
vad: fix Silero VAD required number of frames
2024-06-28 11:25:55 -07:00
Aleix Conchillo Flaqué
cce1ddb183 vad: fix Silero VAD required number of frames 2024-06-28 10:45:48 -07:00
Aleix Conchillo Flaqué
8691d14289 Merge pull request #255 from Viking5274/main
Fix twilio error
2024-06-26 10:17:03 -07:00
daniil5701133
dd402da9e5 added handling streamSid after first wss connect
fixx name
2024-06-26 18:56:30 +03:00
Aleix Conchillo Flaqué
2fd04248f1 examples(storytelling-chatbot): upgrade npm vulnerabilities 2024-06-25 22:04:55 -07:00
Aleix Conchillo Flaqué
0ac42006f8 Merge pull request #260 from pipecat-ai/aleix/more-interruption-fixes
more interruption fixes
2024-06-25 21:52:02 -07:00
Aleix Conchillo Flaqué
66e331248d update CHANGELOG for 0.0.34 2024-06-25 21:43:23 -07:00
Aleix Conchillo Flaqué
4be3e8c87d aggregators: revert using intermediate results 2024-06-25 21:33:17 -07:00
Aleix Conchillo Flaqué
dac033fe61 services(azure): allow transcriptions during interruptions
If the user interrupts we can't just discard transcriptions because the user is
actually interrupting and talking.
2024-06-25 21:33:06 -07:00
Aleix Conchillo Flaqué
d302cbb114 services(deepgram): allow transcriptions during interruptions
If the user interrupts we can't just discard transcriptions because the user is
actually interrupting and talking.
2024-06-25 21:32:21 -07:00
Aleix Conchillo Flaqué
e3b407db28 Merge pull request #259 from pipecat-ai/aleix/prepare-0.0.33
update CHANGELOG for 0.0.33
2024-06-25 12:05:07 -07:00
Aleix Conchillo Flaqué
4ef623f09e update CHANGELOG for 0.0.33 2024-06-25 11:53:07 -07:00
Aleix Conchillo Flaqué
253530a63d Merge pull request #258 from pipecat-ai/aleix/upgrade-cartesia-1.0.0
services(cartesia): upgrade to new cartesia 1.0.0
2024-06-25 11:52:04 -07:00
Aleix Conchillo Flaqué
4f38d989f5 services(cartesia): upgrade to new cartesia 1.0.0 2024-06-25 11:51:34 -07:00
Aleix Conchillo Flaqué
84074e90ee Merge pull request #257 from pipecat-ai/aleix/cancel-all-tasks-when-interrutpted
cancel all tasks when interrutpted
2024-06-25 11:16:00 -07:00
Aleix Conchillo Flaqué
38aee7d8f2 services(azure): cancel tasks when interrupted and ignore incoming transcriptions 2024-06-25 11:15:26 -07:00
Aleix Conchillo Flaqué
64198313c6 services(deepgram): cancel tasks when interrupted and ignore incoming transcriptions 2024-06-25 11:15:07 -07:00
Aleix Conchillo Flaqué
d61b6c301c transports(base_input): create push tasks after pushing interruption 2024-06-25 11:15:07 -07:00
Aleix Conchillo Flaqué
83d1931266 Merge pull request #256 from pipecat-ai/aleix/tts-cleanup-when-interrupted
services(tts): strip before TTS and cleanup when interrupted
2024-06-25 11:14:32 -07:00
Aleix Conchillo Flaqué
c31f2ab285 services(tts): strip before TTS and cleanup when interrupted 2024-06-25 11:13:19 -07:00
Aleix Conchillo Flaqué
0ddc5721b4 Merge pull request #252 from pipecat-ai/aleix/daily-check-size-read-audio-frames
transports(daily): always check size of read audio frames
2024-06-25 09:45:05 -07:00
Aleix Conchillo Flaqué
98bd183bc4 pyproject: fix cartesia version and update requirements files 2024-06-25 09:43:54 -07:00
Aleix Conchillo Flaqué
aaa154524c Merge pull request #253 from pipecat-ai/aleix/llm-response-use-intermediate-results
aggregators: uses intermediate results for LLMAssistantResponseAggreg…
2024-06-24 19:21:14 -07:00
Aleix Conchillo Flaqué
beced68337 aggregators: uses intermediate results for LLMAssistantResponseAggregator 2024-06-24 17:33:45 -07:00
Aleix Conchillo Flaqué
94823ab952 transports(daily): always check size of read audio frames 2024-06-24 14:56:24 -07:00
Kwindla Hultman Kramer
0b6a19802f Merge pull request #250 from pipecat-ai/lewis/flush-tts-on-llm-response-end
Flush output from TTSService on LLMFullResponseEndFrame
2024-06-22 20:37:45 -04:00
Lewis Wolfgang
c4a2d2197c Flush output from TTSService on LLMFullResponseEndFrame
To cover cases when the LLM response does not end in punctuation.
2024-06-22 14:57:44 -04:00
Aleix Conchillo Flaqué
269d06aa15 Merge pull request #249 from pipecat-ai/aleix/pipecat-0.0.32
update CHANGELOG.md for 0.0.32
2024-06-22 09:21:21 -07:00
Aleix Conchillo Flaqué
dfef1f2c54 update CHANGELOG.md for 0.0.32 2024-06-22 09:19:22 -07:00
Aleix Conchillo Flaqué
b62beaba0b Merge pull request #248 from pipecat-ai/aleix/deepgramstt-url
services(deepgram): add url to DeepgramSTTService
2024-06-21 22:26:23 -07:00
Aleix Conchillo Flaqué
adf414e40f services(deepgram): add url to DeepgramSTTService 2024-06-21 16:52:28 -07:00
Aleix Conchillo Flaqué
dc64e57f63 Merge pull request #241 from pipecat-ai/aleix/transports-async
transports: fully use asyncio in all read/write operations
2024-06-21 16:00:08 -07:00
Aleix Conchillo Flaqué
d3e410b2ac transports: fully use asyncio in all read/write operations 2024-06-21 15:55:15 -07:00
Aleix Conchillo Flaqué
c544b2474b update linux-py3.10-requirements with fastapi and new daily-python 2024-06-21 15:44:01 -07:00
Aleix Conchillo Flaqué
18243de358 add fastapi and update macos-py3.10-requirements.txt 2024-06-21 13:16:47 -07:00
Aleix Conchillo Flaqué
6625895d1f update macos-py3.10-requirements.txt 2024-06-21 13:13:02 -07:00
Aleix Conchillo Flaqué
f9ecce739e Merge pull request #247 from pipecat-ai/aleix/twilio-updates
some twilio updates
2024-06-21 10:14:40 -07:00
Aleix Conchillo Flaqué
0075dd8386 update linux/macos-py3.10-requirements.txt 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
eef1cde816 updated CHANGELOG.md with fastapi and twilio updates 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
8d867c30c6 transports(websocket): verify websockets module 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
42c668b7ae examples(twilio-chatbot): update instructions and renames 2024-06-21 09:48:12 -07:00
Aleix Conchillo Flaqué
b62227b4ae serializers(twilio): formatting and allow str | bytes | None 2024-06-21 09:47:17 -07:00
Aleix Conchillo Flaqué
25ef0cb87b serializers: allow str | bytes | None 2024-06-21 09:42:43 -07:00
Aleix Conchillo Flaqué
e195941aa5 Merge pull request #246 from pipecat-ai/aleix/daily-dialout-answered
transports(daily): added dialout_answered event
2024-06-20 18:37:24 -07:00
Aleix Conchillo Flaqué
e09eef1dd7 Merge pull request #243 from Viking5274/main
Add twilio_websocket_service with example
2024-06-20 14:09:48 -07:00
Aleix Conchillo Flaqué
7c13663a4e transports(daily): added dialout_answered event 2024-06-20 13:01:25 -07:00
daniil5701133
5753869e5e add twilio-chatbot example with README.md info how to start app
created twilio_websocket_service.py, TwilioFrameSerializer.py

moved pcm_16000_to_ulaw_8000 and ulaw_8000_to_pcm_16000 to src/pipecat/utils/audio.py
fixed callback on disconnect
2024-06-20 23:00:01 +03:00
chadbailey59
ba878a19f4 fixed "Dr." interruption (#245) 2024-06-19 20:53:04 -05:00
Aleix Conchillo Flaqué
55a9de78cd Merge pull request #239 from pipecat-ai/aleix/azure-stt
azure stt support
2024-06-14 14:07:07 +08:00
Aleix Conchillo Flaqué
ff51fc9091 updated CHANGELOG and README 2024-06-13 17:03:49 -07:00
Aleix Conchillo Flaqué
a4f857ee34 examples: use new AzureSTTService in 07f-interruptible-azure 2024-06-13 17:03:49 -07:00
Aleix Conchillo Flaqué
3250d74bef services(azure): new AzureSTTService 2024-06-13 17:03:49 -07:00
Aleix Conchillo Flaqué
c086160239 examples: cleanup some 07 interruptible examples 2024-06-13 16:36:10 -07:00
Aleix Conchillo Flaqué
6cdccaff53 Merge pull request #238 from pipecat-ai/aleix/pipecat-0.0.31
pipecat 0.0.31
2024-06-14 06:31:41 +08:00
Aleix Conchillo Flaqué
a9ab8de25d update CHANGELOG for 0.0.31 2024-06-13 15:31:03 -07:00
Aleix Conchillo Flaqué
2a29cb18a5 transports(base_output): chunk audio into 20ms instead of 10ms 2024-06-13 15:30:41 -07:00
Aleix Conchillo Flaqué
4193a4f415 Merge pull request #237 from pipecat-ai/aleix/pipecat-0.0.30
update CHANGELOG for 0.0.30
2024-06-14 05:28:14 +08:00
Aleix Conchillo Flaqué
0226ec450a update CHANGELOG for 0.0.30 2024-06-13 14:27:37 -07:00
Aleix Conchillo Flaqué
020b8ebb35 Merge pull request #236 from pipecat-ai/aleix/report-only-initial-ttfb
report only initial ttfb
2024-06-14 05:24:52 +08:00
Aleix Conchillo Flaqué
1170b30c1b aggregator(user_response): also handle small VADParams.stop_secs 2024-06-13 13:30:31 -07:00
Aleix Conchillo Flaqué
0004d4a906 vad: reduce smoothing factor and increase confidence 2024-06-13 13:30:11 -07:00
Aleix Conchillo Flaqué
cb27e86266 metrics: allow sending only initial TTFB metrics 2024-06-13 13:30:00 -07:00
Aleix Conchillo Flaqué
77a3b2ea5c Merge pull request #235 from pipecat-ai/aleix/openpipe-refactoring
openpipe refactoring
2024-06-14 01:28:50 +08:00
Aleix Conchillo Flaqué
099e65f3b6 report processor name in error logs 2024-06-13 10:20:45 -07:00
Aleix Conchillo Flaqué
befb8db120 update pyproject and requirements 2024-06-13 10:20:45 -07:00
Aleix Conchillo Flaqué
9992d826b1 examples: renamed 06b-listen... to 07h-inte... 2024-06-13 10:18:20 -07:00
Aleix Conchillo Flaqué
18604e1a39 re-add removed CHANGELOG lines 2024-06-13 10:18:20 -07:00
Aleix Conchillo Flaqué
312c569182 services(openpipe): refactored so it's based on BaseOpenAILLMService 2024-06-13 09:30:50 -07:00
Aleix Conchillo Flaqué
b43e0ed130 Merge pull request #233 from KwalAI/openpipe-integration
OpenPipe Integration
2024-06-13 22:41:57 +08:00
Aleix Conchillo Flaqué
289debea34 Merge pull request #234 from pipecat-ai/aleix/fix-daily-room-properties-exp
transports(helpers): fix DailyRoomProperties.exp
2024-06-13 22:38:41 +08:00
Aleix Conchillo Flaqué
ccd6af7016 transports(helpers): fix DailyRoomProperties.exp 2024-06-12 23:15:22 -07:00
Ankur Duggal
effc69e4e4 formatting 2024-06-12 15:01:19 -07:00
Ankur Duggal
c7a0d0db64 OpenPipe Integration 2024-06-12 14:23:56 -07:00
Aleix Conchillo Flaqué
50d69a1ca4 Merge pull request #231 from pipecat-ai/aleix/websocket-deserializer-none
serializer: allow deserialize() to return None
2024-06-13 04:36:03 +08:00
Aleix Conchillo Flaqué
8a6b8fe70a Merge pull request #232 from pipecat-ai/aleix/pyproject-deepgram
pyproject: add deepgram-sdk
2024-06-13 03:53:08 +08:00
Aleix Conchillo Flaqué
c4e53aea71 update macos-py3.10-requirements with deepgram 2024-06-12 12:52:20 -07:00
Aleix Conchillo Flaqué
ad5125e93f pyproject: add deepgram-sdk 2024-06-12 12:50:18 -07:00
Aleix Conchillo Flaqué
8d92cbac93 Merge pull request #230 from pipecat-ai/aleix/processor-names
processor names
2024-06-13 03:16:07 +08:00
Aleix Conchillo Flaqué
0225443ec8 transports(base): always send MetricsFrame 2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué
71e1d0a334 pipeline: send initial TTFB initial metrics from PipelineTask 2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué
83f69e02fd allow specifying frame processor names 2024-06-12 12:15:29 -07:00
Aleix Conchillo Flaqué
e1b2da1ff0 serializer: allow deserialize() to return None 2024-06-12 12:11:36 -07:00
Kwindla Hultman Kramer
5eb1b90a4b Merge pull request #229 from pipecat-ai/khk-deepgram-url-configurable
Deepgram TTS service improvements
2024-06-12 14:52:04 -04:00
Kwindla Hultman Kramer
9c4ee74b91 bot to test for demo 2024-06-12 10:41:49 -07:00
Aleix Conchillo Flaqué
f65f566829 re-add transports/services/helpers/__init__.py 2024-06-12 10:37:28 -07:00
Aleix Conchillo Flaqué
c8ad3123b7 Merge pull request #207 from pipecat-ai/dialin-example
New example: Dialin bot (call your Pipecat via phone)
2024-06-13 01:36:00 +08:00
Jon Taylor
8cefce28cf added example fly toml 2024-06-12 10:35:03 -07:00
Jon Taylor
a834d26885 removed https from daily boy 2024-06-12 10:35:03 -07:00
Jon Taylor
810e3cd551 added fly.example.toml due to gitignore 2024-06-12 10:35:03 -07:00
Jon Taylor
f258fa96cd added env to dockerignore 2024-06-12 10:35:03 -07:00
Jon Taylor
757ec61f14 added deepgram to readme 2024-06-12 10:35:03 -07:00
Jon Taylor
2c933f43d8 linting errors and removed unusued sip url 2024-06-12 10:35:03 -07:00
Jon Taylor
cc5bfa8af8 removed helps and fixed linting 2024-06-12 10:35:03 -07:00
Jon Taylor
de9f3e55f1 new example: dialin 2024-06-12 10:35:03 -07:00
Aleix Conchillo Flaqué
ed0c986218 Merge pull request #228 from pipecat-ai/aleix/websocket-fixes
websocket fixes
2024-06-13 01:30:21 +08:00
Aleix Conchillo Flaqué
72c27215b6 transports(websocket): use push_audio_frame() 2024-06-12 10:29:39 -07:00
Aleix Conchillo Flaqué
c23b14f768 examples: use DeepgramSTTService in websocker-server 2024-06-12 10:29:22 -07:00
Aleix Conchillo Flaqué
81282f9c4d services(deepgram): keep conenction alive 2024-06-12 10:29:22 -07:00
Aleix Conchillo Flaqué
2b324f6f81 Merge pull request #227 from pipecat-ai/aleix/daily-room-properties-extra
transports(daily): DailyRoomProperties now allow extra unknown parame…
2024-06-13 00:25:07 +08:00
Kwindla Hultman Kramer
049f110344 PipelineTask should not exit when Deepgram TTS returns a Bad Request "unutterable" 2024-06-12 09:24:09 -07:00
Kwindla Hultman Kramer
448a0307a8 rebasing 2024-06-12 07:54:18 -07:00
Aleix Conchillo Flaqué
7390e42f5c transports(daily): DailyRoomProperties now allow extra unknown parameters 2024-06-11 22:31:32 -07:00
Aleix Conchillo Flaqué
ee880d229f Merge pull request #223 from pipecat-ai/aleix/fix-lower-vad-stop-secs
processors: fix LLMResponseAggregator with lower VAD values
2024-06-12 13:30:34 +08:00
Aleix Conchillo Flaqué
9cd07d81f8 processors: fix LLMResponseAggregator with lower VAD values 2024-06-11 22:30:06 -07:00
Aleix Conchillo Flaqué
b453d089c3 Merge pull request #226 from pipecat-ai/aleix/chunk-audio-output
transport: chunk longer audio frames
2024-06-12 13:28:28 +08:00
Aleix Conchillo Flaqué
7410fe1d1e transport: chunk longer audio frames 2024-06-11 17:50:51 -07:00
Aleix Conchillo Flaqué
6323a77431 Merge pull request #224 from pipecat-ai/aleix/deepgram-stt-simple
deepgram stt simple
2024-06-12 08:48:19 +08:00
Aleix Conchillo Flaqué
0aedaa8553 services(deepgram): abstract StartFrame/EndFrame/CancelFrame 2024-06-10 21:18:42 -07:00
Aleix Conchillo Flaqué
6554479d39 transports: don't queue system frames 2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué
ce2ebd3198 examples: updated 07c-interruptible-deepgram to usee DeepgramSTTService 2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué
13ea1efc96 examples: add new 13b-deepgram-transcription 2024-06-10 21:00:01 -07:00
Aleix Conchillo Flaqué
ef380321cf services: added new DeepgramSTTService 2024-06-10 21:00:01 -07:00
Kwindla Hultman Kramer
294b037730 configurable deepgram base url 2024-06-08 09:38:48 -04:00
Aleix Conchillo Flaqué
7603996612 Merge pull request #220 from pipecat-ai/aleix/pipecat-0.0.29
update CHANGELOG for 0.0.29
2024-06-08 04:43:52 +08:00
Aleix Conchillo Flaqué
3048d2b0b1 update CHANGELOG for 0.0.29 2024-06-07 13:43:00 -07:00
Aleix Conchillo Flaqué
0bb47a09d2 Merge pull request #218 from pipecat-ai/aleix/send-inital-metrics-mapping
send inital metrics mapping
2024-06-08 04:41:59 +08:00
Aleix Conchillo Flaqué
1afe6901d9 processors: add processors_with_metrics() and can_generate_metrics() 2024-06-07 13:38:21 -07:00
Aleix Conchillo Flaqué
3e019fb512 services(openai): remove unused _chat_completions 2024-06-07 13:18:11 -07:00
Aleix Conchillo Flaqué
e069aa9608 updated CHANGELOG with BasePipeline 2024-06-07 13:18:09 -07:00
Aleix Conchillo Flaqué
0b32e42d25 transports(daily): fix extra super().process_frame() 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
8d18be5069 services(anthropic): fix metrics 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
e715d99d0c pipeline: send initial ttfb metrics mapping 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
dc28590247 moved ParallelTask to pipecat.pipeline.parallel_task 2024-06-07 13:17:50 -07:00
Aleix Conchillo Flaqué
139f158ea1 Merge pull request #219 from pipecat-ai/aleix/switch-voices
switch voices and languages
2024-06-08 04:13:25 +08:00
Aleix Conchillo Flaqué
4b2a18837f services(whisper): add text logging 2024-06-07 13:12:51 -07:00
Aleix Conchillo Flaqué
b4340d0185 services(whisper): increase no speech probability to 0.4 2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué
90d11398e6 examples: add 15a-switch-languages 2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué
bf8c73b25b examples: add 15-switch-voices 2024-06-07 13:12:21 -07:00
Aleix Conchillo Flaqué
21cd21de1b processors(filters): add FunctionFilter 2024-06-07 13:12:18 -07:00
Aleix Conchillo Flaqué
c25f6e56e7 Merge pull request #217 from pipecat-ai/khk-tts-timings
Added TTFB timings for all TTS services
2024-06-07 05:42:52 +08:00
Aleix Conchillo Flaqué
a1f1d1995c transports: allow sending metrics 2024-06-06 14:35:34 -07:00
Aleix Conchillo Flaqué
390582d7f3 services: use start/stop_ttfb_metrics to report TTFB metrics 2024-06-06 14:00:10 -07:00
Aleix Conchillo Flaqué
e765a29ca2 processors: implement base process_frame(). all subsclassed should call it 2024-06-06 10:54:21 -07:00
Kwindla Hultman Kramer
cf5c244487 Merge branch 'main' into khk-tts-timings 2024-06-06 13:05:42 -04:00
Kwindla Hultman Kramer
a5eb30a93d changelog 2024-06-06 11:49:05 -04:00
Kwindla Hultman Kramer
ac7bc35944 azure tts ttfb 2024-06-06 11:45:48 -04:00
Kwindla Hultman Kramer
ddfd721f6e openai tts ttfb 2024-06-06 11:32:47 -04:00
Kwindla Hultman Kramer
aee3916cd1 cartesia async fixed 2024-06-06 11:24:26 -04:00
Kwindla Hultman Kramer
3eff1e559b pipecat async working, but maybe needs a threaded implementation 2024-06-06 11:11:06 -04:00
Kwindla Hultman Kramer
1a542c91fa temp commit, woring on playht 2024-06-06 10:48:22 -04:00
Aleix Conchillo Flaqué
cd60a84f8a Merge pull request #215 from pipecat-ai/aleix/silero-vad-memory-fix
vad(silero): fix memory issue
2024-06-06 05:50:47 +08:00
Aleix Conchillo Flaqué
3dd4bac6e6 vad(silero): fix memory issue 2024-06-05 14:50:28 -07:00
Kwindla Hultman Kramer
06ff9cfede added timing logs for cartesia, deepgram, elevenlabs 2024-06-05 16:12:10 -04:00
Aleix Conchillo Flaqué
2d1ed9a304 Merge pull request #214 from pipecat-ai/aleix/pipecat-0.0.27
transports(daily): added participants() and participant_counts()
2024-06-06 03:15:34 +08:00
Aleix Conchillo Flaqué
50b51c05f6 transports(daily): added participants() and participant_counts() 2024-06-05 12:14:00 -07:00
Aleix Conchillo Flaqué
5ce4b8dd5b update CHANGELOG with OpenAITTSService 2024-06-05 11:44:24 -07:00
Aleix Conchillo Flaqué
2f4467b5a5 Merge pull request #213 from pipecat-ai/aleix/pipecat-0.0.26
update CHANGELOG for 0.0.26
2024-06-06 01:10:01 +08:00
Aleix Conchillo Flaqué
e91ab54a69 update CHANGELOG for 0.0.26 2024-06-05 10:07:45 -07:00
Aleix Conchillo Flaqué
6a33432c82 Merge pull request #212 from pipecat-ai/aleix/make-pinlesscallupdate-public
transports(daily): move pinlessCallUpdate to public api
2024-06-05 23:14:14 +08:00
Aleix Conchillo Flaqué
135654a080 transports(daily): move pinlessCallUpdate to public api 2024-06-05 08:08:56 -07:00
Aleix Conchillo Flaqué
7b708a2bee Merge pull request #211 from pipecat-ai/aleix/base-transport-async
various fixes and improvements
2024-06-05 22:57:35 +08:00
Aleix Conchillo Flaqué
b515c28417 services(cartesia): allow output_format and model_id 2024-06-04 19:24:33 -07:00
Aleix Conchillo Flaqué
854ffb0323 update CHANGELOG for DailyRESTHelper 2024-06-04 15:45:17 -07:00
Aleix Conchillo Flaqué
891b7b22ea transports: push EndFrame/CancelFrame before stopping push task 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
c8d37a7227 pipeline(runner): add support for SIGTERM 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
489060881d update macos-py3.10-requirements 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
d56a4cce1b update CHANGELOG with latest changes 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
7eb9dfde38 pyproject: include langchain-community and langchain-openai 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
571e10f83e services(anthropic): fix interruptions with anthropic 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
af202d4fe5 pipeline(task): introduce has_finished() 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
4057fbbcfd transports(tk): fix pyaudio output stream cleanup 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
5cdb8a79a1 examples: use camera_out_is_live for live video 2024-06-04 15:43:54 -07:00
Aleix Conchillo Flaqué
a674b43243 transport: remove redundant camera thread and switch audio pull for push 2024-06-04 15:43:54 -07:00
Jon Taylor
ac41f13b7c Merge pull request #205 from pipecat-ai/daily_rest_helpers
Created REST helpers for Daily covering commonly used methods for running / deployment
2024-06-04 22:26:39 +02:00
Jon Taylor
003b9887b1 made sip and sipuri optional and None 2024-06-04 19:03:58 +02:00
Jon Taylor
ba45c2ab5b addressed review (urllib import and linting 2024-06-04 18:39:35 +02:00
Aleix Conchillo Flaqué
9d36a48a80 Merge pull request #208 from pipecat-ai/aleix/cartesia-voice-load-startup
services(cartesia): load voices on startup
2024-06-04 22:54:25 +08:00
Aleix Conchillo Flaqué
20a525635e Merge pull request #201 from TomTom101/TomTom101/openai_tts
Added OpenAI TTS (#196)
2024-06-04 22:53:56 +08:00
Aleix Conchillo Flaqué
659eceea95 services(cartesia): load voices on startup 2024-06-03 14:08:04 -07:00
TomTom101
d462c03d00 chore: Review comments 2024-06-03 20:13:15 +02:00
Jon Taylor
6591e07eb4 removed hardcoded 'https' from API url 2024-06-03 19:32:14 +02:00
Aleix Conchillo Flaqué
fe71825954 Merge pull request #206 from pipecat-ai/aleix/fix-deepgram-tts
services(deepgram): fixed DeepgramTTSService
2024-06-04 00:28:53 +08:00
Aleix Conchillo Flaqué
43516f84fe services(deepgram): fixed DeepgramTTSService 2024-06-03 07:53:46 -07:00
Jon Taylor
0849edb00b added Daily REST helpers file for common methods used in Pipecat bots 2024-06-03 16:38:13 +02:00
Aleix Conchillo Flaqué
dd3b4083eb Merge pull request #204 from TomTom101/TomTom101/langchain
fix: Fixed imports, support new PipelineParams
2024-06-03 03:16:30 +08:00
TomTom101
89673a4040 test(langchain): Use new PipelineParams in test 2024-06-02 20:19:55 +02:00
TomTom101
410dbd3dfc fix: Fixed imports, support new PipelineParams 2024-06-02 20:16:11 +02:00
TomTom101
7085b1ea3f doc(openai): Added hint re the 24kHz sample rate 2024-06-01 20:35:46 +02:00
TomTom101
8683cae719 feat: OpenAITTS 2024-06-01 10:13:28 +02:00
Aleix Conchillo Flaqué
0197efa524 Merge pull request #200 from pipecat-ai/aleix/changelog-0.0.25
update CHANGELOG.md for version 0.0.25
2024-06-01 07:48:42 +08:00
Aleix Conchillo Flaqué
16e76caa33 update CHANGELOG.md for version 0.0.25 2024-05-31 16:48:03 -07:00
Aleix Conchillo Flaqué
1f5240694d Merge pull request #199 from pipecat-ai/aleix/langchain-changelog
move LangchainProcessor to processors/frameworks and update CHANGELOG
2024-06-01 07:46:51 +08:00
Aleix Conchillo Flaqué
f087151db7 move LangchainProcessor to processors/frameworks and update CHANGELOG 2024-05-31 16:45:39 -07:00
Aleix Conchillo Flaqué
0b691ff597 Merge pull request #198 from pipecat-ai/aleix/websocket-transport
websocket transport support
2024-06-01 04:40:39 +08:00
TomTom101
ae049961b7 wip: untested 2024-05-31 22:30:52 +02:00
Aleix Conchillo Flaqué
0d6eee705f Merge pull request #190 from TomTom101/TomTom101/langchain
Langchain service
2024-06-01 04:21:12 +08:00
Aleix Conchillo Flaqué
58d20ec9dc transport(websocket-server): add on_client_disconnected 2024-05-31 12:52:43 -07:00
Aleix Conchillo Flaqué
38befe1dc1 examples(websocket): rename server.py to bot.py 2024-05-31 12:09:54 -07:00
Aleix Conchillo Flaqué
2f335100a5 remove storage folder 2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué
3fef818843 examples(websocket-server): use VAD analyzer from transport 2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué
428c8af77e transports(websocket): base class from BaseInputTransport 2024-05-31 11:54:18 -07:00
Aleix Conchillo Flaqué
54fccd2e25 pipeline: cleanup processors one by one 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
66c6a5dc0f transports(websocket): base class from BaseOutputTransport 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
92561ae19d some event loop parameter updates 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
b85e93410b transports(daily): fix event handlers callback 2024-05-31 11:37:43 -07:00
Aleix Conchillo Flaqué
593993ba97 transports(base_input): remove unnecessary task 2024-05-31 11:37:41 -07:00
Aleix Conchillo Flaqué
7b8b606278 update CHANGELOG and create websocker-server instructions 2024-05-31 11:37:19 -07:00
Aleix Conchillo Flaqué
7116ad0607 examples: fix websocket-client audio playback 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
c507044277 examples: use gpt-4o model by default 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
5f45a9d90f examples: websocket-server updates 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
e31e87aabd transport(websocket): update audio_frame_size 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
2957416d90 serializers(protobuf): support id and name fields 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
b9b761b67a added sample_rate and num_channels to protobuf AudioRawFrame 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
a7539e9317 transports: simplify and fix async and nested decorators 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
75575c0c68 use get_event_loop() and move event handlers to BaseTransport 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
77b3e08214 examples: add and update wbesocket eaxmples 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
956b783c1a transports: added new WebsocketServerTransport 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
e90c080470 serializers: added BaseSerializer 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
37aabaa03a frames: generate protobuf pb2 file for pipecat package 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
3e289a7bef pyproject: add protobuf dependency 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
6dd5e3fdf5 dev-requirements: add grpcio-tools 2024-05-31 11:36:52 -07:00
Aleix Conchillo Flaqué
e60df3c7c0 Merge pull request #195 from pipecat-ai/aleix/function-calling-move-to-llmservice
function calling move to LLMService
2024-06-01 02:36:29 +08:00
Aleix Conchillo Flaqué
42f772beed examples: some function calling examples cleanup 2024-05-31 11:36:04 -07:00
Aleix Conchillo Flaqué
3655c4a0fc services: move function calling registration to LLMService 2024-05-31 11:36:04 -07:00
Aleix Conchillo Flaqué
012dbffd94 update CHANGELOG.md for function calling 2024-05-31 11:36:03 -07:00
TomTom101
4b39efeee3 fix(langchain): try/catch langchain import in service; Only langchain is installed with the [langchain] extra (#190) 2024-05-31 10:19:27 +02:00
Kwindla Hultman Kramer
19caf750fd Merge pull request #194 from pipecat-ai/khk-cartesia-changelog
Added cartesia line to CHANGELOG.md
2024-05-30 14:18:41 -07:00
Kwindla Hultman Kramer
296611714f added cartesia line to CHANGELOG.md 2024-05-30 10:41:00 -07:00
chadbailey59
4c3d19cc8b Function calling (#175)
* added function calling code back

* removed old llm_context file

* added integration testing for openai

* added function calling example

* added function callbacks

* added function start callback

* fixup

* fixup

* added different return type support for function calling

* intake example working

* added frame loggers

* cleanup

* fixup

* Update openai.py

* removed function call frame types

* fixup

* re-added example

* renumbered wake phrase

* fixup for autopep8

* remove unused imports
2024-05-30 12:25:39 -05:00
Aleix Conchillo Flaqué
a3ba07c7a3 Merge pull request #193 from pipecat-ai/aleix/fix-camera-out-enabled-cpu
transport(output): fix high CPU usage with camera_out_enabled and no …
2024-05-31 01:25:06 +08:00
Kwindla Hultman Kramer
a1579808b2 Merge pull request #189 from pipecat-ai/khk-cartesia-etc
Cartesia TTS
2024-05-30 10:24:45 -07:00
Aleix Conchillo Flaqué
aecb9f5816 transport(output): fix high CPU usage with camera_out_enabled and no images 2024-05-30 10:18:43 -07:00
Aleix Conchillo Flaqué
a5d42a526c Merge pull request #191 from pipecat-ai/aleix/fix-silero-vad
vad: fix silero vad frame processor
2024-05-30 23:25:52 +08:00
Aleix Conchillo Flaqué
a9472f8116 vad: fix silero vad frame processor 2024-05-30 07:50:58 -07:00
TomTom101
b19243ab75 fix: corrected hint to install Langchain libs 2024-05-30 10:53:42 +02:00
TomTom101
2bf094b950 test(langchain): Rewrite to unittest, make it meaningful 2024-05-30 10:43:33 +02:00
Kwindla Hultman Kramer
d5f106ae19 pr fixes 2024-05-29 23:41:35 -07:00
Kwindla Hultman Kramer
920745345a cartesia tts support 2024-05-29 23:35:35 -07:00
TomTom101
143033d7db fix: install langchain-community with the langchain extra 2024-05-30 03:15:14 +02:00
TomTom101
335990c145 wip: hint to install langchain_community 2024-05-30 03:15:14 +02:00
TomTom101
6d24e836b0 wip: Example using LC message history 2024-05-30 03:15:14 +02:00
TomTom101
278a2fed56 wip: First stab at langchain support
Is this a service or processor?
How to deal with conversation history? LC has sophisticated means of this, but might get in the way of `LLMResponseAggregator`
2024-05-30 03:15:14 +02:00
831 changed files with 127934 additions and 15742 deletions

87
.github/ISSUE_TEMPLATE/1-bug_report.yml vendored Normal file
View File

@@ -0,0 +1,87 @@
name: Bug report
description: Report a bug or unexpected behavior
type: Bug
body:
- type: markdown
attributes:
value: |
## Bug Report
Thank you for taking the time to fill out this bug report.
- type: markdown
attributes:
value: |
### Environment
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: python-version
attributes:
label: Python version
description: Which Python version are you using?
placeholder: e.g., 3.12.8
validations:
required: true
- type: input
id: os
attributes:
label: Operating System
description: Which OS are you using?
placeholder: e.g., Ubuntu 24.04, Windows 11, macOS 12.5
validations:
required: true
- type: textarea
id: description
attributes:
label: Issue description
description: Provide a clear description of the issue.
validations:
required: true
- type: textarea
id: repro
attributes:
label: Reproduction steps
description: List the steps to reproduce the issue.
placeholder: |
1. Do this...
2. Then do that...
3. Observe the error...
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected behavior
description: What did you expect to happen?
validations:
required: true
- type: textarea
id: actual
attributes:
label: Actual behavior
description: What actually happened?
validations:
required: true
- type: textarea
id: logs
attributes:
label: Logs
description: If applicable, include any relevant logs or error messages
render: shell
validations:
required: false

67
.github/ISSUE_TEMPLATE/2-question.yml vendored Normal file
View File

@@ -0,0 +1,67 @@
name: Question
description: Ask a question or get help
type: Question
body:
- type: markdown
attributes:
value: |
## Question
Use this form to ask a question about pipecat.
- type: markdown
attributes:
value: |
### Environment (if applicable)
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using? (if applicable)
placeholder: e.g., 0.0.63
validations:
required: false
- type: input
id: python-version
attributes:
label: Python version
description: Which Python version are you using? (if applicable)
placeholder: e.g., 3.12.8
validations:
required: false
- type: input
id: os
attributes:
label: Operating System
description: Which OS are you using? (if applicable)
placeholder: e.g., Ubuntu 24.04, Windows 11, macOS 12.5
validations:
required: false
- type: textarea
id: question
attributes:
label: Question
description: Provide your question in detail here.
validations:
required: true
- type: textarea
id: tried
attributes:
label: What I've tried
description: Describe what you've already tried or research you've done.
placeholder: I've looked at the documentation and tried...
validations:
required: false
- type: textarea
id: context
attributes:
label: Context
description: Any additional context or information that might help others understand your question better.
validations:
required: false

View File

@@ -0,0 +1,52 @@
name: Feature request
description: Suggest an enhancement or new feature
type: Enhancement
body:
- type: markdown
attributes:
value: |
## Feature Request
Thank you for suggesting an enhancement to pipecat.
- type: textarea
id: problem
attributes:
label: Problem Statement
description: A clear description of the problem this feature would solve.
placeholder: I'm always frustrated when...
validations:
required: true
- type: textarea
id: solution
attributes:
label: Proposed Solution
description: A clear and concise description of what you want to happen.
validations:
required: true
- type: textarea
id: alternatives
attributes:
label: Alternative Solutions
description: Any alternative solutions or features you've considered.
validations:
required: false
- type: textarea
id: context
attributes:
label: Additional Context
description: Add any other context, mockups, or screenshots about the feature request here.
placeholder: You can drag and drop images here to include them.
validations:
required: false
- type: checkboxes
id: contribution
attributes:
label: Would you be willing to help implement this feature?
options:
- label: Yes, I'd like to contribute
- label: No, I'm just suggesting

View File

@@ -0,0 +1,82 @@
name: Service Issue
description: An issue with a third-party service
type: Service Issue
body:
- type: markdown
attributes:
value: |
## Service Issue
Use this form to report an issue with a third-party service integration.
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: service-name
attributes:
label: Service Name
description: Which third-party service is having issues?
placeholder: e.g., OpenAI, ElevenLabs, Anthropic
validations:
required: true
- type: input
id: service-version
attributes:
label: Service or model version
description: Which version of the service API or model are you using?
placeholder: e.g., v1, gpt-4.1
validations:
required: false
- type: textarea
id: description
attributes:
label: Issue Description
description: Provide a clear description of the service issue.
validations:
required: true
- type: textarea
id: reproduction
attributes:
label: Reproduction Steps
description: Provide steps to reproduce the issue.
placeholder: |
1. Configure service X
2. Call method Y
3. See error Z
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected Behavior
description: What did you expect to happen?
validations:
required: true
- type: textarea
id: actual
attributes:
label: Actual Behavior
description: What actually happened?
validations:
required: true
- type: textarea
id: logs
attributes:
label: Error Logs
description: If available, include any error messages or logs.
render: shell
validations:
required: false

View File

@@ -0,0 +1,56 @@
name: New Service
description: Request to support a new third-party service
type: New Service
body:
- type: markdown
attributes:
value: |
## New Service Request
Use this form to request support for a new third-party service in pipecat.
- type: input
id: service-name
attributes:
label: Service Name
description: What is the name of the third-party service?
placeholder: e.g., NewAPI, SomeService
validations:
required: true
- type: input
id: service-website
attributes:
label: Service Website
description: Link to the service's website or documentation
placeholder: e.g., https://newapi.com
validations:
required: true
- type: textarea
id: service-description
attributes:
label: Service Description
description: Briefly describe what this service does and how it works.
validations:
required: true
- type: textarea
id: api-info
attributes:
label: API Information
description: If available, provide details about the service's API.
placeholder: |
- API documentation link
- Authentication method
- Key endpoints you'd like supported
validations:
required: false
- type: checkboxes
id: contribution
attributes:
label: Would you be willing to help implement this service?
options:
- label: Yes, I'd like to contribute
- label: No, I'm just suggesting

74
.github/ISSUE_TEMPLATE/6-dependency.yml vendored Normal file
View File

@@ -0,0 +1,74 @@
name: Dependency Issue
description: An issue with a Pipecat dependency (not a third-party service)
type: Dependency Issue
body:
- type: markdown
attributes:
value: |
## Dependency Issue
Use this form to report an issue with a Pipecat dependency.
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: dependency-name
attributes:
label: Dependency Name
description: Which Pipecat dependency is causing the issue?
placeholder: e.g., openai, anthropic, fastapi
validations:
required: true
- type: input
id: dependency-version
attributes:
label: Dependency Version
description: Which version of the dependency are you using?
placeholder: e.g., 1.2.3
validations:
required: true
- type: textarea
id: description
attributes:
label: Issue Description
description: Provide a clear description of the dependency issue.
validations:
required: true
- type: textarea
id: impact
attributes:
label: Impact
description: How is this dependency issue affecting your usage of pipecat?
validations:
required: true
- type: textarea
id: reproduction
attributes:
label: Reproduction Steps
description: If applicable, provide steps to reproduce the issue.
placeholder: |
1. Install dependency X
2. Run command Y
3. See error Z
validations:
required: false
- type: textarea
id: logs
attributes:
label: Error Logs
description: If applicable, include any relevant error messages or logs.
render: shell
validations:
required: false

View File

@@ -0,0 +1,70 @@
name: Troubleshooting
description: Help with a specific use case
type: Troubleshooting
body:
- type: markdown
attributes:
value: |
## Troubleshooting Request
Use this form to get help with a specific use case or implementation.
- type: input
id: pipecat-version
attributes:
label: pipecat version
description: Which version are you using?
placeholder: e.g., 0.0.63
validations:
required: true
- type: input
id: python-version
attributes:
label: Python version
description: Which version of Python are you using?
placeholder: e.g., 3.12.8
validations:
required: true
- type: input
id: os
attributes:
label: Operating System
description: Which OS are you using?
placeholder: e.g., Ubuntu 24.04, Windows 11, macOS 12.5
validations:
required: true
- type: textarea
id: use-case
attributes:
label: Use Case Description
description: Describe what you're trying to accomplish with pipecat.
validations:
required: true
- type: textarea
id: current-approach
attributes:
label: Current Approach
description: What have you tried so far? Include code snippets if relevant.
render: python
validations:
required: true
- type: textarea
id: errors
attributes:
label: Errors or Unexpected Behavior
description: Describe any errors or unexpected behavior you're encountering.
validations:
required: true
- type: textarea
id: additional-context
attributes:
label: Additional Context
description: Any other information that might help us understand your situation.
validations:
required: false

1
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1 @@
blank_issues_enabled: false

1
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1 @@
#### Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

View File

@@ -21,24 +21,20 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
- name: Install project and other Python dependencies
run: |
source .venv/bin/activate
pip install --editable .
run: uv build
- name: Install project in editable mode
run: uv pip install --editable .

47
.github/workflows/coverage.yaml vendored Normal file
View File

@@ -0,0 +1,47 @@
name: coverage
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
jobs:
coverage:
name: "Coverage"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "latest"
- name: Set up Python
run: uv python install 3.12
- name: Install system packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Install dependencies
run: |
uv sync --group dev --extra anthropic --extra aws --extra google --extra langchain
- name: Run tests with coverage
run: |
uv run coverage run
uv run coverage xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
slug: pipecat-ai/pipecat

43
.github/workflows/format.yaml vendored Normal file
View File

@@ -0,0 +1,43 @@
name: format
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
concurrency:
group: build-format-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
ruff-format:
name: "Code quality checks"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Ruff formatter
id: ruff-format
run: uv run ruff format --diff
- name: Ruff linter (all rules)
id: ruff-check
run: uv run ruff check

View File

@@ -1,44 +0,0 @@
name: lint
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- "**"
paths-ignore:
- "docs/**"
concurrency:
group: build-lint-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
autopep8:
name: "Formatting lints"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install development Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
- name: autopep8
id: autopep8
run: |
source .venv/bin/activate
autopep8 --max-line-length 100 --exit-code -r -d --exclude "*_pb2.py" -a -a src/
- name: Fail if autopep8 requires changes
if: steps.autopep8.outputs.exit-code == 2
run: exit 1

View File

@@ -5,7 +5,7 @@ on:
inputs:
gitref:
type: string
description: "what git ref to build"
description: "what git tag to build (e.g. v0.0.74)"
required: true
jobs:
@@ -17,23 +17,17 @@ jobs:
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.gitref }}
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
run: uv build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:

View File

@@ -1,10 +1,6 @@
name: publish-test
on:
workflow_dispatch:
push:
branches:
- main
on: workflow_dispatch
jobs:
build:
@@ -14,26 +10,18 @@ jobs:
- name: Checkout repo
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.gitref }}
fetch-tags: true
fetch-depth: 100
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
run: uv build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
@@ -43,7 +31,7 @@ jobs:
publish-to-test-pypi:
name: "Publish to Test PyPI"
runs-on: ubuntu-latest
needs: [ build ]
needs: [build]
environment:
name: testpypi
url: https://pypi.org/p/pipecat-ai

View File

@@ -0,0 +1,61 @@
name: Python Compatibility Test
on:
push:
branches: [main, develop]
paths: ['pyproject.toml']
pull_request:
branches: [main, develop]
paths: ['pyproject.toml']
jobs:
test-compatibility:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.10.18', '3.11.13', '3.12.11', '3.13.5']
name: Python ${{ matrix.python-version }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y \
portaudio19-dev \
libcairo2-dev \
libgirepository1.0-dev \
pkg-config
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
version: 'latest'
- name: Set up Python ${{ matrix.python-version }}
run: |
uv python install ${{ matrix.python-version }}
uv python pin ${{ matrix.python-version }}
- name: Test uv sync with all extras (Python < 3.13)
if: "!startsWith(matrix.python-version, '3.13.')"
run: |
uv sync --group dev --all-extras --no-extra krisp
- name: Test uv sync without PyTorch extras (Python 3.13+)
if: startsWith(matrix.python-version, '3.13.')
run: |
uv sync --group dev --all-extras \
--no-extra krisp \
--no-extra ultravox \
--no-extra local-smart-turn \
--no-extra moondream \
--no-extra mlx-whisper
- name: Verify installation
run: |
uv run python --version
uv run python -c "import pipecat; print('✅ Pipecat imports successfully')"

51
.github/workflows/sync-quickstart.yaml vendored Normal file
View File

@@ -0,0 +1,51 @@
name: Sync Quickstart to pipecat-quickstart repo
on:
push:
branches: [main]
paths:
- 'examples/quickstart/**'
workflow_dispatch: # Manual trigger
jobs:
sync-quickstart:
runs-on: ubuntu-latest
steps:
- name: Checkout main repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checkout quickstart repo
uses: actions/checkout@v4
with:
repository: pipecat-ai/pipecat-quickstart
token: ${{ secrets.QUICKSTART_SYNC_TOKEN }}
path: quickstart-repo
- name: Sync files (excluding uv.lock and README.md)
run: |
# Copy all files except uv.lock and README.md
find examples/quickstart -type f \
-not -name "README.md" \
-not -name "uv.lock" \
-exec cp {} quickstart-repo/ \;
- name: Commit and push changes
run: |
cd quickstart-repo
git config user.name "GitHub Action"
git config user.email "action@github.com"
git add .
# Only commit if there are changes
if ! git diff --staged --quiet; then
git commit -m "Sync from pipecat main repo
Updated files from examples/quickstart/
Commit: ${{ github.sha }}
"
git push
else
echo "No changes to sync"
fi

View File

@@ -1,4 +1,4 @@
name: test
name: tests
on:
workflow_dispatch:
@@ -20,30 +20,25 @@ jobs:
name: "Unit and Integration Tests"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Checkout repo
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "latest"
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing requirements-dev.txt and requirements-extra.txt which
# contain all dependencies needed to run the tests and examples.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('linux-py3.10-requirements.txt') }}-${{ hashFiles('dev-requirements.txt') }}
path: .venv
run: uv python install 3.12
- name: Install system packages
run: sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
sudo apt-get install -y portaudio19-dev
- name: Install dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r linux-py3.10-requirements.txt -r dev-requirements.txt
uv sync --group dev --extra anthropic --extra aws --extra google --extra langchain
- name: Test with pytest
run: |
source .venv/bin/activate
pytest --doctest-modules --ignore-glob="*to_be_updated*" src tests
uv run pytest

28
.gitignore vendored
View File

@@ -4,9 +4,10 @@ __pycache__/
*~
venv
.venv
/.idea
#*#
# Distribution / packaging
# Distribution / Packaging
.Python
build/
develop-eggs/
@@ -27,4 +28,27 @@ share/python-wheels/
MANIFEST
.DS_Store
.env
fly.toml
fly.toml
# Examples
examples/**/node_modules/
examples/**/.expo/
examples/**/dist/
examples/**/npm-debug.*
examples/**/*.jks
examples/**/*.p8
examples/**/*.p12
examples/**/*.key
examples/**/*.mobileprovision
examples/**/*.orig.*
examples/**/web-build/
# macOS
.DS_Store
# Documentation
docs/api/_build/
docs/api/api
# uv
.python-version

8
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,8 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.12.1
hooks:
- id: ruff
language_version: python3
args: [--fix]
- id: ruff-format

28
.readthedocs.yaml Normal file
View File

@@ -0,0 +1,28 @@
version: 2
build:
os: ubuntu-22.04
tools:
python: '3.12'
apt_packages:
- portaudio19-dev
- python3-dev
- libasound2-dev
jobs:
post_install:
- pip install uv
- UV_PROJECT_ENVIRONMENT=$READTHEDOCS_VIRTUALENV_PATH uv sync --group docs --all-extras --no-extra krisp --no-extra gstreamer --no-extra ultravox --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
sphinx:
configuration: docs/api/conf.py
fail_on_warning: false
search:
ranking:
api/*: 5
getting-started/*: 4
guides/*: 3
submodules:
include: all
recursive: true

File diff suppressed because it is too large Load Diff

337
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,337 @@
## Contributing to Pipecat
We welcome contributions of all kinds! Your help is appreciated. Follow these steps to get involved:
1. **Fork this repository**: Start by forking the Pipecat Documentation repository to your GitHub account.
2. **Clone the repository**: Clone your forked repository to your local machine.
```bash
git clone https://github.com/your-username/pipecat
```
3. **Create a branch**: For your contribution, create a new branch.
```bash
git checkout -b your-branch-name
```
4. **Make your changes**: Edit or add files as necessary.
5. **Test your changes**: Ensure that your changes look correct and follow the style set in the codebase.
6. **Commit your changes**: Once you're satisfied with your changes, commit them with a meaningful message.
```bash
git commit -m "Description of your changes"
```
7. **Push your changes**: Push your branch to your forked repository.
```bash
git push origin your-branch-name
```
8. **Submit a Pull Request (PR)**: Open a PR from your forked repository to the main branch of this repo.
> Important: Describe the changes you've made clearly!
Our maintainers will review your PR, and once everything is good, your contributions will be merged!
## Dependency Management
This project uses [uv](https://docs.astral.sh/uv/) for dependency management. The `uv.lock` file is committed to ensure reproducible builds.
### Adding or Updating Dependencies
1. Edit `pyproject.toml` to add/update dependencies
2. Run `uv lock` to update the lockfile with new dependency resolution
3. Run `uv sync` to install the updated dependencies locally
4. Always commit both files together:
```bash
git add pyproject.toml uv.lock
git commit -m "feat: add new dependency for feature X"
```
**Important:** Never manually edit `uv.lock`. It's auto-generated by `uv lock`.
## Code Style and Documentation
### Python Code Style
We use Ruff for code linting and formatting. Please ensure your code passes all linting checks before submitting a PR.
### Docstring Conventions
We follow Google-style docstrings with these specific conventions:
**Regular Classes:**
- Class docstring describes the class purpose and key functionality
- `__init__` method has its own docstring with complete `Args:` section documenting all parameters
- All public methods must have docstrings with `Args:` and `Returns:` sections as appropriate
**Dataclasses:**
- Class docstring describes the purpose and documents all fields in a `Parameters:` section
- No `__init__` docstring (auto-generated)
**Properties:**
- Must have docstrings with `Returns:` section
**Abstract Methods:**
- Must have docstrings explaining what subclasses should implement
**`__init__.py` Files:**
- **Skip docstrings** for pure import/re-export modules
- **Add brief docstrings** for top-level packages or those with initialization logic
**Enums:**
- Class docstring describes the enumeration purpose
- Use `Parameters:` section to document each enum value and its meaning
- No `__init__` docstring (Enums don't have custom constructors)
**Code Examples in Docstrings:**
- Use `Examples:` as a section header for multiple examples
- Use descriptive text followed by double colons (`::`) for each example
- **Always include a blank line after the `::"`**
- Indent all code consistently within each block
- Separate multiple examples with blank lines for readability
**Lists and Bullets in Docstrings:**
- Use dashes (`-`) for bullet points, not asterisks (`*`)
- **Add a blank line before bullet lists** when they follow a colon
- Use section headers like "Supported features:" or "Behavior:" before lists
- For complex nested information, consider using paragraph format instead
**Deprecations:**
- Use `warnings.warn()` in code for runtime deprecation warnings
- Add `.. deprecated::` directive in docstrings for documentation visibility
- Include version information and describe current status
- Describe parameters in present tense, use directive to indicate deprecation status
#### Examples:
```python
# Regular class
class MyService(BaseService):
"""Description of what the service does.
Provides detailed explanation of the service's functionality,
key features, and usage patterns.
Supported features:
- Feature one with detailed explanation
- Feature two with additional context
- Feature three for advanced use cases
"""
def __init__(self, param1: str, old_param: str = None, **kwargs):
"""Initialize the service.
Args:
param1: Description of param1.
old_param: Controls legacy behavior.
.. deprecated:: 1.2.0
This parameter no longer has any effect and will be removed in version 2.0.
**kwargs: Additional arguments passed to parent.
"""
if old_param is not None:
import warnings
warnings.warn(
"Parameter 'old_param' is deprecated and will be removed in version 2.0.",
DeprecationWarning,
)
super().__init__(**kwargs)
@property
def sample_rate(self) -> int:
"""Get the current sample rate.
Returns:
The sample rate in Hz.
"""
return self._sample_rate
async def process_data(self, data: str) -> bool:
"""Process the provided data.
Args:
data: The data to process.
Returns:
True if processing succeeded.
"""
pass
# Dataclass with code examples
@dataclass
class MessageFrame:
"""Frame containing messages in OpenAI format.
Supports both simple and content list message formats.
Example::
[
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there!"}
]
Parameters:
messages: List of messages in OpenAI format.
"""
messages: List[dict]
# Enum class
class Status(Enum):
"""Status codes for processing operations.
Parameters:
PENDING: Operation is queued but not started.
RUNNING: Operation is currently in progress.
COMPLETED: Operation finished successfully.
FAILED: Operation encountered an error.
"""
PENDING = "pending"
RUNNING = "running"
COMPLETED = "completed"
FAILED = "failed"
```
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
- Demonstrating empathy and kindness toward other people
- Being respectful of differing opinions, viewpoints, and experiences
- Giving and gracefully accepting constructive feedback
- Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
- Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
- The use of sexualized language or imagery, and sexual attention or advances of
any kind
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or email address,
without their explicit permission
- Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at pipecat-ai@daily.co.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -1,40 +0,0 @@
# setup
FROM python:3.11.5
WORKDIR /app
COPY requirements.txt /app
COPY *.py /app
COPY pyproject.toml /app
COPY src/ /app/src/
COPY examples/ /app/examples/
WORKDIR /app
RUN ls --recursive /app/
RUN pip3 install --upgrade -r requirements.txt
RUN python -m build .
RUN pip3 install .
RUN pip3 install gunicorn
# If running on Ubuntu, Azure TTS requires some extra config
# https://learn.microsoft.com/en-us/azure/ai-services/speech-service/quickstarts/setup-platform?pivots=programming-language-python&tabs=linux%2Cubuntu%2Cdotnetcli%2Cdotnet%2Cjre%2Cmaven%2Cnodejs%2Cmac%2Cpypi
RUN wget -O - https://www.openssl.org/source/openssl-1.1.1w.tar.gz | tar zxf -
WORKDIR openssl-1.1.1w
RUN ./config --prefix=/usr/local
RUN make -j $(nproc)
RUN make install_sw install_ssldirs
RUN ldconfig -v
ENV SSL_CERT_DIR=/etc/ssl/certs
#ENV LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
RUN apt clean
RUN apt-get update
RUN apt-get -y install build-essential libssl-dev ca-certificates libasound2 wget
ENV PYTHONUNBUFFERED=1
WORKDIR /app
EXPOSE 8000
# run
CMD ["gunicorn", "--workers=2", "--log-level", "debug", "--chdir", "examples/server", "--capture-output", "daily-bot-manager:app", "--bind=0.0.0.0:8000"]

View File

@@ -1,6 +1,6 @@
BSD 2-Clause License
Copyright (c) 2024, Kwindla Hultman Kramer
Copyright (c) 20242025, Daily
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

4
MANIFEST.in Normal file
View File

@@ -0,0 +1,4 @@
prune docs
prune examples
prune scripts
prune tests

331
README.md
View File

@@ -1,221 +1,254 @@
<div align="center">
 <img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
</div>
<h1><div align="center">
<img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
</div></h1>
# Pipecat
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![codecov](https://codecov.io/gh/pipecat-ai/pipecat/graph/badge.svg?token=LNVUIVO4Y9)](https://codecov.io/gh/pipecat-ai/pipecat) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/pipecat-ai/pipecat)
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) [![Discord](https://img.shields.io/discord/1239284677165056021
)](https://discord.gg/pipecat)
# 🎙️ Pipecat: Real-Time Voice & Multimodal AI Agents
`pipecat` is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, [story-telling toys for kids](https://storytelling-chatbot.fly.dev/), customer support bots, [intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0), and snarky social companions.
**Pipecat** is an open-source Python framework for building real-time voice and multimodal conversational agents. Orchestrate audio and video, AI services, different transports, and conversation pipelines effortlessly—so you can focus on what makes your agent unique.
Take a look at some example apps:
> Want to dive right in? Try the [quickstart](https://docs.pipecat.ai/getting-started/quickstart).
## 🚀 What You Can Build
- **Voice Assistants** natural, streaming conversations with AI
- **AI Companions** coaches, meeting assistants, characters
- **Multimodal Interfaces** voice, video, images, and more
- **Interactive Storytelling** creative tools with generative media
- **Business Agents** customer intake, support bots, guided flows
- **Complex Dialog Systems** design logic with structured conversations
🧭 Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
🔍 Looking for help debugging your pipeline and processors? Check out [Whisker](https://github.com/pipecat-ai/whisker), a real-time Pipecat debugger.
## 🧠 Why Pipecat?
- **Voice-first**: Integrates speech recognition, text-to-speech, and conversation handling
- **Pluggable**: Supports many AI services and tools
- **Composable Pipelines**: Build complex behavior from modular components
- **Real-Time**: Ultra-low latency interaction with different transports (e.g. WebSockets or WebRTC)
## 📱 Client SDKs
You can connect to Pipecat from any platform using our official SDKs:
<table>
<tr>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg" width="40" height="40" alt="JavaScript"/>
<a href="https://docs.pipecat.ai/client/js/introduction">JavaScript</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React"/>
<a href="https://docs.pipecat.ai/client/react/introduction">React</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React Native"/>
<a href="https://docs.pipecat.ai/client/react-native/introduction">React Native</a>
</td>
</tr>
<tr>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/swift/swift-original.svg" width="40" height="40" alt="Swift"/>
<a href="https://docs.pipecat.ai/client/ios/introduction">Swift</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/kotlin/kotlin-original.svg" width="40" height="40" alt="Kotlin"/>
<a href="https://docs.pipecat.ai/client/android/introduction">Kotlin</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/cplusplus/cplusplus-original.svg" width="40" height="40" alt="JavaScript"/>
<a href="https://docs.pipecat.ai/client/c++/introduction">C++</a>
</td>
</tr>
</table>
## 🎬 See it in action
<p float="left">
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png" width="280" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png" width="280" /></a>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/simple-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
<br/>
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png" width="280" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png" width="280" /></a>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
</p>
## Getting started with voice agents
## 🧩 Available services
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youre ready. You can also add a 📞 telephone number, 🖼️ image output, 📺 video input, use different LLMs, and more.
| Category | Services |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx) |
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
```shell
# install the module
pip install pipecat-ai
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
# set up an .env file with API keys
cp dot-env.template .env
```
## ⚡ Getting started
By default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional dependencies that you can install with:
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you're ready.
```shell
pip install "pipecat-ai[option,...]"
```
1. Install uv
Your project may or may not need these, so they're made available as optional requirements. Here is a list:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
- **AI services**: `anthropic`, `azure`, `deepgram`, `google`, `fal`, `moondream`, `openai`, `playht`, `silero`, `whisper`
- **Transports**: `local`, `websocket`, `daily`
> **Need help?** Refer to the [uv install documentation](https://docs.astral.sh/uv/getting-started/installation/).
## Code examples
2. Install the module
- [foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) — complete applications that you can use as starting points for development
```bash
# For new projects
uv init my-pipecat-app
cd my-pipecat-app
uv add pipecat-ai
## A simple voice agent running locally
# Or for existing projects
uv add pipecat-ai
```
Here is a very basic Pipecat bot that greets a user when they join a real-time session. We'll use [Daily](https://daily.co) for real-time media transport, and [ElevenLabs](https://elevenlabs.io/) for text-to-speech.
3. Set up your environment
```python
#app.py
```bash
cp env.example .env
```
import asyncio
import aiohttp
4. To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with:
from pipecat.frames.frames import EndFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
```bash
uv add "pipecat-ai[option,...]"
```
async def main():
async with aiohttp.ClientSession() as session:
# Use Daily as a real-time media transport (WebRTC)
transport = DailyTransport(
room_url=...,
token=...,
"Bot Name",
DailyParams(audio_out_enabled=True))
> **Using pip?** You can still use `pip install pipecat-ai` and `pip install "pipecat-ai[option,...]"` to get set up.
# Use Eleven Labs for Text-to-Speech
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=...,
voice_id=...,
)
## 🧪 Code examples
# Simple pipeline that will process text to speech and output the result
pipeline = Pipeline([tts, transport.output()])
- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [Example apps](https://github.com/pipecat-ai/pipecat-examples) — complete applications that you can use as starting points for development
# Create Pipecat processor that can run one or more pipelines tasks
runner = PipelineRunner()
## 🛠️ Contributing to the framework
# Assign the task callable to run the pipeline
task = PipelineTask(pipeline)
### Prerequisites
# Register an event handler to play audio when a
# participant joins the transport WebRTC session
@transport.event_handler("on_participant_joined")
async def on_new_participant_joined(transport, participant):
participant_name = participant["info"]["userName"] or ''
# Queue a TextFrame that will get spoken by the TTS service (Eleven Labs)
await task.queue_frames([TextFrame(f"Hello there, {participant_name}!"), EndFrame()])
**Minimum Python Version:** 3.10
**Recommended Python Version:** 3.12
# Run the pipeline task
await runner.run(task)
### Setup Steps
if __name__ == "__main__":
asyncio.run(main())
```
1. Clone the repository and navigate to it:
Run it with:
```bash
git clone https://github.com/pipecat-ai/pipecat.git
cd pipecat
```
```shell
python app.py
```
2. Install development and testing dependencies:
Daily provides a prebuilt WebRTC user interface. Whilst the app is running, you can visit at `https://<yourdomain>.daily.co/<room_url>` and listen to the bot say hello!
```bash
uv sync --group dev --all-extras \
--no-extra gstreamer \
--no-extra krisp \
--no-extra local \
--no-extra ultravox # (ultravox not fully supported on macOS)
```
3. Install the git pre-commit hooks:
## WebRTC for production use
```bash
uv run pre-commit install
```
WebSockets are fine for server-to-server communication or for initial development. But for production use, youll need client-server audio to use a protocol designed for real-time media transport. (For an explanation of the difference between WebSockets and WebRTC, see [this post.](https://www.daily.co/blog/how-to-talk-to-an-llm-with-your-voice/#webrtc))
One way to get up and running quickly with WebRTC is to sign up for a Daily developer account. Daily gives you SDKs and global infrastructure for audio (and video) routing. Every account gets 10,000 audio/video/transcription minutes free each month.
Sign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://docs.daily.co/reference/rest-api/rooms) in the developer Dashboard.
## What is VAD?
Voice Activity Detection &mdash; very important for knowing when a user has finished speaking to your bot. If you are not using press-to-talk, and want Pipecat to detect when the user has finished talking, VAD is an essential component for a natural feeling conversation.
Pipecast makes use of WebRTC VAD by default when using a WebRTC transport layer. Optionally, you can use Silero VAD for improved accuracy at the cost of higher CPU usage.
```shell
pip install pipecat-ai[silero]
```
The first time your run your bot with Silero, startup may take a while whilst it downloads and caches the model in the background. You can check the progress of this in the console.
## Hacking on the framework itself
_Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo:_
```shell
python3 -m venv venv
source venv/bin/activate
```
From the root of this repo, run the following:
```shell
pip install -r dev-requirements.txt -r {env}-requirements.txt
python -m build
```
This builds the package. To use the package locally (eg to run sample files), run
```shell
pip install --editable .
```
If you want to use this package from another directory, you can run:
```shell
pip install path_to_this_repo
```
> **Note**: Some extras (local, gstreamer) require system dependencies. See documentation if you encounter build errors.
### Running tests
From the root directory, run:
To run all tests, from the root directory:
```shell
pytest --doctest-modules --ignore-glob="*to_be_updated*" src tests
```bash
uv run pytest
```
## Setting up your editor
Run a specific test suite:
This project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting.
```bash
uv run pytest tests/test_name.py
```
### Emacs
### Setting up your editor
You can use [use-package](https://github.com/jwiegley/use-package) to install [py-autopep8](https://codeberg.org/ideasman42/emacs-py-autopep8) package and configure `autopep8` arguments:
This project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting via [Ruff](https://github.com/astral-sh/ruff).
#### Emacs
You can use [use-package](https://github.com/jwiegley/use-package) to install [emacs-lazy-ruff](https://github.com/christophermadsen/emacs-lazy-ruff) package and configure `ruff` arguments:
```elisp
(use-package py-autopep8
(use-package lazy-ruff
:ensure t
:defer t
:hook ((python-mode . py-autopep8-mode))
:hook ((python-mode . lazy-ruff-mode))
:config
(setq py-autopep8-options '("-a" "-a", "--max-line-length=100")))
(setq lazy-ruff-format-command "ruff format")
(setq lazy-ruff-check-command "ruff check --select I"))
```
`autopep8` was installed in the `venv` environment described before, so you should be able to use [pyvenv-auto](https://github.com/ryotaro612/pyvenv-auto) to automatically load that environment inside Emacs.
`ruff` was installed in the `venv` environment described before, so you should be able to use [pyvenv-auto](https://github.com/ryotaro612/pyvenv-auto) to automatically load that environment inside Emacs.
```elisp
(use-package pyvenv-auto
:ensure t
:defer t
:hook ((python-mode . pyvenv-auto-run)))
```
### Visual Studio Code
#### Visual Studio Code
Install the
[autopep8](https://marketplace.visualstudio.com/items?itemName=ms-python.autopep8) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, enable formatting on save and configure `autopep8` arguments:
[Ruff](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, and enable formatting on save:
```json
"[python]": {
"editor.defaultFormatter": "ms-python.autopep8",
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true
},
"autopep8.args": [
"-a",
"-a",
"--max-line-length=100"
],
}
```
## Getting help
#### PyCharm
`ruff` was installed in the `venv` environment described before, now to enable autoformatting on save, go to `File` -> `Settings` -> `Tools` -> `File Watchers` and add a new watcher with the following settings:
1. **Name**: `Ruff formatter`
2. **File type**: `Python`
3. **Working directory**: `$ContentRoot$`
4. **Arguments**: `format $FilePath$`
5. **Program**: `$PyInterpreterDirectory$/ruff`
## 🤝 Contributing
We welcome contributions from the community! Whether you're fixing bugs, improving documentation, or adding new features, here's how you can help:
- **Found a bug?** Open an [issue](https://github.com/pipecat-ai/pipecat/issues)
- **Have a feature idea?** Start a [discussion](https://discord.gg/pipecat)
- **Want to contribute code?** Check our [CONTRIBUTING.md](CONTRIBUTING.md) guide
- **Documentation improvements?** [Docs](https://github.com/pipecat-ai/docs) PRs are always welcome
Before submitting a pull request, please check existing issues and PRs to avoid duplicates.
We aim to review all contributions promptly and provide constructive feedback to help get your changes merged.
## 🛟 Getting help
➡️ [Join our Discord](https://discord.gg/pipecat)
➡️ [Read the docs](https://docs.pipecat.ai)
➡️ [Reach us on X](https://x.com/pipecat_ai)

11
codecov.yml Normal file
View File

@@ -0,0 +1,11 @@
coverage:
range: 50..90 # coverage lower than 50 is red, higher than 90 green, between color code
status:
project:
default:
target: auto # auto % coverage target
threshold: 5% # allow for 5% reduction of coverage without failing
# do not run coverage on patch nor changes
patch: false

View File

@@ -1,6 +0,0 @@
autopep8~=2.1.0
build~=1.2.1
pip-tools~=7.4.1
pytest~=8.2.0
setuptools~=69.5.1
setuptools_scm~=8.1.0

View File

@@ -1,10 +0,0 @@
# Pipecat Docs
## [Architecture Overview](architecture.md)
Learn about the thinking behind the framework's design.
## [A Frame's Progress](frame-progress.md)
See how a Frame is processed through a Transport, a Pipeline, and a series of Frame Processors.

20
docs/api/Makefile Normal file
View File

@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

109
docs/api/README.md Normal file
View File

@@ -0,0 +1,109 @@
# Pipecat Documentation
This directory contains the source files for auto-generating Pipecat's server API reference documentation.
## Setup
1. Install documentation dependencies:
```bash
pip install -r requirements.txt
```
2. Make the build scripts executable:
```bash
chmod +x build-docs.sh rtd-test.py
```
## Building Documentation
From this directory, you can build the documentation in several ways:
### Local Build
```bash
# Using the build script (automatically opens docs when done)
./build-docs.sh
# Or directly with sphinx-build
sphinx-build -b html . _build/html -W --keep-going
```
### ReadTheDocs Test Build
To test the documentation build process exactly as it would run on ReadTheDocs:
```bash
./rtd-test.py
```
This script:
- Creates a fresh virtual environment
- Installs all dependencies as specified in requirements files
- Handles conflicting dependencies (like grpcio versions for Riva and PlayHT)
- Builds the documentation in an isolated environment
- Provides detailed logging of the build process
Use this script to verify your documentation will build correctly on ReadTheDocs before pushing changes.
## Viewing Documentation
The built documentation will be available at `_build/html/index.html`. To open:
```bash
# On MacOS
open _build/html/index.html
# On Linux
xdg-open _build/html/index.html
# On Windows
start _build/html/index.html
```
## Directory Structure
```
.
├── api/ # Auto-generated API documentation
├── _build/ # Built documentation
├── _static/ # Static files (images, css, etc.)
├── conf.py # Sphinx configuration
├── index.rst # Main documentation entry point
├── requirements-base.txt # Base documentation dependencies
├── requirements-riva.txt # Riva-specific dependencies
├── requirements-playht.txt # PlayHT-specific dependencies
├── build-docs.sh # Local build script
└── rtd-test.py # ReadTheDocs test build script
```
## Notes
- Documentation is auto-generated from Python docstrings
- Service modules are automatically detected and included
- The build process matches our ReadTheDocs configuration
- Warnings are treated as errors (-W flag) to maintain consistency
- The --keep-going flag ensures all errors are reported
- Dependencies are split into multiple requirements files to handle version conflicts
## Troubleshooting
If you encounter missing service modules:
1. Verify the service is installed with its extras: `pip install pipecat-ai[service-name]`
2. Check the build logs for import errors
3. Ensure the service module is properly initialized in the package
4. Run `./rtd-test.py` to test in an isolated environment matching ReadTheDocs
For dependency conflicts:
1. Check the requirements files for version specifications
2. Use `rtd-test.py` to verify dependency resolution
3. Consider adding service-specific requirements files if needed
For more information:
- [ReadTheDocs Configuration](.readthedocs.yaml)
- [Sphinx Documentation](https://www.sphinx-doc.org/)

27
docs/api/build-docs.sh Executable file
View File

@@ -0,0 +1,27 @@
#!/bin/bash
# Build docs using uv
echo "Installing dependencies with uv..."
uv sync --group docs --all-extras --no-extra krisp --no-extra gstreamer --no-extra ultravox --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
# Check if sphinx-build is available
if ! uv run sphinx-build --version &> /dev/null; then
echo "Error: sphinx-build is not available" >&2
exit 1
fi
# Clean previous build
rm -rf _build
echo "Building documentation..."
# Build docs matching ReadTheDocs configuration
uv run sphinx-build -b html -d _build/doctrees . _build/html -W --keep-going
if [ $? -eq 0 ]; then
echo "Documentation built successfully!"
# Open docs (MacOS)
open _build/html/index.html
else
echo "Documentation build failed!" >&2
exit 1
fi

210
docs/api/conf.py Normal file
View File

@@ -0,0 +1,210 @@
import logging
import os
import sys
from datetime import datetime
from pathlib import Path
# Configure logging
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger("sphinx-build")
# Add source directory to path
docs_dir = Path(__file__).parent
project_root = docs_dir.parent.parent
sys.path.insert(0, str(project_root / "src"))
# Project information
project = "pipecat-ai"
current_year = datetime.now().year
copyright = f"2024-{current_year}, Daily" if current_year > 2024 else "2024, Daily"
author = "Daily"
# General configuration
extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinx.ext.intersphinx",
]
suppress_warnings = [
"autodoc.mocked_object",
"toc.not_included",
]
# Napoleon settings
napoleon_google_docstring = True
napoleon_include_init_with_doc = True
# AutoDoc settings
autodoc_default_options = {
"members": True,
"member-order": "bysource",
"undoc-members": False,
"exclude-members": "__weakref__,model_config",
"show-inheritance": True,
}
# Mock imports for optional dependencies
autodoc_mock_imports = [
# Krisp - has build issues on some platforms
"pipecat_ai_krisp",
"krisp",
# System-specific GUI libraries
"_tkinter",
"tkinter",
# Platform-specific audio libraries (if needed)
"gi",
"gi.require_version",
"gi.repository",
# OpenCV - sometimes has import issues during docs build
"cv2",
# Heavy ML packages excluded from ReadTheDocs
# ultravox dependencies
"vllm",
"vllm.engine.arg_utils",
# local-smart-turn dependencies
"coremltools",
"coremltools.models",
"coremltools.models.MLModel",
"torch",
"torch.nn",
"torch.nn.functional",
"torchaudio",
# moondream dependencies
"transformers",
"transformers.AutoTokenizer",
"transformers.AutoFeatureExtractor",
"AutoFeatureExtractor",
"timm",
"einops",
"intel_extension_for_pytorch",
"huggingface_hub",
# riva dependencies
"riva",
"riva.client",
"riva.client.Auth",
"riva.client.ASRService",
"riva.client.StreamingRecognitionConfig",
"riva.client.RecognitionConfig",
"riva.client.AudioEncoding",
"riva.client.proto.riva_tts_pb2",
"riva.client.SpeechSynthesisService",
# MLX dependencies (Apple Silicon specific)
"mlx",
"mlx_whisper", # Note: might need underscore format too
]
# HTML output settings
html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"] if os.path.exists("_static") else []
autodoc_typehints = "signature" # Show type hints in the signature only, not in the docstring
html_show_sphinx = False
def import_core_modules():
"""Import core pipecat modules for autodoc to discover."""
core_modules = [
"pipecat",
"pipecat.frames",
"pipecat.pipeline",
"pipecat.processors",
"pipecat.services",
"pipecat.transports",
"pipecat.audio",
"pipecat.adapters",
"pipecat.clocks",
"pipecat.metrics",
"pipecat.observers",
"pipecat.runner",
"pipecat.serializers",
"pipecat.sync",
"pipecat.transcriptions",
"pipecat.utils",
]
for module_name in core_modules:
try:
__import__(module_name)
logger.info(f"Successfully imported {module_name}")
except ImportError as e:
logger.warning(f"Failed to import {module_name}: {e}")
def clean_title(title: str) -> str:
"""Automatically clean module titles."""
# Remove everything after space (like 'module', 'processor', etc.)
title = title.split(" ")[0]
# Get the last part of the dot-separated path
parts = title.split(".")
title = parts[-1]
return title
def setup(app):
"""Generate API documentation during Sphinx build."""
from sphinx.ext.apidoc import main
docs_dir = Path(__file__).parent
project_root = docs_dir.parent.parent
output_dir = str(docs_dir / "api")
source_dir = str(project_root / "src" / "pipecat")
# Clean existing files
if Path(output_dir).exists():
import shutil
shutil.rmtree(output_dir)
logger.info(f"Cleaned existing documentation in {output_dir}")
logger.info(f"Generating API documentation...")
logger.info(f"Output directory: {output_dir}")
logger.info(f"Source directory: {source_dir}")
excludes = [
str(project_root / "src/pipecat/pipeline/to_be_updated"),
str(project_root / "src/pipecat/examples"),
str(project_root / "src/pipecat/tests"),
"**/test_*.py",
"**/tests/*.py",
]
try:
main(
[
"-f", # Force overwriting
"-e", # Don't generate empty files
"-M", # Put module documentation before submodule documentation
"--no-toc", # Don't create a table of contents file
"--separate", # Put documentation for each module in its own page
"--module-first", # Module documentation before submodule documentation
"--implicit-namespaces", # Added: Handle implicit namespace packages
"-o",
output_dir,
source_dir,
]
+ excludes
)
logger.info("API documentation generated successfully!")
# Process generated RST files to update titles
for rst_file in Path(output_dir).glob("**/*.rst"): # Changed to recursive glob
content = rst_file.read_text()
lines = content.split("\n")
# Find and clean up the title
if lines and "=" in lines[1]: # Title is typically the first line
old_title = lines[0]
new_title = clean_title(old_title)
content = content.replace(old_title, new_title)
rst_file.write_text(content)
logger.info(f"Updated title: {old_title} -> {new_title}")
except Exception as e:
logger.error(f"Error generating API documentation: {e}", exc_info=True)
import_core_modules()

36
docs/api/index.rst Normal file
View File

@@ -0,0 +1,36 @@
Pipecat API Reference
=====================
Welcome to the Pipecat API reference.
Use the navigation on the left to browse modules, or search using the search box.
**New to Pipecat?** Check out the `main documentation <https://docs.pipecat.ai>`_ for tutorials, guides, and client SDK information.
Quick Links
-----------
* `GitHub Repository <https://github.com/pipecat-ai/pipecat>`_
* `Join our Community <https://discord.gg/pipecat>`_
.. toctree::
:maxdepth: 2
:caption: API Reference
:hidden:
Adapters <api/pipecat.adapters>
Audio <api/pipecat.audio>
Clocks <api/pipecat.clocks>
Extensions <api/pipecat.extensions>
Frames <api/pipecat.frames>
Metrics <api/pipecat.metrics>
Observers <api/pipecat.observers>
Pipeline <api/pipecat.pipeline>
Processors <api/pipecat.processors>
Runner <api/pipecat.runner>
Serializers <api/pipecat.serializers>
Services <api/pipecat.services>
Sync <api/pipecat.sync>
Transcriptions <api/pipecat.transcriptions>
Transports <api/pipecat.transports>
Utils <api/pipecat.utils>

35
docs/api/make.bat Normal file
View File

@@ -0,0 +1,35 @@
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)
if "%1" == "" goto help
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
:end
popd

38
docs/api/rtd-test.sh Executable file
View File

@@ -0,0 +1,38 @@
#!/bin/bash
set -e
# Configuration
DOCS_DIR=$(pwd)
PROJECT_ROOT=$(cd ../../ && pwd)
TEST_DIR="/tmp/rtd-test-$(date +%Y%m%d_%H%M%S)"
echo "Creating test directory: $TEST_DIR"
mkdir -p "$TEST_DIR"
cd "$TEST_DIR"
# Create virtual environment
python -m venv venv
source venv/bin/activate
echo "Installing build dependencies..."
pip install --upgrade pip wheel setuptools
echo "Installing documentation dependencies..."
pip install -r "$DOCS_DIR/requirements.txt"
echo "Building documentation..."
cd "$DOCS_DIR"
sphinx-build -b html . "_build/html"
echo "Build complete. Check _build/html directory for output."
# Print summary
echo -e "\n=== Build Summary ==="
echo "Documentation: $DOCS_DIR/_build/html"
echo "Test environment: $TEST_DIR"
echo -e "\nTo view the documentation:"
echo "open $DOCS_DIR/_build/html/index.html"
# Print installed packages for verification
echo -e "\n=== Installed Packages ==="
pip freeze | grep -E "sphinx|pipecat"

View File

@@ -1,17 +0,0 @@
# Pipecat architecture guide
## Frames
Frames can represent discrete chunks of data, for instance a chunk of text, a chunk of audio, or an image. They can also be used to as control flow, for instance a frame that indicates that there is no more data available, or that a user started or stopped talking. They can also represent more complex data structures, such as a message array used for an LLM completion.
## FrameProcessors
Frame processors operate on frames. Every frame processor implements a `process_frame` method that consumes one frame and produces zero or more frames. Frame processors can do simple transforms, such as concatenating text fragments into sentences, or they can treat frames as input for an AI Service, and emit chat completions based on message arrays or transform text into audio or images.
## Pipelines
Pipelines are lists of frame processors linked together. Frame processors can push frames upstream or downstream to their peers. A very simple pipeline might chain an LLM frame processor to a text-to-speech frame processor, with a transport as an output.
## Transports
Transports provide input and output frame processors to receive or send frames respectively. For example, the `DailyTransport` does this with a WebRTC session joined to a Daily.co room.

View File

@@ -1,46 +0,0 @@
# A Frame's Progress
1. A user says “Hello, LLM” and the cloud transcription service delivers a transcription to the Transport.
![A transcript frame arrives](images/frame-progress-01.png)
2. The Transport places a Transcription frame in the Pipelines source queue.
![Frame in source queue](images/frame-progress-02.png)
3. The Pipeline passes the Transcription frame to the first Frame Processor in its list, the LLM User Message Aggregator.
![To UMA](images/frame-progress-03.png)
4. The LLM User Message Aggregator updates the LLM Context with a `{“user”: “Hello LLM”}` message.
![Update context](images/frame-progress-04.png)
5. The LLM User Message Aggregator yields an LLM Message Frame, containing the updated LLM Context. The Pipeline passes this frame to the LLM Frame Processor.
![Update context](images/frame-progress-05.png)
6. The LLM Frame Processor creates a streaming chat completion based on the LLM context and yields the first chunk of a response, Text Frame with the value “Hi, “. The Pipeline passes this frame to the TTS Frame Processor. The TTS Frame Processor aggregates this response but doesnt yield anything, yet, because its waiting for a full sentence.
![LLM yields Text](images/frame-progress-06.png)
7. The LLM Frame Processor yields another Text Frame with the value “there.”. The Pipeline passes this frame to the TTS Frame Processor.
![LLM yields more Text](images/frame-progress-07.png)
8. The TTS Frame Processor now has a full sentence, so it starts streaming audio based on “Hi, there.” It yields the first chunk of streaming audio as an Audio frame, which the Pipeline passes to the LLM Assistant Message Aggregator.
![TTS yields Audio](images/frame-progress-08.png)
9. The LLM Assistant Message Aggregator doesnt do anything with Audio frames, so it immediately yields the frame, unchanged. This is the convention for all Frame Processors: frames that the processor doesnt process should be immediately yielded.
![pass-through](images/frame-progress-09.png)
10. The Pipeline places the first Audio frame in its sink queue, which is being watched by the Transport. Since the frame is now in a queue, the Pipeline can continue processing other frames. Note that the source and sink queues form a sort of “boundary of concurrent processing” between a Pipeline and the outside world. In a Pipeline, Frames are processed sequentially; once a Frame is on a queue it can be processed in parallel with the frames being processed by the Pipeline. TODO: link to a more in-depth section about this.
![sink queue](images/frame-progress-10.png)
11. The TTS Frame Processor yields another Audio frame as the Transport transmits the first Audio frame.
![parallel audio](images/frame-progress-11.png)
12. As before, the LLM Assistant Message Aggregator immediately yields the Audio frame and the Pipeline places the Audio frame in the sink queue.
![sink queue 2](images/frame-progress-12.png)
13. The TTS Frame Processor has no more frames to yield. The LLM Frame Processor emits an LLM Response End Frame, which the Pipeline passes to the TTS Frame Processor.
![response end](images/frame-progress-13.png)
14. The TTS Frame Processor immediately yields the LLM Response End Frame, so the Pipeline passes it along to the LLM Assistant Message Aggregator. The LLM Assistant Message Aggregator updates the LLM Context with the full response from the LLM. TODO TODO: I realized I forgot that the TSS Frame Processor also yields the Text frames that the LLM emitted so that the LLM Assistant Message Aggregator could accumulate them, arrggh.
![response end](images/frame-progress-14.png)
15. The system is quiet, and waiting for the next message from the Transport.
![response end](images/frame-progress-15.png)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 111 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 117 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 98 KiB

View File

@@ -1,35 +0,0 @@
# Anthropic
ANTHROPIC_API_KEY=...
# Azure
AZURE_SPEECH_REGION=...
AZURE_SPEECH_API_KEY=...
AZURE_CHATGPT_API_KEY=...
AZURE_CHATGPT_ENDPOINT=https://...
AZURE_CHATGPT_MODEL=...
AZURE_DALLE_API_KEY=...
AZURE_DALLE_ENDPOINT=https://...
AZURE_DALLE_MODEL=...
# Daily
DAILY_API_KEY=...
DAILY_SAMPLE_ROOM_URL=https://...
# ElevenLabs
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
# Fal
FAL_KEY=...
# Fireworks
FIREWORKS_API_KEY=...
# PlayHT
PLAY_HT_USER_ID=...
PLAY_HT_API_KEY=...
# OpenAI
OPENAI_API_KEY=...

157
env.example Normal file
View File

@@ -0,0 +1,157 @@
# AI-COUSTICS
AICOUSTICS_LICENSE_KEY=...
# Anthropic
ANTHROPIC_API_KEY=...
# Async
ASYNCAI_API_KEY=...
ASYNCAI_VOICE_ID=...
# AWS
AWS_SECRET_ACCESS_KEY=...
AWS_ACCESS_KEY_ID=...
AWS_REGION=...
# Azure
AZURE_SPEECH_REGION=...
AZURE_SPEECH_API_KEY=...
AZURE_CHATGPT_API_KEY=...
AZURE_CHATGPT_ENDPOINT=https://...
AZURE_CHATGPT_MODEL=...
AZURE_DALLE_API_KEY=...
AZURE_DALLE_ENDPOINT=https://...
AZURE_DALLE_MODEL=...
# Cartesia
CARTESIA_API_KEY=...
# Daily
DAILY_API_KEY=...
DAILY_SAMPLE_ROOM_URL=https://...
# Deepgram
DEEPGRAM_API_KEY=...
# ElevenLabs
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
# Neuphonic
NEUPHONIC_API_KEY=...
# Fal
FAL_KEY=...
# Fireworks
FIREWORKS_API_KEY=...
# Gladia
GLADIA_API_KEY=...
GLADIA_REGION=...
# Google
GOOGLE_API_KEY=...
GOOGLE_CLOUD_PROJECT_ID=...
GOOGLE_TEST_CREDENTIALS=...
GOOGLE_VERTEX_TEST_CREDENTIALS=...
# LMNT
LMNT_API_KEY=...
LMNT_VOICE_ID=...
# Perplexity
PERPLEXITY_API_KEY=...
# PlayHT
PLAYHT_USER_ID=...
PLAYHT_API_KEY=...
# OpenAI
OPENAI_API_KEY=...
# OpenPipe
OPENPIPE_API_KEY=...
# Tavus
TAVUS_API_KEY=...
TAVUS_REPLICA_ID=...
TAVUS_PERSONA_ID=...
# Simli
SIMLI_API_KEY=...
SIMLI_FACE_ID=...
# Krisp
KRISP_MODEL_PATH=...
# DeepSeek
DEEPSEEK_API_KEY=...
# Groq
GROQ_API_KEY=...
# Grok
GROK_API_KEY=...
# Inworld
INWORLD_API_KEY=...
# Together.ai
TOGETHER_API_KEY=...
# Cerebras
CEREBRAS_API_KEY=...
# Fish Audio
FISH_API_KEY=...
# Assembly AI
ASSEMBLYAI_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...
# Piper
PIPER_BASE_URL=...
# Smart turn
LOCAL_SMART_TURN_MODEL_PATH=...
FAL_SMART_TURN_API_KEY=...
# Twilio
TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...
# MiniMax
MINIMAX_API_KEY=...
MINIMAX_GROUP_ID=...
# Sarvam AI
SARVAM_API_KEY=...
# Soniox
SONIOX_API_KEY=
# Speechmatics
SPEECHMATICS_API_KEY=...
# SambaNova
SAMBANOVA_API_KEY=...
# Sentry
SENTRY_DSN=...
# Heygen
HEYGEN_API_KEY=...
# Mistral
MISTRAL_API_KEY=...
# NVIDIA
NVIDIA_API_KEY=...
# Qwen
QWEN_API_KEY=...

View File

@@ -1,84 +1,31 @@
# Pipecat Examples
This directory contains examples to help you learn how to build with Pipecat.
# Pipecat &mdash; Examples
## Getting Started
## Foundational snippets
Small snippets that build on each other, introducing one or two concepts at a time.
New to Pipecat? Start here:
➡️ [Take a look](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational)
- **[Quickstart](quickstart/)** - Get your first voice AI bot running in 5 minutes _(coming soon)_
- **[Client/Server Web](client-server-web/)** - Learn to build web applications with Pipecat's client SDKs _(coming soon)_
- **[Phone Bot with Twilio](phone-bot-twilio/)** - Connect your bot to a phone number _(coming soon)_
## Chatbot examples
Collection of self-contained real-time voice and video AI demo applications built with Pipecat.
## Foundational Examples
### Quickstart
Single-file examples that introduce core Pipecat concepts one at a time. These examples:
Each project has its own set of dependencies and configuration variables. They intentionally avoids shared code across projects &mdash; you can grab whichever demo folder you want to work with as a starting point.
- Build on each other progressively
- Focus on specific features or integrations
- Are used for testing with every Pipecat release
We recommend you start with a virtual environment:
See the **[Foundational Examples README](foundational/)** for the complete list.
```shell
cd pipecat-ai/examples/simple-chatbot
## More Advanced Examples
python -m venv venv
Ready to explore complex use cases? Visit **[pipecat-examples](https://github.com/pipecat-ai/pipecat-examples)** for:
source venv/bin/activate
pip install -r requirements.txt
```
Next, follow the steps in the README for each demo.
Make sure you `pip install -r requirements.txt` for each demo project, so you can be sure to have the necessary service dependencies that extend the functionality of Pipecat. You can read more about the framework architecture [here](https://github.com/pipecat-ai/pipecat/tree/main/docs).
## Projects:
| Project | Description | Services |
| -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
| [Simple Chatbot](simple-chatbot) | Basic voice-driven conversational bot. A good starting point for learning the flow of the framework. | Deepgram, OpenAI, Daily, Daily Prebuilt UI |
| [Storytelling Chatbot](storytelling-chatbot) | Stitches together multiple third-party services to create a collaborative storytime experience. | Deepgram, ElevenLabs, Open AI, Fal, Daily, Custom UI |
| [Translation Chatbot](translation-chatbot) | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI |
| [Moondream Chatbot](moondream-chatbot) | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU** | Deepgram, OpenAI, Moondream, Daily, Daily Prebuilt UI |
| Function-calling Chatbot (TBC) | A chatbot that can call functions in response to user input | Deepgram, OpenAI, Fireworks, Daily, Daily Prebuilt UI |
> [!IMPORTANT]
> These example projects use Daily as a WebRTC transport and can be joined using their hosted Prebuilt UI.
> It provides a quick way to join a real-time session with your bot and test your ideas without building any frontend code. If you'd like to see an example of a custom UI, try Storybot.
## FAQ
### Deployment
For each of these demos we've included a `Dockerfile`. Out of the box, this should provide everything needed to get the respective demo running on a VM:
```shell
docker build username/app:tag .
docker run -p 7860:7860 --env-file ./.env username/app:tag
docker push ...
```
### SSL
If you're working with a custom UI (such as with the Storytelling Chatbot), it's important to ensure your deployment platform supports HTTPS, as accessing user devices such as mics and webcams requires SSL.
If you try to run a custom UI without SSL, you may see an error in the console telling you that `navigator` is undefined, or no devices are available.
### Are these examples production ready?
Yes, kind of.
These demos attempt to keep things simple and are unopinionated regarding environment or scalability.
We're using FastAPI to spawn a subprocess for the bots / agents &mdash; useful for small tests, but not so great for production grade apps with many concurrent users. You can see how this works in each project's `start` endpoint in `server.py`.
Creating virtualized worker pools and on-demand instances is out of scope for these examples, but we hope to add some examples to this repo soon!
For projects that have CUDA as a requirement, such as Moondream Chatbot, be sure to deploy to a GPU-powered platform (such as [fly.io](https://fly.io) or [Runpod](https://runpod.io).)
## Getting help
➡️ [Join our Discord](https://discord.gg/pipecat)
➡️ [Reach us on Twitter](https://x.com/pipecat_ai)
- Production-ready applications
- Multi-platform client implementations
- Telephony integrations
- Multimodal and creative applications
- Deployment and monitoring examples

View File

@@ -0,0 +1,70 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.piper.tts import PiperTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),
"webrtc": lambda: TransportParams(audio_out_enabled=True),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Create an HTTP session
async with aiohttp.ClientSession() as session:
tts = PiperTTSService(
base_url=os.getenv("PIPER_BASE_URL"), aiohttp_session=session, sample_rate=24000
)
task = PipelineTask(
Pipeline([tts, transport.output()]),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
# Register an event handler so we can play the audio when the client joins
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
await task.queue_frames([TTSSpeakFrame(f"Hello there!"), EndFrame()])
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,71 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.rime.tts import RimeHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),
"webrtc": lambda: TransportParams(audio_out_enabled=True),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Create an HTTP session
async with aiohttp.ClientSession() as session:
tts = RimeHttpTTSService(
api_key=os.getenv("RIME_API_KEY", ""),
voice_id="rex",
aiohttp_session=session,
)
task = PipelineTask(
Pipeline([tts, transport.output()]),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
# Register an event handler so we can play the audio when the client joins
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
await task.queue_frames([TTSSpeakFrame(f"Hello there!"), EndFrame()])
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -1,56 +1,68 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from pipecat.frames.frames import EndFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from runner import configure
from loguru import logger
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),
"webrtc": lambda: TransportParams(audio_out_enabled=True),
}
async def main(room_url):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True))
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
runner = PipelineRunner()
task = PipelineTask(
Pipeline([tts, transport.output()]),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when the client joins
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
await task.queue_frames([TTSSpeakFrame(f"Hello there!"), EndFrame()])
# Register an event handler so we can play the audio when the
# participant joins.
@transport.event_handler("on_participant_joined")
async def on_new_participant_joined(transport, participant):
participant_name = participant["info"]["userName"] or ''
await task.queue_frames([TextFrame(f"Hello there, {participant_name}!"), EndFrame()])
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
await runner.run(task)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url))
from pipecat.runner.run import main
main()

View File

@@ -1,25 +1,23 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from pipecat.frames.frames import EndFrame, TextFrame
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.audio import LocalAudioTransport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.local.audio import LocalAudioTransport, LocalAudioTransportParams
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
@@ -27,26 +25,24 @@ logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
transport = LocalAudioTransport(TransportParams(audio_out_enabled=True))
transport = LocalAudioTransport(LocalAudioTransportParams(audio_out_enabled=True))
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
pipeline = Pipeline([tts, transport.output()])
pipeline = Pipeline([tts, transport.output()])
task = PipelineTask(pipeline)
task = PipelineTask(pipeline)
async def say_something():
await asyncio.sleep(1)
await task.queue_frames([TextFrame("Hello there!"), EndFrame()])
async def say_something():
await asyncio.sleep(1)
await task.queue_frames([TTSSpeakFrame("Hello there, how is it going!"), EndFrame()])
runner = PipelineRunner()
runner = PipelineRunner(handle_sigint=False if sys.platform == "win32" else True)
await asyncio.gather(runner.run(task), say_something())
await asyncio.gather(runner.run(task), say_something())
if __name__ == "__main__":

View File

@@ -0,0 +1,62 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.runner.livekit import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.livekit.transport import LiveKitParams, LiveKitTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
(url, token, room_name) = await configure()
transport = LiveKitTransport(
url=url,
token=token,
room_name=room_name,
params=LiveKitParams(audio_out_enabled=True),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when the
# participant joins.
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant_id):
await asyncio.sleep(1)
await task.queue_frame(
TTSSpeakFrame(
"Hello there! How are you doing today? Would you like to talk about the weather?"
)
)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,65 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import EndFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.riva.tts import FastPitchTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),
"webrtc": lambda: TransportParams(audio_out_enabled=True),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
tts = FastPitchTTSService(api_key=os.getenv("NVIDIA_API_KEY"))
task = PipelineTask(
Pipeline([tts, transport.output()]),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
# Register an event handler so we can play the audio when the client joins
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
await task.queue_frames([TTSSpeakFrame(f"Hello there!"), EndFrame()])
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -1,68 +1,79 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from pipecat.frames.frames import EndFrame, LLMMessagesFrame
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import EndFrame, LLMContextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(audio_out_enabled=True),
"twilio": lambda: FastAPIWebsocketParams(audio_out_enabled=True),
"webrtc": lambda: TransportParams(audio_out_enabled=True),
}
async def main(room_url):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
None,
"Say One Thing From an LLM",
DailyParams(audio_out_enabled=True))
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4-turbo-preview")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are an LLM in a WebRTC session, and this is a 'hello world' demo. Say hello to the world.",
}]
messages = [
{
"role": "system",
"content": "You are an LLM in a WebRTC session, and this is a 'hello world' demo. Say hello to the world.",
}
]
runner = PipelineRunner()
task = PipelineTask(
Pipeline([llm, tts, transport.output()]),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
task = PipelineTask(Pipeline([llm, tts, transport.output()]))
# Register an event handler so we can play the audio when the client joins
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
await task.queue_frames([LLMContextFrame(LLMContext(messages)), EndFrame()])
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await task.queue_frames([LLMMessagesFrame(messages), EndFrame()])
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url))
from pipecat.runner.run import main
main()

View File

@@ -1,68 +1,83 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.fal import FalImageGenService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.fal.image import FalImageGenService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
),
"webrtc": lambda: TransportParams(
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
),
}
async def main(room_url):
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Create an HTTP session
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
None,
"Show a still frame image",
DailyParams(
camera_out_enabled=True,
camera_out_width=1024,
camera_out_height=1024
)
)
imagegen = FalImageGenService(
params=FalImageGenService.InputParams(
image_size="square_hd"
),
params=FalImageGenService.InputParams(image_size="square_hd"),
aiohttp_session=session,
key=os.getenv("FAL_KEY"),
)
runner = PipelineRunner()
task = PipelineTask(
Pipeline([imagegen, transport.output()]),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
task = PipelineTask(Pipeline([imagegen, transport.output()]))
# Register an event handler so we can play the audio when the client joins
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
await task.queue_frame(TextFrame("a cat in the style of picasso"))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
# Note that we do not put an EndFrame() item in the pipeline for this demo.
# This means that the bot will stay in the channel until it times out.
# An EndFrame() in the pipeline would cause the transport to shut
# down.
await task.queue_frames([TextFrame("a cat in the style of picasso")])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url))
from pipecat.runner.run import main
main()

View File

@@ -1,27 +1,25 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
import tkinter as tk
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.fal import FalImageGenService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.tk import TkLocalTransport
from pipecat.services.fal.image import FalImageGenService
from pipecat.transports.local.tk import TkLocalTransport, TkTransportParams
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
@@ -35,15 +33,11 @@ async def main():
transport = TkLocalTransport(
tk_root,
TransportParams(
camera_out_enabled=True,
camera_out_width=1024,
camera_out_height=1024))
TkTransportParams(video_out_enabled=True, video_out_width=1024, video_out_height=1024),
)
imagegen = FalImageGenService(
params=FalImageGenService.InputParams(
image_size="square_hd"
),
params=FalImageGenService.InputParams(image_size="square_hd"),
aiohttp_session=session,
key=os.getenv("FAL_KEY"),
)
@@ -56,7 +50,7 @@ async def main():
runner = PipelineRunner()
async def run_tk():
while runner.is_active():
while not task.has_finished():
tk_root.update()
tk_root.update_idletasks()
await asyncio.sleep(0.1)

View File

@@ -0,0 +1,84 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.google.image import GoogleImageGenService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
),
"webrtc": lambda: TransportParams(
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
imagegen = GoogleImageGenService(
api_key=os.getenv("GOOGLE_API_KEY"),
)
task = PipelineTask(
Pipeline([imagegen, transport.output()]),
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
# Register an event handler so we can play the audio when the client joins
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
await task.queue_frame(TextFrame("a cat in the style of picasso"))
await task.queue_frame(TextFrame("a dog in the style of picasso"))
await task.queue_frame(TextFrame("a fish in the style of picasso"))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,178 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import asyncio
import os
from contextlib import asynccontextmanager
from typing import Dict
import uvicorn
from dotenv import load_dotenv
from fastapi import BackgroundTasks, FastAPI
from fastapi.responses import RedirectResponse
from loguru import logger
from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.smallwebrtc.connection import IceServer, SmallWebRTCConnection
from pipecat.transports.smallwebrtc.transport import SmallWebRTCTransport
load_dotenv(override=True)
app = FastAPI()
# Store connections by pc_id
pcs_map: Dict[str, SmallWebRTCConnection] = {}
ice_servers = [
IceServer(
urls="stun:stun.l.google.com:19302",
)
]
# Mount the frontend at /
app.mount("/client", SmallWebRTCPrebuiltUI)
async def run_example(webrtc_connection: SmallWebRTCConnection):
logger.info(f"Starting bot")
# Create a transport using the WebRTC connection
transport = SmallWebRTCTransport(
webrtc_connection=webrtc_connection,
params=TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=False)
await runner.run(task)
@app.get("/", include_in_schema=False)
async def root_redirect():
return RedirectResponse(url="/client/")
@app.post("/api/offer")
async def offer(request: dict, background_tasks: BackgroundTasks):
pc_id = request.get("pc_id")
if pc_id and pc_id in pcs_map:
pipecat_connection = pcs_map[pc_id]
logger.info(f"Reusing existing connection for pc_id: {pc_id}")
await pipecat_connection.renegotiate(
sdp=request["sdp"],
type=request["type"],
restart_pc=request.get("restart_pc", False),
)
else:
pipecat_connection = SmallWebRTCConnection(ice_servers)
await pipecat_connection.initialize(sdp=request["sdp"], type=request["type"])
@pipecat_connection.event_handler("closed")
async def handle_disconnected(webrtc_connection: SmallWebRTCConnection):
logger.info(f"Discarding peer connection for pc_id: {webrtc_connection.pc_id}")
pcs_map.pop(webrtc_connection.pc_id, None)
# Run example function with SmallWebRTC transport arguments.
background_tasks.add_task(run_example, pipecat_connection)
answer = pipecat_connection.get_answer()
# Updating the peer connection inside the map
pcs_map[answer["pc_id"]] = pipecat_connection
return answer
@asynccontextmanager
async def lifespan(app: FastAPI):
yield # Run app
coros = [pc.disconnect() for pc in pcs_map.values()]
await asyncio.gather(*coros)
pcs_map.clear()
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Pipecat Bot Runner")
parser.add_argument(
"--host", default="localhost", help="Host for HTTP server (default: localhost)"
)
parser.add_argument(
"--port", type=int, default=7860, help="Port for HTTP server (default: 7860)"
)
args = parser.parse_args()
uvicorn.run(app, host=args.host, port=args.port)

View File

@@ -1,86 +0,0 @@
#
# Copyright (c) 2024, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import aiohttp
import asyncio
import os
import sys
from pipecat.pipeline.merge_pipeline import SequentialMergePipeline
from pipecat.pipeline.pipeline import Pipeline
from pipecat.frames.frames import EndPipeFrame, LLMMessagesFrame, TextFrame
from pipecat.pipeline.task import PipelineTask
from pipecat.services.azure import AzureLLMService, AzureTTSService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.transport_services import TransportServiceOutput
from pipecat.services.transports.daily_transport import DailyTransport
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main(room_url: str):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(room_url, None, "Static And Dynamic Speech")
meeting = TransportServiceOutput(transport, mic_enabled=True)
llm = AzureLLMService(
api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
model=os.getenv("AZURE_CHATGPT_MODEL"),
)
azure_tts = AzureTTSService(
api_key=os.getenv("AZURE_SPEECH_API_KEY"),
region=os.getenv("AZURE_SPEECH_REGION"),
)
elevenlabs_tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
messages = [{"role": "system",
"content": "tell the user a joke about llamas"}]
# Start a task to run the LLM to create a joke, and convert the LLM
# output to audio frames. This task will run in parallel with generating
# and speaking the audio for static text, so there's no delay to speak
# the LLM response.
llm_pipeline = Pipeline([llm, elevenlabs_tts])
llm_task = PipelineTask(llm_pipeline)
await llm_task.queue_frames([LLMMessagesFrame(messages), EndPipeFrame()])
simple_tts_pipeline = Pipeline([azure_tts])
await simple_tts_pipeline.queue_frames(
[
TextFrame("My friend the LLM is going to tell a joke about llamas."),
EndPipeFrame(),
]
)
merge_pipeline = SequentialMergePipeline(
[simple_tts_pipeline, llm_pipeline])
await asyncio.gather(
transport.run(merge_pipeline),
simple_tts_pipeline.run_pipeline(),
llm_pipeline.run_pipeline(),
)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url))

View File

@@ -0,0 +1,107 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.daily import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.daily.transport import DailyLogLevel, DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Respond bot",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
transcription_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
)
transport.set_log_level(DailyLogLevel.Info)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.cancel()
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,140 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import json
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
InterruptionFrame,
TranscriptionFrame,
TTSSpeakFrame,
UserStartedSpeakingFrame,
UserStoppedSpeakingFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.livekit import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.livekit.transport import LiveKitParams, LiveKitTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
(url, token, room_name) = await configure()
transport = LiveKitTransport(
url=url,
token=token,
room_name=room_name,
params=LiveKitParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. "
"Your goal is to demonstrate your capabilities in a succinct way. "
"Your output will be converted to audio so don't include special characters in your answers. "
"Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
)
# Register an event handler so we can play the audio when the
# participant joins.
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant_id):
await asyncio.sleep(1)
await task.queue_frame(
TTSSpeakFrame(
"Hello there! How are you doing today? Would you like to talk about the weather?"
)
)
# Register an event handler to receive data from the participant via text chat
# in the LiveKit room. This will be used to as transcription frames and
# interrupt the bot and pass it to llm for processing and
# then pass back to the participant as audio output.
@transport.event_handler("on_data_received")
async def on_data_received(transport, data, participant_id):
logger.info(f"Received data from participant {participant_id}: {data}")
# convert data from bytes to string
json_data = json.loads(data)
await task.queue_frames(
[
InterruptionFrame(),
UserStartedSpeakingFrame(),
TranscriptionFrame(
user_id=participant_id,
timestamp=json_data["timestamp"],
text=json_data["message"],
),
UserStoppedSpeakingFrame(),
],
)
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,51 +1,44 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from dataclasses import dataclass
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import (
AppFrame,
EndFrame,
DataFrame,
Frame,
ImageRawFrame,
LLMContextFrame,
LLMFullResponseStartFrame,
LLMMessagesFrame,
TextFrame
TextFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.sync_parallel_pipeline import SyncParallelPipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.processors.aggregators.gated import GatedAggregator
from pipecat.processors.aggregators.llm_response import LLMFullResponseAggregator
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.aggregators.sentence import SentenceAggregator
from pipecat.processors.aggregators.parallel_task import ParallelTask
from pipecat.services.openai import OpenAILLMService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.fal import FalImageGenService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaHttpTTSService
from pipecat.services.fal.image import FalImageGenService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
@dataclass
class MonthFrame(AppFrame):
class MonthFrame(DataFrame):
month: str
def __str__(self):
@@ -59,6 +52,8 @@ class MonthPrepender(FrameProcessor):
self.prepend_to_next_text_frame = False
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, MonthFrame):
self.most_recent_month = frame.month
elif self.prepend_to_next_text_frame and isinstance(frame, TextFrame):
@@ -71,58 +66,72 @@ class MonthPrepender(FrameProcessor):
await self.push_frame(frame, direction)
async def main(room_url):
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_out_enabled=True,
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
),
"webrtc": lambda: TransportParams(
audio_out_enabled=True,
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
"""Run the Calendar Month Narration bot using WebRTC transport.
Args:
webrtc_connection: The WebRTC connection to use
room_name: Optional room name for display purposes
"""
logger.info(f"Starting bot")
# Create an HTTP session for API calls
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
None,
"Month Narration Bot",
DailyParams(
audio_out_enabled=True,
camera_out_enabled=True,
camera_out_width=1024,
camera_out_height=1024
)
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
tts = CartesiaHttpTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4-turbo-preview")
imagegen = FalImageGenService(
params=FalImageGenService.InputParams(
image_size="square_hd"
),
params=FalImageGenService.InputParams(image_size="square_hd"),
aiohttp_session=session,
key=os.getenv("FAL_KEY"),
)
gated_aggregator = GatedAggregator(
gate_open_fn=lambda frame: isinstance(frame, ImageRawFrame),
gate_close_fn=lambda frame: isinstance(frame, LLMFullResponseStartFrame),
start_open=False
)
sentence_aggregator = SentenceAggregator()
month_prepender = MonthPrepender()
llm_full_response_aggregator = LLMFullResponseAggregator()
pipeline = Pipeline([
llm, # LLM
sentence_aggregator, # Aggregates LLM output into full sentences
ParallelTask( # Run pipelines in parallel aggregating the result
[month_prepender, tts], # Create "Month: sentence" and output audio
[llm_full_response_aggregator, imagegen] # Aggregate full LLM response
),
gated_aggregator, # Queues everything until an image is available
transport.output() # Transport output
])
# With `SyncParallelPipeline` we synchronize audio and images by pushing
# them basically in order (e.g. I1 A1 A1 A1 I2 A2 A2 A2 A2 I3 A3). To do
# that, each pipeline runs concurrently and `SyncParallelPipeline` will
# wait for the input frame to be processed.
#
# Note that `SyncParallelPipeline` requires the last processor in each
# of the pipelines to be synchronous. In this case, we use
# `CartesiaHttpTTSService` and `FalImageGenService` which make HTTP
# requests and wait for the response.
pipeline = Pipeline(
[
llm, # LLM
sentence_aggregator, # Aggregates LLM output into full sentences
SyncParallelPipeline( # Run pipelines in parallel aggregating the result
[month_prepender, tts], # Create "Month: sentence" and output audio
[imagegen], # Generate image
),
transport.output(), # Transport output
]
)
frames = []
for month in [
@@ -146,19 +155,37 @@ async def main(room_url):
}
]
frames.append(MonthFrame(month=month))
frames.append(LLMMessagesFrame(messages))
frames.append(LLMContextFrame(LLMContext(messages)))
frames.append(EndFrame())
task = PipelineTask(
pipeline,
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
runner = PipelineRunner()
# Set up transport event handlers
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Start the month narration once connected
await task.queue_frames(frames)
task = PipelineTask(pipeline)
await task.queue_frames(frames)
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
# Run the pipeline
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url))
from pipecat.runner.run import main
main()

View File

@@ -1,32 +1,38 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import aiohttp
import asyncio
import os
import sys
import tkinter as tk
from pipecat.frames.frames import AudioRawFrame, Frame, URLImageRawFrame, LLMMessagesFrame, TextFrame
from pipecat.pipeline.parallel_pipeline import ParallelPipeline
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.aggregators.llm_response import LLMFullResponseAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.openai import OpenAILLMService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.fal import FalImageGenService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.tk import TkLocalTransport
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from dotenv import load_dotenv
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
OutputAudioRawFrame,
TextFrame,
TTSAudioRawFrame,
URLImageRawFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.sync_parallel_pipeline import SyncParallelPipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.aggregators.sentence import SentenceAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.cartesia.tts import CartesiaHttpTTSService
from pipecat.services.fal.image import FalImageGenService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.local.tk import TkLocalTransport, TkTransportParams
load_dotenv(override=True)
@@ -42,7 +48,12 @@ async def main():
runner = PipelineRunner()
async def get_month_data(month):
messages = [{"role": "system", "content": f"Describe a nature photograph suitable for use in a calendar, for the month of {month}. Include only the image description with no preamble. Limit the description to one sentence, please.", }]
messages = [
{
"role": "system",
"content": f"Describe a nature photograph suitable for use in a calendar, for the month of {month}. Include only the image description with no preamble. Limit the description to one sentence, please.",
}
]
class ImageDescription(FrameProcessor):
def __init__(self):
@@ -50,6 +61,8 @@ async def main():
self.text = ""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, TextFrame):
self.text = frame.text
await self.push_frame(frame, direction)
@@ -58,12 +71,17 @@ async def main():
def __init__(self):
super().__init__()
self.audio = bytearray()
self.frame = None
async def process_frame(self, frame: Frame, direction: FrameDirection):
if isinstance(frame, AudioRawFrame):
await super().process_frame(frame, direction)
if isinstance(frame, TTSAudioRawFrame):
self.audio.extend(frame.audio)
self.frame = AudioRawFrame(
bytes(self.audio), frame.sample_rate, frame.num_channels)
self.frame = OutputAudioRawFrame(
bytes(self.audio), frame.sample_rate, frame.num_channels
)
await self.push_frame(frame, direction)
class ImageGrabber(FrameProcessor):
def __init__(self):
@@ -71,26 +89,26 @@ async def main():
self.frame = None
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, URLImageRawFrame):
self.frame = frame
await self.push_frame(frame, direction)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4-turbo-preview")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"))
tts = CartesiaHttpTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
imagegen = FalImageGenService(
params=FalImageGenService.InputParams(
image_size="square_hd"
),
params=FalImageGenService.InputParams(image_size="square_hd"),
aiohttp_session=session,
key=os.getenv("FAL_KEY"))
key=os.getenv("FAL_KEY"),
)
aggregator = LLMFullResponseAggregator()
sentence_aggregator = SentenceAggregator()
description = ImageDescription()
@@ -98,16 +116,30 @@ async def main():
image_grabber = ImageGrabber()
pipeline = Pipeline([
llm,
aggregator,
description,
ParallelPipeline([tts, audio_grabber],
[imagegen, image_grabber])
])
# With `SyncParallelPipeline` we synchronize audio and images by
# pushing them basically in order (e.g. I1 A1 A1 A1 I2 A2 A2 A2 A2
# I3 A3). To do that, each pipeline runs concurrently and
# `SyncParallelPipeline` will wait for the input frame to be
# processed.
#
# Note that `SyncParallelPipeline` requires the last processor in
# each of the pipelines to be synchronous. In this case, we use
# `CartesiaHttpTTSService` and `FalImageGenService` which make HTTP
# requests and wait for the response.
pipeline = Pipeline(
[
llm, # LLM
sentence_aggregator, # Aggregates LLM output into full sentences
description, # Store sentence
SyncParallelPipeline(
[tts, audio_grabber], # Generate and store audio for the given sentence
[imagegen, image_grabber], # Generate and storeimage for the given sentence
),
]
)
task = PipelineTask(pipeline)
await task.queue_frame(LLMMessagesFrame(messages))
await task.queue_frame(LLMContextFrame(LLMContext(messages)))
await task.stop_when_done()
await runner.run(task)
@@ -121,24 +153,23 @@ async def main():
transport = TkLocalTransport(
tk_root,
TransportParams(
TkTransportParams(
audio_out_enabled=True,
camera_out_enabled=True,
camera_out_width=1024,
camera_out_height=1024))
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
),
)
pipeline = Pipeline([transport.output()])
task = PipelineTask(pipeline)
# We only specify 5 months as we create tasks all at once and we might
# get rate limited otherwise.
# We only specify a few months as we create tasks all at once and we
# might get rate limited otherwise.
months: list[str] = [
"January",
"February",
# "March",
# "April",
# "May",
]
# We create one task per month. This will be executed concurrently.
@@ -156,7 +187,7 @@ async def main():
await runner.stop_when_done()
async def run_tk():
while True:
while not task.has_finished():
tk_root.update()
tk_root.update_idletasks()
await asyncio.sleep(0.1)

View File

@@ -1,101 +1,157 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from pipecat.frames.frames import LLMMessagesFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.aggregators.llm_response import (
LLMAssistantResponseAggregator,
LLMUserResponseAggregator,
)
from pipecat.processors.logger import FrameLogger
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.vad.silero import SileroVADAnalyzer
from runner import configure
from loguru import logger
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import Frame, LLMRunFrame, MetricsFrame
from pipecat.metrics.metrics import (
LLMUsageMetricsData,
ProcessingMetricsData,
TTFBMetricsData,
TTSUsageMetricsData,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
class MetricsLogger(FrameProcessor):
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, MetricsFrame):
for d in frame.data:
if isinstance(d, TTFBMetricsData):
print(f"!!! MetricsFrame: {frame}, ttfb: {d.value}")
elif isinstance(d, ProcessingMetricsData):
print(f"!!! MetricsFrame: {frame}, processing: {d.value}")
elif isinstance(d, LLMUsageMetricsData):
tokens = d.value
print(
f"!!! MetricsFrame: {frame}, tokens: {tokens.prompt_tokens}, characters: {tokens.completion_tokens}"
)
elif isinstance(d, TTSUsageMetricsData):
print(f"!!! MetricsFrame: {frame}, characters: {d.value}")
await self.push_frame(frame, direction)
async def main(room_url: str, token):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
token,
"Respond bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer()
)
)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4-turbo-preview")
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
fl_in = FrameLogger("Inner")
fl_out = FrameLogger("Outer")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
tma_in = LLMUserResponseAggregator(messages)
tma_out = LLMAssistantResponseAggregator(messages)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
pipeline = Pipeline([
fl_in,
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
ml = MetricsLogger()
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(),
tma_in,
stt,
context_aggregator.user(),
llm,
fl_out,
tts,
ml,
transport.output(),
tma_out
])
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
messages.append(
{"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMMessagesFrame(messages)])
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
runner = PipelineRunner()
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
await runner.run(task)
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url, token))
from pipecat.runner.run import main
main()

View File

@@ -1,39 +1,41 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from PIL import Image
from pipecat.frames.frames import ImageRawFrame, Frame, SystemFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.aggregators.llm_context import (
LLMAssistantContextAggregator,
LLMUserContextAggregator,
)
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.openai import OpenAILLMService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.transports.services.daily import DailyTransport
from pipecat.transports.services.daily import DailyParams
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
from loguru import logger
from PIL import Image
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
BotStartedSpeakingFrame,
BotStoppedSpeakingFrame,
Frame,
LLMRunFrame,
OutputImageRawFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
class ImageSyncAggregator(FrameProcessor):
@@ -48,76 +50,125 @@ class ImageSyncAggregator(FrameProcessor):
self._waiting_image_bytes = self._waiting_image.tobytes()
async def process_frame(self, frame: Frame, direction: FrameDirection):
if not isinstance(frame, SystemFrame):
await self.push_frame(ImageRawFrame(image=self._speaking_image_bytes, size=(1024, 1024), format=self._speaking_image_format))
await self.push_frame(frame)
await self.push_frame(ImageRawFrame(image=self._waiting_image_bytes, size=(1024, 1024), format=self._waiting_image_format))
else:
await self.push_frame(frame)
await super().process_frame(frame, direction)
async def main(room_url: str, token):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
token,
"Respond bot",
DailyParams(
audio_out_enabled=True,
camera_out_width=1024,
camera_out_height=1024,
transcription_enabled=True
if isinstance(frame, BotStartedSpeakingFrame):
await self.push_frame(
OutputImageRawFrame(
image=self._speaking_image_bytes,
size=(1024, 1024),
format=self._speaking_image_format,
)
)
)
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
elif isinstance(frame, BotStoppedSpeakingFrame):
await self.push_frame(
OutputImageRawFrame(
image=self._waiting_image_bytes,
size=(1024, 1024),
format=self._waiting_image_format,
)
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4-turbo-preview")
await self.push_frame(frame, direction)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
tma_in = LLMUserContextAggregator(messages)
tma_out = LLMAssistantContextAggregator(messages)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_out_enabled=True,
video_out_width=1024,
video_out_height=1024,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
image_sync_aggregator = ImageSyncAggregator(
os.path.join(os.path.dirname(__file__), "assets", "speaking.png"),
os.path.join(os.path.dirname(__file__), "assets", "waiting.png"),
)
pipeline = Pipeline([
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
image_sync_aggregator = ImageSyncAggregator(
os.path.join(os.path.dirname(__file__), "assets", "speaking.png"),
os.path.join(os.path.dirname(__file__), "assets", "waiting.png"),
)
pipeline = Pipeline(
[
transport.input(),
image_sync_aggregator,
tma_in,
stt,
context_aggregator.user(),
llm,
tts,
image_sync_aggregator,
transport.output(),
tma_out
])
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
participant_name = participant["info"]["userName"] or ''
transport.capture_participant_transcription(participant["id"])
await task.queue_frames([TextFrame(f"Hi, this is {participant_name}.")])
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
await task.queue_frames([LLMRunFrame()])
runner = PipelineRunner()
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
await runner.run(task)
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url, token))
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,128 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaHttpTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaHttpTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -1,94 +1,155 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
import wave
from pipecat.frames.frames import LLMMessagesFrame
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
LLMFullResponseEndFrame,
LLMFullResponseStartFrame,
LLMRunFrame,
LLMTextFrame,
OutputAudioRawFrame,
TextFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_response import (
LLMAssistantResponseAggregator, LLMUserResponseAggregator)
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.vad.silero import SileroVADAnalyzer
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def main(room_url: str, token):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
token,
"Respond bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer()
)
)
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4-turbo-preview")
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
tma_in = LLMUserResponseAggregator(messages)
tma_out = LLMAssistantResponseAggregator(messages)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
pipeline = Pipeline([
transport.input(), # Transport user input
tma_in, # User responses
llm, # LLM
tts, # TTS
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
tma_out # Assistant spoken responses
])
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
messages.append(
{"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMMessagesFrame(messages)])
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
runner = PipelineRunner()
audio_file_path = os.path.join(os.path.dirname(__file__), "assets", "pre-recorded.wav")
await runner.run(task)
with wave.open(audio_file_path, "rb") as wav_file:
llm_text_frame = TextFrame(text="This is a pre-recorded message.")
llm_text_frame.skip_tts = True
audio_data = wav_file.readframes(wav_file.getnframes())
output_audio_raw_frame = OutputAudioRawFrame(
audio=audio_data, sample_rate=44100, num_channels=1
)
await task.queue_frames(
[
LLMRunFrame(),
LLMFullResponseStartFrame(),
llm_text_frame,
output_audio_raw_frame,
LLMFullResponseEndFrame(),
]
)
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url, token))
from pipecat.runner.run import main
main()

View File

@@ -1,95 +0,0 @@
#
# Copyright (c) 2024, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from pipecat.frames.frames import LLMMessagesFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_response import (
LLMAssistantResponseAggregator, LLMUserResponseAggregator)
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.anthropic import AnthropicLLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.vad.silero import SileroVADAnalyzer
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main(room_url: str, token):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
token,
"Respond bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer()
)
)
tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)
llm = AnthropicLLMService(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-3-opus-20240229")
# todo: think more about how to handle system prompts in a more general way. OpenAI,
# Google, and Anthropic all have slightly different approaches to providing a system
# prompt.
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative, helpful, and brief way. Say hello.",
},
]
tma_in = LLMUserResponseAggregator(messages)
tma_out = LLMAssistantResponseAggregator(messages)
pipeline = Pipeline([
transport.input(), # Transport user input
tma_in, # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
tma_out # Assistant spoken responses
])
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
await task.queue_frames([LLMMessagesFrame(messages)])
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url, token))

View File

@@ -0,0 +1,180 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import (
LLMUserAggregatorParams,
)
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.base_llm import BaseOpenAILLMService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
"""Speechmatics STT Service Example
This example demonstrates using Speechmatics Speech-to-Text service with speaker diarization and intelligent speaker management. Key features:
1. Speaker Diarization
- Automatically identifies and distinguishes between different speakers
- First speaker is identified as 'S1', others get subsequent IDs
- Uses `enable_diarization` parameter to manage speaker detection
2. Smart Speaker Control
- `focus_speakers` parameter lets you target specific speakers (e.g. ["S1"])
- Other speakers will be wrapped in PASSIVE tags
- Only processes speech from focused speakers
- Words from all speakers are wrapped with XML tags for clear speaker identification
- Other speakers' speech only sent when focused speaker is active
3. Voice Activity Detection
- Built-in VAD using `enable_vad` parameter
- Remove `vad_analyzer` from `transport` config to use module's VAD
- Emits speaker started/stopped events
4. Configuration Options
- `operating_point` parameter defaults to `ENHANCED` for optimal accuracy
- Configurable `end_of_utterance_silence_trigger` (default 0.5s)
- Customizable speaker formatting
- Additional diarization settings available
For detailed information about operating points and configuration:
https://docs.speechmatics.com/rt-api-ref
"""
logger.info(f"Starting bot")
stt = SpeechmaticsSTTService(
api_key=os.getenv("SPEECHMATICS_API_KEY"),
params=SpeechmaticsSTTService.InputParams(
language=Language.EN,
enable_vad=True,
enable_diarization=True,
focus_speakers=["S1"],
end_of_utterance_silence_trigger=0.5,
speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
speaker_passive_format="<PASSIVE><{speaker_id}>{text}</{speaker_id}></PASSIVE>",
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
model="eleven_turbo_v2_5",
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
params=BaseOpenAILLMService.InputParams(temperature=0.75),
)
messages = [
{
"role": "system",
"content": (
"You are a helpful British assistant called Alfred. "
"Your goal is to demonstrate your capabilities in a succinct way. "
"Your output will be converted to audio so don't include special characters in your answers. "
"Always include punctuation in your responses. "
"Give very short replies - do not give longer replies unless strictly necessary. "
"Respond to what the user said in a concise, funny, creative and helpful way. "
"Use `<Sn/>` tags to identify different speakers - do not use tags in your replies. "
"Do not respond to speakers within `<PASSIVE/>` tags unless explicitly asked to. "
),
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Say a short hello to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,169 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import (
LLMUserAggregatorParams,
)
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.base_llm import BaseOpenAILLMService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
"""Run example using Speechmatics STT.
This example will use diarization within our STT service and output the words spoken by
each individual speaker and wrap them with XML tags for the LLM to process. Note the
instructions in the system context for the LLM. This greatly improves the conversation
experience by allowing the LLM to understand who is speaking in a multi-party call.
By default, this example will use our ENHANCED operating point, which is optimized for
high accuracy. You can change this by setting the `operating_point` parameter to a different
value.
For more information on operating points, see the Speechmatics documentation:
https://docs.speechmatics.com/rt-api-ref
"""
logger.info(f"Starting bot")
stt = SpeechmaticsSTTService(
api_key=os.getenv("SPEECHMATICS_API_KEY"),
params=SpeechmaticsSTTService.InputParams(
language=Language.EN,
enable_diarization=True,
end_of_utterance_silence_trigger=0.5,
speaker_active_format="<{speaker_id}>{text}</{speaker_id}>",
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
model="eleven_turbo_v2_5",
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
params=BaseOpenAILLMService.InputParams(temperature=0.75),
)
messages = [
{
"role": "system",
"content": (
"You are a helpful British assistant called Alfred. "
"Your goal is to demonstrate your capabilities in a succinct way. "
"Your output will be converted to audio so don't include special characters in your answers. "
"Always include punctuation in your responses. "
"Give very short replies - do not give longer replies unless strictly necessary. "
"Respond to what the user said in a concise, funny, creative and helpful way. "
"Use `<Sn/>` tags to identify different speakers - do not use tags in your replies."
),
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(aggregation_timeout=0.005),
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Say a short hello to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,126 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.soniox.stt import SonioxSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = SonioxSTTService(
api_key=os.getenv("SONIOX_API_KEY"),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,138 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.inworld.tts import InworldTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Create an HTTP session
async with aiohttp.ClientSession() as session:
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
# Inworld TTS Service - Unified streaming and non-streaming
# Set streaming=True for real-time audio, streaming=False for complete audio generation
streaming = True # Toggle this to switch between modes
tts = InworldTTSService(
api_key=os.getenv("INWORLD_API_KEY", ""),
aiohttp_session=session,
voice_id="Ashley",
model="inworld-tts-1",
streaming=streaming, # True: real-time chunks, False: complete audio then playback
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are very knowledgable about dogs. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,133 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.asyncai.tts import AsyncAIHttpTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Create an HTTP session
async with aiohttp.ClientSession() as session:
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = AsyncAIHttpTTSService(
api_key=os.getenv("ASYNCAI_API_KEY", ""),
voice_id=os.getenv("ASYNCAI_VOICE_ID", "e0f39dc4-f691-4e78-bba5-5c636692cc04"),
aiohttp_session=session,
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,129 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.asyncai.tts import AsyncAITTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = AsyncAITTSService(
api_key=os.getenv("ASYNCAI_API_KEY", ""),
voice_id=os.getenv("ASYNCAI_VOICE_ID", "e0f39dc4-f691-4e78-bba5-5c636692cc04"),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,169 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import datetime
import os
import wave
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.filters.aic_filter import AICFilter
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# Create audio buffer processor so we can hear the audio fitler results.
audiobuffer = AudioBufferProcessor(
num_channels=2, # 1 for mono, 2 for stereo (user left, bot right)
enable_turn_audio=False, # Enable per-turn audio recording
)
def _create_aic_filter() -> AICFilter:
license_key = os.getenv("AICOUSTICS_LICENSE_KEY", "")
return AICFilter(
license_key=license_key,
enhancement_level=1.0,
)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
audio_in_filter=_create_aic_filter(),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
audio_in_filter=_create_aic_filter(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
audio_in_filter=_create_aic_filter(),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
audiobuffer, # write audio data to a file
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
await audiobuffer.start_recording()
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
# Save or process the composite audio
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"./conversation_{timestamp}.wav"
# Create the WAV file
with wave.open(filename, "wb") as wf:
wf.setnchannels(num_channels)
wf.setsampwidth(2) # 16-bit audio
wf.setframerate(sample_rate)
wf.writeframes(audio)
logger.info(f"Saved recording to {filename}")
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,157 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMMessagesUpdateFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frameworks.langchain import LangchainProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
message_store = {}
def get_session_history(session_id: str) -> BaseChatMessageHistory:
if session_id not in message_store:
message_store[session_id] = ChatMessageHistory()
return message_store[session_id]
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"Be nice and helpful. Answer very briefly and without special characters like `#` or `*`. "
"Your response will be synthesized to voice and those characters will create unnatural sounds.",
),
MessagesPlaceholder("chat_history"),
("human", "{input}"),
]
)
chain = prompt | ChatOpenAI(model="gpt-4.1", temperature=0.7)
history_chain = RunnableWithMessageHistory(
chain,
get_session_history,
history_messages_key="chat_history",
input_messages_key="input",
)
lc = LangchainProcessor(history_chain)
context = LLMContext()
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
lc, # Langchain
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
# An `LLMContextFrame` will be picked up by the LangchainProcessor using
# only the content of the last message to inject it in the prompt defined
# above. So no role is required here.
messages = [({"content": "Please briefly introduce yourself to the user."})]
await task.queue_frames([LLMMessagesUpdateFrame(messages, run_llm=True)])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,133 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from deepgram import LiveOptions
from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import (
InterruptionFrame,
LLMRunFrame,
UserStartedSpeakingFrame,
UserStoppedSpeakingFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(
api_key=os.getenv("DEEPGRAM_API_KEY"),
live_options=LiveOptions(vad_events=True, utterance_end_ms="1000"),
)
tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@stt.event_handler("on_speech_started")
async def on_speech_started(stt, *args, **kwargs):
await task.queue_frames([InterruptionFrame(), UserStartedSpeakingFrame()])
@stt.event_handler("on_utterance_end")
async def on_utterance_end(stt, *args, **kwargs):
await task.queue_frames([UserStoppedSpeakingFrame()])
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -1,94 +1,126 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import aiohttp
import os
import sys
from pipecat.frames.frames import LLMMessagesFrame
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_response import (
LLMAssistantResponseAggregator, LLMUserResponseAggregator)
from pipecat.services.deepgram import DeepgramTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat.vad.silero import SileroVADAnalyzer
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from runner import configure
from loguru import logger
from dotenv import load_dotenv
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def main(room_url: str, token):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
token,
"Respond bot",
DailyParams(
audio_out_enabled=True,
transcription_enabled=True,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer()
)
)
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
tts = DeepgramTTSService(
aiohttp_session=session,
api_key=os.getenv("DEEPGRAM_API_KEY"),
voice="aura-helios-en"
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4-turbo-preview")
tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
tma_in = LLMUserResponseAggregator(messages)
tma_out = LLMAssistantResponseAggregator(messages)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
pipeline = Pipeline([
transport.input(), # Transport user input
tma_in, # User responses
llm, # LLM
tts, # TTS
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
tma_out # Assistant spoken responses
])
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
messages.append(
{"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMMessagesFrame(messages)])
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
runner = PipelineRunner()
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
await runner.run(task)
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
(url, token) = configure()
asyncio.run(main(url, token))
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,137 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService
from pipecat.services.elevenlabs.tts import ElevenLabsHttpTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Create an HTTP session
async with aiohttp.ClientSession() as session:
stt = ElevenLabsSTTService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
aiohttp_session=session,
)
tts = ElevenLabsHttpTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY", ""),
voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
aiohttp_session=session,
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,129 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY", ""),
voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,129 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.playht.tts import PlayHTHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = PlayHTHttpTTSService(
user_id=os.getenv("PLAYHT_USER_ID"),
api_key=os.getenv("PLAYHT_API_KEY"),
voice_url="s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,131 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.playht.tts import PlayHTTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = PlayHTTTSService(
user_id=os.getenv("PLAYHT_USER_ID"),
api_key=os.getenv("PLAYHT_API_KEY"),
voice_url="s3://voice-cloning-zero-shot/e46b4027-b38d-4d24-b292-38fbca2be0ef/original/manifest.json",
params=PlayHTTTSService.InputParams(language=Language.EN),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,135 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.azure.llm import AzureLLMService
from pipecat.services.azure.stt import AzureSTTService
from pipecat.services.azure.tts import AzureTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = AzureSTTService(
api_key=os.getenv("AZURE_SPEECH_API_KEY"),
region=os.getenv("AZURE_SPEECH_REGION"),
)
tts = AzureTTSService(
api_key=os.getenv("AZURE_SPEECH_API_KEY"),
region=os.getenv("AZURE_SPEECH_REGION"),
)
llm = AzureLLMService(
api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
model=os.getenv("AZURE_CHATGPT_MODEL"),
)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,130 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.openai.stt import OpenAISTTService
from pipecat.services.openai.tts import OpenAITTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = OpenAISTTService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-transcribe",
prompt="Expect words related to dogs, such as breed names.",
)
tts = OpenAITTSService(api_key=os.getenv("OPENAI_API_KEY"), voice="ballad")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are very knowledgable about dogs. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_out_sample_rate=24000,
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,134 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import time
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openpipe.llm import OpenPipeLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
timestamp = int(time.time())
llm = OpenPipeLLMService(
api_key=os.getenv("OPENAI_API_KEY"),
openpipe_api_key=os.getenv("OPENPIPE_API_KEY"),
tags={"conversation_id": f"pipecat-{timestamp}"},
)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,132 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.xtts.tts import XTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Create an HTTP session
async with aiohttp.ClientSession() as session:
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = XTTSService(
aiohttp_session=session,
voice_id="Claribel Dervla",
base_url="http://localhost:8000",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,137 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.gladia.config import GladiaInputParams, LanguageConfig
from pipecat.services.gladia.stt import GladiaSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = GladiaSTTService(
api_key=os.getenv("GLADIA_API_KEY", ""),
region=os.getenv("GLADIA_REGION"),
params=GladiaInputParams(
language_config=LanguageConfig(
languages=[Language.EN],
)
),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY", ""),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY", ""))
messages = [
{
"role": "system",
"content": f"You are a helpful LLM. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,125 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.lmnt.tts import LmntTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = LmntTTSService(api_key=os.getenv("LMNT_API_KEY"), voice_id="morgan")
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User respones
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,130 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.groq.llm import GroqLLMService
from pipecat.services.groq.stt import GroqSTTService
from pipecat.services.groq.tts import GroqTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = GroqSTTService(api_key=os.getenv("GROQ_API_KEY"))
llm = GroqLLMService(
api_key=os.getenv("GROQ_API_KEY"), model="meta-llama/llama-4-maverick-17b-128e-instruct"
)
tts = GroqTTSService(api_key=os.getenv("GROQ_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(
context, user_params=LLMUserAggregatorParams(aggregation_timeout=0.05)
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,177 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMMessagesAppendFrame, LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frameworks.strands_agents import StrandsAgentsProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.aws.stt import AWSTranscribeSTTService
from pipecat.services.aws.tts import AWSPollyTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
# Strands agent setup
try:
from strands import Agent, tool
from strands.models import BedrockModel
except ImportError:
logger.warning("Strands not installed. Please install with: pip install strands-agents")
Agent = None
BedrockModel = None
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
}
def build_agent(model_id: str, max_tokens: int):
"""Create and configure a Strands agent for NAB customer service coaching.
Args:
model_id: The AWS Bedrock model ID to use
max_tokens: Maximum tokens for the model
Returns:
Configured Strands Agent
"""
@tool
def check_weather(location: str) -> str:
if location.lower() == "san francisco":
return "The weather in San Francisco is sunny and 30 degrees."
elif location.lower() == "sydney":
return "The weather in Sydney is cloudy and 20 degrees."
else:
return "I'm not sure about the weather in that location."
agent = Agent(
model=BedrockModel(
model_id=model_id,
max_tokens=max_tokens,
),
tools=[check_weather],
system_prompt="You are a helpful assistant that can check the weather in a given location.",
)
return agent
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = AWSTranscribeSTTService()
tts = AWSPollyTTSService(
region="us-west-2", # only specific regions support generative TTS
voice_id="Joanna",
params=AWSPollyTTSService.InputParams(engine="generative", rate="1.1"),
)
# Create Strands agent processor
try:
agent = build_agent(model_id="us.anthropic.claude-3-5-haiku-20241022-v1:0", max_tokens=8000)
llm = StrandsAgentsProcessor(agent=agent)
logger.info("Successfully created Strands agent for NAB customer service coaching")
except Exception as e:
logger.error(f"Failed to create Strands agent: {e}")
raise ValueError(
"Unable to create Strands processor. Please ensure you have properly "
"installed strands-agents and configured your AWS credentials."
)
# Setup context aggregators for message handling
context = LLMContext()
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # Speech-to-text
context_aggregator.user(), # User responses
llm, # Strands Agents processor
tts, # Text-to-speech
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
await task.queue_frames(
[
LLMMessagesAppendFrame(
messages=[
{
"role": "user",
"content": f"Greet the user and introduce yourself.",
}
],
run_llm=True,
)
]
)
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

Some files were not shown because too many files have changed in this diff Show More