Compare commits

...

131 Commits

Author SHA1 Message Date
Mark Backman
65ce5d0457 Add SelfClosingTagAggregator, example 45, and unit tests 2025-09-10 14:54:37 -04:00
Aleix Conchillo Flaqué
908325484d Merge pull request #2614 from pipecat-ai/aleix/readme-client-sdks-table
README: update clients' table
2025-09-10 10:21:18 -07:00
Mark Backman
dd6ff789c7 Merge pull request #2628 from pipecat-ai/mb/fix-13-push-frame
fix: 13 foundational examples now push frames from TranscriptionLogger
2025-09-10 09:13:04 -07:00
Mark Backman
f4938e0fad fix: 13 foundational examples now push frames from TranscriptionLogger 2025-09-10 10:40:10 -04:00
James Hush
e8f60c7c6f Handle missing rawResponse in transcription messages (#2623)
* Handle missing rawResponse in transcription messages

- Use message.get('rawResponse', {}) to safely access rawResponse field
- Default is_final to False when rawResponse is missing
- Add proper type annotations for better code clarity
- Minor import formatting cleanup

This prevents KeyError crashes when transcription messages from Daily's API
don't include the rawResponse field in edge cases.

* docs: add changelog line
2025-09-10 15:03:23 +08:00
kompfner
38f6e33f97 Merge pull request #2598 from pipecat-ai/pk/deprecate-vision-image-raw-frame
Remove `VisionImageRawFrame`, which was previously being handled dire…
2025-09-08 17:13:28 -04:00
Paul Kompfner
1c3e4e34e5 Minor fix to AWS Bedrock console logging to handle image messages in the context 2025-09-08 17:10:11 -04:00
Paul Kompfner
623c660027 Remove debugging comment 2025-09-08 17:01:51 -04:00
Paul Kompfner
a3e65ab3b5 The VisionImageRawFrame removal and corresponding VisionImageFrameAggregator deprecation will now happen in version 0.0.85 2025-09-08 17:01:47 -04:00
Paul Kompfner
f3a4b416df Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.
Removing `VisionImageRawFrame` lets us simplify LLM services' logic, getting us closer to the idealized architecture where all they care about is handling context frames.

This change is in service of getting us closer to ready to deprecate usage of `OpenAILLMContext` and subclasses in favor of the universal `LLMContext`, at least for the traditional text-to-text LLMs.

Why remove `VisionImageRawFrame` rather than deprecate? It's "internal"—only created by `VisionImageFrameAggregator`—and never intended to be used directly by users (it would be difficult to use directly anyway).

Move the logic that was once in `VisionImageFrameAggregator` directly into the examples. Reasoning:
- If `UserImageRequester` is defined in the examples, it makes sense for `UserImageProcessor` to be too, as it’s the flip side of the same coin, so to speak
- The logic is now pretty trivial
- This kind of one-shot, history-less image-describing pipeline shouldn't be common at all; it's ok for it to live in examples rather than as a dedicated class
- In the short term, this enables us to create `LLMContext`s for services that support it and `OpenAILLMContext`s for services that don't yet (AWS)

This commit also adds missing translation from OpenAI-format image context messages to AWS format. Note that this isn't a wasted effort in the face of the upcoming migration to universal `LLMContext`—this work will be reused as it has to be implemented there too.
2025-09-08 17:00:08 -04:00
Aleix Conchillo Flaqué
aa471a4ef5 update CHANGELOG with LiveKitTransport updates 2025-09-08 13:53:21 -07:00
Aleix Conchillo Flaqué
d55133a44f Merge pull request #2604 from alexyzhou/feature/livekit_video_and_bug_fix
Feature: Add support for livekit video stream and minor bug fixes
2025-09-08 13:51:14 -07:00
Aleix Conchillo Flaqué
0f1cf81691 README: update clients' table 2025-09-08 12:08:32 -07:00
kompfner
ac4d335799 Merge pull request #2613 from pipecat-ai/pk/mistral-message-fixups
Apply additional fixups to context messages to meet Mistral-specific …
2025-09-08 13:59:54 -04:00
Paul Kompfner
e65385c151 Tweak the Mistral-specific context messages fixup logic to handle the (mostly academic) possibility of a "tool" message appearing at the end 2025-09-08 13:55:09 -04:00
Paul Kompfner
0bb7df7a6b Remove stray debugging message 2025-09-08 13:38:26 -04:00
Paul Kompfner
daee1ddf3b Apply additional fixups to context messages to meet Mistral-specific requirements 2025-09-08 11:26:58 -04:00
Aleix Conchillo Flaqué
1cccb97ccf Merge pull request #2608 from pipecat-ai/aleix/deprecate-noisereducefilter
audio(filters): deprecate NoisereduceFilter
2025-09-07 20:54:09 -07:00
Aleix Conchillo Flaqué
d7794abf21 audio(filters): deprecate NoisereduceFilter 2025-09-07 20:52:17 -07:00
Aleix Conchillo Flaqué
6a6a63a532 Merge pull request #2607 from pipecat-ai/aleix/scripts-evals-improve-eval-prompt
scripts(evals): allow user to talk and only eval when needed
2025-09-07 20:49:43 -07:00
Mark Backman
6edb6fed41 Merge pull request #2606 from pipecat-ai/mb/quickstart-lockfile
Remove uv.lock from quickstart
2025-09-07 06:10:14 -07:00
Mark Backman
a537382816 Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596)
* Add OpenAI Realtime module

* Add foundational examples for OpenAI Realtime

* Add deprecation warning to OpenAIRealtimeBetaLLMService

* Add deprecation warning to AzureRealtimeBetaLLMService

* Update Changelog
2025-09-07 09:09:57 -04:00
Aleix Conchillo Flaqué
46deaada70 scripts(evals): allow user to talk and only eval when needed 2025-09-06 19:19:08 -07:00
Mark Backman
dbc52bc6b0 Remove uv.lock from quickstart 2025-09-06 11:13:50 -04:00
Alex Zhou
d6432589f6 fix: fix format and lint by ruff 2025-09-06 10:50:47 +08:00
Alex Zhou
13b73d4406 feat: Add support for pipecat video stream; fix the bug of duplicate participants when connecting; fix the bug of RTVI messages sent via livekit messages; 2025-09-06 10:41:33 +08:00
Aleix Conchillo Flaqué
85d8282f7e Merge pull request #2602 from pipecat-ai/aleix/pipecat-0.0.84
update CHANGELOG for 0.0.84
2025-09-05 19:35:26 -07:00
Aleix Conchillo Flaqué
070690ec64 update CHANGELOG for 0.0.84 2025-09-05 18:22:50 -07:00
Aleix Conchillo Flaqué
b9c96fd623 Merge pull request #2601 from pipecat-ai/aleix/daily-python-0.19.9
pyproject: update daily-python to 0.19.9
2025-09-05 18:21:49 -07:00
Aleix Conchillo Flaqué
f8b2ab6331 pyproject: update daily-python to 0.19.9 2025-09-05 18:14:57 -07:00
Mark Backman
ea3f7e3c34 Merge pull request #2600 from pipecat-ai/mb/livekit-dtmf
LiveKitTransport: Add support to send DTMF
2025-09-05 15:25:32 -07:00
Mark Backman
2f44f88b08 LiveKitTransport: Add support to send DTMF 2025-09-05 18:23:04 -04:00
Mark Backman
25747a001b Merge pull request #2599 from pipecat-ai/mb/fix-daily-dtmf
DTMF: Add support for native DTMF implementation where available
2025-09-05 15:20:05 -07:00
Mark Backman
fbe4338440 DTMF: Add support for native DTMF implementation where available 2025-09-05 18:16:56 -04:00
Filipi da Silva Fuchter
64b4c65728 Merge pull request #2595 from pipecat-ai/filipi/heygen_quality
Improving HeyGen example video quality.
2025-09-05 17:19:25 -03:00
kompfner
29442969a9 Merge pull request #2597 from pipecat-ai/pk/fix-anthropic-tool-less-usage
Fix Anthropic tool-less usage
2025-09-05 15:30:29 -04:00
Paul Kompfner
dc2e1d4ad3 Fix Anthropic tool-less usage 2025-09-05 11:47:31 -04:00
Filipi Fuchter
5477dfcbea Improving HeyGen example video quality. 2025-09-05 11:30:01 -03:00
kompfner
516f0e08ab Merge pull request #2590 from pipecat-ai/pk/gemini-multimodal-live-doesnt-support-llm-context
Raise an error when attempting to use Gemini Multimodal Live with uni…
2025-09-05 09:22:33 -04:00
Paul Kompfner
246f9f3325 Raise an error when attempting to use Gemini Multimodal Live with universal LLMContext. This is exactly the same error we already have for the other s2s models, AWS Nova Sonic and OpenAI Realtime, it was just missing from this service. 2025-09-04 16:47:08 -04:00
kompfner
3d850e8cc5 Merge pull request #2574 from pipecat-ai/pk/expand-universal-llm-context-support-to-anthropic
Expand universal `LLMContext` support to Anthropic
2025-09-04 13:09:44 -04:00
Paul Kompfner
6e734a37f9 Fix a bug in AWSBedrockLLMService.run_inference(); it was expecting the wrong format for the system instruction 2025-09-04 13:04:15 -04:00
Paul Kompfner
f72ca2fd7d Remove unnecessary system_instruction argument from run_inference() methods 2025-09-04 13:04:15 -04:00
Paul Kompfner
0826d72f74 Add deprecation warning for using enable_prompt_caching_beta param 2025-09-04 13:04:15 -04:00
Paul Kompfner
ba5ebfa0ec Fixed subtle CHANGELOG conflict after release of 0.0.83: universal LLMContext support for Anthropic didn't make that release. Also, some automatic Prettier fixes. 2025-09-04 13:04:11 -04:00
Paul Kompfner
dc3412b2df Bump a deprecation to 0.0.84, as 0.0.83 just shipped 2025-09-04 13:03:06 -04:00
Paul Kompfner
b2e9fd9341 Rename Anthropic enable_prompt_caching_beta parameter to just enable_prompt_caching 2025-09-04 13:03:06 -04:00
Paul Kompfner
c11b207c97 Add Anthropic to CHANGELOG list of services newly supporting runtime LLM switching 2025-09-04 13:03:06 -04:00
Paul Kompfner
d6205027cf Trivial cleanup 2025-09-04 13:03:06 -04:00
Paul Kompfner
986160c077 Fix a bug where the Anthropic adapter's merge-consecutive-messages-with-the-same-role logic was unintentionally affecting the source LLMContext's messages, resulting in more and more duplication of text with each inference 2025-09-04 13:03:06 -04:00
Paul Kompfner
b56ff86fee Minor refactor of AnthropicLLMAdapter cache-control-marker-adding logic (without really changing its behavior) 2025-09-04 13:03:06 -04:00
Paul Kompfner
5c574eaad9 Add support for universal LLMContext to Anthropic LLM service 2025-09-04 13:03:06 -04:00
Paul Kompfner
2df231143a Add foundational example using Anthropic with universal LLMContext 2025-09-04 13:03:06 -04:00
Aleix Conchillo Flaqué
65298ab792 update CHANGELOG with AWSBedrockLLMService fix 2025-09-04 09:24:55 -07:00
Aleix Conchillo Flaqué
b609b02614 Merge pull request #2568 from ezisezis/fix-bedrock-timeouts
fix timeout handling in AWSBedrockLLMService
2025-09-04 09:23:28 -07:00
Aleix Conchillo Flaqué
f2b50c14d2 Merge pull request #2573 from pipecat-ai/vp-minor-fixes-07s
example 07s: minor typo updates
2025-09-04 09:21:32 -07:00
Aleix Conchillo Flaqué
ee3b023986 update CHANGELOG with OpenAIImageGenService fix 2025-09-04 09:20:02 -07:00
Aleix Conchillo Flaqué
0d9e1190d7 Merge pull request #2583 from sassanh/main
fix: openai image generator now initiates URLImageRawFrame with correct order of arguments
2025-09-04 09:17:51 -07:00
Mark Backman
595a7c7fbe Merge pull request #2587 from pipecat-ai/mb/update-quickstart-0.0.83
Update quickstart pyproject to use 0.0.83
2025-09-04 07:42:56 -07:00
Mark Backman
586586f743 Update quickstart pyproject to use 0.0.83 2025-09-04 10:36:58 -04:00
Mark Backman
a1c6ad539d Merge pull request #2585 from ashotbagh/feat/asyncai-multilingual-support
feat(asyncai): add multilingual TTS support
2025-09-04 05:03:45 -07:00
Ashot
daf7fed8b3 feat(asyncai): add multilingual TTS support 2025-09-04 13:58:50 +04:00
Sassan Haradji
a26647c433 fix: openai image generator now initiates URLImageRawFrame with correct order of arguments 2025-09-04 06:09:57 +03:30
Aleix Conchillo Flaqué
0fab56fc13 Merge pull request #2577 from pipecat-ai/aleix/pipecat-0.0.83
update CHANGELOG for 0.0.83
2025-09-03 16:49:24 -07:00
Aleix Conchillo Flaqué
f0baff94b2 update CHANGELOG for 0.0.83 2025-09-03 16:47:43 -07:00
Aleix Conchillo Flaqué
d146170fd6 Merge pull request #2580 from pipecat-ai/mb/add-cerebras-evals
Add 14k (CerebrasLLMService) to release evals
2025-09-03 15:08:28 -07:00
Filipi da Silva Fuchter
001a2d36e5 Merge pull request #2579 from pipecat-ai/filipi/input_message
Creating InputTransportMessageUrgentFrame
2025-09-03 19:01:07 -03:00
Filipi Fuchter
99e237b1e2 Fixed an issue where messages received from the transport were always being resent. 2025-09-03 18:58:34 -03:00
Aleix Conchillo Flaqué
978f644f19 Merge pull request #2578 from pipecat-ai/aleix/user-speaking-frame
add UserSpeakingFrame and UserStartedSpeakingFrame/UserStopeedSpeakingFrame updates
2025-09-03 14:55:45 -07:00
Aleix Conchillo Flaqué
5a4c6b9618 BaseInputTransport: push UserStartedSpeakingFrame/UserStoppedSpeakingFrame upstream 2025-09-03 14:32:32 -07:00
Mark Backman
977a57c8fb Add 14k (CerebrasLLMService) to release evals 2025-09-03 17:11:38 -04:00
Mark Backman
c64bc5a636 Merge pull request #2576 from joyceerhl/joyce/cerebras-default
fix: update default Cerebras model to GPT-OSS-120B
2025-09-03 14:10:28 -07:00
Joyce Er
eba006d39c Fix nits 2025-09-03 14:07:49 -07:00
Joyce Er
a001f6f193 Switch to GPT-OSS-120B 2025-09-03 14:00:27 -07:00
Aleix Conchillo Flaqué
09d6ec1098 introduce and push UserSpeakingFrame upstream/downstream 2025-09-03 13:56:01 -07:00
Aleix Conchillo Flaqué
f56be9315a Merge pull request #2572 from pipecat-ai/aleix/deepgram-disconnect-task
ParallePipeline: wait for CancelFrame in all branches
2025-09-03 13:10:55 -07:00
Mark Backman
8e5880b2e7 Merge pull request #2575 from pipecat-ai/mb/fix-foundational-frame-direction
fix: Specify frame direction in 06a push_frame
2025-09-03 12:46:10 -07:00
Joyce Er
d8ac6f2c1a fix: update default Cerebras model to Qwen 3 32B 2025-09-03 12:23:36 -07:00
Mark Backman
052ffe8712 fix: Specify frame direction in 06a push_frame 2025-09-03 15:07:05 -04:00
Aleix Conchillo Flaqué
b52296450c DeepgramSTTService: remove raising CancelledError 2025-09-03 11:24:02 -07:00
Aleix Conchillo Flaqué
c71cec04d3 ParallelPipeline: wait for CancelFrame in all branches 2025-09-03 11:23:37 -07:00
vipyne
83f64ecd3b example 07s: minor typo updates 2025-09-03 12:11:07 -05:00
Aleix Conchillo Flaqué
d19170d8b1 Merge pull request #2565 from pipecat-ai/aleix/reorganize-transports
transports: reorganize module
2025-09-03 08:52:49 -07:00
Mark Backman
8b95d74193 Merge pull request #2571 from pipecat-ai/mb/fix-docs-0.0.83
Fix docs generation before 0.0.83 release
2025-09-03 08:52:20 -07:00
Mark Backman
3c4694a8f1 Fix docs generation before 0.0.83 release 2025-09-03 11:31:14 -04:00
kompfner
b9748b1228 Merge pull request #2563 from pipecat-ai/pk/expand-universal-llm-context-support-to-more-llms
Expand universal `LLMContext` support to more LLMs
2025-09-03 11:20:26 -04:00
Paul Kompfner
def1cf1548 Update CHANGELOG with entry about expanded support among LLM services for new universal LLMContext 2025-09-03 09:07:57 -04:00
Paul Kompfner
9b216116f1 Remove supports_universal_context gate from OpenAI-library-based LLM services. It's no longer needed, as they all now support universal LLMContext. 2025-09-03 09:07:18 -04:00
Paul Kompfner
7cb372ebb9 Add support for universal LLMContext to Together.ai LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
6838bc1e51 Add a few missing LLMContext type hints 2025-09-03 09:07:18 -04:00
Paul Kompfner
e04f42167e Add support for universal LLMContext to SambaNova LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
91a3f63e28 Add support for universal LLMContext to Qwen LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b24eb76559 Add QWEN_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Paul Kompfner
d9ea02595b Add support for universal LLMContext to Ollama LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
5bc0e49baa Add support for universal LLMContext to Perplexity LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
ec138b97d9 Add support for universal LLMContext to OpenRouter LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
0c32cc29a7 Add support for universal LLMContext to OpenPipe LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
d740bab99e Add support for universal LLMContext to NVIDIA NIM LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
ac62183eb6 Add NVIDIA_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Paul Kompfner
34f823bcac Add support for universal LLMContext to Groq LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b4e1051066 Add support for universal LLMContext to Grok LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
d8882bc381 Add support for universal LLMContext to Google Vertex AI LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
da18d0a562 Add support for universal LLMContext to Fireworks AI LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
f8e13a82cf Fix Fireworks AI function calling example 2025-09-03 09:07:18 -04:00
Paul Kompfner
2b00d37e94 Add support for universal LLMContext to DeepSeek LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
2dbd17da4d Fix Cerebras function calling example 2025-09-03 09:07:18 -04:00
Paul Kompfner
d45fbd5455 Add support for universal LLMContext to Cerebras LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b22bdff6d0 Add support for universal LLMContext to Azure LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
2b286365e0 Add MISTRAL_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Eduards Klavins
0a3e98857e fix timeout handling in AWSBedrockLLMService 2025-09-03 11:52:30 +03:00
Aleix Conchillo Flaqué
aeb9f1ffca transports: reorganize module 2025-09-02 17:31:39 -07:00
Filipi da Silva Fuchter
7f1100bd4c Merge pull request #2513 from pipecat-ai/filipi/whatsapp
Support for the new WhatsApp Cloud API
2025-09-02 18:06:59 -03:00
Filipi Fuchter
8fbd9b5af7 Added support for WhatsApp User-initiated Calls. 2025-09-02 18:05:00 -03:00
Filipi Fuchter
49c1f0bd08 Fixed SmallWebRTCTransport to not use mid to decide if the transceiver should be sendrecv or not. 2025-09-02 18:04:51 -03:00
Aleix Conchillo Flaqué
ce7a0512f9 Merge pull request #2562 from pipecat-ai/aleix/ai-coustics-speech-enhancement
add ai-coustics speech enhancement filter
2025-09-02 13:13:28 -07:00
Aleix Conchillo Flaqué
fdcd14dd21 updated CHANGELOG with AICFilter and fix deprecations 2025-09-02 13:10:10 -07:00
Mark Backman
0386599163 Added Daily SIP room creation utility (#2560)
* Added Daily SIP room creation utility to configure() function

* Add sip_codecs to the DailyRoomSipParams
2025-09-02 14:12:04 -04:00
Corvin Jaedicke
c1ce3d7d2b bumped aic-sdk version to v1.0.1 with minor changes 2025-09-02 11:11:29 -07:00
Corvin Jaedicke
8ecece2d9c Add AIC SDK audio filter 2025-09-02 11:11:29 -07:00
Filipi da Silva Fuchter
0d8ab7abca Merge pull request #2552 from pipecat-ai/filipi/freeze_issues
Fixed an issue where the pipeline could freeze.
2025-09-02 14:43:08 -03:00
Filipi Fuchter
dea7c22020 Fixed an issue where the pipeline could freeze. 2025-09-02 13:58:41 -03:00
Mark Backman
cfe11267f4 Merge pull request #2546 from pipecat-ai/mb/update-changelog-mem0
Add mem0 changelog entry
2025-09-02 07:22:52 -07:00
Mark Backman
d0c97d3602 Add mem0 changelog entry 2025-09-02 10:03:17 -04:00
Mark Backman
37e1551abc Merge pull request #2555 from pipecat-ai/mb/update-uv-lock-quickstart 2025-09-02 04:19:15 -07:00
Mark Backman
e1477e79f0 Merge pull request #2538 from rimelabs/rime-flush-audio-update
Use Rime’s official {"operation": "flush"} command in flush_audio() for proper text buffer flushing
2025-09-01 18:01:20 -07:00
Mark Backman
547b126d98 Update required pipecat-ai version for quickstart 2025-09-01 20:52:44 -04:00
Mark Backman
447e3b28eb Update uv.lock for quickstart 2025-09-01 20:52:12 -04:00
gokuljs
472efa2971 ruff format fix 2025-09-02 04:25:28 +05:30
gokuljs
5f801743d0 Add changelog entry for RimeTTSService flush_audio API update 2025-09-01 22:16:02 +05:30
gokuljs
e3f2faabf7 Merge branch 'main' of github.com:rimelabs/pipecat into rime-flush-audio-update 2025-08-30 01:18:50 +05:30
gokuljs
e06bd6049e update flush operation in flush audio function under rime tts 2025-08-29 21:27:14 +05:30
253 changed files with 14692 additions and 11465 deletions

View File

@@ -7,7 +7,114 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
## Added
### Added
- Added `SelfClosingTagAggregator` to handle incomplete self-closing tags during
streaming (e.g., prevents splitting `<break time="0.1s"/>` when received as
`<break time="0.` + `1s"/>`).
- Added video streaming support to `LiveKitTransport`.
- Added `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService` which provide
access to OpenAI Realtime.
### Removed
- Remove `VisionImageRawFrame` in favor of context frames (`LLMContextFrame` or
`OpenAILLMContextFrame`).
### Deprecated
- Deprecate `VisionImageFrameAggregator` because `VisionImageRawFrame` has been
removed. See the `12*` examples for the new recommended replacement pattern.
- `NoisereduceFilter` is now deprecated and will be removed in a future
version. Use other audio filters like `KrispFilter` or `AICFilter`.
- Deprecated `OpenAIRealtimeBetaLLMService` and `AzureRealtimeBetaLLMService`.
Use `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService`, respectively.
Each service will be removed in an upcoming version, 1.0.0.
### Fixed
- Fixed a `LiveKitTransport` issue where RTVI messages were not properly
encoded.
- Add additional fixups to Mistral context messages to ensure they meet
Mistral-specific requirements, avoiding Mistral "invalid request" errors.
- Fixed `DailyTransport` transcription handling to gracefully handle missing
`rawResponse` field in transcription messages, preventing KeyError crashes.
## [0.0.84] - 2025-09-05
### Added
- Add the ability to send DTMF to `LiveKitTransport`.
- Expanded support for universal `LLMContext` to the Anthropic LLM service.
Using the universal `LLMContext` and associated `LLMContextAggregatorPair` is
a pre-requisite for using `LLMSwitcher` to switch between LLMs at runtime.
### Changed
- Updated `daily-python` to 0.19.9.
- Restored `DailyTransport`'s native DTMF support using Daily's `send_dtmf()`
method instead of generated audio tones.
### Fixed
- Fixed a `AWSBedrockLLMService` crash caused by an extra `await`.
- Fixed a `OpenAIImageGenService` issue where it was not creating
`URLImageRawFrame` correctly.
## [0.0.83] - 2025-09-03
### Added
- Added multilingual support for AsyncAI in `AsyncAITTSService` and `AsyncAIHttpTTSService`.
- New `languages`: `es`, `fr`, `de`, `it`.
- Added new frames `InputTransportMessageUrgentFrame` and
`DailyInputTransportMessageUrgentFrame` for transport messages received from
external sources.
- Added `UserSpeakingFrame`. This will be sent upstream and downstream while VAD
detects the user is speaking.
- Expanded support for universal `LLMContext` to more LLM services. Using the
universal `LLMContext` and associated `LLMContextAggregatorPair` is a
pre-requisite for using `LLMSwitcher` to switch between LLMs at runtime.
Here are the newly-supported services:
- Azure
- Cerebras
- Deepseek
- Fireworks AI
- Google Vertex AI
- Grok
- Groq
- Mistral
- NVIDIA NIM
- Ollama
- OpenPipe
- OpenRouter
- Perplexity
- Qwen
- SambaNova
- Together.ai
- Added support for WhatsApp User-initiated Calls.
- Added new audio filter `AICFilter`, speech enhancement for improving VAD/STT
performance, no ONNX dependency.
See https://ai-coustics.com/sdk/
- Added a timeout around cancel input tasks to prevent indefinite hangs when
cancellation is swallowed by third-party code.
- Added `pipecat.extensions.ivr` for automated IVR system navigation with
configurable goals and conversation handling. Supports DTMF input, verbal
@@ -42,16 +149,34 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
sending it through the transport. This makes sending DTMF generic across all
output transports.
- Added new config parameters to `GladiaSTTService`.
- Added new config parameters to `GladiaSTTService`.
- PreProcessingConfig > `audio_enhancer` to enhance audio quality.
- CustomVocabularyItem > `pronunciations` and `language` to specify special pronunciations and in which language it will be pronounced.
- CustomVocabularyItem > `pronunciations` and `language` to specify special
pronunciations and in which language it will be pronounced.
## Changed
### Changed
- `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` are also pushed
upstream.
- `ParallelPipeline` now waits for `CancelFrame` to finish in all branches
before pushing it downstream.
- Added `sip_codecs` to the `DailyRoomSipParams`.
- Updated the `configure()` function in `pipecat.runner.daily` to include new
args to create SIP-enabled rooms. Additionally, added new args to control the
room and token expiration durations.
- `pipecat.frames.frames.KeypadEntry` is deprecated and has been moved to
`pipecat.audio.dtmf.types.KeypadEntry`.
## Removed
- Updated `RimeTTSService`'s flush_audio message to conform with Rime's official
API.
- Updated the default model for `CerebrasLLMService` to GPT-OSS-120B.
### Removed
- Remove `StopInterruptionFrame`. This was a legacy frame that was not being
used really anywhere and it didn't provide any useful meaning. It was only
@@ -63,15 +188,41 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Remove deprecated `DailyTransport.send_dtmf()`.
## Deprecated
### Deprecated
- Transports have been re-organized.
```
pipecat.transports.network.small_webrtc -> pipecat.transports.smallwebrtc.transport
pipecat.transports.network.webrtc_connection -> pipecat.transports.smallwebrtc.connection
pipecat.transports.network.websocket_client -> pipecat.transports.websocket.client
pipecat.transports.network.websocket_server -> pipecat.transports.websocket.server
pipecat.transports.network.fastapi_websocket -> pipecat.transports.websocket.fastapi
pipecat.transports.services.daily -> pipecat.transports.daily.transport
pipecat.transports.services.helpers.daily_rest -> pipecat.transports.daily.utils
pipecat.transports.services.livekit -> pipecat.transports.livekit.transport
pipecat.transports.services.tavus -> pipecat.transports.tavus.transport
```
- `pipecat.frames.frames.KeypadEntry` is deprecated use
`pipecat.audio.dtmf.types.KeypadEntry` instead.
## Fixed
### Fixed
- Fixed an issue where messages received from the transport were always being resent.
- Fixed `SmallWebRTCTransport` to not use `mid` to decide if the transceiver should
be `sendrecv` or not.
- Fixed an issue where Deepgram swallowed `asyncio.CancelledError` during
disconnect, preventing tasks from being cancelled.
- Fixed an issue where `PipelineTask` was not cleaning up the observers.
### Performance
- Reduced latency and improved memory performance in `Mem0MemoryService`.
## [0.0.82] - 2025-08-28
### Added

View File

@@ -28,6 +28,41 @@
- **Composable Pipelines**: Build complex behavior from modular components
- **Real-Time**: Ultra-low latency interaction with different transports (e.g. WebSockets or WebRTC)
## 📱 Client SDKs
You can connect to Pipecat from any platform using our official SDKs:
<table>
<tr>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg" width="40" height="40" alt="JavaScript"/>
<a href="https://docs.pipecat.ai/client/js/introduction">JavaScript</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React"/>
<a href="https://docs.pipecat.ai/client/react/introduction">React</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React Native"/>
<a href="https://docs.pipecat.ai/client/react-native/introduction">React Native</a>
</td>
</tr>
<tr>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/swift/swift-original.svg" width="40" height="40" alt="Swift"/>
<a href="https://docs.pipecat.ai/client/ios/introduction">Swift</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/kotlin/kotlin-original.svg" width="40" height="40" alt="Kotlin"/>
<a href="https://docs.pipecat.ai/client/android/introduction">Kotlin</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/cplusplus/cplusplus-original.svg" width="40" height="40" alt="JavaScript"/>
<a href="https://docs.pipecat.ai/client/c++/introduction">C++</a>
</td>
</tr>
</table>
## 🎬 See it in action
<p float="left">
@@ -38,17 +73,6 @@
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
</p>
## 📱 Client SDKs
You can connect to Pipecat from any platform using our official SDKs:
| Platform | SDK Repo | Description |
| -------- | ------------------------------------------------------------------------------ | -------------------------------- |
| Web | [pipecat-client-web](https://github.com/pipecat-ai/pipecat-client-web) | JavaScript and React client SDKs |
| iOS | [pipecat-client-ios](https://github.com/pipecat-ai/pipecat-client-ios) | Swift SDK for iOS |
| Android | [pipecat-client-android](https://github.com/pipecat-ai/pipecat-client-android) | Kotlin SDK for Android |
| C++ | [pipecat-client-cxx](https://github.com/pipecat-ai/pipecat-client-cxx) | C++ client SDK |
## 🧩 Available services
| Category | Services |
@@ -62,7 +86,7 @@ You can connect to Pipecat from any platform using our official SDKs:
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)

View File

@@ -1,3 +1,6 @@
# AI-COUSTICS
AICOUSTICS_LICENSE_KEY=...
# Anthropic
ANTHROPIC_API_KEY=...
@@ -143,3 +146,12 @@ SENTRY_DSN=...
# Heygen
HEYGEN_API_KEY=...
# Mistral
MISTRAL_API_KEY=...
# NVIDIA
NVIDIA_API_KEY=...
# Qwen
QWEN_API_KEY=...

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.piper.tts import PiperTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.rime.tts import RimeHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -17,8 +17,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -17,7 +17,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.runner.livekit import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.services.livekit import LiveKitParams, LiveKitTransport
from pipecat.transports.livekit.transport import LiveKitParams, LiveKitTransport
load_dotenv(override=True)

View File

@@ -17,8 +17,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.riva.tts import FastPitchTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -18,7 +18,7 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.fal.image import FalImageGenService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)

View File

@@ -17,7 +17,7 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.google.image import GoogleImageGenService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)

View File

@@ -27,8 +27,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
from pipecat.transports.network.webrtc_connection import IceServer, SmallWebRTCConnection
from pipecat.transports.smallwebrtc.connection import IceServer, SmallWebRTCConnection
from pipecat.transports.smallwebrtc.transport import SmallWebRTCTransport
load_dotenv(override=True)

View File

@@ -21,7 +21,7 @@ from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.runner.daily import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyLogLevel, DailyParams, DailyTransport
from pipecat.transports.daily.transport import DailyLogLevel, DailyParams, DailyTransport
load_dotenv(override=True)

View File

@@ -28,7 +28,7 @@ from pipecat.runner.livekit import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.livekit import LiveKitParams, LiveKitTransport
from pipecat.transports.livekit.transport import LiveKitParams, LiveKitTransport
load_dotenv(override=True)

View File

@@ -33,7 +33,7 @@ from pipecat.services.cartesia.tts import CartesiaHttpTTSService
from pipecat.services.fal.image import FalImageGenService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)

View File

@@ -28,8 +28,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -29,7 +29,7 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
@@ -66,7 +66,7 @@ class ImageSyncAggregator(FrameProcessor):
)
)
await self.push_frame(frame)
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get

View File

@@ -21,8 +21,8 @@ from pipecat.services.cartesia.tts import CartesiaHttpTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -21,8 +21,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -26,8 +26,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.soniox.stt import SonioxSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.inworld.tts import InworldTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.asyncai.tts import AsyncAIHttpTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.asyncai.tts import AsyncAITTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -0,0 +1,163 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import datetime
import os
import wave
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.filters.aic_filter import AICFilter
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# Create audio buffer processor so we can hear the audio fitler results.
audiobuffer = AudioBufferProcessor(
num_channels=2, # 1 for mono, 2 for stereo (user left, bot right)
enable_turn_audio=False, # Enable per-turn audio recording
user_continuous_stream=True, # User has continuous audio stream
)
def _create_aic_filter() -> AICFilter:
license_key = os.getenv("AICOUSTICS_LICENSE_KEY", "")
return AICFilter(
license_key=license_key,
enhancement_level=1.0,
)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
audio_in_filter=_create_aic_filter(),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
audio_in_filter=_create_aic_filter(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
audio_in_filter=_create_aic_filter(),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
audiobuffer, # write audio data to a file
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
await audiobuffer.start_recording()
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
# Save or process the composite audio
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"./conversation_{timestamp}.wav"
# Create the WAV file
with wave.open(filename, "wb") as wf:
wf.setnchannels(num_channels)
wf.setsampwidth(2) # 16-bit audio
wf.setframerate(sample_rate)
wf.writeframes(audio)
logger.info(f"Saved recording to {filename}")
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -33,8 +33,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -27,8 +27,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.elevenlabs.tts import ElevenLabsHttpTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.playht.tts import PlayHTHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.playht.tts import PlayHTTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.azure.llm import AzureLLMService
from pipecat.services.azure.stt import AzureSTTService
from pipecat.services.azure.tts import AzureTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.openai.stt import OpenAISTTService
from pipecat.services.openai.tts import OpenAITTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openpipe.llm import OpenPipeLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.xtts.tts import XTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.services.gladia.stt import GladiaSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.lmnt.tts import LmntTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.groq.llm import GroqLLMService
from pipecat.services.groq.stt import GroqSTTService
from pipecat.services.groq.tts import GroqTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -20,8 +20,8 @@ from pipecat.services.aws.llm import AWSBedrockLLMService
from pipecat.services.aws.stt import AWSTranscribeSTTService
from pipecat.services.aws.tts import AWSPollyTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -41,8 +41,8 @@ from pipecat.services.google.stt import GoogleSTTService
from pipecat.services.google.tts import GeminiTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.google.stt import GoogleSTTService
from pipecat.services.google.tts import GoogleTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.assemblyai.stt import AssemblyAISTTService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.rime.tts import RimeHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.rime.tts import RimeTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.nim.llm import NimLLMService
from pipecat.services.riva.stt import RivaSTTService
from pipecat.services.riva.tts import RivaTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -36,8 +36,8 @@ from pipecat.services.google.llm import GoogleLLMService
from pipecat.services.google.tts import GoogleTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -93,9 +93,8 @@ class UserAudioCollector(FrameProcessor):
elif isinstance(frame, UserStoppedSpeakingFrame):
self._user_speaking = False
self._context.add_audio_frames_message(audio_frames=self._audio_frames)
await self._user_context_aggregator.push_frame(
self._user_context_aggregator.get_context_frame()
)
await self._user_context_aggregator.push_frame(LLMRunFrame())
elif isinstance(frame, InputAudioRawFrame):
if self._user_speaking:
self._audio_frames.append(frame)
@@ -151,7 +150,7 @@ class TranscriptExtractor(FrameProcessor):
await self.push_frame(frame, direction)
class TanscriptionContextFixup(FrameProcessor):
class TranscriptionContextFixup(FrameProcessor):
def __init__(self, context):
super().__init__()
self._context = context
@@ -245,7 +244,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
context_aggregator = llm.create_context_aggregator(context)
audio_collector = UserAudioCollector(context, context_aggregator.user())
pull_transcript_out_of_llm_output = TranscriptExtractor(context)
fixup_context_messages = TanscriptionContextFixup(context)
fixup_context_messages = TranscriptionContextFixup(context)
pipeline = Pipeline(
[

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.fish.tts import FishAudioTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -19,8 +19,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.ultravox.stt import UltravoxSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.neuphonic.tts import NeuphonicHttpTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.neuphonic.tts import NeuphonicTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.fal.stt import FalSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.services.minimax.tts import MiniMaxHttpTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.sarvam.tts import SarvamHttpTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.sarvam.tts import SarvamTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -17,7 +17,7 @@ from pipecat.runner.daily import configure
from pipecat.services.azure import AzureLLMService, AzureTTSService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.fal import FalImageGenService
from pipecat.transports.services.daily import DailyTransport
from pipecat.transports.daily.transport import DailyTransport
load_dotenv(override=True)

View File

@@ -22,7 +22,7 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport, maybe_capture_participant_camera
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.local.tk import TkLocalTransport, TkTransportParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -32,8 +32,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -11,12 +11,19 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -28,12 +35,14 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.moondream.vision import MoondreamService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
)
await self.push_frame(frame, direction)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
vision_aggregator = VisionImageFrameAggregator()
image_processor = UserImageProcessor()
# If you run into weird description, try with use_cpu=True
moondream = MoondreamService()
@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
vision_aggregator,
image_processor,
moondream,
tts,
transport.output(),
@@ -119,7 +151,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -11,12 +11,19 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -28,12 +35,14 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.google.llm import GoogleLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
)
await self.push_frame(frame, direction)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
vision_aggregator = VisionImageFrameAggregator()
image_processor = UserImageProcessor()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
vision_aggregator,
image_processor,
google,
tts,
transport.output(),
@@ -123,7 +155,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -11,12 +11,19 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -28,12 +35,14 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
)
await self.push_frame(frame, direction)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
vision_aggregator = VisionImageFrameAggregator()
image_processor = UserImageProcessor()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
vision_aggregator,
image_processor,
openai,
tts,
transport.output(),
@@ -123,7 +155,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -11,12 +11,19 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -28,12 +35,14 @@ from pipecat.services.anthropic.llm import AnthropicLLMService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -46,9 +55,32 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
)
await self.push_frame(frame, direction)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -78,7 +110,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
vision_aggregator = VisionImageFrameAggregator()
image_processor = UserImageProcessor()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
@@ -96,7 +128,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
vision_aggregator,
image_processor,
anthropic,
tts,
transport.output(),
@@ -123,7 +155,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -0,0 +1,186 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from typing import Optional
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
Frame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContext,
OpenAILLMContextFrame,
)
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
create_transport,
get_transport_client_id,
maybe_capture_participant_camera,
)
from pipecat.services.aws.llm import AWSBedrockLLMService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
def set_participant_id(self, participant_id: str):
self._participant_id = participant_id
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
# Note: AWS Bedrock does not yet support the universal LLMContext
context = OpenAILLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = OpenAILLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_in_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_in_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
user_response = UserResponseAggregator()
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
image_processor = UserImageProcessor()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
# AWS for vision analysis
aws = AWSBedrockLLMService(
aws_region="us-west-2",
model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
params=AWSBedrockLLMService.InputParams(temperature=0.8),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
pipeline = Pipeline(
[
transport.input(),
stt,
user_response,
image_requester,
image_processor,
aws,
tts,
transport.output(),
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected: {client}")
await maybe_capture_participant_camera(transport, client)
# Set the participant ID in the image requester
client_id = get_transport_client_id(transport, client)
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.whisper.stt import WhisperSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
async def main():
transport = LocalAudioTransport(

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService, Language, LiveOptions
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.gladia import GladiaSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -25,8 +25,8 @@ from pipecat.services.gladia.config import (
from pipecat.services.gladia.stt import GladiaSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -40,6 +40,9 @@ class TranscriptionLogger(FrameProcessor):
elif isinstance(frame, TranslationFrame):
print(f"Translation ({frame.language}): {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.assemblyai.stt import AssemblyAISTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -20,8 +20,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.whisper.stt import MLXModel, WhisperSTTServiceMLX
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -52,6 +52,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
self._last_transcription_time = time.time()
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.stt import CartesiaSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -31,6 +31,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -21,8 +21,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.sambanova.stt import SambaNovaSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -53,6 +53,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
self._last_transcription_time = time.time()
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -19,8 +19,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -19,8 +19,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.soniox.stt import SonioxSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
transport_params = {
"daily": lambda: DailyParams(

View File

@@ -19,8 +19,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.azure.stt import AzureSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -32,6 +32,9 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
transport_params = {
"daily": lambda: DailyParams(

View File

@@ -24,8 +24,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -30,7 +30,7 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
@@ -97,7 +97,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm = AnthropicLLMService(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-3-7-sonnet-latest",
enable_prompt_caching_beta=True,
params=AnthropicLLMService.InputParams(enable_prompt_caching=True),
)
llm.register_function("get_weather", get_weather)
llm.register_function("get_image", get_image)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.together.llm import TogetherLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -30,7 +30,7 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)

View File

@@ -30,7 +30,7 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.google.llm import GoogleLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)

View File

@@ -26,8 +26,8 @@ from pipecat.services.groq.llm import GroqLLMService
from pipecat.services.groq.stt import GroqSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.grok.llm import GrokLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.fireworks.llm import FireworksLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -75,9 +75,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# sent to the same callback with an additional function_name parameter.
llm.register_function("get_current_weather", fetch_weather_from_api)
@llm.event_handler("on_function_calls_started")
async def on_function_calls_started(service, function_calls):
await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
# Disabling for now, as it ends up tripping up the model in this example
# ("let me check on that" ends up at the end of the context, which the
# model erroneously treats as a nudge to call the tool again; the
# ensuing inference then yields wonky results).
# @llm.event_handler("on_function_calls_started")
# async def on_function_calls_started(service, function_calls):
# await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
weather_function = FunctionSchema(
name="get_current_weather",
@@ -99,7 +103,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. Start by saying hello.",
},
]

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.nim.llm import NimLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.cerebras.llm import CerebrasLLMService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
@@ -67,7 +67,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = CerebrasLLMService(api_key=os.getenv("CEREBRAS_API_KEY"), model="llama-3.3-70b")
llm = CerebrasLLMService(api_key=os.getenv("CEREBRAS_API_KEY"))
# You can also register a function_name of None to get all functions
# sent to the same callback with an additional function_name parameter.
llm.register_function("get_current_weather", fetch_weather_from_api)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepseek.llm import DeepSeekLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openrouter.llm import OpenRouterLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -28,8 +28,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.perplexity.llm import PerplexityLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.google.llm_openai import GoogleLLMOpenAIBetaService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.google.llm_vertex import GoogleVertexLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.qwen.llm import QwenLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.aws.stt import AWSTranscribeSTTService
from pipecat.services.aws.tts import AWSPollyTTSService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -26,8 +26,8 @@ from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.sambanova.llm import SambaNovaLLMService
from pipecat.services.sambanova.stt import SambaNovaSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)

Some files were not shown because too many files have changed in this diff Show More