Commit Graph

776 Commits

Author SHA1 Message Date
Aleix Conchillo Flaqué
2cf71239b0 examples(01b): use TTSSpeakFrame instead of TextFrame 2025-09-16 17:18:45 -04:00
Mark Backman
8ead309f8d 38b: Update bundled ONNX smart-turn model 2025-09-16 13:17:14 -04:00
Aleix Conchillo Flaqué
0f2b7bc01b examples(foundational): fix 19b-openai-realtime-beta-text 2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué
2a24061bbb examples(07ad): remove deprecated user_continuous_stream 2025-09-11 18:50:00 -07:00
Aleix Conchillo Flaqué
8249b014f0 frames: BotInterruptionFrame is deprecated, use InterruptionTaskFrame 2025-09-11 09:01:54 -07:00
Aleix Conchillo Flaqué
9d9f10ae0e frames: StartInterruptionFrame is deprecated, use InterruptionFrame 2025-09-11 09:01:54 -07:00
marcus-daily
a2e76bcad8 Smart Turn V3 support 2025-09-11 16:04:56 +01:00
kompfner
b31322e38e Merge pull request #2619 from pipecat-ai/pk/aws-universal-context
Expand universal `LLMContext` support to AWS Bedrock
2025-09-11 09:33:08 -04:00
Mark Backman
f4938e0fad fix: 13 foundational examples now push frames from TranscriptionLogger 2025-09-10 10:40:10 -04:00
Paul Kompfner
fedb8a201f Update 12d example to use LLMContext, now that AWS Bedrock supports it 2025-09-09 16:24:13 -04:00
Paul Kompfner
75f9914f49 Add support for universal LLMContext to AWS Bedrock LLM service 2025-09-09 15:25:04 -04:00
Paul Kompfner
f4d6715e32 Add foundational example using AWS Bedrock with universal LLMContext 2025-09-09 10:49:51 -04:00
Paul Kompfner
f3a4b416df Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.
Removing `VisionImageRawFrame` lets us simplify LLM services' logic, getting us closer to the idealized architecture where all they care about is handling context frames.

This change is in service of getting us closer to ready to deprecate usage of `OpenAILLMContext` and subclasses in favor of the universal `LLMContext`, at least for the traditional text-to-text LLMs.

Why remove `VisionImageRawFrame` rather than deprecate? It's "internal"—only created by `VisionImageFrameAggregator`—and never intended to be used directly by users (it would be difficult to use directly anyway).

Move the logic that was once in `VisionImageFrameAggregator` directly into the examples. Reasoning:
- If `UserImageRequester` is defined in the examples, it makes sense for `UserImageProcessor` to be too, as it’s the flip side of the same coin, so to speak
- The logic is now pretty trivial
- This kind of one-shot, history-less image-describing pipeline shouldn't be common at all; it's ok for it to live in examples rather than as a dedicated class
- In the short term, this enables us to create `LLMContext`s for services that support it and `OpenAILLMContext`s for services that don't yet (AWS)

This commit also adds missing translation from OpenAI-format image context messages to AWS format. Note that this isn't a wasted effort in the face of the upcoming migration to universal `LLMContext`—this work will be reused as it has to be implemented there too.
2025-09-08 17:00:08 -04:00
Mark Backman
a537382816 Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596)
* Add OpenAI Realtime module

* Add foundational examples for OpenAI Realtime

* Add deprecation warning to OpenAIRealtimeBetaLLMService

* Add deprecation warning to AzureRealtimeBetaLLMService

* Update Changelog
2025-09-07 09:09:57 -04:00
Filipi Fuchter
5477dfcbea Improving HeyGen example video quality. 2025-09-05 11:30:01 -03:00
Paul Kompfner
b2e9fd9341 Rename Anthropic enable_prompt_caching_beta parameter to just enable_prompt_caching 2025-09-04 13:03:06 -04:00
Paul Kompfner
5c574eaad9 Add support for universal LLMContext to Anthropic LLM service 2025-09-04 13:03:06 -04:00
Paul Kompfner
2df231143a Add foundational example using Anthropic with universal LLMContext 2025-09-04 13:03:06 -04:00
Aleix Conchillo Flaqué
f2b50c14d2 Merge pull request #2573 from pipecat-ai/vp-minor-fixes-07s
example 07s: minor typo updates
2025-09-04 09:21:32 -07:00
Mark Backman
977a57c8fb Add 14k (CerebrasLLMService) to release evals 2025-09-03 17:11:38 -04:00
Mark Backman
c64bc5a636 Merge pull request #2576 from joyceerhl/joyce/cerebras-default
fix: update default Cerebras model to GPT-OSS-120B
2025-09-03 14:10:28 -07:00
Joyce Er
eba006d39c Fix nits 2025-09-03 14:07:49 -07:00
Joyce Er
a001f6f193 Switch to GPT-OSS-120B 2025-09-03 14:00:27 -07:00
Mark Backman
052ffe8712 fix: Specify frame direction in 06a push_frame 2025-09-03 15:07:05 -04:00
vipyne
83f64ecd3b example 07s: minor typo updates 2025-09-03 12:11:07 -05:00
Aleix Conchillo Flaqué
d19170d8b1 Merge pull request #2565 from pipecat-ai/aleix/reorganize-transports
transports: reorganize module
2025-09-03 08:52:49 -07:00
Paul Kompfner
f8e13a82cf Fix Fireworks AI function calling example 2025-09-03 09:07:18 -04:00
Paul Kompfner
2dbd17da4d Fix Cerebras function calling example 2025-09-03 09:07:18 -04:00
Aleix Conchillo Flaqué
aeb9f1ffca transports: reorganize module 2025-09-02 17:31:39 -07:00
Aleix Conchillo Flaqué
fdcd14dd21 updated CHANGELOG with AICFilter and fix deprecations 2025-09-02 13:10:10 -07:00
Corvin Jaedicke
8ecece2d9c Add AIC SDK audio filter 2025-09-02 11:11:29 -07:00
Aleix Conchillo Flaqué
bd7d9346b7 frames: remove StopInterruptionFrame 2025-08-29 16:40:01 -07:00
Aleix Conchillo Flaqué
64f2135ddc examples(14f): use default models 2025-08-28 11:38:59 -07:00
Paul Kompfner
189749b579 Add LLMRunFrame to trigger an LLM response, replacing context_aggregator.user().get_context_frame() 2025-08-28 09:53:33 -04:00
pratham-sarvam
6d582e41b7 Added Sarvam TTS Websocket Implementation (#2356)
* Added Sarvam TTS Websocket Implementation

* Addressed some of the comments on PR

* added change voice logic

* added changes from main

* pushing text frames and added flush audio

* updated docs string for better docs

* Addressed comments and added some improvements

* pushed optional args down

* removed new line

* made aiohttp session mandatory in http service

* added push frame and removed unused function

* removed pong message

* added disconnecting logic

---------

Co-authored-by: vinayak-sarvam <vinayak@sarvam.ai>
2025-08-26 18:10:26 -03:00
Paul Kompfner
f1f43fe500 After a rebase, rename foundational examples showing usage of universal context to avoid naming conflict with a recently-added example. 2025-08-26 09:44:15 -04:00
Paul Kompfner
a962459151 Change LLMContextAggregatorPair.create(context) to LLMContextAggregatorPair(context) 2025-08-26 09:44:15 -04:00
Paul Kompfner
688b136141 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add to Google LLM service support for universal LLM context
2025-08-26 09:44:15 -04:00
Paul Kompfner
809c4c1bc5 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add to OpenAI LLM service support for universal LLM context
2025-08-26 09:44:15 -04:00
Mark Backman
bd401e8d6f Rename TTSBuffer to TTSGate 2025-08-22 12:12:17 -04:00
Mark Backman
f0dfab23e7 Cleanup 2025-08-22 12:12:17 -04:00
Mark Backman
fbc907c371 Change path to extensions 2025-08-22 12:12:17 -04:00
Mark Backman
446bb5cddf Refactor callback to event 2025-08-22 12:12:17 -04:00
Mark Backman
1c1ee94074 Add 44 to evals, update evals to support user speaking first 2025-08-22 12:12:17 -04:00
Mark Backman
ce579d4266 Make on_voicemail_detected callback required, cleanup logging 2025-08-22 12:12:17 -04:00
Mark Backman
5a07b30c7a Class name changes, add TTSStarted/StoppedFrame to the TTSBuffer 2025-08-22 12:12:17 -04:00
Mark Backman
9da33f3897 Handle multiple user inputs from the user when a voicemail is detected; add a configurable timeout to emitting the callback 2025-08-22 12:12:17 -04:00
Mark Backman
5ca82ec61e Final docstrings, comments, and cleanup 2025-08-22 12:12:17 -04:00
Mark Backman
238d6bf9ab Add buffering logic 2025-08-22 12:12:17 -04:00
Mark Backman
90ae85bab2 More updates—added new voicemail module 2025-08-22 12:12:17 -04:00