Aleix Conchillo Flaqué
4f848e9631
Merge pull request #3227 from fixie-ai/mike/upstream
...
Add Ultravox service
2025-12-13 18:29:02 -08:00
Mike Depinet
2e4fa3f8db
PR comments
...
Also satisfy some Pyright complaints and update default model
2025-12-12 15:03:31 -08:00
Mike Depinet
4b81be7acf
Add Ultravox service ( #1 )
...
Adds support for using Ultravox Realtime as a speech-to-speech service.
Also removes the deprecated Ultravox speech-to-text vllm model integration to avoid confusion.
2025-12-12 10:16:15 -08:00
Filipi Fuchter
87fc860cd5
Changing the HeyGenVideoService example to use the live avatar API.
2025-12-12 08:52:10 -03:00
kompfner
1e98094394
Merge pull request #3175 from pipecat-ai/pk/thinking-exploration
...
Additional functionality related to thinking, for Google and Anthropic LLMs.
2025-12-11 17:15:37 -05:00
Paul Kompfner
ccdd6cde52
Fix a couple of typos in comments
2025-12-11 17:05:09 -05:00
Paul Kompfner
12979293ad
Add thinking examples to eval suite
2025-12-11 15:58:48 -05:00
Paul Kompfner
28248e9b00
Split up thinking examples so that there isn't an llm command-line arg for controlling which LLM to use. This change is preparation for adding these examples to our suite of evals.
2025-12-11 15:07:35 -05:00
kompfner
f41c3dcbc3
Merge pull request #3212 from pipecat-ai/pk/nova-2-sonic
...
Nova 2 Sonic support
2025-12-11 09:36:50 -05:00
Paul Kompfner
c37da6ab78
In the AWS Nova Sonic example, shorten the simulated weather function call delay
2025-12-09 16:53:18 -05:00
Paul Kompfner
1892854516
In the AWS Nova Sonic example, send back "location" from the weather-fetching function to help the model associate a tool response with a tool call...if you interrupt the model while more than one function call is outbound, it seemingly can get confused about which tool result goes which call.
2025-12-09 16:27:23 -05:00
Mark Backman
735e597bf2
Merge pull request #3209 from pipecat-ai/hush/07n-prompt
...
Update system prompt in Gemini example to be more instructive
2025-12-09 15:45:46 -05:00
Paul Kompfner
3e66cb50e0
Update AWS Nova Sonic example to showcase async tool calling
2025-12-09 12:44:21 -05:00
Paul Kompfner
0c5bccd1f1
Changes related to Nova 2 Sonic's support for the model speaking first
2025-12-09 11:55:23 -05:00
Paul Kompfner
ca5e668f4a
Update AWSNovaSonicLLMService docstring with more (and more up-to-date) info
2025-12-09 10:14:27 -05:00
Paul Kompfner
53de6c0b9a
Update list of supported regions in 40-aws-nova-sonic.py
2025-12-09 09:46:53 -05:00
James Hush
83877ab1e6
Update system prompt in Gemini example to be more instructive
...
Changed the on_client_connected system message from a direct greeting to
an instruction that tells the AI to introduce itself, giving the LLM more
flexibility in how it starts the conversation.
2025-12-09 09:04:10 +01:00
Aleix Conchillo Flaqué
cfd1cada8c
VoicemailDetector: add on_conversation_detected event
2025-12-08 11:57:14 -08:00
Paul Kompfner
61674d7758
Add process_thought constructor argument to TranscriptProcessor to control whether to handle thoughts in addition to assistant utterances. Defaults to False.
2025-12-08 10:27:36 -05:00
Paul Kompfner
ef703e9d16
Get rid of ThoughtTranscriptProcessor, moving its logic into AssistantTranscriptProcessor instead
2025-12-08 09:59:32 -05:00
Paul Kompfner
747bd4f737
Tweak the prompt of the thinking + functions example to not confuse Gemini as much (Gemini found the original prompt a bit ambiguous, it seems)
2025-12-08 09:29:10 -05:00
Paul Kompfner
c8c6f424cd
Add support for Gemini 3 Pro non-function-call-related thought signatures
2025-12-08 09:29:10 -05:00
Paul Kompfner
217f03b9cc
Add additional functionality related to "thinking", for Google and Anthropic LLMs.
...
Thinking, sometimes called "extended thinking" or "reasoning", is an LLM process where the model takes some additional time before giving an answer. It's useful for complex tasks that may require some level of planning and structured, step-by-step reasoning. The model can output its thoughts (or thought summaries, depending on the model) in addition to the answer. The thoughts are usually pretty granular and not really suitable for being spoken out loud in a conversation, but can be useful for logging or prompt debugging.
Here's what's added:
1. New typed input parameters for Google and Anthropic LLMs that control the models' thinking behavior (like how much thinking to do, and whether to output thoughts or thought summaries).
2. New frames for representing thoughts output by LLMs.
3. A generic mechanism for associating extra LLM-specific data with a function call in context, used specifically to support Google's function-call-related "thought signatures", which are necessary to ensure thinking continuity between function calls in a chain (where the model thinks, makes a function call, thinks some more, etc.)
4. A generic mechanism for recording LLM thoughts to context, used specifically to support Anthropic, whose thought signatures are expected to appear alongside the text of the thoughts within assistant context messages.
5. An expansion of `TranscriptProcessor` to process LLM thoughts in addition to user and assistant utterances.
2025-12-08 09:29:01 -05:00
Aleix Conchillo Flaqué
4d03270bc3
examples(foundational): update 14i-fireworks with new serverless model
2025-12-05 15:31:29 -08:00
vipyne
82e0253a62
add mcp filter example and changelog
2025-12-05 10:56:59 -06:00
Mark Backman
91dec044c4
Merge pull request #3171 from LaurentMazare/gradium
...
Gradium integration.
2025-12-05 09:43:44 -05:00
laurent
07ebf8534a
Add the example.
2025-12-05 10:51:22 +01:00
Aleix Conchillo Flaqué
8dc9872ed5
deprecate pipecat.sync package
2025-12-03 18:44:41 -08:00
vipyne
a8280522e5
examples: rename nvidia foundational examples
2025-12-01 22:41:17 -06:00
Aleix Conchillo Flaqué
7648d0436c
examples(19): linting
2025-12-01 18:30:34 -08:00
Mark Backman
0ece8b5894
Add 07c Deepgram SageMaker example
2025-11-24 16:41:01 -05:00
mattie ruth backman
24266c238f
Augmented PatternPairAggregator so that matched patterns can...
...
be treated as their own aggregation, taking advantage of the new
ability to assign a type to an aggregation
2025-11-21 17:16:10 -05:00
vipyne
68292bd75f
rename MCP foundational examples
2025-11-19 10:34:13 -06:00
vipyne
42423bff41
update MCP foundational examples
2025-11-19 10:29:18 -06:00
Aleix Conchillo Flaqué
ceaf53fdb0
LLMContext: async create_image_message/create_audio_message fixes
2025-11-18 19:41:13 -08:00
Mark Backman
62a0f0c0f5
Merge pull request #3070 from ivaaan/hume-timestamps
2025-11-18 19:56:20 -05:00
ivaaan
f325eeb95b
rm TranscriptProcessor 2
2025-11-18 20:41:10 +01:00
ivaaan
c2309efd7e
rm TranscriptProcessor
2025-11-18 20:35:09 +01:00
Ivan A
a38f208135
Update examples/foundational/07ae-interruptible-hume.py
...
Co-authored-by: Mark Backman <m.backman@gmail.com >
2025-11-18 20:30:28 +01:00
Mark Backman
153201542b
Fix foundational 30 example to output TTSTextFrames synced to audio
2025-11-18 13:29:06 -05:00
Ivan A
8dbe119a73
Merge branch 'main' into hume-timestamps
2025-11-18 18:38:24 +01:00
ivaaan
26f96d0be8
upd example
2025-11-18 18:31:38 +01:00
Aleix Conchillo Flaqué
9f45ad4d2e
LLMContext: create_image_message/create_audio_message are now async
2025-11-18 09:04:40 -08:00
Paul Kompfner
5095fc6a64
Update Moondream example so that Moondream service output makes it into the context, even if the TTS service is disabled
2025-11-17 15:16:19 -05:00
ivaaan
9156e21727
fix formatting
2025-11-17 14:00:03 +01:00
Filipi Fuchter
04dbbabc03
Introduced a minimum confidence parameter in DeepgramFluxSTTService to avoid generating transcriptions below a defined threshold.
2025-11-17 09:54:30 -03:00
ivaaan
71869a116d
fix errors
2025-11-17 13:51:04 +01:00
ivaaan
2f2bde9856
add timestamps to example
2025-11-17 13:40:03 +01:00
Mark Backman
74a0e8c88d
Merge pull request #3050 from ai-coustics/aic-vad-analyzer
...
feat(ai-coustics): add ai-coustics integrated VAD
2025-11-14 08:11:15 -05:00
kompfner
e83ac82bf3
Merge pull request #3042 from pipecat-ai/pk/follow-up-inter-frame-spaces
...
Follow-up to #3041
2025-11-13 11:03:06 -05:00