Commit Graph

1445 Commits

Author SHA1 Message Date
Paul Kompfner
217f03b9cc Add additional functionality related to "thinking", for Google and Anthropic LLMs.
Thinking, sometimes called "extended thinking" or "reasoning", is an LLM process where the model takes some additional time before giving an answer. It's useful for complex tasks that may require some level of planning and structured, step-by-step reasoning. The model can output its thoughts (or thought summaries, depending on the model) in addition to the answer. The thoughts are usually pretty granular and not really suitable for being spoken out loud in a conversation, but can be useful for logging or prompt debugging.

Here's what's added:

1. New typed input parameters for Google and Anthropic LLMs that control the models' thinking behavior (like how much thinking to do, and whether to output thoughts or thought summaries).
2. New frames for representing thoughts output by LLMs.
3. A generic mechanism for associating extra LLM-specific data with a function call in context, used specifically to support Google's function-call-related "thought signatures", which are necessary to ensure thinking continuity between function calls in a chain (where the model thinks, makes a function call, thinks some more, etc.)
4. A generic mechanism for recording LLM thoughts to context, used specifically to support Anthropic, whose thought signatures are expected to appear alongside the text of the thoughts within assistant context messages.
5. An expansion of `TranscriptProcessor` to process LLM thoughts in addition to user and assistant utterances.
2025-12-08 09:29:01 -05:00
Aleix Conchillo Flaqué
4d03270bc3 examples(foundational): update 14i-fireworks with new serverless model 2025-12-05 15:31:29 -08:00
vipyne
82e0253a62 add mcp filter example and changelog 2025-12-05 10:56:59 -06:00
Mark Backman
91dec044c4 Merge pull request #3171 from LaurentMazare/gradium
Gradium integration.
2025-12-05 09:43:44 -05:00
laurent
07ebf8534a Add the example. 2025-12-05 10:51:22 +01:00
Aleix Conchillo Flaqué
8dc9872ed5 deprecate pipecat.sync package 2025-12-03 18:44:41 -08:00
vipyne
a8280522e5 examples: rename nvidia foundational examples 2025-12-01 22:41:17 -06:00
Aleix Conchillo Flaqué
7648d0436c examples(19): linting 2025-12-01 18:30:34 -08:00
Mark Backman
0ece8b5894 Add 07c Deepgram SageMaker example 2025-11-24 16:41:01 -05:00
mattie ruth backman
24266c238f Augmented PatternPairAggregator so that matched patterns can...
be treated as their own aggregation, taking advantage of the new
ability to assign a type to an aggregation
2025-11-21 17:16:10 -05:00
vipyne
68292bd75f rename MCP foundational examples 2025-11-19 10:34:13 -06:00
vipyne
42423bff41 update MCP foundational examples 2025-11-19 10:29:18 -06:00
Aleix Conchillo Flaqué
ceaf53fdb0 LLMContext: async create_image_message/create_audio_message fixes 2025-11-18 19:41:13 -08:00
Mark Backman
62a0f0c0f5 Merge pull request #3070 from ivaaan/hume-timestamps 2025-11-18 19:56:20 -05:00
ivaaan
f325eeb95b rm TranscriptProcessor 2 2025-11-18 20:41:10 +01:00
ivaaan
c2309efd7e rm TranscriptProcessor 2025-11-18 20:35:09 +01:00
Ivan A
a38f208135 Update examples/foundational/07ae-interruptible-hume.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-11-18 20:30:28 +01:00
Mark Backman
153201542b Fix foundational 30 example to output TTSTextFrames synced to audio 2025-11-18 13:29:06 -05:00
Ivan A
8dbe119a73 Merge branch 'main' into hume-timestamps 2025-11-18 18:38:24 +01:00
ivaaan
26f96d0be8 upd example 2025-11-18 18:31:38 +01:00
Aleix Conchillo Flaqué
9f45ad4d2e LLMContext: create_image_message/create_audio_message are now async 2025-11-18 09:04:40 -08:00
Paul Kompfner
5095fc6a64 Update Moondream example so that Moondream service output makes it into the context, even if the TTS service is disabled 2025-11-17 15:16:19 -05:00
ivaaan
9156e21727 fix formatting 2025-11-17 14:00:03 +01:00
Filipi Fuchter
04dbbabc03 Introduced a minimum confidence parameter in DeepgramFluxSTTService to avoid generating transcriptions below a defined threshold. 2025-11-17 09:54:30 -03:00
ivaaan
71869a116d fix errors 2025-11-17 13:51:04 +01:00
ivaaan
2f2bde9856 add timestamps to example 2025-11-17 13:40:03 +01:00
Mark Backman
74a0e8c88d Merge pull request #3050 from ai-coustics/aic-vad-analyzer
feat(ai-coustics): add ai-coustics integrated VAD
2025-11-14 08:11:15 -05:00
kompfner
e83ac82bf3 Merge pull request #3042 from pipecat-ai/pk/follow-up-inter-frame-spaces
Follow-up to #3041
2025-11-13 11:03:06 -05:00
Mark Backman
edbf96b3c5 Update GeminiTTSService for streaming, other Google TTS improvements 2025-11-13 10:22:34 -05:00
Paul Kompfner
8851d18f92 Tweak the LLM prompt again to try to fix the issue of LLMs sometimes omitting punctuation in their output. 2025-11-13 10:02:33 -05:00
Mark Backman
0e37658f8d Add ElevenLabsRealtimeSTTService 2025-11-13 09:49:05 -05:00
Corvin Jaedicke
a7b2052b38 add ai-coustics VAD 2025-11-13 14:20:35 +01:00
Paul Kompfner
1802f949ef Fix an issue with some examples where punctuation was missing from the LLM output, by tweaking the LLM prompt. 2025-11-12 17:12:03 -05:00
Paul Kompfner
5222ff99de Apply includes_inter_frame_spaces = True in all LLM and TTS services that need it.
Note that for `LLMTextFrame`s, the right behavior is pretty much always `includes_inter_frame_spaces = True`. I decided *not* to go ahead and make that the default for `LLMTextFrame`s, though, simply to not introduce a subtle behavior change for creative/unexpected use-cases that were relying on text in hand-crafted `LLMTextFrame`s being handled a certain way. Ditto for `TTSTextFrame`s.

Also, fix an issue in `NeuphonicTTSService` where it wasn't pushing `TTSTextFrame`s.

Also, fix the broken `SarvamHttpTTSService` example.

Also, add a couple of missing examples.
2025-11-12 15:10:11 -05:00
Aleix Conchillo Flaqué
0ed430e7e2 examples(foundational): use DeepgramSTTService in 07 2025-11-07 11:34:11 -08:00
Aleix Conchillo Flaqué
4f1468e0fa scripts(evals): improve eval prompt 2025-11-07 10:05:46 -08:00
Paul Kompfner
359d220162 Document a OpenAIRealtimeLLMService gotcha in an example. 2025-11-07 10:32:27 -05:00
Paul Kompfner
c3306bb4f2 Support for passing in a ToolsSchema in lieu of a list of provider-specific dicts when updating OpenAIRealtimeLLMService using LLMUpdateSettingsFrame. 2025-11-07 10:18:29 -05:00
Mark Backman
1fb6d6bd23 GoogleSTTService: Add more robust handling of 409 errors 2025-11-06 14:35:53 -05:00
Mark Backman
9f2ddcc5f4 Merge pull request #2927 from pipecat-ai/marcus/2025-10-28_sample_rtvi_fix
Add RTVIProcessor to foundational example 38b
2025-11-06 10:19:10 -05:00
Mark Backman
961e28517e Remove arg from RTVIProcessor 2025-11-06 10:16:31 -05:00
Paul Kompfner
13d6078ea0 Minor tweak to an example for clarity. 2025-11-05 15:30:01 -05:00
Paul Kompfner
9ce33f23b9 Add an example demonstrating MCP usage with a speech-to-speech service (GeminiLiveLLMService) using the pattern of passing in tools in the constructor 2025-11-05 15:29:04 -05:00
Paul Kompfner
bee4165ba4 Add LLMSwitcher.register_direct_function() 2025-11-05 15:28:19 -05:00
Paul Kompfner
0184493711 Update the service switcher example to illustrate registering tools on all LLMs in a switcher 2025-11-05 15:27:00 -05:00
vipyne
b7a4d7371c wrap tools = await mcp.register_tools(llm) in try in examples 2025-11-04 09:01:12 -06:00
vipyne
ef88d6a2ea update example 39-mcp-stdio.py to use different mcp server
https://www.loom.com/share/a9f0a270261d4c6cb054ab2b4dcd6084

SO to Rijksmuseum MCP
https://github.com/r-huijts/rijksmuseum-mcp
2025-11-04 09:01:12 -06:00
Mark Backman
0abc699f24 Merge pull request #2964 from pipecat-ai/mb/14j-nim-updates
Fix 14j foundational example
2025-11-04 07:24:53 -05:00
Mark Backman
1c53a5fd01 Fix 14j foundational example 2025-11-03 14:57:44 -05:00
Paul Kompfner
87131850bc GeminiLiveLLMService supports context-provided system instruction and tools 2025-11-03 10:30:46 -05:00