pipecat

Author	SHA1	Message	Date
Paul Kompfner	217f03b9cc	Add additional functionality related to "thinking", for Google and Anthropic LLMs. Thinking, sometimes called "extended thinking" or "reasoning", is an LLM process where the model takes some additional time before giving an answer. It's useful for complex tasks that may require some level of planning and structured, step-by-step reasoning. The model can output its thoughts (or thought summaries, depending on the model) in addition to the answer. The thoughts are usually pretty granular and not really suitable for being spoken out loud in a conversation, but can be useful for logging or prompt debugging. Here's what's added: 1. New typed input parameters for Google and Anthropic LLMs that control the models' thinking behavior (like how much thinking to do, and whether to output thoughts or thought summaries). 2. New frames for representing thoughts output by LLMs. 3. A generic mechanism for associating extra LLM-specific data with a function call in context, used specifically to support Google's function-call-related "thought signatures", which are necessary to ensure thinking continuity between function calls in a chain (where the model thinks, makes a function call, thinks some more, etc.) 4. A generic mechanism for recording LLM thoughts to context, used specifically to support Anthropic, whose thought signatures are expected to appear alongside the text of the thoughts within assistant context messages. 5. An expansion of `TranscriptProcessor` to process LLM thoughts in addition to user and assistant utterances.	2025-12-08 09:29:01 -05:00
Aleix Conchillo Flaqué	4d03270bc3	examples(foundational): update 14i-fireworks with new serverless model	2025-12-05 15:31:29 -08:00
vipyne	82e0253a62	add mcp filter example and changelog	2025-12-05 10:56:59 -06:00
Mark Backman	91dec044c4	Merge pull request #3171 from LaurentMazare/gradium Gradium integration.	2025-12-05 09:43:44 -05:00
laurent	07ebf8534a	Add the example.	2025-12-05 10:51:22 +01:00
Aleix Conchillo Flaqué	8dc9872ed5	deprecate pipecat.sync package	2025-12-03 18:44:41 -08:00
vipyne	a8280522e5	examples: rename nvidia foundational examples	2025-12-01 22:41:17 -06:00
Aleix Conchillo Flaqué	7648d0436c	examples(19): linting	2025-12-01 18:30:34 -08:00
Mark Backman	0ece8b5894	Add 07c Deepgram SageMaker example	2025-11-24 16:41:01 -05:00
mattie ruth backman	24266c238f	Augmented PatternPairAggregator so that matched patterns can... be treated as their own aggregation, taking advantage of the new ability to assign a type to an aggregation	2025-11-21 17:16:10 -05:00
vipyne	68292bd75f	rename MCP foundational examples	2025-11-19 10:34:13 -06:00
vipyne	42423bff41	update MCP foundational examples	2025-11-19 10:29:18 -06:00
Aleix Conchillo Flaqué	ceaf53fdb0	LLMContext: async create_image_message/create_audio_message fixes	2025-11-18 19:41:13 -08:00
Mark Backman	62a0f0c0f5	Merge pull request #3070 from ivaaan/hume-timestamps	2025-11-18 19:56:20 -05:00
ivaaan	f325eeb95b	rm TranscriptProcessor 2	2025-11-18 20:41:10 +01:00
ivaaan	c2309efd7e	rm TranscriptProcessor	2025-11-18 20:35:09 +01:00
Ivan A	a38f208135	Update examples/foundational/07ae-interruptible-hume.py Co-authored-by: Mark Backman <m.backman@gmail.com>	2025-11-18 20:30:28 +01:00
Mark Backman	153201542b	Fix foundational 30 example to output TTSTextFrames synced to audio	2025-11-18 13:29:06 -05:00
Ivan A	8dbe119a73	Merge branch 'main' into hume-timestamps	2025-11-18 18:38:24 +01:00
ivaaan	26f96d0be8	upd example	2025-11-18 18:31:38 +01:00
Aleix Conchillo Flaqué	9f45ad4d2e	LLMContext: create_image_message/create_audio_message are now async	2025-11-18 09:04:40 -08:00
Paul Kompfner	5095fc6a64	Update Moondream example so that Moondream service output makes it into the context, even if the TTS service is disabled	2025-11-17 15:16:19 -05:00
ivaaan	9156e21727	fix formatting	2025-11-17 14:00:03 +01:00
Filipi Fuchter	04dbbabc03	Introduced a minimum confidence parameter in DeepgramFluxSTTService to avoid generating transcriptions below a defined threshold.	2025-11-17 09:54:30 -03:00
ivaaan	71869a116d	fix errors	2025-11-17 13:51:04 +01:00
ivaaan	2f2bde9856	add timestamps to example	2025-11-17 13:40:03 +01:00
Mark Backman	74a0e8c88d	Merge pull request #3050 from ai-coustics/aic-vad-analyzer feat(ai-coustics): add ai-coustics integrated VAD	2025-11-14 08:11:15 -05:00
kompfner	e83ac82bf3	Merge pull request #3042 from pipecat-ai/pk/follow-up-inter-frame-spaces Follow-up to #3041	2025-11-13 11:03:06 -05:00
Mark Backman	edbf96b3c5	Update GeminiTTSService for streaming, other Google TTS improvements	2025-11-13 10:22:34 -05:00
Paul Kompfner	8851d18f92	Tweak the LLM prompt again to try to fix the issue of LLMs sometimes omitting punctuation in their output.	2025-11-13 10:02:33 -05:00
Mark Backman	0e37658f8d	Add ElevenLabsRealtimeSTTService	2025-11-13 09:49:05 -05:00
Corvin Jaedicke	a7b2052b38	add ai-coustics VAD	2025-11-13 14:20:35 +01:00
Paul Kompfner	1802f949ef	Fix an issue with some examples where punctuation was missing from the LLM output, by tweaking the LLM prompt.	2025-11-12 17:12:03 -05:00
Paul Kompfner	5222ff99de	Apply `includes_inter_frame_spaces = True` in all LLM and TTS services that need it. Note that for `LLMTextFrame`s, the right behavior is pretty much always `includes_inter_frame_spaces = True`. I decided not to go ahead and make that the default for `LLMTextFrame`s, though, simply to not introduce a subtle behavior change for creative/unexpected use-cases that were relying on text in hand-crafted `LLMTextFrame`s being handled a certain way. Ditto for `TTSTextFrame`s. Also, fix an issue in `NeuphonicTTSService` where it wasn't pushing `TTSTextFrame`s. Also, fix the broken `SarvamHttpTTSService` example. Also, add a couple of missing examples.	2025-11-12 15:10:11 -05:00
Aleix Conchillo Flaqué	0ed430e7e2	examples(foundational): use DeepgramSTTService in 07	2025-11-07 11:34:11 -08:00
Aleix Conchillo Flaqué	4f1468e0fa	scripts(evals): improve eval prompt	2025-11-07 10:05:46 -08:00
Paul Kompfner	359d220162	Document a `OpenAIRealtimeLLMService` gotcha in an example.	2025-11-07 10:32:27 -05:00
Paul Kompfner	c3306bb4f2	Support for passing in a `ToolsSchema` in lieu of a list of provider-specific dicts when updating `OpenAIRealtimeLLMService` using `LLMUpdateSettingsFrame`.	2025-11-07 10:18:29 -05:00
Mark Backman	1fb6d6bd23	GoogleSTTService: Add more robust handling of 409 errors	2025-11-06 14:35:53 -05:00
Mark Backman	9f2ddcc5f4	Merge pull request #2927 from pipecat-ai/marcus/2025-10-28_sample_rtvi_fix Add RTVIProcessor to foundational example 38b	2025-11-06 10:19:10 -05:00
Mark Backman	961e28517e	Remove arg from RTVIProcessor	2025-11-06 10:16:31 -05:00
Paul Kompfner	13d6078ea0	Minor tweak to an example for clarity.	2025-11-05 15:30:01 -05:00
Paul Kompfner	9ce33f23b9	Add an example demonstrating MCP usage with a speech-to-speech service (`GeminiLiveLLMService`) using the pattern of passing in tools in the constructor	2025-11-05 15:29:04 -05:00
Paul Kompfner	bee4165ba4	Add `LLMSwitcher.register_direct_function()`	2025-11-05 15:28:19 -05:00
Paul Kompfner	0184493711	Update the service switcher example to illustrate registering tools on all LLMs in a switcher	2025-11-05 15:27:00 -05:00
vipyne	b7a4d7371c	wrap `tools = await mcp.register_tools(llm)` in try in examples	2025-11-04 09:01:12 -06:00
vipyne	ef88d6a2ea	update example 39-mcp-stdio.py to use different mcp server https://www.loom.com/share/a9f0a270261d4c6cb054ab2b4dcd6084 SO to Rijksmuseum MCP https://github.com/r-huijts/rijksmuseum-mcp	2025-11-04 09:01:12 -06:00
Mark Backman	0abc699f24	Merge pull request #2964 from pipecat-ai/mb/14j-nim-updates Fix 14j foundational example	2025-11-04 07:24:53 -05:00
Mark Backman	1c53a5fd01	Fix 14j foundational example	2025-11-03 14:57:44 -05:00
Paul Kompfner	87131850bc	`GeminiLiveLLMService` supports context-provided system instruction and tools	2025-11-03 10:30:46 -05:00

1 2 3 4 5 ...

1445 Commits