pipecat

Author	SHA1	Message	Date
Aleix Conchillo Flaqué	a9cca0b934	LLMAssistantAggregatorParams: copy to llm_response_universal	2025-12-19 14:47:02 -08:00
Aleix Conchillo Flaqué	7e6b0839b0	examples(foundational): don't use legacy LLMUserAggregatorParams	2025-12-19 14:47:02 -08:00
Paul Kompfner	9b6f5853cf	Deprecate `OpenAILLMContext` and associated things	2025-12-19 11:23:06 -05:00
Mark Backman	56c58f7302	Move Ultravox foundational example to 50, add to release evals	2025-12-18 13:38:12 -05:00
Aleix Conchillo Flaqué	d07b37b288	scripts(evals): more eval prompts improvements	2025-12-17 09:55:12 -08:00
Mark Backman	afa7573834	Merge pull request #3239 from pipecat-ai/mb/update-inworld-tts Inworld TTS services: Add websocket TTS class, add word-timestamp ali…	2025-12-16 16:26:43 -05:00
Mark Backman	bd3bf9a00e	Inworld TTS services: Add websocket TTS class, add word-timestamp alignment	2025-12-16 13:47:24 -05:00
kompfner	92f934031d	Merge pull request #3224 from pipecat-ai/pk/simplify-gemini-thinking Clean up logic related to applying Gemini thought signatures to conte…	2025-12-16 13:35:17 -05:00
Aleix Conchillo Flaqué	a14c911fb2	scripts(evals): improve eval assertion on exit	2025-12-14 12:37:05 -08:00
Aleix Conchillo Flaqué	4f848e9631	Merge pull request #3227 from fixie-ai/mike/upstream Add Ultravox service	2025-12-13 18:29:02 -08:00
Paul Kompfner	e604e9b490	Support conversations with Gemini 3 Pro Image (model "gemini-3-pro-image-preview"). Prior to this change, after the model generated an image the conversation would not be able to progress. It would stall out because we were never storing the image in context, so the model would never realize it already did the work of generating an image. We didn't run into issues with Gemini 2.5 Flash Image, because that model always followed up an image with a text message.	2025-12-12 18:20:17 -05:00
Mike Depinet	2e4fa3f8db	PR comments Also satisfy some Pyright complaints and update default model	2025-12-12 15:03:31 -08:00
Mike Depinet	4b81be7acf	Add Ultravox service (#1 ) Adds support for using Ultravox Realtime as a speech-to-speech service. Also removes the deprecated Ultravox speech-to-text vllm model integration to avoid confusion.	2025-12-12 10:16:15 -08:00
Paul Kompfner	64471d65f8	Clean up logic related to applying Gemini thought signatures to context messages	2025-12-12 12:53:11 -05:00
Filipi Fuchter	87fc860cd5	Changing the HeyGenVideoService example to use the live avatar API.	2025-12-12 08:52:10 -03:00
kompfner	1e98094394	Merge pull request #3175 from pipecat-ai/pk/thinking-exploration Additional functionality related to thinking, for Google and Anthropic LLMs.	2025-12-11 17:15:37 -05:00
Paul Kompfner	ccdd6cde52	Fix a couple of typos in comments	2025-12-11 17:05:09 -05:00
Paul Kompfner	12979293ad	Add thinking examples to eval suite	2025-12-11 15:58:48 -05:00
Paul Kompfner	28248e9b00	Split up thinking examples so that there isn't an `llm` command-line arg for controlling which LLM to use. This change is preparation for adding these examples to our suite of evals.	2025-12-11 15:07:35 -05:00
kompfner	f41c3dcbc3	Merge pull request #3212 from pipecat-ai/pk/nova-2-sonic Nova 2 Sonic support	2025-12-11 09:36:50 -05:00
Paul Kompfner	c37da6ab78	In the AWS Nova Sonic example, shorten the simulated weather function call delay	2025-12-09 16:53:18 -05:00
Paul Kompfner	1892854516	In the AWS Nova Sonic example, send back "location" from the weather-fetching function to help the model associate a tool response with a tool call...if you interrupt the model while more than one function call is outbound, it seemingly can get confused about which tool result goes which call.	2025-12-09 16:27:23 -05:00
Mark Backman	735e597bf2	Merge pull request #3209 from pipecat-ai/hush/07n-prompt Update system prompt in Gemini example to be more instructive	2025-12-09 15:45:46 -05:00
Paul Kompfner	3e66cb50e0	Update AWS Nova Sonic example to showcase async tool calling	2025-12-09 12:44:21 -05:00
Paul Kompfner	0c5bccd1f1	Changes related to Nova 2 Sonic's support for the model speaking first	2025-12-09 11:55:23 -05:00
Paul Kompfner	ca5e668f4a	Update `AWSNovaSonicLLMService` docstring with more (and more up-to-date) info	2025-12-09 10:14:27 -05:00
Paul Kompfner	53de6c0b9a	Update list of supported regions in 40-aws-nova-sonic.py	2025-12-09 09:46:53 -05:00
James Hush	83877ab1e6	Update system prompt in Gemini example to be more instructive Changed the on_client_connected system message from a direct greeting to an instruction that tells the AI to introduce itself, giving the LLM more flexibility in how it starts the conversation.	2025-12-09 09:04:10 +01:00
Aleix Conchillo Flaqué	cfd1cada8c	VoicemailDetector: add on_conversation_detected event	2025-12-08 11:57:14 -08:00
Paul Kompfner	61674d7758	Add `process_thought` constructor argument to `TranscriptProcessor` to control whether to handle thoughts in addition to assistant utterances. Defaults to `False`.	2025-12-08 10:27:36 -05:00
Paul Kompfner	ef703e9d16	Get rid of `ThoughtTranscriptProcessor`, moving its logic into `AssistantTranscriptProcessor` instead	2025-12-08 09:59:32 -05:00
Paul Kompfner	747bd4f737	Tweak the prompt of the thinking + functions example to not confuse Gemini as much (Gemini found the original prompt a bit ambiguous, it seems)	2025-12-08 09:29:10 -05:00
Paul Kompfner	c8c6f424cd	Add support for Gemini 3 Pro non-function-call-related thought signatures	2025-12-08 09:29:10 -05:00
Paul Kompfner	217f03b9cc	Add additional functionality related to "thinking", for Google and Anthropic LLMs. Thinking, sometimes called "extended thinking" or "reasoning", is an LLM process where the model takes some additional time before giving an answer. It's useful for complex tasks that may require some level of planning and structured, step-by-step reasoning. The model can output its thoughts (or thought summaries, depending on the model) in addition to the answer. The thoughts are usually pretty granular and not really suitable for being spoken out loud in a conversation, but can be useful for logging or prompt debugging. Here's what's added: 1. New typed input parameters for Google and Anthropic LLMs that control the models' thinking behavior (like how much thinking to do, and whether to output thoughts or thought summaries). 2. New frames for representing thoughts output by LLMs. 3. A generic mechanism for associating extra LLM-specific data with a function call in context, used specifically to support Google's function-call-related "thought signatures", which are necessary to ensure thinking continuity between function calls in a chain (where the model thinks, makes a function call, thinks some more, etc.) 4. A generic mechanism for recording LLM thoughts to context, used specifically to support Anthropic, whose thought signatures are expected to appear alongside the text of the thoughts within assistant context messages. 5. An expansion of `TranscriptProcessor` to process LLM thoughts in addition to user and assistant utterances.	2025-12-08 09:29:01 -05:00
Aleix Conchillo Flaqué	4d03270bc3	examples(foundational): update 14i-fireworks with new serverless model	2025-12-05 15:31:29 -08:00
vipyne	82e0253a62	add mcp filter example and changelog	2025-12-05 10:56:59 -06:00
Mark Backman	91dec044c4	Merge pull request #3171 from LaurentMazare/gradium Gradium integration.	2025-12-05 09:43:44 -05:00
laurent	07ebf8534a	Add the example.	2025-12-05 10:51:22 +01:00
Aleix Conchillo Flaqué	8dc9872ed5	deprecate pipecat.sync package	2025-12-03 18:44:41 -08:00
vipyne	a8280522e5	examples: rename nvidia foundational examples	2025-12-01 22:41:17 -06:00
Aleix Conchillo Flaqué	7648d0436c	examples(19): linting	2025-12-01 18:30:34 -08:00
Mark Backman	0ece8b5894	Add 07c Deepgram SageMaker example	2025-11-24 16:41:01 -05:00
mattie ruth backman	24266c238f	Augmented PatternPairAggregator so that matched patterns can... be treated as their own aggregation, taking advantage of the new ability to assign a type to an aggregation	2025-11-21 17:16:10 -05:00
vipyne	68292bd75f	rename MCP foundational examples	2025-11-19 10:34:13 -06:00
vipyne	42423bff41	update MCP foundational examples	2025-11-19 10:29:18 -06:00
Aleix Conchillo Flaqué	ceaf53fdb0	LLMContext: async create_image_message/create_audio_message fixes	2025-11-18 19:41:13 -08:00
Mark Backman	62a0f0c0f5	Merge pull request #3070 from ivaaan/hume-timestamps	2025-11-18 19:56:20 -05:00
ivaaan	f325eeb95b	rm TranscriptProcessor 2	2025-11-18 20:41:10 +01:00
ivaaan	c2309efd7e	rm TranscriptProcessor	2025-11-18 20:35:09 +01:00
Ivan A	a38f208135	Update examples/foundational/07ae-interruptible-hume.py Co-authored-by: Mark Backman <m.backman@gmail.com>	2025-11-18 20:30:28 +01:00

1 2 3 4 5 ...

1478 Commits