pipecat

Author	SHA1	Message	Date
Aleix Conchillo Flaqué	4f848e9631	Merge pull request #3227 from fixie-ai/mike/upstream Add Ultravox service	2025-12-13 18:29:02 -08:00
Mike Depinet	2e4fa3f8db	PR comments Also satisfy some Pyright complaints and update default model	2025-12-12 15:03:31 -08:00
Mike Depinet	4b81be7acf	Add Ultravox service (#1 ) Adds support for using Ultravox Realtime as a speech-to-speech service. Also removes the deprecated Ultravox speech-to-text vllm model integration to avoid confusion.	2025-12-12 10:16:15 -08:00
Filipi Fuchter	87fc860cd5	Changing the HeyGenVideoService example to use the live avatar API.	2025-12-12 08:52:10 -03:00
kompfner	1e98094394	Merge pull request #3175 from pipecat-ai/pk/thinking-exploration Additional functionality related to thinking, for Google and Anthropic LLMs.	2025-12-11 17:15:37 -05:00
Paul Kompfner	ccdd6cde52	Fix a couple of typos in comments	2025-12-11 17:05:09 -05:00
Paul Kompfner	12979293ad	Add thinking examples to eval suite	2025-12-11 15:58:48 -05:00
Paul Kompfner	28248e9b00	Split up thinking examples so that there isn't an `llm` command-line arg for controlling which LLM to use. This change is preparation for adding these examples to our suite of evals.	2025-12-11 15:07:35 -05:00
kompfner	f41c3dcbc3	Merge pull request #3212 from pipecat-ai/pk/nova-2-sonic Nova 2 Sonic support	2025-12-11 09:36:50 -05:00
Paul Kompfner	c37da6ab78	In the AWS Nova Sonic example, shorten the simulated weather function call delay	2025-12-09 16:53:18 -05:00
Paul Kompfner	1892854516	In the AWS Nova Sonic example, send back "location" from the weather-fetching function to help the model associate a tool response with a tool call...if you interrupt the model while more than one function call is outbound, it seemingly can get confused about which tool result goes which call.	2025-12-09 16:27:23 -05:00
Mark Backman	735e597bf2	Merge pull request #3209 from pipecat-ai/hush/07n-prompt Update system prompt in Gemini example to be more instructive	2025-12-09 15:45:46 -05:00
Paul Kompfner	3e66cb50e0	Update AWS Nova Sonic example to showcase async tool calling	2025-12-09 12:44:21 -05:00
Paul Kompfner	0c5bccd1f1	Changes related to Nova 2 Sonic's support for the model speaking first	2025-12-09 11:55:23 -05:00
Paul Kompfner	ca5e668f4a	Update `AWSNovaSonicLLMService` docstring with more (and more up-to-date) info	2025-12-09 10:14:27 -05:00
Paul Kompfner	53de6c0b9a	Update list of supported regions in 40-aws-nova-sonic.py	2025-12-09 09:46:53 -05:00
James Hush	83877ab1e6	Update system prompt in Gemini example to be more instructive Changed the on_client_connected system message from a direct greeting to an instruction that tells the AI to introduce itself, giving the LLM more flexibility in how it starts the conversation.	2025-12-09 09:04:10 +01:00
Aleix Conchillo Flaqué	cfd1cada8c	VoicemailDetector: add on_conversation_detected event	2025-12-08 11:57:14 -08:00
Paul Kompfner	61674d7758	Add `process_thought` constructor argument to `TranscriptProcessor` to control whether to handle thoughts in addition to assistant utterances. Defaults to `False`.	2025-12-08 10:27:36 -05:00
Paul Kompfner	ef703e9d16	Get rid of `ThoughtTranscriptProcessor`, moving its logic into `AssistantTranscriptProcessor` instead	2025-12-08 09:59:32 -05:00
Paul Kompfner	747bd4f737	Tweak the prompt of the thinking + functions example to not confuse Gemini as much (Gemini found the original prompt a bit ambiguous, it seems)	2025-12-08 09:29:10 -05:00
Paul Kompfner	c8c6f424cd	Add support for Gemini 3 Pro non-function-call-related thought signatures	2025-12-08 09:29:10 -05:00
Paul Kompfner	217f03b9cc	Add additional functionality related to "thinking", for Google and Anthropic LLMs. Thinking, sometimes called "extended thinking" or "reasoning", is an LLM process where the model takes some additional time before giving an answer. It's useful for complex tasks that may require some level of planning and structured, step-by-step reasoning. The model can output its thoughts (or thought summaries, depending on the model) in addition to the answer. The thoughts are usually pretty granular and not really suitable for being spoken out loud in a conversation, but can be useful for logging or prompt debugging. Here's what's added: 1. New typed input parameters for Google and Anthropic LLMs that control the models' thinking behavior (like how much thinking to do, and whether to output thoughts or thought summaries). 2. New frames for representing thoughts output by LLMs. 3. A generic mechanism for associating extra LLM-specific data with a function call in context, used specifically to support Google's function-call-related "thought signatures", which are necessary to ensure thinking continuity between function calls in a chain (where the model thinks, makes a function call, thinks some more, etc.) 4. A generic mechanism for recording LLM thoughts to context, used specifically to support Anthropic, whose thought signatures are expected to appear alongside the text of the thoughts within assistant context messages. 5. An expansion of `TranscriptProcessor` to process LLM thoughts in addition to user and assistant utterances.	2025-12-08 09:29:01 -05:00
Aleix Conchillo Flaqué	4d03270bc3	examples(foundational): update 14i-fireworks with new serverless model	2025-12-05 15:31:29 -08:00
vipyne	82e0253a62	add mcp filter example and changelog	2025-12-05 10:56:59 -06:00
Mark Backman	91dec044c4	Merge pull request #3171 from LaurentMazare/gradium Gradium integration.	2025-12-05 09:43:44 -05:00
laurent	07ebf8534a	Add the example.	2025-12-05 10:51:22 +01:00
Aleix Conchillo Flaqué	8dc9872ed5	deprecate pipecat.sync package	2025-12-03 18:44:41 -08:00
vipyne	a8280522e5	examples: rename nvidia foundational examples	2025-12-01 22:41:17 -06:00
Aleix Conchillo Flaqué	7648d0436c	examples(19): linting	2025-12-01 18:30:34 -08:00
Mark Backman	0ece8b5894	Add 07c Deepgram SageMaker example	2025-11-24 16:41:01 -05:00
mattie ruth backman	24266c238f	Augmented PatternPairAggregator so that matched patterns can... be treated as their own aggregation, taking advantage of the new ability to assign a type to an aggregation	2025-11-21 17:16:10 -05:00
vipyne	68292bd75f	rename MCP foundational examples	2025-11-19 10:34:13 -06:00
vipyne	42423bff41	update MCP foundational examples	2025-11-19 10:29:18 -06:00
Aleix Conchillo Flaqué	ceaf53fdb0	LLMContext: async create_image_message/create_audio_message fixes	2025-11-18 19:41:13 -08:00
Mark Backman	62a0f0c0f5	Merge pull request #3070 from ivaaan/hume-timestamps	2025-11-18 19:56:20 -05:00
ivaaan	f325eeb95b	rm TranscriptProcessor 2	2025-11-18 20:41:10 +01:00
ivaaan	c2309efd7e	rm TranscriptProcessor	2025-11-18 20:35:09 +01:00
Ivan A	a38f208135	Update examples/foundational/07ae-interruptible-hume.py Co-authored-by: Mark Backman <m.backman@gmail.com>	2025-11-18 20:30:28 +01:00
Mark Backman	153201542b	Fix foundational 30 example to output TTSTextFrames synced to audio	2025-11-18 13:29:06 -05:00
Ivan A	8dbe119a73	Merge branch 'main' into hume-timestamps	2025-11-18 18:38:24 +01:00
ivaaan	26f96d0be8	upd example	2025-11-18 18:31:38 +01:00
Aleix Conchillo Flaqué	9f45ad4d2e	LLMContext: create_image_message/create_audio_message are now async	2025-11-18 09:04:40 -08:00
Paul Kompfner	5095fc6a64	Update Moondream example so that Moondream service output makes it into the context, even if the TTS service is disabled	2025-11-17 15:16:19 -05:00
ivaaan	9156e21727	fix formatting	2025-11-17 14:00:03 +01:00
Filipi Fuchter	04dbbabc03	Introduced a minimum confidence parameter in DeepgramFluxSTTService to avoid generating transcriptions below a defined threshold.	2025-11-17 09:54:30 -03:00
ivaaan	71869a116d	fix errors	2025-11-17 13:51:04 +01:00
ivaaan	2f2bde9856	add timestamps to example	2025-11-17 13:40:03 +01:00
Mark Backman	74a0e8c88d	Merge pull request #3050 from ai-coustics/aic-vad-analyzer feat(ai-coustics): add ai-coustics integrated VAD	2025-11-14 08:11:15 -05:00
kompfner	e83ac82bf3	Merge pull request #3042 from pipecat-ai/pk/follow-up-inter-frame-spaces Follow-up to #3041	2025-11-13 11:03:06 -05:00

1 2 3 4 5 ...

1467 Commits