Commit Graph

1497 Commits

Author SHA1 Message Date
Aleix Conchillo Flaqué
eb5a797b12 turns: rename bot turn start to user turn stop strategies 2025-12-30 14:33:58 -08:00
Aleix Conchillo Flaqué
97ab0d4f53 examples: added 52-live-translation without interruptions 2025-12-29 17:30:06 -08:00
Mark Backman
c28ed2206c DeepgramSTTService pushes user started/stopped speaking and interruption frames 2025-12-29 10:17:35 -08:00
Aleix Conchillo Flaqué
c821e9f8fd turns: add external user and bot turn start strategies
External strategies are strategies where the logic for user turn start and turn
end come from a different processors (e.g. an STT).
2025-12-29 10:17:35 -08:00
Aleix Conchillo Flaqué
0e8e3afc85 Merge pull request #3307 from pipecat-ai/aleix/simplify-turns-package-imports
turns: simplify imports and don't require full strategy module path
2025-12-28 18:51:23 -08:00
Aleix Conchillo Flaqué
5496aa722f turns: simplify imports and don't require full strategy module path 2025-12-28 16:20:15 -08:00
Aleix Conchillo Flaqué
5b93fb9609 PipelineTask: deprecate allow_interruptions parameter 2025-12-28 08:27:02 -08:00
Aleix Conchillo Flaqué
192ede6e34 Merge pull request #3298 from pipecat-ai/aleix/push-user-started-speaking-first
push UserStartedSpeakingFrame before interruption
2025-12-28 08:24:50 -08:00
Aleix Conchillo Flaqué
8b861d9143 LLMUserAggregator: move turn_start_strategies from PipelineTask 2025-12-28 08:16:34 -08:00
Aleix Conchillo Flaqué
094d9fd7d7 turns(mute): make strategies available in __init__ 2025-12-28 08:12:44 -08:00
Aleix Conchillo Flaqué
0efa36a04e examples(foundational): added 24-user-mute-strategy.py example 2025-12-27 13:49:31 -08:00
Aleix Conchillo Flaqué
260b7e7959 push UserStartedSpeakingFrame before interruption 2025-12-24 15:33:44 -08:00
Mark Backman
49b53d72a9 Merge pull request #3276 from pipecat-ai/mb/grok-realtime-cleanup
GrokRealtimeLLMService cleanup
2025-12-22 18:13:23 -05:00
Mark Backman
93689827e9 Revert turn strategies changes to quickstart 2025-12-22 18:05:05 -05:00
Mark Backman
348fa5a719 Improve SessionProperties initialization: remove voice from args, set default for TurnDetection 2025-12-20 08:02:48 -05:00
Mark Backman
0576783c5e Improve sample_rate handling in GrokRealtimeLLMService 2025-12-20 07:46:31 -05:00
Mrunmay Chichkhede
d7d979dde1 feat: Add GrokRealtimeLLMService for xAI Grok Voice Agent API (#3267) 2025-12-20 07:04:12 -05:00
Sam Sykes
76bae6e699 Update SpeechmaticsSTTService to use the python voice SDK 2025-12-19 19:59:18 -05:00
Aleix Conchillo Flaqué
d22e1f18bb examples: update with new user and bot turn start strategies 2025-12-19 14:47:02 -08:00
Aleix Conchillo Flaqué
a9cca0b934 LLMAssistantAggregatorParams: copy to llm_response_universal 2025-12-19 14:47:02 -08:00
Aleix Conchillo Flaqué
7e6b0839b0 examples(foundational): don't use legacy LLMUserAggregatorParams 2025-12-19 14:47:02 -08:00
Paul Kompfner
9b6f5853cf Deprecate OpenAILLMContext and associated things 2025-12-19 11:23:06 -05:00
Mark Backman
56c58f7302 Move Ultravox foundational example to 50, add to release evals 2025-12-18 13:38:12 -05:00
Aleix Conchillo Flaqué
d07b37b288 scripts(evals): more eval prompts improvements 2025-12-17 09:55:12 -08:00
Mark Backman
afa7573834 Merge pull request #3239 from pipecat-ai/mb/update-inworld-tts
Inworld TTS services: Add websocket TTS class, add word-timestamp ali…
2025-12-16 16:26:43 -05:00
Mark Backman
bd3bf9a00e Inworld TTS services: Add websocket TTS class, add word-timestamp alignment 2025-12-16 13:47:24 -05:00
kompfner
92f934031d Merge pull request #3224 from pipecat-ai/pk/simplify-gemini-thinking
Clean up logic related to applying Gemini thought signatures to conte…
2025-12-16 13:35:17 -05:00
Aleix Conchillo Flaqué
a14c911fb2 scripts(evals): improve eval assertion on exit 2025-12-14 12:37:05 -08:00
Aleix Conchillo Flaqué
4f848e9631 Merge pull request #3227 from fixie-ai/mike/upstream
Add Ultravox service
2025-12-13 18:29:02 -08:00
Paul Kompfner
e604e9b490 Support conversations with Gemini 3 Pro Image (model "gemini-3-pro-image-preview").
Prior to this change, after the model generated an image the conversation would not be able to progress. It would stall out because we were never storing the image in context, so the model would never realize it already did the work of generating an image. We didn't run into issues with Gemini 2.5 Flash Image, because that model always followed up an image with a text message.
2025-12-12 18:20:17 -05:00
Mike Depinet
2e4fa3f8db PR comments
Also satisfy some Pyright complaints and update default model
2025-12-12 15:03:31 -08:00
Mike Depinet
4b81be7acf Add Ultravox service (#1)
Adds support for using Ultravox Realtime as a speech-to-speech service.

Also removes the deprecated Ultravox speech-to-text vllm model integration to avoid confusion.
2025-12-12 10:16:15 -08:00
Paul Kompfner
64471d65f8 Clean up logic related to applying Gemini thought signatures to context messages 2025-12-12 12:53:11 -05:00
Filipi Fuchter
87fc860cd5 Changing the HeyGenVideoService example to use the live avatar API. 2025-12-12 08:52:10 -03:00
kompfner
1e98094394 Merge pull request #3175 from pipecat-ai/pk/thinking-exploration
Additional functionality related to thinking, for Google and Anthropic LLMs.
2025-12-11 17:15:37 -05:00
Paul Kompfner
ccdd6cde52 Fix a couple of typos in comments 2025-12-11 17:05:09 -05:00
Paul Kompfner
12979293ad Add thinking examples to eval suite 2025-12-11 15:58:48 -05:00
Paul Kompfner
28248e9b00 Split up thinking examples so that there isn't an llm command-line arg for controlling which LLM to use. This change is preparation for adding these examples to our suite of evals. 2025-12-11 15:07:35 -05:00
kompfner
f41c3dcbc3 Merge pull request #3212 from pipecat-ai/pk/nova-2-sonic
Nova 2 Sonic support
2025-12-11 09:36:50 -05:00
Paul Kompfner
c37da6ab78 In the AWS Nova Sonic example, shorten the simulated weather function call delay 2025-12-09 16:53:18 -05:00
Paul Kompfner
1892854516 In the AWS Nova Sonic example, send back "location" from the weather-fetching function to help the model associate a tool response with a tool call...if you interrupt the model while more than one function call is outbound, it seemingly can get confused about which tool result goes which call. 2025-12-09 16:27:23 -05:00
Mark Backman
735e597bf2 Merge pull request #3209 from pipecat-ai/hush/07n-prompt
Update system prompt in Gemini example to be more instructive
2025-12-09 15:45:46 -05:00
Paul Kompfner
3e66cb50e0 Update AWS Nova Sonic example to showcase async tool calling 2025-12-09 12:44:21 -05:00
Paul Kompfner
0c5bccd1f1 Changes related to Nova 2 Sonic's support for the model speaking first 2025-12-09 11:55:23 -05:00
Paul Kompfner
ca5e668f4a Update AWSNovaSonicLLMService docstring with more (and more up-to-date) info 2025-12-09 10:14:27 -05:00
Paul Kompfner
53de6c0b9a Update list of supported regions in 40-aws-nova-sonic.py 2025-12-09 09:46:53 -05:00
James Hush
83877ab1e6 Update system prompt in Gemini example to be more instructive
Changed the on_client_connected system message from a direct greeting to
an instruction that tells the AI to introduce itself, giving the LLM more
flexibility in how it starts the conversation.
2025-12-09 09:04:10 +01:00
Aleix Conchillo Flaqué
cfd1cada8c VoicemailDetector: add on_conversation_detected event 2025-12-08 11:57:14 -08:00
Paul Kompfner
61674d7758 Add process_thought constructor argument to TranscriptProcessor to control whether to handle thoughts in addition to assistant utterances. Defaults to False. 2025-12-08 10:27:36 -05:00
Paul Kompfner
ef703e9d16 Get rid of ThoughtTranscriptProcessor, moving its logic into AssistantTranscriptProcessor instead 2025-12-08 09:59:32 -05:00