Aleix Conchillo Flaqué
eb5a797b12
turns: rename bot turn start to user turn stop strategies
2025-12-30 14:33:58 -08:00
Aleix Conchillo Flaqué
97ab0d4f53
examples: added 52-live-translation without interruptions
2025-12-29 17:30:06 -08:00
Mark Backman
c28ed2206c
DeepgramSTTService pushes user started/stopped speaking and interruption frames
2025-12-29 10:17:35 -08:00
Aleix Conchillo Flaqué
c821e9f8fd
turns: add external user and bot turn start strategies
...
External strategies are strategies where the logic for user turn start and turn
end come from a different processors (e.g. an STT).
2025-12-29 10:17:35 -08:00
Aleix Conchillo Flaqué
0e8e3afc85
Merge pull request #3307 from pipecat-ai/aleix/simplify-turns-package-imports
...
turns: simplify imports and don't require full strategy module path
2025-12-28 18:51:23 -08:00
Aleix Conchillo Flaqué
5496aa722f
turns: simplify imports and don't require full strategy module path
2025-12-28 16:20:15 -08:00
Aleix Conchillo Flaqué
5b93fb9609
PipelineTask: deprecate allow_interruptions parameter
2025-12-28 08:27:02 -08:00
Aleix Conchillo Flaqué
192ede6e34
Merge pull request #3298 from pipecat-ai/aleix/push-user-started-speaking-first
...
push UserStartedSpeakingFrame before interruption
2025-12-28 08:24:50 -08:00
Aleix Conchillo Flaqué
8b861d9143
LLMUserAggregator: move turn_start_strategies from PipelineTask
2025-12-28 08:16:34 -08:00
Aleix Conchillo Flaqué
094d9fd7d7
turns(mute): make strategies available in __init__
2025-12-28 08:12:44 -08:00
Aleix Conchillo Flaqué
0efa36a04e
examples(foundational): added 24-user-mute-strategy.py example
2025-12-27 13:49:31 -08:00
Aleix Conchillo Flaqué
260b7e7959
push UserStartedSpeakingFrame before interruption
2025-12-24 15:33:44 -08:00
Mark Backman
49b53d72a9
Merge pull request #3276 from pipecat-ai/mb/grok-realtime-cleanup
...
GrokRealtimeLLMService cleanup
2025-12-22 18:13:23 -05:00
Mark Backman
93689827e9
Revert turn strategies changes to quickstart
2025-12-22 18:05:05 -05:00
Mark Backman
348fa5a719
Improve SessionProperties initialization: remove voice from args, set default for TurnDetection
2025-12-20 08:02:48 -05:00
Mark Backman
0576783c5e
Improve sample_rate handling in GrokRealtimeLLMService
2025-12-20 07:46:31 -05:00
Mrunmay Chichkhede
d7d979dde1
feat: Add GrokRealtimeLLMService for xAI Grok Voice Agent API ( #3267 )
2025-12-20 07:04:12 -05:00
Sam Sykes
76bae6e699
Update SpeechmaticsSTTService to use the python voice SDK
2025-12-19 19:59:18 -05:00
Aleix Conchillo Flaqué
d22e1f18bb
examples: update with new user and bot turn start strategies
2025-12-19 14:47:02 -08:00
Aleix Conchillo Flaqué
a9cca0b934
LLMAssistantAggregatorParams: copy to llm_response_universal
2025-12-19 14:47:02 -08:00
Aleix Conchillo Flaqué
7e6b0839b0
examples(foundational): don't use legacy LLMUserAggregatorParams
2025-12-19 14:47:02 -08:00
Paul Kompfner
9b6f5853cf
Deprecate OpenAILLMContext and associated things
2025-12-19 11:23:06 -05:00
Mark Backman
56c58f7302
Move Ultravox foundational example to 50, add to release evals
2025-12-18 13:38:12 -05:00
Aleix Conchillo Flaqué
d07b37b288
scripts(evals): more eval prompts improvements
2025-12-17 09:55:12 -08:00
Mark Backman
afa7573834
Merge pull request #3239 from pipecat-ai/mb/update-inworld-tts
...
Inworld TTS services: Add websocket TTS class, add word-timestamp ali…
2025-12-16 16:26:43 -05:00
Mark Backman
bd3bf9a00e
Inworld TTS services: Add websocket TTS class, add word-timestamp alignment
2025-12-16 13:47:24 -05:00
kompfner
92f934031d
Merge pull request #3224 from pipecat-ai/pk/simplify-gemini-thinking
...
Clean up logic related to applying Gemini thought signatures to conte…
2025-12-16 13:35:17 -05:00
Aleix Conchillo Flaqué
a14c911fb2
scripts(evals): improve eval assertion on exit
2025-12-14 12:37:05 -08:00
Aleix Conchillo Flaqué
4f848e9631
Merge pull request #3227 from fixie-ai/mike/upstream
...
Add Ultravox service
2025-12-13 18:29:02 -08:00
Paul Kompfner
e604e9b490
Support conversations with Gemini 3 Pro Image (model "gemini-3-pro-image-preview").
...
Prior to this change, after the model generated an image the conversation would not be able to progress. It would stall out because we were never storing the image in context, so the model would never realize it already did the work of generating an image. We didn't run into issues with Gemini 2.5 Flash Image, because that model always followed up an image with a text message.
2025-12-12 18:20:17 -05:00
Mike Depinet
2e4fa3f8db
PR comments
...
Also satisfy some Pyright complaints and update default model
2025-12-12 15:03:31 -08:00
Mike Depinet
4b81be7acf
Add Ultravox service ( #1 )
...
Adds support for using Ultravox Realtime as a speech-to-speech service.
Also removes the deprecated Ultravox speech-to-text vllm model integration to avoid confusion.
2025-12-12 10:16:15 -08:00
Paul Kompfner
64471d65f8
Clean up logic related to applying Gemini thought signatures to context messages
2025-12-12 12:53:11 -05:00
Filipi Fuchter
87fc860cd5
Changing the HeyGenVideoService example to use the live avatar API.
2025-12-12 08:52:10 -03:00
kompfner
1e98094394
Merge pull request #3175 from pipecat-ai/pk/thinking-exploration
...
Additional functionality related to thinking, for Google and Anthropic LLMs.
2025-12-11 17:15:37 -05:00
Paul Kompfner
ccdd6cde52
Fix a couple of typos in comments
2025-12-11 17:05:09 -05:00
Paul Kompfner
12979293ad
Add thinking examples to eval suite
2025-12-11 15:58:48 -05:00
Paul Kompfner
28248e9b00
Split up thinking examples so that there isn't an llm command-line arg for controlling which LLM to use. This change is preparation for adding these examples to our suite of evals.
2025-12-11 15:07:35 -05:00
kompfner
f41c3dcbc3
Merge pull request #3212 from pipecat-ai/pk/nova-2-sonic
...
Nova 2 Sonic support
2025-12-11 09:36:50 -05:00
Paul Kompfner
c37da6ab78
In the AWS Nova Sonic example, shorten the simulated weather function call delay
2025-12-09 16:53:18 -05:00
Paul Kompfner
1892854516
In the AWS Nova Sonic example, send back "location" from the weather-fetching function to help the model associate a tool response with a tool call...if you interrupt the model while more than one function call is outbound, it seemingly can get confused about which tool result goes which call.
2025-12-09 16:27:23 -05:00
Mark Backman
735e597bf2
Merge pull request #3209 from pipecat-ai/hush/07n-prompt
...
Update system prompt in Gemini example to be more instructive
2025-12-09 15:45:46 -05:00
Paul Kompfner
3e66cb50e0
Update AWS Nova Sonic example to showcase async tool calling
2025-12-09 12:44:21 -05:00
Paul Kompfner
0c5bccd1f1
Changes related to Nova 2 Sonic's support for the model speaking first
2025-12-09 11:55:23 -05:00
Paul Kompfner
ca5e668f4a
Update AWSNovaSonicLLMService docstring with more (and more up-to-date) info
2025-12-09 10:14:27 -05:00
Paul Kompfner
53de6c0b9a
Update list of supported regions in 40-aws-nova-sonic.py
2025-12-09 09:46:53 -05:00
James Hush
83877ab1e6
Update system prompt in Gemini example to be more instructive
...
Changed the on_client_connected system message from a direct greeting to
an instruction that tells the AI to introduce itself, giving the LLM more
flexibility in how it starts the conversation.
2025-12-09 09:04:10 +01:00
Aleix Conchillo Flaqué
cfd1cada8c
VoicemailDetector: add on_conversation_detected event
2025-12-08 11:57:14 -08:00
Paul Kompfner
61674d7758
Add process_thought constructor argument to TranscriptProcessor to control whether to handle thoughts in addition to assistant utterances. Defaults to False.
2025-12-08 10:27:36 -05:00
Paul Kompfner
ef703e9d16
Get rid of ThoughtTranscriptProcessor, moving its logic into AssistantTranscriptProcessor instead
2025-12-08 09:59:32 -05:00