pipecat

Author	SHA1	Message	Date
asilvestre	c61672194d	Vonage Video Connector Transport	2026-05-18 14:40:49 +02:00
Filipi da Silva Fuchter	c51a817efa	Merge pull request #4442 from pipecat-ai/filipi/runner_all_transports Unified start route to make all transports available	2026-05-18 09:27:44 -03:00
Bismeet singh	d85eda6da8	Merge pull request #4507 from BismeetSingh/fix/elevenlabs-stt-service-crash-language Fix/elevenlabs stt service crash language	2026-05-17 10:17:07 -04:00
Aleix Conchillo Flaqué	71feb42711	Merge pull request #4503 from pipecat-ai/changelog-1.2.1 Release 1.2.1 - Changelog Update v1.2.1	2026-05-15 15:19:55 -07:00
aconchillo	6b93ca0cb6	Update changelog for version 1.2.1	2026-05-15 22:18:46 +00:00
Aleix Conchillo Flaqué	b6ecce754b	Merge pull request #4501 from pipecat-ai/aleix/fix-filter-incomplete-tool-calls Fix filter-incomplete + function-calling deadlock	2026-05-15 15:11:45 -07:00
Aleix Conchillo Flaqué	d39e6bf921	Add changelog for #4501	2026-05-15 14:54:51 -07:00
Aleix Conchillo Flaqué	63064860ef	Move OpenAITTSService instructions into Settings in the example Mirrors the deprecation in ``OpenAITTSService.__init__``: ``instructions`` is now a Settings field. The constructor still accepts it for backward compatibility but the canonical path is through ``Settings``.	2026-05-15 14:54:51 -07:00
Aleix Conchillo Flaqué	f5158d51e7	Add filter-incomplete + function-calling turn-management example A copy of ``turn-management-filter-incomplete-turns.py`` extended with a ``get_weather(location)`` direct function. Exercises the path where the LLM responds to a complete user turn by calling a tool — used to reproduce (and now verify the fix for) the ``_user_speaking`` gating bug between filter-incomplete and function calls.	2026-05-15 14:54:51 -07:00
Aleix Conchillo Flaqué	94dbd2fa68	Broadcast UserTurnInferenceCompletedFrame on tool calls in filter-incomplete With ``filter_incomplete_user_turns`` enabled, an LLM that responded to a user turn by calling a tool (without first emitting a ✓ marker) never finalized the user turn. ``UserStoppedSpeakingFrame`` stayed deferred, the assistant aggregator kept ``_user_speaking=True``, and when ``FunctionCallResultFrame`` arrived its ``not self._user_speaking`` gate dropped the context push — the LLM continuation never ran and the call hung silently. Broadcast ``UserTurnInferenceCompletedFrame`` on ``FunctionCallsStartedFrame`` (i.e. the moment the LLM commits to a tool call, before the function dispatches), gated by a new ``_turn_completion_broadcasted`` flag so the ✓ path and the tool-call path don't both fire. The flag resets in ``_turn_reset`` alongside the other per-turn state. Emitting on the start frame rather than ``LLMFullResponseEndFrame`` also shrinks the race window — ``UserStoppedSpeakingFrame`` (a ``SystemFrame``) has the maximum possible head start over the ``FunctionCallResultFrame`` (``DataFrame``) that follows.	2026-05-15 14:50:35 -07:00
Mark Backman	c6ea6c6522	Merge pull request #4500 from pipecat-ai/mb/update-gradium-endpoints Update Gradium STT/TTS endpoints to region-neutral URLs	2026-05-15 15:59:14 -04:00
Mark Backman	58a22aeeb1	Add changelog for #4500	2026-05-15 15:19:39 -04:00
Mark Backman	5403aa56e4	Remove Gradium endpoint overrides from voice example Drop the explicit US-region URLs so the example picks up the new region-neutral defaults in GradiumSTTService and GradiumTTSService.	2026-05-15 15:17:12 -04:00
Mark Backman	0e0d76d020	Update Gradium endpoints to region-neutral URLs Drop the EU-region default from the STT/TTS WebSocket URLs in favor of the generic api.gradium.ai endpoint, and remove the explicit overrides from the examples so they pick up the new defaults.	2026-05-15 15:02:05 -04:00
filipi87	b493ed8d3a	Removing the websocket transport from elevenlabs example.	2026-05-15 10:11:38 -03:00
filipi87	c3338667b1	Mounting the prebuilt frontend UI and root redirect for all transports.	2026-05-15 10:06:47 -03:00
Aleix Conchillo Flaqué	ea296babe9	Merge pull request #4498 from pipecat-ai/changelog-1.2.0 Release 1.2.0 - Changelog Update v1.2.0	2026-05-14 14:47:47 -07:00
aconchillo	b13af2b053	Update changelog for version 1.2.0	2026-05-14 21:45:36 +00:00
Aleix Conchillo Flaqué	7b6d878f07	update uv.lock	2026-05-14 14:41:38 -07:00
Aleix Conchillo Flaqué	8e405f15aa	changelog: fix 4446.change.md file name	2026-05-14 14:38:54 -07:00
Aleix Conchillo Flaqué	44a40e8eb2	Merge pull request #4497 from pipecat-ai/aleix/fix-tts-context-id-fallback Fall back to _turn_context_id in get_active_audio_context_id	2026-05-14 13:34:34 -07:00
Aleix Conchillo Flaqué	ea97cb1a78	Add changelog for #4497	2026-05-14 13:22:50 -07:00
Aleix Conchillo Flaqué	22650b1b56	Move QwenLLMService model into Settings in the qwen example Mirrors the deprecation in ``QwenLLMService.__init__``: ``model`` should be passed via ``settings=QwenLLMService.Settings(model=...)`` instead of as a direct constructor arg.	2026-05-14 13:22:07 -07:00
Aleix Conchillo Flaqué	b76831e677	Fall back to _turn_context_id in get_active_audio_context_id TTS services whose wire protocol does not echo the context_id back on incoming audio (Sarvam, Smallest, Soniox, Inworld, ...) call ``get_active_audio_context_id()`` to tag each chunk. That accessor returned only ``_playing_context_id`` — the playback-side cursor set asynchronously by ``_audio_context_task_handler`` when it pops a context off the serialization queue. Result: incoming audio that arrived in the gap between contexts or at the very start of a turn (before the playback loop popped) had ``context_id=None`` and was dropped with ``unable to append audio to context: no context ID provided``. Fall back to ``_turn_context_id`` (the synthesis-side cursor, set as soon as the turn's context is created) so the gap is covered without prematurely nulling the playback cursor.	2026-05-14 13:22:00 -07:00
Mark Backman	b57111743f	Merge pull request #4495 from pipecat-ai/mb/soniox-stt-lang-counter	2026-05-14 15:57:31 -04:00
Mark Backman	dcbb0070c9	Add changelog for Soniox language selection	2026-05-14 15:42:43 -04:00
Mark Backman	73278d3309	Use majority language for Soniox transcripts	2026-05-14 15:18:43 -04:00
filipi87	c8efe319b3	Adding the changelog for the changes.	2026-05-14 11:10:33 -03:00
Mark Backman	49bda11ae8	Merge pull request #4482 from pipecat-ai/mb/soniox-stt-token-language Propagate Soniox token language	2026-05-13 16:28:56 -04:00
Aleix Conchillo Flaqué	07640582ce	Merge pull request #4467 from pipecat-ai/aleix/fix-tts-ttfb-tracing Fix metrics.ttfb and partial output on TTS/STT/LLM OpenTelemetry spans	2026-05-13 13:10:52 -07:00
Mark Backman	078af6969a	Merge pull request #4473 from timofey-TK/inworld-tts-v2 Add support for Inworld TTS v2 fields	2026-05-13 15:32:16 -04:00
Mark Backman	9f40ba21c2	Add changelog for Soniox language fix	2026-05-13 15:26:10 -04:00
Mark Backman	82f0896d6a	Propagate Soniox token language	2026-05-13 15:23:22 -04:00
kompfner	7e4cd23de4	Merge pull request #4474 from pipecat-ai/pk/inworld-realtime-tools Extend cancel_on_interruption=False to Inworld Realtime (best-effort + warning)	2026-05-13 15:12:34 -04:00
TimTk	97f50c8aa2	Address review: use resolve_language, narrow delivery_mode type, update changelog - Replace custom LANGUAGE_MAP fallback in language_to_inworld_language with resolve_language(language, LANGUAGE_MAP, use_base_code=False) to match the pattern used by other services and restore the unverified-language warning - Tighten delivery_mode type from str to Literal["STABLE", "BALANCED", "CREATIVE"] - Update changelog entry to mention delivery_mode and language normalization	2026-05-13 21:43:02 +03:00
Mark Backman	08680732f6	Merge pull request #4475 from pipecat-ai/mb/cartesia-korean-fix Fix Cartesia CJK timestamp spacing	2026-05-13 13:20:42 -04:00
Mark Backman	064b68aa01	Fix Cartesia CJK timestamp spacing	2026-05-13 13:13:40 -04:00
Filipi da Silva Fuchter	b0f8ea7e28	Merge pull request #4477 from pipecat-ai/filipi/nvidia_sagemaker_follow_up NVidia TTS Sagemaker: Buffering audio to avoid glitches.	2026-05-13 14:06:44 -03:00
filipi87	ad50c8d5d5	Buffering audio to avoid glitches.	2026-05-13 14:01:03 -03:00
Mark Backman	5fef239b68	Merge pull request #4450 from pipecat-ai/mb/gpt-realtime-whisper Default OpenAI Realtime transcription to gpt-realtime-whisper	2026-05-13 09:48:33 -04:00
Filipi da Silva Fuchter	9148e307cc	Merge pull request #4464 from pipecat-ai/filipi/nvidia_sagemaker NVidia sagemaker - TTS and STT services	2026-05-13 07:53:26 -03:00
Filipi da Silva Fuchter	703d23b658	Update examples/voice/voice-nvidia-sagemaker.py Co-authored-by: Mark Backman <mark@daily.co>	2026-05-13 06:36:57 -04:00
Filipi da Silva Fuchter	227ba288da	Update examples/voice/voice-nvidia-sagemaker.py Co-authored-by: Mark Backman <mark@daily.co>	2026-05-13 06:36:45 -04:00
Timofey	39e7f9e354	Fix Inworld TTS v2 request fields	2026-05-13 11:17:31 +03:00
Aleix Conchillo Flaqué	7cc7968abb	Fix pyright errors in service_decorators.py	2026-05-12 20:10:43 -07:00
Aleix Conchillo Flaqué	52d8008783	Add LLM interruption changelog entry for #4467	2026-05-12 20:10:43 -07:00
Aleix Conchillo Flaqué	a3ce963b54	Capture partial LLM output on interruption traced_llm only attached the aggregated ``output`` attribute to the span after the wrapped function returned successfully. When the LLM call was cancelled mid-stream (e.g. interruption during generation), the accumulated text was discarded — the span had no ``output``. Moved the attribute assignment into the ``finally`` block alongside the existing TTFB write so the partial text we already captured via the patched ``push_frame`` lands on the span regardless of whether ``f`` returned normally, raised, or was cancelled.	2026-05-12 20:10:43 -07:00
Aleix Conchillo Flaqué	e70ee603b2	Add STT changelog entry for #4467	2026-05-12 20:10:43 -07:00
Aleix Conchillo Flaqué	111e59a7b1	Apply the same span-scope fix to traced_stt @traced_stt had the same root issue as @traced_tts: the span lifetime was tied to a per-transcript handler call, which doesn't match the operation we want to trace. Now uses the __set_name__ pattern to install: - A push_frame wrapper that drives one STT span per finalized TranscriptionFrame. The span is anchored at speech start (VADUserStartedSpeakingFrame.timestamp - start_secs) but lazy-opened on the first TranscriptionFrame. Opening earlier (on VAD or UserStartedSpeakingFrame) races with TurnTraceObserver._handle_turn_started, which runs as a background task via _call_event_handler (sync=False), so the span would end up parented to the previous turn. Deferring the open to the first TranscriptionFrame avoids that race because STT only emits transcripts well after the turn observer has set the current turn's context. - A stop_ttfb_metrics wrapper that closes the span on the TTFB-timeout path (called with end_time != None from stt_service.py:566). The span is marked stt.timed_out=True and its end_time is pinned to the timeout's end_time (= _last_transcript_time) so the duration reflects when STT actually stopped responding, not when the timeout fired. Span lifecycle: - Open: lazy on first TranscriptionFrame of a segment. - Close (success): finalized=True attaches metrics.ttfb and closes the span. Multiple finalized transcripts in a single turn produce multiple spans. - Close (timeout): stop_ttfb_metrics(end_time=...) closes with stt.timed_out=True. - Close (orphan): UserStoppedSpeakingFrame closes any still-open span with stt.incomplete=True (covers turns where no finalized transcript and no timeout fired). No changes required outside service_decorators.py — stt_service.py and every per-service file are untouched.	2026-05-12 20:10:43 -07:00
Aleix Conchillo Flaqué	079282d140	Add changelog for #4467	2026-05-12 20:10:43 -07:00

1 2 3 4 5 ...

9537 Commits