pipecat

Author	SHA1	Message	Date
Aleix Conchillo Flaqué	0ccdd808e6	Fix traced_tts so metrics.ttfb reflects the real TTFB Previously @traced_tts scoped the span to the lifetime of run_tts(). For streaming TTS services run_tts() returns as soon as the synthesis request is sent, long before audio chunks arrive, so: - The span duration measured the WebSocket-send time, not synthesis time. - The first synthesis recorded the WS-send duration as metrics.ttfb (via the in-progress fallback in FrameProcessorMetrics.ttfb). - Subsequent syntheses recorded the previous call's TTFB on the current span (off-by-one). The decorator now uses a __set_name__ descriptor to wrap the owning class's setup() at class definition time. setup() installs per-instance patches on create_audio_context, append_to_audio_context, remove_audio_context, on_audio_context_completed, and reset_active_audio_context. These patches own the span lifetime: - create_audio_context: open span, set baseline attributes. - append_to_audio_context: record metrics.ttfb on the first TTSAudioRawFrame (when stop_ttfb_metrics has produced a real value), end span on appended TTSStoppedFrame. - on_audio_context_completed: end span on natural completion (handles services that auto-push TTSStoppedFrame via push_frame, bypassing append_to_audio_context). - remove_audio_context: safety net for explicit removal paths. - reset_active_audio_context: interruption hook (always reached from _handle_interruption); marks the span tts.interrupted=true only when nothing else has closed it. The run_tts wrapper now only attaches per-call attributes (text, metrics.character_count) to the already-open span. No changes required in tts_service.py or in any of the per-service files.	2026-05-12 20:10:43 -07:00
Mark Backman	3e8c5c08f4	Clarify realtime settings update condition	2026-05-12 17:48:53 -04:00
Mark Backman	644030584f	Centralize OpenAI audio constants	2026-05-12 17:48:53 -04:00
filipi87	0740021ff4	Removing changelog for sanitize_text_for_tts	2026-05-12 18:29:35 -03:00
filipi87	68f265fa62	Fixing ruff format.	2026-05-12 18:28:14 -03:00
filipi87	b9f052079d	Removing sanitize_text_for_tts	2026-05-12 18:22:15 -03:00
filipi87	130bb7371c	Removing sanitize_text_for_tts	2026-05-12 18:21:47 -03:00
filipi87	5d61763987	Refactoring how we are reconnecting the STT.	2026-05-12 18:20:19 -03:00
filipi87	7984556692	Fixing typecheck.	2026-05-12 18:00:07 -03:00
filipi87	bea9e4b3ba	New example voice-nvidia-sagemaker.py	2026-05-12 17:44:11 -03:00
Mark Backman	19df443500	Merge pull request #4471 from pipecat-ai/mb/fix-gstreamer-pyright-import	2026-05-12 16:34:48 -04:00
Mark Backman	07f241143b	Merge pull request #4469 from pipecat-ai/mb/remove-vad-analyzer-runner-utils-docstring	2026-05-12 16:34:27 -04:00
Mark Backman	2fdb9bbf42	Merge pull request #4462 from pipecat-ai/mb/cartesia-sonic-3.5	2026-05-12 16:34:04 -04:00
filipi87	0146947b68	Addressing the comments left in the PR review.	2026-05-12 17:12:19 -03:00
Paul Kompfner	863a1bf177	Add changelog for #4474	2026-05-12 16:04:12 -04:00
Paul Kompfner	58333b2705	Extend cancel_on_interruption=False to InworldRealtimeLLMService (best-effort) Same async-tool routing approach as #4441: detect async-tool messages in the LLM context, deliver the final result via the formal tool-result channel. Caveat: as of this writing, Inworld Realtime doesn't appear to handle the resulting delayed tool result reliably, so the routing is best-effort and the service emits a one-time warning when async-tool messages are seen. Streamed intermediate results remain unsupported. Also adds function calling to the realtime-inworld.py example, and softens the Inworld mention in the #4447 changelog now that the exclusion is being closed.	2026-05-12 16:03:34 -04:00
TimTk	ecaff1d1eb	Fix changelog fragment number	2026-05-12 22:21:59 +03:00
Mark Backman	e2bfa6352f	Add changelog for #4450	2026-05-12 15:20:57 -04:00
Mark Backman	abd28e2ac1	Update OpenAI realtime transcription default	2026-05-12 15:20:57 -04:00
kompfner	88deebbf5f	Merge pull request #4472 from pipecat-ai/pk/default-gpt-realtime-2 Switch OpenAIRealtimeLLMService default model to gpt-realtime-2	2026-05-12 15:17:12 -04:00
TimTk	9b55d4ddd4	Add support for Inworld TTS v2 fields	2026-05-12 22:13:09 +03:00
filipi87	c2bdc1aada	Fixing metrics and adding extra guard after sanitization.	2026-05-12 16:11:01 -03:00
Paul Kompfner	fc0589e8f1	Switch OpenAIRealtimeLLMService default model to gpt-realtime-2	2026-05-12 14:57:59 -04:00
kompfner	67f8d34e9f	Merge pull request #4470 from pipecat-ai/pk/gpt-realtime-2-reasoning-effort Add reasoning support to OpenAIRealtimeLLMService for gpt-realtime-2	2026-05-12 14:43:39 -04:00
kompfner	d3b8710720	Merge pull request #4465 from pipecat-ai/pk/gpt-realtime-2 Handle gpt-realtime-2 multi-output-item audio responses	2026-05-12 14:30:15 -04:00
Mark Backman	86e2aa85d3	Fix GStreamer pipeline source pyright import	2026-05-12 14:16:36 -04:00
Paul Kompfner	b89500256d	Drop debug logging added while investigating multi-output-item audio	2026-05-12 14:05:16 -04:00
Paul Kompfner	a52bdef32b	Add reasoning support to OpenAIRealtimeLLMService for gpt-realtime-2	2026-05-12 13:55:19 -04:00
Mark Backman	afd9fc5fdf	Remove vad_analyzer from create_transport docstring example	2026-05-12 13:50:17 -04:00
filipi87	7f98dba925	Changelog files for the new nvidia features.	2026-05-12 14:43:12 -03:00
filipi87	6a27ed35b1	Fixing the Bidi client to accept None.	2026-05-12 12:19:30 -03:00
filipi87	a34864d643	Fixed ruff, pyright, and test_service_init failures	2026-05-12 11:39:52 -03:00
Paul Kompfner	007fa3a3a8	Handle gpt-realtime-2 multi-output-item audio responses A single Realtime API response can now contain more than one audio item (observed with gpt-realtime-2), and the first item's audio.done can arrive after deltas from the second have started arriving. Deltas still arrive strictly in playback order across items, so we keep forwarding them as received — matching OpenAI's reference implementation. Adjusted OpenAIRealtimeLLMService so a multi-item response is treated as one continuous TTS turn: - _handle_evt_audio_delta: on item switch, advance the tracked item in place (reset total_size) without emitting another TTSStartedFrame. Truncation now always targets the latest item. - _handle_evt_audio_done: debug-trace only; no longer pushes TTSStoppedFrame. - _handle_evt_response_done: pushes a single TTSStoppedFrame per turn, bookending the audio with the Started pushed on the first delta. Added tests covering single-item, overlapping multi-item, non-overlapping multi-item, and interrupt-during-multi-item (last-item-wins truncation).	2026-05-12 10:34:50 -04:00
filipi87	5dd7413c00	Nvidia Sagemaker Nemotron ASR STT service	2026-05-12 11:16:00 -03:00
filipi87	8e0a338d96	Nvidia Sagemaker Magpie TTS service	2026-05-12 11:15:42 -03:00
filipi87	d6655e7a5e	Fixing ruff format.	2026-05-12 10:40:09 -03:00
filipi87	33b73df6ec	Changing the websocket route to return the same data as PCC.	2026-05-12 10:38:15 -03:00
Mark Backman	d65aee9181	Add changelog for #4462	2026-05-11 17:34:00 -04:00
Mark Backman	1755016679	Update default Cartesia TTS model to sonic-3.5	2026-05-11 17:33:40 -04:00
Mark Backman	b7f6298601	Merge pull request #4461 from pipecat-ai/mb/security-vuln-2025-05-11 Update uv.lock for urllib3 and langchain-core	2026-05-11 15:58:05 -04:00
Mark Backman	396873ac7e	Merge pull request #4460 from pipecat-ai/mb/codex-skills Add Codex skills and AGENTS.md	2026-05-11 15:57:49 -04:00
Mark Backman	5b33964a1b	Update uv.lock for urllib3 and langchain-core	2026-05-11 15:51:01 -04:00
Mark Backman	8b37cd1d3a	Add agent-neutral repository instructions	2026-05-11 15:43:43 -04:00
Mark Backman	7a2b667fa1	Add Codex skill symlinks	2026-05-11 15:27:49 -04:00
Mark Backman	ee8c607315	Merge pull request #4452 from pipecat-ai/mb/cleanup-frontmatter Add cleanup skill frontmatter	2026-05-11 09:33:44 -04:00
Aleix Conchillo Flaqué	71578e7151	Merge pull request #4449 from pipecat-ai/aleix/base-object-task-manager Move create_task and cancel_task from FrameProcessor to BaseObject	2026-05-10 20:36:54 -07:00
Aleix Conchillo Flaqué	77058b01c4	Add changelog for #4449	2026-05-10 20:34:52 -07:00
Aleix Conchillo Flaqué	4f85e7c089	Fix pyright cr_code access on Coroutine in BaseObject.create_task `collections.abc.Coroutine` doesn't expose `cr_code`/`co_name`; only native coroutine objects do. Use `getattr` chains so pyright is happy and any non-native awaitable falls back to a generic task name instead of crashing.	2026-05-10 20:34:52 -07:00
Aleix Conchillo Flaqué	15531c8112	Wire TaskObserver via setup() instead of constructor TaskObserver previously took a TaskManager in __init__ and reached into it directly. Since BaseObject now provides task_manager / create_task / cancel_task, drop the constructor argument and call `observer.setup(task_manager)` from PipelineTask._setup() before starting it.	2026-05-10 20:34:52 -07:00
Mark Backman	b9e8f13105	Add cleanup skill frontmatter	2026-05-09 12:30:20 -07:00

1 2 3 4 5 ...

9537 Commits