Commit Graph

9522 Commits

Author SHA1 Message Date
Filipi da Silva Fuchter
e0e3cd336a Merge pull request #4529 from pipecat-ai/filipi/squash_skill
New skill to squash commits.
2026-05-20 16:06:23 -03:00
Mark Backman
709a0ce839 Merge pull request #4527 from pipecat-ai/mb/fix-elevenlabs-keepalive-1008
Fix ElevenLabs keepalive racing context-init (1008 disconnects)
2026-05-20 11:21:17 -04:00
Mark Backman
be93350eae Merge pull request #4522 from pipecat-ai/mb/stt-latency-smallest
Add P99 latency for Smallest AI, Mistral, XAI STT
2026-05-20 11:21:00 -04:00
Mark Backman
4a96ab7073 Merge pull request #4524 from pipecat-ai/mb/fix-runner-imports
Improve runner optional transport handling
2026-05-20 11:16:16 -04:00
filipi87
c321f50e76 New skill to squash commits. 2026-05-20 10:29:03 -03:00
Filipi da Silva Fuchter
bca337f97e Merge pull request #4380 from pipecat-ai/filipi/smart_text
Smart Text Handling
2026-05-20 10:18:30 -03:00
filipi87
5d9e8c5ac5 Removing debug log. 2026-05-20 10:13:46 -03:00
Mark Backman
70773bce0a Add changelog for PR #4527 2026-05-20 09:08:47 -04:00
filipi87
8bdb49bd1a chore: add changelogs for word-timestamp and frame-ordering fixes 2026-05-20 10:03:30 -03:00
filipi87
81bb81c1d0 test: add automated tests for word tracking, frame sequencing, and Cartesia TTS
Adds tests for AggregatedFrameSequencer, WordCompletionTracker, and
word_timestamp_utils (including CJK language scenarios). Updates existing
Cartesia TTS and TTS frame ordering tests to cover the new behaviours.
2026-05-20 10:03:26 -03:00
filipi87
e1bdee598c fix: preserve raw_text through TTS pipeline for correct LLM context attribution
TTSTextFrame entries were losing their original text structure when word
timestamps were enabled. AggregatedTextFrame now carries a raw_text field with
the original LLM-produced text (including pattern delimiters such as
<card>...</card>). The assistant context receives properly-tagged content
rather than the cleaned words returned by the TTS provider. Also handles words
that straddle two sentence boundaries by splitting and attributing each part
to its correct source frame.
2026-05-20 10:03:21 -03:00
filipi87
185a89bb3b fix: strip Cartesia SSML tags from word timestamp entries
SSML markup (e.g. <spell>, <emotion>, <break>) was leaking into word entries
returned by the Cartesia word-timestamps API. Tags are now stripped before
processing so word-to-text attribution remains accurate when SSML is present
in the TTS input.
2026-05-20 10:03:15 -03:00
filipi87
6b9deefbe3 fix: preserve frame insertion order in BaseOutputTransport for equal PTS values
Frames sharing the same presentation timestamp were being reordered by the
priority queue. Adds a monotonic counter as a tiebreaker so frames with equal
PTS are always emitted in insertion order, preventing subtle audio/text
sequencing bugs.
2026-05-20 10:03:08 -03:00
filipi87
deefc32faf fix: hold skipped TTS frames in position until preceding spoken frames complete
Skipped frames (e.g. code blocks filtered via skip_aggregator_types) were
emitted to the assistant context immediately instead of waiting for preceding
spoken frames to finish. Introduces AggregatedFrameSequencer to hold each
frame's slot and flush only after all earlier spoken sentences are complete,
keeping context ordering correct.
2026-05-20 10:03:03 -03:00
Mark Backman
a5e6886b80 Fix ElevenLabs keepalive racing context-init (1008 disconnects)
The keepalive could fire for a new turn's context before that context's
voice_settings context-init was sent, making the keepalive the context's
first message (no voice_settings) and causing ElevenLabs to reject the
later init with a 1008 policy violation. The keepalive now only targets a
context once its context-init has been sent (tracked in _context_init_sent).
2026-05-20 08:59:01 -04:00
Mark Backman
d11a4ba0cd Use shared telephony route availability checks 2026-05-20 08:57:48 -04:00
Mark Backman
38407e091d Add p99 values for Mistral and XAI 2026-05-19 22:51:33 -04:00
Mark Backman
82cd931efa Merge pull request #4306 from YFortin/fix/azure-tts-last-word-race
fix(azure-tts): Route completion through word boundary queue to prevent last word from being missed
2026-05-19 22:27:50 -04:00
Mark Backman
33e5d1f89b Add changelog for PR #4522 2026-05-19 18:33:58 -04:00
Mark Backman
861dd23873 Add changelog for runner updates 2026-05-19 17:31:07 -04:00
Mark Backman
b825dd779e Clarify runner startup banner 2026-05-19 17:31:07 -04:00
Mark Backman
1487da53a9 Improve runner optional transport handling 2026-05-19 17:03:16 -04:00
Mark Backman
aff84a5d9e Add P99 latency for Smallest AI STT 2026-05-19 11:05:15 -04:00
Mark Backman
c09f6d5adb Merge pull request #4052 from Vonage/vonage_video_connector_transport
Vonage WebRTC Transport Integration
2026-05-19 10:56:20 -04:00
asilvestre
e2d249e5d9 adding uv.lock 2026-05-19 16:33:38 +02:00
asilvestre
956b39b0dc remove extraenous await in cleanup 2026-05-19 16:33:04 +02:00
asilvestre
bc769eaa82 Changing the example to use OpenAI 2026-05-18 14:40:56 +02:00
asilvestre
ee5aa4dc71 SubscribeSettings to be pydantic and comment fixes 2026-05-18 14:40:56 +02:00
asilvestre
dd38fbc735 add documentation entry 2026-05-18 14:40:56 +02:00
asilvestre
a1c40df471 add documentation entry 2026-05-18 14:40:56 +02:00
asilvestre
c4ff9300c9 fix linting and typechecking 2026-05-18 14:40:56 +02:00
asilvestre
cab4585cbb added changelog 2026-05-18 14:40:56 +02:00
Antoni Silvestre
18368d047e Linting and changes to adapt to v1.0 2026-05-18 14:40:56 +02:00
asilvestre
e3abb4b6d7 apply suggestions in PR 2026-05-18 14:40:56 +02:00
Antoni Silvestre
0fd971d59d Update src/pipecat/runner/types.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-05-18 14:40:56 +02:00
asilvestre
c61672194d Vonage Video Connector Transport 2026-05-18 14:40:49 +02:00
Filipi da Silva Fuchter
c51a817efa Merge pull request #4442 from pipecat-ai/filipi/runner_all_transports
Unified start route to make all transports available
2026-05-18 09:27:44 -03:00
Bismeet singh
d85eda6da8 Merge pull request #4507 from BismeetSingh/fix/elevenlabs-stt-service-crash-language
Fix/elevenlabs stt service crash language
2026-05-17 10:17:07 -04:00
Aleix Conchillo Flaqué
71feb42711 Merge pull request #4503 from pipecat-ai/changelog-1.2.1
Release 1.2.1 - Changelog Update
v1.2.1
2026-05-15 15:19:55 -07:00
aconchillo
6b93ca0cb6 Update changelog for version 1.2.1 2026-05-15 22:18:46 +00:00
Aleix Conchillo Flaqué
b6ecce754b Merge pull request #4501 from pipecat-ai/aleix/fix-filter-incomplete-tool-calls
Fix filter-incomplete + function-calling deadlock
2026-05-15 15:11:45 -07:00
Aleix Conchillo Flaqué
d39e6bf921 Add changelog for #4501 2026-05-15 14:54:51 -07:00
Aleix Conchillo Flaqué
63064860ef Move OpenAITTSService instructions into Settings in the example
Mirrors the deprecation in ``OpenAITTSService.__init__``: ``instructions``
is now a Settings field. The constructor still accepts it for backward
compatibility but the canonical path is through ``Settings``.
2026-05-15 14:54:51 -07:00
Aleix Conchillo Flaqué
f5158d51e7 Add filter-incomplete + function-calling turn-management example
A copy of ``turn-management-filter-incomplete-turns.py`` extended with
a ``get_weather(location)`` direct function. Exercises the path where
the LLM responds to a complete user turn by calling a tool — used to
reproduce (and now verify the fix for) the ``_user_speaking`` gating
bug between filter-incomplete and function calls.
2026-05-15 14:54:51 -07:00
Aleix Conchillo Flaqué
94dbd2fa68 Broadcast UserTurnInferenceCompletedFrame on tool calls in filter-incomplete
With ``filter_incomplete_user_turns`` enabled, an LLM that responded to
a user turn by calling a tool (without first emitting a ✓ marker)
never finalized the user turn. ``UserStoppedSpeakingFrame`` stayed
deferred, the assistant aggregator kept ``_user_speaking=True``, and
when ``FunctionCallResultFrame`` arrived its ``not self._user_speaking``
gate dropped the context push — the LLM continuation never ran and
the call hung silently.

Broadcast ``UserTurnInferenceCompletedFrame`` on
``FunctionCallsStartedFrame`` (i.e. the moment the LLM commits to a
tool call, before the function dispatches), gated by a new
``_turn_completion_broadcasted`` flag so the ✓ path and the tool-call
path don't both fire. The flag resets in ``_turn_reset`` alongside
the other per-turn state.

Emitting on the start frame rather than ``LLMFullResponseEndFrame``
also shrinks the race window — ``UserStoppedSpeakingFrame`` (a
``SystemFrame``) has the maximum possible head start over the
``FunctionCallResultFrame`` (``DataFrame``) that follows.
2026-05-15 14:50:35 -07:00
Mark Backman
c6ea6c6522 Merge pull request #4500 from pipecat-ai/mb/update-gradium-endpoints
Update Gradium STT/TTS endpoints to region-neutral URLs
2026-05-15 15:59:14 -04:00
Mark Backman
58a22aeeb1 Add changelog for #4500 2026-05-15 15:19:39 -04:00
Mark Backman
5403aa56e4 Remove Gradium endpoint overrides from voice example
Drop the explicit US-region URLs so the example picks up the new
region-neutral defaults in GradiumSTTService and GradiumTTSService.
2026-05-15 15:17:12 -04:00
Mark Backman
0e0d76d020 Update Gradium endpoints to region-neutral URLs
Drop the EU-region default from the STT/TTS WebSocket URLs in favor of
the generic api.gradium.ai endpoint, and remove the explicit overrides
from the examples so they pick up the new defaults.
2026-05-15 15:02:05 -04:00
filipi87
b493ed8d3a Removing the websocket transport from elevenlabs example. 2026-05-15 10:11:38 -03:00