9537 Commits

Author SHA1 Message Date
filipi87
5f256e241c Fixing a race condition when cleaning up the daily transport. 2026-05-07 11:29:57 -03:00
Mark Backman
954f63dc7b Document deprecation docstring convention in CLAUDE.md.
Adds an explicit Code Style bullet for the `.. deprecated::` Sphinx
directive (forbidding inline `[DEPRECATED]` tags) and extends the
Docstring Example with a Pydantic params class showing the directive
inside a `Parameters:` block — the context CONTRIBUTING.md's existing
example didn't cover.
2026-05-07 10:03:43 -04:00
Mark Backman
6cc66a3df1 Update video_out_bitrate deprecation to use sphinx directive.
Replaces the inline `[DEPRECATED]` tag with a `.. deprecated:: 1.1.0`
directive per CONTRIBUTING.md docstring conventions, so the deprecation
shows up properly in the rendered docs.
2026-05-07 09:57:21 -04:00
filipi87
a445399337 Fixing a bug in the ElevenLabs TTS refactor where alignment state was reset too early mid-turn. 2026-05-07 10:10:54 -03:00
filipi87
5ed2057599 Merge branch 'main' into filipi/refactoring_elevenlabs 2026-05-07 09:32:53 -03:00
Filipi da Silva Fuchter
cacde00e26 Merge pull request #4435 from pipecat-ai/filipi/uninterruptible_frame
Refactoring TTSService to preserve uninterruptible frames.
2026-05-07 08:46:42 -03:00
Filipi da Silva Fuchter
b1b598f65e Merge pull request #4434 from pipecat-ai/filipi/fix_interruption_regression
Fix interruption blocked by slow non-uninterruptible frame in queue
2026-05-07 08:46:10 -03:00
filipi87
c48ee93892 Adding changelog entry for the fix. 2026-05-06 16:30:22 -03:00
filipi87
cf22dac171 Refactoring TTSService to preserve uninterruptible frames. 2026-05-06 16:26:45 -03:00
filipi87
36f6e22aee Adding changelog for the interruption fix. 2026-05-06 15:39:27 -03:00
filipi87
921a7a46cb Fix interruption blocked by slow non-uninterruptible frame in queue
When a non-uninterruptible frame was being processed slowly and an
uninterruptible frame was waiting in the queue, _start_interruption
skipped task cancellation. This caused interruptions to stall until
the slow frame finished, even though it had no reason to block them.

The fix: only skip cancellation when the *current* frame is
uninterruptible. Uninterruptible frames already in the queue are
preserved regardless, because __create_process_task calls
__reset_process_queue internally, which always retains them.

Fixes: https://github.com/pipecat-ai/pipecat/issues/4412
2026-05-06 15:35:43 -03:00
filipi87
fda18a9afa Adding changelog for the elevenlabs improvement. 2026-05-06 14:58:18 -03:00
filipi87
d146a7f8e0 Refactoring ElevenLabs to send close_context as soon as the turn context is complete. 2026-05-06 14:55:49 -03:00
Filipi da Silva Fuchter
90f0f7cd27 Merge pull request #4431 from pipecat-ai/filipi/tts_deadlock
Fixing TTSService deadlock.
2026-05-06 14:52:04 -03:00
Mark Backman
37376b3506 Merge pull request #4429 from pipecat-ai/mb/update-grok-default-llm-model
fix(xai): update default Grok model to grok-4.20-non-reasoning
2026-05-06 13:41:05 -04:00
Mark Backman
729418c2b7 Merge pull request #4428 from pipecat-ai/mb/deprecate-resampy
chore(audio): deprecate ResampyResampler
2026-05-06 13:40:51 -04:00
filipi87
4512038a17 Creating a changelog entry for the fix. 2026-05-06 13:36:20 -03:00
filipi87
a23baf9de6 Fixing TTSService deadlock. 2026-05-06 13:32:26 -03:00
Mark Backman
d18fe7c39c feat(rtvi): type UI accessibility snapshots 2026-05-06 11:29:19 -04:00
Mark Backman
41124dc494 refactor(rtvi): clarify UI message names 2026-05-06 11:08:25 -04:00
Filipi da Silva Fuchter
95db08646c Merge pull request #4430 from pipecat-ai/filipi/flux_audio
Implementing dynamic watchdog timeout for Deepgram Flux STT
2026-05-06 11:40:06 -03:00
filipi87
03e5ebb266 Improving watchdog_min_timeout description. 2026-05-06 11:37:18 -03:00
filipi87
5daf267c11 Adding changelogs. 2026-05-06 11:26:14 -03:00
filipi87
1cb77b422a Created a watchdog_min_timeout to allow to change the default value. 2026-05-06 11:22:37 -03:00
filipi87
0c779b4c3d Implementing dynamic watchdog timeout for Deepgram Flux STT 2026-05-06 11:01:58 -03:00
Mark Backman
138991418a docs(changelog): add 4429 entry for Grok default model update 2026-05-06 09:51:01 -04:00
Mark Backman
94e136a6b7 fix(xai): update default Grok model to grok-4.20-non-reasoning
grok-3 is being retired from the xAI API on May 15, 2026. Switch the
default to grok-4.20-non-reasoning, which xAI recommends for non-reasoning
workloads and is appropriate for real-time voice AI.
2026-05-06 09:48:39 -04:00
Mark Backman
9598e262b5 docs(changelog): add 4428 deprecation entry for ResampyResampler 2026-05-06 09:41:14 -04:00
Mark Backman
8c3521f2e4 chore(audio): deprecate ResampyResampler in favor of SOXR resamplers
Emits a DeprecationWarning on instantiation. ResampyResampler will be
removed in Pipecat 2.0 along with the default resampy and numba
dependencies.
2026-05-06 09:40:13 -04:00
Mark Backman
eda98fb13f Merge pull request #4424 from pipecat-ai/mb/revert-elevenlabs-tts-alignment
fix(elevenlabs): only use normalizedAlignment when pronunciation dict is set
2026-05-06 08:27:25 -04:00
Mark Backman
3722ee223c Merge pull request #4419 from pipecat-ai/mb/fix-changelog-entry-4416
Fix changelog filename for 4416
2026-05-05 14:50:24 -04:00
Mark Backman
2620e76dab docs(elevenlabs): clarify alignment leading-space handling 2026-05-05 14:49:41 -04:00
Mark Backman
2447db766e docs(changelog): add 4424 entry for elevenlabs alignment selection fix 2026-05-05 14:49:41 -04:00
Mark Backman
61a81ed87b fix(elevenlabs): use alignment by default, normalizedAlignment only with pronunciation dicts
PR #4344 unconditionally switched to normalizedAlignment to fix garbled
words with pronunciation dictionaries (#4316). But normalizedAlignment
returns the post-normalized form of what was spoken - including
romanization of non-Latin scripts (Chinese rendered as pinyin), which
ends up in the LLM context and degrades subsequent turns.

Gate the switch on pronunciation_dictionary_locators being configured.
Adds a _select_alignment helper with preferred-with-fallback (both
fields are nullable per the API schema), used by both the WebSocket
and HTTP services. Tests cover dictionary mode, default mode, fallback
when preferred is missing or null, and HTTP field-name variants.
2026-05-05 14:49:41 -04:00
Mark Backman
735cd09c7e Merge pull request #4422 from cshape/tts-2
feat(inworld): default to inworld-tts-2
2026-05-05 14:00:04 -04:00
Paul Kompfner
2616076bec Add deterministic dev-error demo example
``examples/function-calling/function-calling-missing-handler.py``
demonstrates the missing-handler path by deliberately advertising a
tool to the LLM without registering its handler — what happens when a
developer forgets to call ``register_function``. Exercises the new
``logger.error`` severity end-to-end without needing to coax the LLM
into hallucinating.
2026-05-05 13:08:00 -04:00
Paul Kompfner
40667e50fc Add changelog for #4404 2026-05-05 13:03:49 -04:00
Paul Kompfner
e06e0c0282 Mitigate tool-call-related hallucination
When tools change mid-conversation, LLMs can produce a few different
flavors of tool-call-related hallucination: calling tools that have
been removed, avoiding tools that have been re-added, or hallucinating
output (made-up answers or tool-call-shaped non-tool-calls) when tools
are unavailable.

This change introduces an opt-in ``add_tool_change_messages`` flag on
the LLM aggregators (preferred entry point: ``LLMContextAggregatorPair(
..., add_tool_change_messages=True)``) that appends a developer-role
message to the context whenever ``LLMSetToolsFrame`` changes the set
of advertised standard tools. Helps the LLM stay coherent across tool
changes by spelling out exactly what just became available or
unavailable. Both aggregators participate; whichever handles the
frame first wins, and the other (if any) sees an empty diff against
the shared context and stays silent — order-independent regardless of
whether the frame flows downstream or upstream.

Also tightens the existing missing-handler path (introduced in #4301):

- Reworded the terminal tool result to a neutral "The function
  ``X`` is not currently available." (overridable via
  ``LLMService.MISSING_FUNCTION_CALL_MESSAGE_TEMPLATE``). Previously
  read "Error: function 'X' is not registered."
- Logs at the call site now distinguish developer error (tool
  advertised but no handler registered → ``logger.error``) from
  hallucination (tool not advertised → ``logger.warning``).

Includes a manual validation harness
(``examples/features/features-add-tool-change-messages.py``) that
exercises the new ``add_tool_change_messages`` mitigation by flipping
tool availability on a turn counter so its effect can be observed
end-to-end with the flag on vs. off.
2026-05-05 13:02:43 -04:00
Cale Shapera
84eefba4df docs: add changelog fragment for tts-2 default flip 2026-05-05 09:20:16 -07:00
Cale Shapera
fe3af5d9f7 feat(inworld): default to inworld-tts-2
Flip the default Inworld TTS model from inworld-tts-1.5-max to
inworld-tts-2 across:
- InworldHttpTTSService (HTTP)
- InworldTTSService (WebSocket)
- InworldRealtimeLLMService (cascade Realtime)

inworld-tts-1.5-max and inworld-tts-1.5-mini remain valid options;
existing users can pin the prior model explicitly via the model
setting. Docstring examples updated to reference the new default.
2026-05-05 09:20:16 -07:00
Mark Backman
7729eecfe4 Fix changelog filename for 4416 2026-05-04 21:54:58 -04:00
Mark Backman
fa31a2fd63 Merge pull request #4416 from pipecat-ai/mb/pr-4333-aws-credentials-review
feat(aws): add shared credential resolver with boto3 chain fallback
2026-05-04 21:48:33 -04:00
Mark Backman
678d40e102 docs(changelog): add 4333 entries for AWS credential resolver expansion 2026-05-04 19:30:37 -04:00
Mark Backman
8becafee38 fix(aws): use shared credential resolver in Polly, Bedrock, AgentCore
Polly TTS, Bedrock LLM, and AgentCore previously did
`arg or os.getenv("AWS_...")` and handed the result straight to
aioboto3.  When only one of `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY`
was set, aioboto3 received a half-populated kwarg and errored instead of
falling through to the boto3 credential provider chain (instance
profiles, IRSA, ECS task roles, SSO, etc.).

Route credential resolution through the shared `resolve_credentials()`
helper introduced for AWS Transcribe so all four services follow the
same `explicit → env → boto3 chain` fallback.  Add an
`AWSCredentials.to_boto_kwargs()` method to bridge the dataclass field
names (`access_key`, `secret_key`) to the aioboto3 kwargs
(`aws_access_key_id`, `aws_secret_access_key`).

No public API changes.  Behaviour is identical for fully-explicit and
fully-env-var configurations; partial env vars now correctly trigger
the chain instead of erroring.
2026-05-04 19:23:53 -04:00
Mark Backman
83190d38e9 Merge pull request #4414 from pipecat-ai/mb/fix-ttsspeakframe-assistant-turn-stopped 2026-05-04 18:12:33 -04:00
Mark Backman
7519c26ac5 Merge pull request #4417 from pipecat-ai/mb/resolve-runner-filepath 2026-05-04 18:09:34 -04:00
Mark Backman
b2b7e9ee6f Merge pull request #4415 from pipecat-ai/mb/fix-elevenlabs-leading-spaces-flash 2026-05-04 18:08:31 -04:00
Mark Backman
e864d5778a ci: install runner extra for the coverage job 2026-05-04 16:44:47 -04:00
Mark Backman
89f10dd9a1 test: drop webrtc-dependent test, remove webrtc extra from CI 2026-05-04 16:42:05 -04:00
Mark Backman
f67e3ef0b2 ci: install runner and webrtc extras for the test job 2026-05-04 16:29:58 -04:00