Commit Graph

8582 Commits

Author SHA1 Message Date
Kwindla Hultman Kramer
3cd7d882fb Fix bundled Gemini Live transcription ordering 2026-03-25 18:56:00 -07:00
Chad Bailey
2d78533d77 Add changelog for Gemini Live server_content fix 2026-03-25 23:42:42 +00:00
Chad Bailey
c1dd44f947 Fix Gemini Live message handling to process all server_content fields
Gemini 3.x can bundle multiple fields (e.g. model_turn and
output_transcription) on the same server_content message. The previous
elif chain would only process the first matching field and silently
drop the rest. Switch to independent if checks so every field is
handled.
2026-03-25 23:42:07 +00:00
Mark Backman
4ee4002d5d Merge pull request #4137 from pipecat-ai/mb/language-string-log-level-debug
Downgrade unrecognized language string log from warning to debug
2026-03-25 12:26:46 -04:00
Mark Backman
ff5d055b3c Merge pull request #4031 from niczy/xai-tts-service
Add xAI TTS service
2026-03-25 10:57:08 -04:00
Mark Backman
adc003d6c7 Code review cleanup 2026-03-25 10:53:07 -04:00
Nicholas Zhao
bbd14de9c5 Address PR review: rename to XAIHttpTTSService, add language map, clean up API
- Rename XAITTSService → XAIHttpTTSService and XAITTSSettings → XAIHttpTTSSettings
- Add language_to_xai_language() with explicit LANGUAGE_MAP using resolve_language()
- Remove deprecated InputParams, params, voice, language init params
- Remove XAI_DEFAULT_SAMPLE_RATE and XAI_PCM_CODEC constants; add encoding param
- Set sample_rate=None default (picked up from PipelineParams or user)
- Use Language.EN enum instead of string "en" for default language
- Add changelog/4031.added.md
- Add 07e-interruptible-xai.py foundational example
- Update 14g-function-calling-grok.py to use XAIHttpTTSService
- Register 07e in run-release-evals.py
2026-03-25 10:46:54 -04:00
Nicholas Zhao
02b97035f8 Add xAI TTS service 2026-03-25 10:45:15 -04:00
Mark Backman
f470ff193e Update language tests to expect debug instead of warning 2026-03-25 10:26:10 -04:00
Mark Backman
7bc8b89a54 Add changelog for #4137 2026-03-25 10:21:44 -04:00
Mark Backman
a8eff6fbbf Downgrade unrecognized language string log from warning to debug
Service-specific language strings like Deepgram's "multi" are valid
pass-through values, not issues worth warning about.
2026-03-25 10:20:36 -04:00
kompfner
86e086c6b5 Merge pull request #4130 from pipecat-ai/pk/realtime-services-init-v-context-system-instructions-cleanup
Prefer init-provided system instructions in realtime services
2026-03-25 09:13:52 -04:00
Paul Kompfner
4bdfe1cf31 Add changelog for realtime system instruction preference change 2026-03-24 17:34:50 -04:00
Paul Kompfner
bb33045389 Add system instruction conflict resolution tests for realtime adapters
Test that OpenAI Realtime, Grok Realtime, and Nova Sonic adapters
prefer init-provided system_instruction over context-provided, warn
on conflicts, and don't warn for developer messages.
2026-03-24 17:30:35 -04:00
Paul Kompfner
ac2b1ecd47 Prefer init-provided system instruction in Grok Realtime
Add system_instruction parameter to the Grok Realtime adapter's
get_llm_invocation_params() and call _resolve_system_instruction() to
prefer init-provided over context-provided system instructions and
warn on conflicts. Previously context-provided took precedence.

Update the Grok Realtime example to use settings.system_instruction
instead of session_properties.instructions.
2026-03-24 17:29:19 -04:00
Paul Kompfner
e7dd84b552 Prefer init-provided system instruction in OpenAI Realtime
Add system_instruction parameter to the OpenAI Realtime adapter's
get_llm_invocation_params() and call _resolve_system_instruction() to
prefer init-provided over context-provided system instructions and
warn on conflicts. Previously context-provided took precedence.
2026-03-24 17:21:53 -04:00
Paul Kompfner
39329aaddb Prefer init-provided system instruction in Nova Sonic
Add system_instruction parameter to the Nova Sonic adapter's
get_llm_invocation_params() and call _resolve_system_instruction() to
prefer init-provided over context-provided system instructions and
warn on conflicts. Previously context-provided took precedence.

Remove the service-side fallback logic, as the adapter now handles
resolution.
2026-03-24 17:18:44 -04:00
Paul Kompfner
56a56a4174 Prefer init-provided system instruction in Gemini Live
Pass self._system_instruction_from_init to the adapter's
get_llm_invocation_params(), which calls _resolve_system_instruction()
to prefer init-provided over context-provided system instructions and
warn on conflicts. Previously context-provided took precedence.

Also fix the reconnect check to only reconnect when the resolved
system instruction actually differs from what the initial connection
used, avoiding unnecessary reconnects.
2026-03-24 17:06:56 -04:00
kompfner
b80328e038 Merge pull request #4125 from pipecat-ai/pk/gemini-live-endframe-deferral-issue
Gemini Live: fix EndFrame-deferral hang
2026-03-24 17:02:46 -04:00
kompfner
3a80be760b Merge pull request #4089 from pipecat-ai/pk/system-and-developer-message-handling-update
Centralize system message handling in adapters; add developer message support
2026-03-24 16:24:11 -04:00
Paul Kompfner
e0c49927cf Remove hard-coded model overrides from Together and Groq examples
Prefer service defaults — the hard-coded models we were using are no
longer available on these providers.
2026-03-24 16:05:15 -04:00
Paul Kompfner
45926a7135 Update Together.ai default model to openai/gpt-oss-20b
The previous default (meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo) is
no longer available as a serverless Together.ai model and now requires a
custom deployment. The new default is openai/gpt-oss-20b, one of
Together's recommended models for small & fast use-cases.
2026-03-24 16:05:15 -04:00
Paul Kompfner
8c678c1c98 Set supports_developer_role = False for more OpenAI-compatible services
DeepSeek, Mistral, OLLama, Qwen, SambaNova, and Together don't support
the "developer" message role.
2026-03-24 16:05:15 -04:00
Paul Kompfner
4c121332cf Convert developer messages to user for Cerebras (and lay groundwork for other incompatible services)
OpenAI-compatible services that don't support the "developer" message
role can now set supports_developer_role = False on the service class.
BaseOpenAILLMService passes this as convert_developer_to_user to the
adapter, which converts developer messages to user messages before
sending them to the API. Applied to Cerebras and Perplexity.

Also removes the now-redundant developer→user conversion step from
PerplexityLLMAdapter (handled by the parent adapter via the flag).
2026-03-24 16:05:15 -04:00
Paul Kompfner
74686f9190 Add changelog for Gemini Live system_instruction fix 2026-03-24 16:05:15 -04:00
Paul Kompfner
19bcc8620c Fix Gemini Live not honoring settings.system_instruction
_system_instruction_from_init was being set from the deprecated
`system_instruction` constructor parameter instead of
`self._settings.system_instruction`, so system instructions provided
via settings were silently ignored.
2026-03-24 16:05:15 -04:00
Paul Kompfner
0530722c58 Convert developer messages to user in Perplexity adapter
Perplexity doesn't support the "developer" role. Developer messages are
now converted to "user" before other transformations are applied.
2026-03-24 16:05:15 -04:00
Paul Kompfner
0d1b834770 Add developer message support to realtime adapters
OpenAI Realtime, Grok Realtime, and AWS Nova Sonic adapters now convert
"developer" role messages to "user" (consistent with all other non-OpenAI
adapters). Previously these messages were silently dropped. Adds starter
unit tests for all three realtime adapters.
2026-03-24 16:05:15 -04:00
Paul Kompfner
7a0f7b58d1 Remove bit of unintentionally-left-in debugging logic 2026-03-24 16:05:15 -04:00
Paul Kompfner
5806a3f0fa Use "developer" role for remaining developer-intent messages in examples 2026-03-24 16:05:04 -04:00
Paul Kompfner
27fabfc1b3 Improve warning message wording and formatting 2026-03-24 16:02:42 -04:00
Paul Kompfner
d779a5b4ea Use "developer" role for programmatic conversation-kickoff messages
These messages are developer instructions to the assistant (e.g. "Please
introduce yourself to the user"), not simulated user input. The
"developer" role is semantically correct for this purpose.
2026-03-24 16:02:42 -04:00
Paul Kompfner
2bb36b5b66 Update changelog for developer message simplification 2026-03-24 16:02:42 -04:00
Paul Kompfner
e0bc9c73c6 Add Anthropic interruptible example (07e) and register in release evals 2026-03-24 16:02:42 -04:00
Paul Kompfner
2135557689 Simplify: don't promote developer messages to system instruction
Developer messages are now always converted to "user" in non-OpenAI
adapters, never promoted to the system instruction. This removes an
inconsistency where adding an unrelated message to context would change
whether a developer message got promoted.

Simplifications:
- Rename _extract_initial_system_or_developer → _extract_initial_system
- Return Optional[str] instead of Tuple (role is always "system")
- Drop initial_context_message_role from _resolve_system_instruction
- Drop system_role fields from all ConvertedMessages dataclasses
2026-03-24 16:02:42 -04:00
Paul Kompfner
a0393b9af6 Fix: warn on system_instruction conflict even with single system message
When the only message in context was a system message,
_extract_initial_system_or_developer would convert it to "user" (to
prevent empty history) without warning about the conflict with
system_instruction. Now warns inline before converting, with a message
explaining both the conflict and the user-role conversion.
2026-03-24 16:02:42 -04:00
Paul Kompfner
64ba013b68 Move OpenAI Responses adapter tests into test_get_llm_invocation_params.py
Consolidates all adapter get_llm_invocation_params tests in one file.
Adds new tests for developer message handling in the Responses adapter.
2026-03-24 16:02:42 -04:00
Paul Kompfner
7377d88cf5 Move system_instruction tests into test_get_llm_invocation_params.py 2026-03-24 16:02:42 -04:00
Paul Kompfner
3bbec0a2c8 Broaden docstring: all non-OpenAI providers need non-empty messages 2026-03-24 16:02:42 -04:00
Paul Kompfner
e29a63e1ae Improve _extract_initial_system_or_developer docstring clarity 2026-03-24 16:02:42 -04:00
Paul Kompfner
45178972d7 Fix stale docstring in PerplexityLLMAdapter 2026-03-24 16:02:42 -04:00
Paul Kompfner
bb7199d143 Add changelog entries for #4089 2026-03-24 16:02:42 -04:00
Paul Kompfner
d4dea30407 Centralize system message handling in adapters; add developer message support
Two goals:

1. Centralize system_instruction vs context system message resolution into
   the LLM adapters. This eliminates duplication between in-pipeline and
   out-of-band (run_inference) code paths across ~16 locations in service
   llm.py files.

2. Add support for "developer" role messages in conversation context, which
   is facilitated by the above centralization.

Shared helpers on BaseLLMAdapter:
- _extract_initial_system_or_developer: extracts/converts messages[0]
  based on role and whether system_instruction is provided
- _resolve_system_instruction: warns on conflicts between system_instruction
  and context system messages, returns the effective instruction

Developer message handling (new):
- Non-OpenAI adapters: an initial "developer" message is promoted to the
  system instruction when no system_instruction is provided; otherwise it
  is converted to "user". Subsequent "developer" messages are always
  converted to "user". No conflict warning is emitted for developer
  messages (unlike "system" messages).
- OpenAI adapter: "developer" messages pass through in conversation
  history without triggering conflict warnings.
- OpenAI Responses adapter: "developer" messages are kept as "developer"
  role (same as "system", which is also converted to "developer" for the
  Responses API).

Other behavior changes:
- Gemini: "initial" system message detection now checks messages[0] only
  (previously searched anywhere in the list)
- Bedrock: a lone system message is now converted to "user" instead of
  being extracted to an empty message list (matches existing Anthropic
  behavior)
2026-03-24 16:02:42 -04:00
Mark Backman
b49bf1c83f Merge pull request #4127 from pipecat-ai/mb/tts-text-frame-ordering
Fix LLMFullResponseEndFrame racing ahead of final TTSTextFrame
2026-03-24 15:39:06 -04:00
Mark Backman
1b0f7ecb0e Merge pull request #4126 from pipecat-ai/mb/fix-tts-flush-phantom-contexts
Fix TTS flush creating phantom contexts on ElevenLabs
2026-03-24 15:33:58 -04:00
Mark Backman
8e57dd67a2 Add changelog for #4127 2026-03-24 15:10:48 -04:00
Mark Backman
5d71de8aad Fix LLMFullResponseEndFrame racing ahead of final TTSTextFrame
Route LLMFullResponseEndFrame through the serialization queue instead
of pushing it directly downstream when push_text_frames is enabled.
This ensures the frame is emitted only after the audio context is
fully drained, preserving correct ordering relative to TTSTextFrames.

Previously, the final sentence TTSTextFrame would arrive at the
LLMAssistantAggregator after LLMFullResponseEndFrame, causing it to
be dropped from the conversation context (especially with RTVI text
input where no subsequent interruption would flush the orphaned text).
2026-03-24 15:09:42 -04:00
Paul Kompfner
dc56cb2ccc Gemini Live: reset _bot_is_responding when releasing deferred EndFrame
Without this, the released EndFrame re-enters process_frame, sees
_bot_is_responding is still True, defers again, and loops indefinitely.
2026-03-24 15:01:07 -04:00
Paul Kompfner
063955b7eb Gemini Live: clean up EndFrame deferral state on disconnect
Cancel the deferral timeout task and clear the pending EndFrame during
disconnect, which could otherwise be left dangling after a
CancelFrame-triggered shutdown.
2026-03-24 14:30:14 -04:00
Mark Backman
e05bd54743 Add changelog for #4126 2026-03-24 13:43:07 -04:00