Commit Graph

460 Commits

Author SHA1 Message Date
Aleix Conchillo Flaqué
7dc763d512 Merge pull request #4272 from pipecat-ai/pk/llm-context-get-messages-elide-large-values
Add truncate_large_values to LLMContext.get_messages()
2026-04-13 15:04:41 -07:00
Paul Kompfner
1a02b5d61a Rename elide_large_values to truncate_large_values 2026-04-11 14:29:05 -04:00
Aleix Conchillo Flaqué
f91a113de7 tests: yield in wake phrase strategy setup to let tasks start
The strategy schedules background tasks during setup. Fast-running
tests could observe state before those tasks had a chance to run;
yielding once via asyncio.sleep(0) ensures they do.
2026-04-10 17:37:50 -07:00
Aleix Conchillo Flaqué
e553bb010f tests: migrate LLM tests to Settings-based constructor API
Replace the old `model=` / `params=InputParams(...)` style with the
new `settings=<Service>.Settings(...)` form across LLM service tests.
2026-04-10 17:37:49 -07:00
Paul Kompfner
812cdc6822 Add elide_large_values to LLMContext.get_messages()
Enable callers to get a compact version of context messages suitable
for serialization, logging, and debugging tools. For standard
messages, known binary data (base64 images, audio) is fully elided.
For LLM-specific messages, long string values are recursively
truncated. Adapter get_messages_for_logging() methods now use this.
2026-04-10 16:35:36 -04:00
Aleix Conchillo Flaqué
dcd21e7ff4 Rework audio idle detection with timestamp-based adaptive sleep
Replaces the per-frame asyncio.Event signaling with a monotonic
timestamp updated on each audio frame. The handler sleeps until the
next deadline (last_audio_time + timeout), recomputing on each wake-up
to account for audio arriving during sleep.

This avoids waking the handler on every audio frame (~50/s at 20ms
chunks), and guarantees detection latency is bounded by timeout rather
than 2 * timeout.

Also renames audio_starvation_timeout to audio_idle_timeout and
associated identifiers for consistency with existing pipecat naming
(user_idle_timeout, etc.).
2026-04-10 10:35:18 -07:00
Om Chauhan
cb2c1868b0 fix VAD stuck in SPEAKING state when audio stops mid-speech 2026-04-10 09:54:48 -07:00
kompfner
d07eebff20 Merge pull request #4248 from omChauhanDev/add-openai-custom-tools-support
Add custom_tools support for OpenAI adapters
2026-04-10 10:27:28 -04:00
Paul Kompfner
fc3307bc63 Use OpenAI SDK types for tool params in adapters and tests
These are TypedDicts (plain dicts at runtime), so no behavioral change
— just more descriptive type hints for readers. Use ToolParam instead
of FunctionToolParam for the Responses adapter to reflect that custom
non-function tools are supported. Use ChatCompletionToolParam instead
of Any for the completions adapter return type. Update tests to use
typed params in expected values.
2026-04-10 10:15:39 -04:00
Aleix Conchillo Flaqué
43ddbdf1ec Merge pull request #3797 from iamjr15/fix/idle-processor-event-race
Fix asyncio.Event race conditions in idle processors
2026-04-09 16:04:03 -07:00
iamjr15
565349d332 Fix asyncio.Event race conditions in idle processors
Move event.clear() from finally block to success path in
IdleFrameProcessor and UserIdleProcessor._idle_task_handler().
The finally block unconditionally cleared signals set during
async timeout callbacks, causing false-positive idle detection.

Closes #3402
2026-04-09 13:41:01 -07:00
Cale Shapera
ec574edd53 Add Inworld Realtime Service (#4140)
* Add Inworld Realtime LLM service

Adds a WebSocket-based realtime service for Inworld's cascade
STT/LLM/TTS API with semantic VAD, function calling, and streaming
transcription support.

New files:
- src/pipecat/services/inworld/realtime/ (service, events)
- src/pipecat/adapters/services/inworld_realtime_adapter.py
- examples/foundational/19zb-inworld-realtime.py

Also includes:
- websockets dependency for inworld extra in pyproject.toml
- Adapter and settings tests matching OpenAI/Grok realtime patterns
- Fix for double-response when server-side VAD is enabled

* Prefer init-provided system instruction in Inworld Realtime

Adopt _resolve_system_instruction() from BaseLLMAdapter, matching the
pattern applied to OpenAI Realtime, Grok Realtime, Gemini Live, and
Nova Sonic in the pk/realtime-services-init-v-context-system-instructions-cleanup
branch.

* Update changelog entry with PR number

* Fix changelog format to use bullet point

* Polish PR: default model, example cleanup, changelog update

- Change default model from gpt-4.1-nano to gpt-4.1-mini
- Add function calling demo to example
- Remove demo-testing artifact from system instruction
- Mention Router support in changelog

* Address PR review feedback for Inworld Realtime

- Move example to examples/realtime/realtime-inworld.py
- Change initial context role from "user" to "developer"
- Remove explicit sample rates from example; sync them in
  _ensure_audio_config so Inworld gets the transport's actual rates
- Add audio race condition guard in _handle_evt_audio_delta (matches
  OpenAI realtime pattern)
- Convert remaining "system"/"developer" messages to "user" in adapter
- Add clarifying comment for local-VAD vs server-VAD metrics paths

* Simplify example, add provider tracking, remove local VAD path

- Remove function calling from example, switch model to xai/grok-4-1-fast-non-reasoning
- Add pipecat-realtime session key prefix and provider_data metadata
  for Inworld traffic attribution
- Remove local VAD code path (Inworld only supports server-side VAD)
- Use typed InputAudioBufferAppendEvent for audio sends

* Default TTS model to inworld-tts-1.5-max

* Remove dead shimmed tools code, set STT/VAD defaults

- Remove non-functional AdapterType.SHIM custom tools code from adapter
- Default STT model to assemblyai/u3-rt-pro
- Default VAD eagerness to low
2026-04-09 13:04:17 -04:00
Om Chauhan
1443dfb070 added changelog 2026-04-08 08:48:26 +05:30
Om Chauhan
4bef85e363 added custom_tools support for OpenAI adapters 2026-04-08 08:40:03 +05:30
Filipi da Silva Fuchter
27a8a973b1 Merge pull request #4201 from pipecat-ai/mb/handle-recurring-disconnects
Fix WebsocketService infinite reconnection loop
2026-04-07 11:02:24 -03:00
Filipi da Silva Fuchter
6eccd16543 Merge pull request #4217 from pipecat-ai/filipi/async_tools
Supporting async function calls.
2026-04-07 09:35:03 -03:00
Paul Kompfner
70469e3c0c Assert no LLMContextFrame when run_llm is not set in message frame tests 2026-04-03 11:34:58 -04:00
Paul Kompfner
6111df947e Test LLMAssistantAggregator handling of upstream message frames
Add tests for LLMRunFrame, LLMMessagesAppendFrame, LLMMessagesUpdateFrame,
and LLMMessagesTransformFrame sent upstream to LLMAssistantAggregator,
mirroring the existing LLMUserAggregator downstream tests. Add
frames_to_send_direction param to run_test helper to support this.
2026-04-03 11:34:58 -04:00
Paul Kompfner
4eebfd65d9 Add a LLMMessagesTransformFrame to facilitate programmatically editing context in a frame-based way.
The previous approach required the caller to directly grab a reference to the context object, grab a "snapshot" of its messages *at that point in time*, transform the messages, and then push an `LLMMessagesUpdateFrame` with the transformed messages. This approach can lead to problems: what if there had already been a change to the context queued in the pipeline? The transformed messages would simply overwrite it without consideration.
2026-04-03 11:34:50 -04:00
Mark Backman
fbb49ffc8d Merge pull request #4233 from pipecat-ai/mb/remove-unused-imports-2026-04-02
Remove unused imports across codebase
2026-04-03 07:26:13 -04:00
Mark Backman
8adb38f87c Remove unused imports across codebase 2026-04-02 22:21:16 -04:00
Mark Backman
41e46ee69e Remove deprecated vad_events and should_interrupt from DeepgramSTTService
Deepgram's built-in VAD events were deprecated in 0.0.99 in favor of
Silero VAD. This removes vad_events from settings and LiveOptions,
the should_interrupt parameter, the vad_enabled property,
_on_speech_started/_on_utterance_end handlers, and simplifies
_on_message and process_frame accordingly.
2026-04-02 22:05:49 -04:00
Mark Backman
793ed8f9e3 Remove deprecated UserBotLatencyLogObserver and UserIdleProcessor
UserBotLatencyLogObserver (deprecated 0.0.102) is replaced by
UserBotLatencyObserver. UserIdleProcessor (deprecated 0.0.100) is
replaced by LLMUserAggregator with user_idle_timeout.
2026-04-02 21:54:36 -04:00
filipi87
929a0e33f4 Fixing the automated tests. 2026-04-02 16:58:28 -03:00
Aleix Conchillo Flaqué
976c644f90 Fix tests to expect SpeechControlParamsFrame from default turn strategy 2026-04-02 12:42:06 -07:00
Mark Backman
d503383c23 Remove deprecated interruption_strategies plumbing
The interruption_strategies mechanism was deprecated in v0.0.99 in favor
of LLMUserAggregator's user_turn_strategies. All evaluation logic was
already removed — this removes the remaining field definitions, property,
StartFrame propagation, conditional check in base_input.py, strategy
files, and test.
2026-04-02 11:19:17 -04:00
Mark Backman
2a118084bd Remove deprecated transcript_processor module 2026-04-02 10:57:05 -04:00
Mark Backman
87e8ed109a Remove deprecated STTMuteFilter, STTMuteConfig, and STTMuteStrategy 2026-04-02 10:52:41 -04:00
Mark Backman
41e3afbc2f Remove deprecated add_pattern_pair method from PatternPairAggregator 2026-04-02 10:28:01 -04:00
kompfner
a3c7f6c2af Merge pull request #4215 from pipecat-ai/pk/remove-openaillmcontext
Remove deprecated `OpenAILLMContext` as well as everything (code path…
2026-04-01 14:03:35 -04:00
Paul Kompfner
ebab75765d Fix stream cancellation tests to mock get_chat_completions
The tests were mocking the removed _stream_chat_completions_*_context
methods. Update them to mock get_chat_completions instead.
2026-03-31 18:54:23 -04:00
Paul Kompfner
394599d031 Remove deprecated OpenAILLMContext as well as everything (code paths or whole types) dependent on it (all of which were also deprecated) 2026-03-31 18:15:25 -04:00
mattie ruth backman
0f47076703 More RTVI version parsing improvements 2026-03-31 16:05:53 -04:00
mattie ruth backman
3e255f3d21 improve version format check 2026-03-31 16:05:53 -04:00
mattie ruth backman
565b9b961d add tests for rtvi versioning 2026-03-31 16:05:53 -04:00
Mark Backman
7501effad5 Remove deprecated service module shims and old implementations
Delete deprecated import shims that only re-export from new locations:
- services/ai_services.py
- services/gemini_multimodal_live/
- services/aws_nova_sonic/
- services/openai_realtime/
- services/deepgram/{stt,tts}_sagemaker.py
- services/google/{llm_openai,llm_vertex,google}.py
- services/google/gemini_live/llm_vertex.py
- services/riva/
- services/nim/

Remove deprecated implementations replaced by newer services:
- services/openai_realtime_beta/ (use openai.realtime)
- services/google/openai/ (use google.llm)

Also removes associated examples and tests for deleted services.
2026-03-31 15:34:14 -04:00
Paul Kompfner
712e42533d Introduce WebsocketLLMService and refactor OpenAIResponsesLLMService to use it
Add WebsocketLLMService as a base class for WebSocket-based LLM services,
parallel to WebsocketTTSService/WebsocketSTTService but codifying a
transactional request-response model rather than a continuous background
receive loop.

WebsocketLLMService provides:
- Connection lifecycle (start/stop/cancel → connect/disconnect)
- _ws_send/_ws_recv with transparent ConnectionClosed handling
  (auto-reconnect via exponential backoff → WebsocketReconnectedError)
- _ensure_connected with retry via _try_reconnect

OpenAIResponsesLLMService now inherits from WebsocketLLMService, removing
duplicated connection management code (_connect, _disconnect, _reconnect,
_ensure_connected, _ws_send, start, stop, cancel) and simplifying
_process_context from a loop with attempt tracking to a flat try/except
with a single retry.
2026-03-30 22:26:31 -04:00
Mark Backman
f6a3678f93 Improve tests 2026-03-30 12:46:30 -04:00
Mark Backman
86a16d53bc Detect quick connection failures in WebsocketService to prevent infinite reconnection loops
When a WebSocket server accepts the handshake but immediately closes the
connection (e.g. invalid API key returning close code 1008), the existing
exponential backoff does not help because the handshake keeps succeeding.
This tracks how long each connection survives and emits a non-fatal
ErrorFrame after 3 consecutive sub-5s failures, allowing ServiceSwitcher
failover instead of killing the pipeline.

Fixes #3711
2026-03-30 12:23:11 -04:00
Paul Kompfner
26f85687d6 Handle response cancellation by draining before next inference
Instead of trying to filter stale events inline (unreliable — the API
doesn't provide a way to correlate events to a specific response),
drain remaining events from a cancelled response before starting the
next one. On cancellation, send response.cancel and set a drain flag.
At the start of the next _process_context, read and discard events
until a terminal event arrives, ensuring a clean connection. Falls
back to reconnecting if draining times out.
2026-03-30 09:59:03 -04:00
Paul Kompfner
9defff2a34 Skip server-known output items in previous_response_id optimization
When using previous_response_id, the server already knows its own
output from the previous response. Store the raw response output and,
on the next call, compare it against the items following the matched
input prefix — checking role and text content for messages, and call_id
for function calls. If the items match, skip them and send only truly
new input (user messages, tool results). Falls back to full context if
either the prefix or the output comparison fails.
2026-03-30 09:59:03 -04:00
Paul Kompfner
f2a8a9e753 Add WebSocket-based OpenAI Responses LLM service with previous_response_id optimization
Introduce a WebSocket variant of the OpenAI Responses API service that
maintains a persistent connection to wss://api.openai.com/v1/responses
for lower-latency inference. The WebSocket variant automatically uses
previous_response_id to send only incremental context when possible,
falling back to full context on reconnection or cache miss.

The WebSocket variant becomes the new default OpenAIResponsesLLMService,
and the HTTP variant is renamed to OpenAIResponsesHttpLLMService. Both
share a private base class with common settings, parameter building,
and run_inference (always HTTP) logic.
2026-03-30 09:58:56 -04:00
Mark Backman
8c9e189394 Fix langchain imports for langchain 1.x compatibility
ChatPromptTemplate moved from langchain.prompts to langchain_core.prompts
in langchain 1.x.
2026-03-29 10:27:48 -04:00
OmercohenAviv
5fe48da2fb Merge branch 'main' into fix/heartbeat-monitor-configurable 2026-03-28 11:57:23 +03:00
OmercohenAviv
dccd98ec8a test 2026-03-28 11:53:51 +03:00
Mark Backman
47e53890e3 Fix FastAPI WebSocket disconnect race condition causing pipeline hang
When the remote side disconnects while send() is in flight, send() was
setting _closing=True. This prevented the receive loop from firing
on_client_disconnected, causing the pipeline to hang waiting for a
disconnect signal that never came.

The fix removes _closing from send() (that flag means we initiated the
close) and instead checks Starlette application_state in _can_send()
to suppress subsequent sends after a failure.

Fixes #3912
2026-03-28 00:01:25 -04:00
Mark Backman
5c51981207 Merge pull request #4149 from pipecat-ai/mb/fix-service-switcher-passthrough-errors
Fix ServiceSwitcher reacting to pass-through ErrorFrames
2026-03-26 16:34:45 -04:00
Mark Backman
c331c75d66 Add tests for send_media() exception handling in DeepgramSTTService 2026-03-26 09:20:58 -04:00
Mark Backman
7fef3b01eb Merge pull request #4142 from pipecat-ai/mb/grok-move-to-xai-module
Consolidate Grok services into xai module
2026-03-25 23:32:18 -04:00
Mark Backman
fdbdbc8be3 Fix ServiceSwitcher reacting to pass-through ErrorFrames from other pipeline stages
ErrorFrames propagating upstream from downstream processors (e.g. TTS) would
enter the ServiceSwitcher via process_frame, traverse the active service sub-pipeline,
and reach push_frame where they incorrectly triggered failover. Now only errors whose
processor is one of the managed services trigger handle_error. Also fix the log in
handle_error to attribute errors to the actual source processor rather than the
current active_service.

Closes #4139
2026-03-25 22:53:04 -04:00