159 Commits

Author SHA1 Message Date
Mark Backman
28f9203401 Code review fixes 2026-05-21 11:45:17 -04:00
Aleix Conchillo Flaqué
f5158d51e7 Add filter-incomplete + function-calling turn-management example
A copy of ``turn-management-filter-incomplete-turns.py`` extended with
a ``get_weather(location)`` direct function. Exercises the path where
the LLM responds to a complete user turn by calling a tool — used to
reproduce (and now verify the fix for) the ``_user_speaking`` gating
bug between filter-incomplete and function calls.
2026-05-15 14:54:51 -07:00
Paul Kompfner
1a4a6f4edf refactor(gemini-live): bring tool-result handling in line with the canonical realtime pattern
Lays groundwork for cancel_on_interruption=False support on Gemini Live by
restructuring _process_completed_function_calls to match the shape used by
AWSNovaSonicLLMService and OpenAIRealtimeLLMService in #4441: a single-pass
forward iteration over raw context messages that detects async-tool
messages via async_tool_messages.parse_message and routes them — started
skipped silently, intermediate logged-as-error and surfaced via push_error,
final delivered via the formal FunctionResponse channel.

Replaces the prior two-pass structure that went through the adapter for
sync results — the service now uses a lightweight self._tool_call_id_to_name
map (populated when the model issues tool calls) for the name lookup the
adapter used to provide. Extracts a new GeminiLLMAdapter.to_function_response_dict
static method for the dict-coercion logic that wraps non-dict tool returns
as {value: <result>} for Gemini's FunctionResponse.response field; the
adapter's existing inline copy in _from_standard_message uses it too.

Example consolidation:

- Folds realtime-gemini-live-function-calling.py into the base
  realtime-gemini-live.py example so the base exercises function calling
  out of the box (matching realtime-openai.py and realtime-aws-nova-sonic.py).
- Renames realtime-gemini-live-vertex-function-calling.py to
  realtime-gemini-live-vertex.py, mirroring the consolidation.
- Adds realtime-gemini-live-async-tool.py.
- Updates scripts/evals/run-release-evals.py for the renames.

This commit alone doesn't make cancel_on_interruption=False fully work on
Gemini Live — additional investigation is pending. This is foundational
work to be built on.
2026-05-08 16:42:54 -04:00
Aleix Conchillo Flaqué
ea3585146c chore(scripts): add release-changelog.py
Adds a script to unfill (single-line) entry paragraphs in CHANGELOG.md
while keeping `(PR [...])` on its own continuation line.
2026-04-27 15:07:53 -07:00
Mark Backman
10e58d6e42 Fix type errors in scripts and add to pyright checked set 2026-04-21 16:17:49 -04:00
Mark Backman
84891de04d Add voice/xai-http.py to release evals 2026-04-21 15:49:59 -04:00
filipi87
0340e25e9f Fixing typecheck for service switcher. 2026-04-17 12:44:57 -03:00
Aleix Conchillo Flaqué
b3bb6fdaa5 Modernize Python typing across the codebase
Automated via ruff UP006, UP007, UP035, UP045 rules (target: py311):

- Replace `typing.List`, `Dict`, `Tuple`, `Set`, `FrozenSet`, `Type`
  with their built-in equivalents (`list`, `dict`, `tuple`, etc.)
- Replace `typing.Optional[X]` with `X | None`
- Replace `typing.Union[X, Y]` with `X | Y`
- Move `Mapping`, `Sequence`, `Callable`, `Awaitable`,
  `MutableMapping`, `MutableSequence`, `Iterator`, `AsyncIterator`,
  `AsyncGenerator` imports from `typing` to `collections.abc`
- Remove now-unused `typing` imports
- Add `from __future__ import annotations` to 5 files that use
  forward-reference strings in `X | "Y"` annotations
2026-04-16 09:28:23 -07:00
Mark Backman
9ffcccdd84 Merge pull request #4253 from pipecat-ai/mb/mistral-stt
Add Mistral Voxtral Realtime STT service
2026-04-15 09:00:27 -04:00
Aleix Conchillo Flaqué
153814ecc2 scripts/evals: create recording subdirectories when saving audio
Example files can live under subdirectories (e.g. foundational/01.py),
so the recording path needs its parent directory created before the
audio file is written.
2026-04-10 13:19:20 -07:00
Mark Backman
215b2dc7f3 Add voice-mistral to evals 2026-04-07 15:37:07 -04:00
kompfner
a3c7f6c2af Merge pull request #4215 from pipecat-ai/pk/remove-openaillmcontext
Remove deprecated `OpenAILLMContext` as well as everything (code path…
2026-04-01 14:03:35 -04:00
Mark Backman
3ca656cae5 Update simli name to match others 2026-03-31 22:54:21 -04:00
Mark Backman
6a84d02156 Update evals
- Removed evals for removed services
- Added eval for function-calling-deepseek.py
2026-03-31 22:13:52 -04:00
Mark Backman
080da8b94c Update eval script paths to match renamed example files 2026-03-31 22:09:42 -04:00
Paul Kompfner
394599d031 Remove deprecated OpenAILLMContext as well as everything (code paths or whole types) dependent on it (all of which were also deprecated) 2026-03-31 18:15:25 -04:00
Mark Backman
47b41a0ff7 Rename services/ to voice/ and function-calling/, flatten to top level
Replace the nested services/speech/ and services/function-calling/ with
top-level voice/ and function-calling/ directories. Update eval script
paths and README to match.
2026-03-31 15:20:03 -04:00
Mark Backman
f14638a1fd Revert "Flatten services/ nesting: promote speech and function-calling to top level"
This reverts commit e1939ecd44.
2026-03-31 14:59:23 -04:00
Mark Backman
e1939ecd44 Flatten services/ nesting: promote speech and function-calling to top level
Move services/speech/* directly into services/ and services/function-calling/*
into top-level function-calling/. Update eval script paths and README.
2026-03-31 14:55:22 -04:00
Mark Backman
e719cbbe6d Reorganize examples into topic-based subfolders
Move 304 examples from a flat numbered directory into 14 descriptive
subfolders: getting-started, services (speech + function-calling),
transcription, vision, realtime, persistent-context,
context-summarization, update-settings (stt/tts/llm), turn-management,
thinking-and-mcp, transports, video-avatar, video-processing, and
features.

Strip numbered prefixes from filenames (e.g. 07c-interruptible-deepgram.py
becomes services/speech/deepgram.py) since the folder context makes them
redundant. Keep numbered prefixes only in getting-started/ where ordering
matters.

Update eval script paths and README to match the new structure.
2026-03-31 13:12:24 -04:00
Mark Backman
f2ce7ececc Move foundational examples to examples/ 2026-03-31 13:12:24 -04:00
Paul Kompfner
b5683556d4 Remove duplicate entries in run-release-evals.py, which appeared after a rebase 2026-03-30 10:03:43 -04:00
Paul Kompfner
f2a8a9e753 Add WebSocket-based OpenAI Responses LLM service with previous_response_id optimization
Introduce a WebSocket variant of the OpenAI Responses API service that
maintains a persistent connection to wss://api.openai.com/v1/responses
for lower-latency inference. The WebSocket variant automatically uses
previous_response_id to send only incremental context when possible,
falling back to full context on reconnection or cache miss.

The WebSocket variant becomes the new default OpenAIResponsesLLMService,
and the HTTP variant is renamed to OpenAIResponsesHttpLLMService. Both
share a private base class with common settings, parameter building,
and run_inference (always HTTP) logic.
2026-03-30 09:58:56 -04:00
Mark Backman
2177e28ee1 Remove OpenPipe integration
OpenPipe was acquired by CoreWeave in September 2025. The Python package
hasn't been updated since June 2025 and the repo since 2024. The openpipe
package caps openai<=1.97.1, creating dependency conflicts with other
extras. Remove the dead integration to clean up the codebase.
2026-03-29 10:12:35 -04:00
Mark Backman
63254fe337 Add NebiusLLMService with developer role and tool support fixes
- Add Nebius LLM service wrapping OpenAI-compatible Token Factory API
- Set supports_developer_role = False (Nebius rejects developer role)
- Default to openai/gpt-oss-120b model (supports function calling)
- Add Nebius function-calling example and env.example entry
- Fix Sarvam developer role support
- Update examples to use developer role for intro messages
2026-03-29 08:50:11 -04:00
Mark Backman
d8b0ed18fd Fix example numbering, add LemonSlice to evals 2026-03-27 10:11:37 -04:00
Mark Backman
21a729ae5d Merge pull request #4146 from pipecat-ai/mb/gemini-live-local-vad 2026-03-26 17:48:21 -04:00
Mark Backman
fe0633ecd1 Add 14s to release evals 2026-03-26 12:27:27 -04:00
Mark Backman
503e5e9106 Fix Gemini Live local VAD by sending correct activity events to server
When Gemini Live was configured with local VAD (server-side VAD disabled),
the service was listening for the wrong frame types and not sending
ActivityStart/ActivityEnd events to the server. Now it listens for
VADUserStartedSpeakingFrame/VADUserStoppedSpeakingFrame and sends the
appropriate activity signals when local VAD is in use.

Also removes the unnecessary local SileroVADAnalyzer from server-side VAD
examples and adds a new 26a example demonstrating local VAD configuration.
2026-03-25 18:00:13 -04:00
Mark Backman
adc003d6c7 Code review cleanup 2026-03-25 10:53:07 -04:00
Paul Kompfner
e0bc9c73c6 Add Anthropic interruptible example (07e) and register in release evals 2026-03-24 16:02:42 -04:00
Mark Backman
6eb988b729 Merge pull request #4092 from harshitajain165/harshita/smallest-tts-only
Add Smallest AI TTS service integration
2026-03-24 11:54:34 -04:00
Mark Backman
51d28b4a9f Code review fixes 2026-03-24 11:21:04 -04:00
kompfner
cf083b8411 Merge pull request #4078 from pipecat-ai/cb/gemini-updates
Updates for Gemini Live
2026-03-24 11:18:00 -04:00
Mark Backman
aa0b49d69f Code review fixes 2026-03-24 09:22:08 -04:00
dhruvladia-sarvam
349b8645f3 Merge branch 'main' into feat/sarvam-llm-integration 2026-03-24 16:34:12 +05:30
dhruvladia-sarvam
696196e30c alignment with pr 4081 2026-03-24 16:29:58 +05:30
Mark Backman
d314e2831a Simplify 26 name, update evals 2026-03-23 15:46:13 -04:00
Paul Kompfner
b1a8588209 feat: add 12- and 14d- image/video examples for OpenAI Responses 2026-03-18 15:39:06 -04:00
Paul Kompfner
45186cc4ce feat: add OpenAI Responses API LLM service
Add OpenAIResponsesLLMService using the Responses API, with a dedicated
adapter that converts LLMContext messages to Responses API input items
(system→developer, tool_calls→function_call, tool→function_call_output,
multimodal content conversion, and tools schema flattening).

- New adapter: open_ai_responses_adapter.py
- New service: openai/responses/llm.py
- Examples: 07-interruptible and 14-function-calling variants
- 19 unit tests for adapter conversion logic
- Eval entries for both examples
2026-03-18 11:45:23 -04:00
Mark Backman
786279f143 Remove unused imports, 2026-03-07 2026-03-09 12:44:47 -04:00
Mark Backman
cd28c82de3 Update examples to use the class Settings alias 2026-03-07 09:15:24 -05:00
Mark Backman
671e9a6846 TTS service and example updates 2026-03-06 20:53:22 -05:00
Aleix Conchillo Flaqué
593b75bc8b Update foundational examples to use "user" role
Use system_instruction on LLM service constructors instead of adding
system messages to LLMContext. Messages added to context now use
"user" role.
2026-03-06 09:53:33 -08:00
Mark Backman
ab37185208 Update run_eval_pipeline with the latest settings, system_instruction patterns 2026-03-06 08:32:59 -05:00
Mark Backman
62554a2390 Update examples 2026-03-06 08:30:00 -05:00
Aleix Conchillo Flaqué
3199168d3e scripts(evals): use context.add_message() 2026-03-05 19:14:06 -08:00
Aleix Conchillo Flaqué
1221e2dd76 Fix Daily transport log level and eval script import
Change participant_updated log from debug to trace (too noisy).
Fix deepgram LiveOptions import in eval script.
2026-03-05 16:37:02 -08:00
Mark Backman
eeb8ed8588 Remove Hathora service integration
Hathora is shutting down on March 5, 2026. Remove the STT/TTS services,
examples, and related references.
2026-03-04 22:10:06 -05:00
Mark Backman
65f563ad34 Add debug logging to KrispVivaTurn analyze_end_of_turn and update example
Move speech detection tracking outside the per-frame loop in append_audio
since is_speech applies to the whole buffer. Add debug log in
analyze_end_of_turn to show state and probability at decision time. Update
the Krisp VIVA example to use Cartesia TTS and turn analyzer strategy.
2026-02-23 21:35:35 -05:00