pipecat

Author	SHA1	Message	Date
filipi87	6ef7f6446a	Saving the audio inside the Tavus video so we can test.	2026-05-21 09:01:12 -03:00
filipi87	7c61c36825	Recording the audios that we are receiving.	2026-05-20 19:03:01 -03:00
filipi87	1338da6831	Don't inject silence in the proxy.	2026-05-20 18:26:11 -03:00
filipi87	6a238e0d62	Refactoring how we are handling the silence.	2026-05-20 17:41:11 -03:00
filipi87	e7bad7a007	Buffering the audio before sending back.	2026-05-20 17:23:15 -03:00
filipi87	b360fbf7fc	Handling interruption.	2026-05-20 16:26:47 -03:00
filipi87	f568b1d8df	Fixing ruff format.	2026-05-20 12:48:25 -03:00
filipi87	fd7af7ba9f	Changing the silence threshold to 10.	2026-05-20 12:45:31 -03:00
filipi87	b7d272a5be	Skipping webrtc injected silence.	2026-05-20 12:18:43 -03:00
filipi87	996aa461ac	Sending audio faster than realtime.	2026-05-20 12:03:24 -03:00
Aleix Conchillo Flaqué	f5158d51e7	Add filter-incomplete + function-calling turn-management example A copy of ``turn-management-filter-incomplete-turns.py`` extended with a ``get_weather(location)`` direct function. Exercises the path where the LLM responds to a complete user turn by calling a tool — used to reproduce (and now verify the fix for) the ``_user_speaking`` gating bug between filter-incomplete and function calls.	2026-05-15 14:54:51 -07:00
Paul Kompfner	1a4a6f4edf	refactor(gemini-live): bring tool-result handling in line with the canonical realtime pattern Lays groundwork for cancel_on_interruption=False support on Gemini Live by restructuring _process_completed_function_calls to match the shape used by AWSNovaSonicLLMService and OpenAIRealtimeLLMService in #4441: a single-pass forward iteration over raw context messages that detects async-tool messages via async_tool_messages.parse_message and routes them — started skipped silently, intermediate logged-as-error and surfaced via push_error, final delivered via the formal FunctionResponse channel. Replaces the prior two-pass structure that went through the adapter for sync results — the service now uses a lightweight self._tool_call_id_to_name map (populated when the model issues tool calls) for the name lookup the adapter used to provide. Extracts a new GeminiLLMAdapter.to_function_response_dict static method for the dict-coercion logic that wraps non-dict tool returns as {value: <result>} for Gemini's FunctionResponse.response field; the adapter's existing inline copy in _from_standard_message uses it too. Example consolidation: - Folds realtime-gemini-live-function-calling.py into the base realtime-gemini-live.py example so the base exercises function calling out of the box (matching realtime-openai.py and realtime-aws-nova-sonic.py). - Renames realtime-gemini-live-vertex-function-calling.py to realtime-gemini-live-vertex.py, mirroring the consolidation. - Adds realtime-gemini-live-async-tool.py. - Updates scripts/evals/run-release-evals.py for the renames. This commit alone doesn't make cancel_on_interruption=False fully work on Gemini Live — additional investigation is pending. This is foundational work to be built on.	2026-05-08 16:42:54 -04:00
Aleix Conchillo Flaqué	ea3585146c	chore(scripts): add release-changelog.py Adds a script to unfill (single-line) entry paragraphs in CHANGELOG.md while keeping `(PR [...])` on its own continuation line.	2026-04-27 15:07:53 -07:00
Mark Backman	10e58d6e42	Fix type errors in scripts and add to pyright checked set	2026-04-21 16:17:49 -04:00
Mark Backman	84891de04d	Add voice/xai-http.py to release evals	2026-04-21 15:49:59 -04:00
filipi87	0340e25e9f	Fixing typecheck for service switcher.	2026-04-17 12:44:57 -03:00
Aleix Conchillo Flaqué	b3bb6fdaa5	Modernize Python typing across the codebase Automated via ruff UP006, UP007, UP035, UP045 rules (target: py311): - Replace `typing.List`, `Dict`, `Tuple`, `Set`, `FrozenSet`, `Type` with their built-in equivalents (`list`, `dict`, `tuple`, etc.) - Replace `typing.Optional[X]` with `X \| None` - Replace `typing.Union[X, Y]` with `X \| Y` - Move `Mapping`, `Sequence`, `Callable`, `Awaitable`, `MutableMapping`, `MutableSequence`, `Iterator`, `AsyncIterator`, `AsyncGenerator` imports from `typing` to `collections.abc` - Remove now-unused `typing` imports - Add `from __future__ import annotations` to 5 files that use forward-reference strings in `X \| "Y"` annotations	2026-04-16 09:28:23 -07:00
Mark Backman	9ffcccdd84	Merge pull request #4253 from pipecat-ai/mb/mistral-stt Add Mistral Voxtral Realtime STT service	2026-04-15 09:00:27 -04:00
Aleix Conchillo Flaqué	153814ecc2	scripts/evals: create recording subdirectories when saving audio Example files can live under subdirectories (e.g. foundational/01.py), so the recording path needs its parent directory created before the audio file is written.	2026-04-10 13:19:20 -07:00
Mark Backman	215b2dc7f3	Add voice-mistral to evals	2026-04-07 15:37:07 -04:00
kompfner	a3c7f6c2af	Merge pull request #4215 from pipecat-ai/pk/remove-openaillmcontext Remove deprecated `OpenAILLMContext` as well as everything (code path…	2026-04-01 14:03:35 -04:00
Mark Backman	3ca656cae5	Update simli name to match others	2026-03-31 22:54:21 -04:00
Mark Backman	6a84d02156	Update evals - Removed evals for removed services - Added eval for function-calling-deepseek.py	2026-03-31 22:13:52 -04:00
Mark Backman	080da8b94c	Update eval script paths to match renamed example files	2026-03-31 22:09:42 -04:00
Paul Kompfner	394599d031	Remove deprecated `OpenAILLMContext` as well as everything (code paths or whole types) dependent on it (all of which were also deprecated)	2026-03-31 18:15:25 -04:00
Mark Backman	47b41a0ff7	Rename services/ to voice/ and function-calling/, flatten to top level Replace the nested services/speech/ and services/function-calling/ with top-level voice/ and function-calling/ directories. Update eval script paths and README to match.	2026-03-31 15:20:03 -04:00
Mark Backman	f14638a1fd	Revert "Flatten services/ nesting: promote speech and function-calling to top level" This reverts commit `e1939ecd44`.	2026-03-31 14:59:23 -04:00
Mark Backman	e1939ecd44	Flatten services/ nesting: promote speech and function-calling to top level Move services/speech/* directly into services/ and services/function-calling/* into top-level function-calling/. Update eval script paths and README.	2026-03-31 14:55:22 -04:00
Mark Backman	e719cbbe6d	Reorganize examples into topic-based subfolders Move 304 examples from a flat numbered directory into 14 descriptive subfolders: getting-started, services (speech + function-calling), transcription, vision, realtime, persistent-context, context-summarization, update-settings (stt/tts/llm), turn-management, thinking-and-mcp, transports, video-avatar, video-processing, and features. Strip numbered prefixes from filenames (e.g. 07c-interruptible-deepgram.py becomes services/speech/deepgram.py) since the folder context makes them redundant. Keep numbered prefixes only in getting-started/ where ordering matters. Update eval script paths and README to match the new structure.	2026-03-31 13:12:24 -04:00
Mark Backman	f2ce7ececc	Move foundational examples to examples/	2026-03-31 13:12:24 -04:00
Paul Kompfner	b5683556d4	Remove duplicate entries in run-release-evals.py, which appeared after a rebase	2026-03-30 10:03:43 -04:00
Paul Kompfner	f2a8a9e753	Add WebSocket-based OpenAI Responses LLM service with previous_response_id optimization Introduce a WebSocket variant of the OpenAI Responses API service that maintains a persistent connection to wss://api.openai.com/v1/responses for lower-latency inference. The WebSocket variant automatically uses previous_response_id to send only incremental context when possible, falling back to full context on reconnection or cache miss. The WebSocket variant becomes the new default OpenAIResponsesLLMService, and the HTTP variant is renamed to OpenAIResponsesHttpLLMService. Both share a private base class with common settings, parameter building, and run_inference (always HTTP) logic.	2026-03-30 09:58:56 -04:00
Mark Backman	2177e28ee1	Remove OpenPipe integration OpenPipe was acquired by CoreWeave in September 2025. The Python package hasn't been updated since June 2025 and the repo since 2024. The openpipe package caps openai<=1.97.1, creating dependency conflicts with other extras. Remove the dead integration to clean up the codebase.	2026-03-29 10:12:35 -04:00
Mark Backman	63254fe337	Add NebiusLLMService with developer role and tool support fixes - Add Nebius LLM service wrapping OpenAI-compatible Token Factory API - Set supports_developer_role = False (Nebius rejects developer role) - Default to openai/gpt-oss-120b model (supports function calling) - Add Nebius function-calling example and env.example entry - Fix Sarvam developer role support - Update examples to use developer role for intro messages	2026-03-29 08:50:11 -04:00
Mark Backman	d8b0ed18fd	Fix example numbering, add LemonSlice to evals	2026-03-27 10:11:37 -04:00
Mark Backman	21a729ae5d	Merge pull request #4146 from pipecat-ai/mb/gemini-live-local-vad	2026-03-26 17:48:21 -04:00
Mark Backman	fe0633ecd1	Add 14s to release evals	2026-03-26 12:27:27 -04:00
Mark Backman	503e5e9106	Fix Gemini Live local VAD by sending correct activity events to server When Gemini Live was configured with local VAD (server-side VAD disabled), the service was listening for the wrong frame types and not sending ActivityStart/ActivityEnd events to the server. Now it listens for VADUserStartedSpeakingFrame/VADUserStoppedSpeakingFrame and sends the appropriate activity signals when local VAD is in use. Also removes the unnecessary local SileroVADAnalyzer from server-side VAD examples and adds a new 26a example demonstrating local VAD configuration.	2026-03-25 18:00:13 -04:00
Mark Backman	adc003d6c7	Code review cleanup	2026-03-25 10:53:07 -04:00
Paul Kompfner	e0bc9c73c6	Add Anthropic interruptible example (07e) and register in release evals	2026-03-24 16:02:42 -04:00
Mark Backman	6eb988b729	Merge pull request #4092 from harshitajain165/harshita/smallest-tts-only Add Smallest AI TTS service integration	2026-03-24 11:54:34 -04:00
Mark Backman	51d28b4a9f	Code review fixes	2026-03-24 11:21:04 -04:00
kompfner	cf083b8411	Merge pull request #4078 from pipecat-ai/cb/gemini-updates Updates for Gemini Live	2026-03-24 11:18:00 -04:00
Mark Backman	aa0b49d69f	Code review fixes	2026-03-24 09:22:08 -04:00
dhruvladia-sarvam	349b8645f3	Merge branch 'main' into feat/sarvam-llm-integration	2026-03-24 16:34:12 +05:30
dhruvladia-sarvam	696196e30c	alignment with pr 4081	2026-03-24 16:29:58 +05:30
Mark Backman	d314e2831a	Simplify 26 name, update evals	2026-03-23 15:46:13 -04:00
Paul Kompfner	b1a8588209	feat: add 12- and 14d- image/video examples for OpenAI Responses	2026-03-18 15:39:06 -04:00
Paul Kompfner	45186cc4ce	feat: add OpenAI Responses API LLM service Add OpenAIResponsesLLMService using the Responses API, with a dedicated adapter that converts LLMContext messages to Responses API input items (system→developer, tool_calls→function_call, tool→function_call_output, multimodal content conversion, and tools schema flattening). - New adapter: open_ai_responses_adapter.py - New service: openai/responses/llm.py - Examples: 07-interruptible and 14-function-calling variants - 19 unit tests for adapter conversion logic - Eval entries for both examples	2026-03-18 11:45:23 -04:00
Mark Backman	786279f143	Remove unused imports, 2026-03-07	2026-03-09 12:44:47 -04:00

1 2 3 4

168 Commits