pipecat

Author	SHA1	Message	Date
Om Chauhan	e22f9f84bb	fixed MCPClient to reuse session across tool calls	2026-04-02 18:06:28 -05:00
Mark Backman	0c59819682	Remove allow_interruptions from voice-sarvam example This was missed from the allow_interruptions removal commit.	2026-04-02 11:32:44 -04:00
Mark Backman	e74930b954	Remove deprecated text_aggregator and text_filter params from TTS Remove the deprecated text_aggregator parameter from TTSService, CartesiaTTSService, and RimeTTSService, and the deprecated text_filter parameter from TTSService. Users should use LLMTextProcessor before the TTS service instead. Update the voice-switching example to use LLMTextProcessor with PatternPairAggregator.	2026-04-01 17:03:05 -04:00
Harshita Jain	bd6cbd7fe7	feat: add Smallest AI STT service integration (#4162 ) Add SmallestSTTService using the Pulse WebSocket API for real-time transcription. Includes SmallestSTTSettings dataclass, 32-language support with resolve_language fallback, VAD-driven finalize signal, and SMALLEST_TTFS_P99 latency constant. Also adds X-Source and X-Pipecat-Version headers to Smallest STT and TTS WebSocket connections.	2026-04-01 13:44:04 -04:00
Mark Backman	3ca656cae5	Update simli name to match others	2026-03-31 22:54:21 -04:00
Mark Backman	d3021b4590	Rename example files to prepend parent folder name, preventing package shadowing Example files like openai.py shadow installed packages when Python adds the script directory to sys.path. Prepend the parent folder name to each example file (e.g. openai.py -> function-calling-openai.py). Also split thinking-and-mcp/ into separate mcp/ and thinking/ directories.	2026-03-31 22:06:01 -04:00
Mark Backman	7501effad5	Remove deprecated service module shims and old implementations Delete deprecated import shims that only re-export from new locations: - services/ai_services.py - services/gemini_multimodal_live/ - services/aws_nova_sonic/ - services/openai_realtime/ - services/deepgram/{stt,tts}_sagemaker.py - services/google/{llm_openai,llm_vertex,google}.py - services/google/gemini_live/llm_vertex.py - services/riva/ - services/nim/ Remove deprecated implementations replaced by newer services: - services/openai_realtime_beta/ (use openai.realtime) - services/google/openai/ (use google.llm) Also removes associated examples and tests for deleted services.	2026-03-31 15:34:14 -04:00
Mark Backman	27cb078716	Add missing google-vertex.py file	2026-03-31 15:25:52 -04:00
Mark Backman	47b41a0ff7	Rename services/ to voice/ and function-calling/, flatten to top level Replace the nested services/speech/ and services/function-calling/ with top-level voice/ and function-calling/ directories. Update eval script paths and README to match.	2026-03-31 15:20:03 -04:00
Mark Backman	f14638a1fd	Revert "Flatten services/ nesting: promote speech and function-calling to top level" This reverts commit `e1939ecd44`.	2026-03-31 14:59:23 -04:00
Mark Backman	e1939ecd44	Flatten services/ nesting: promote speech and function-calling to top level Move services/speech/* directly into services/ and services/function-calling/* into top-level function-calling/. Update eval script paths and README.	2026-03-31 14:55:22 -04:00
Mark Backman	1d85aedcae	Split features/ into audio/, observability/, and rag/ subfolders Extract focused example groups from the catch-all features/ folder: - audio/: audio recording, background sound, sound effects - observability/: observer, heartbeats, sentry metrics - rag/: mem0, gemini-rag, gemini grounding metadata Update README to document the new folders.	2026-03-31 13:15:06 -04:00
Mark Backman	e719cbbe6d	Reorganize examples into topic-based subfolders Move 304 examples from a flat numbered directory into 14 descriptive subfolders: getting-started, services (speech + function-calling), transcription, vision, realtime, persistent-context, context-summarization, update-settings (stt/tts/llm), turn-management, thinking-and-mcp, transports, video-avatar, video-processing, and features. Strip numbered prefixes from filenames (e.g. 07c-interruptible-deepgram.py becomes services/speech/deepgram.py) since the folder context makes them redundant. Keep numbered prefixes only in getting-started/ where ordering matters. Update eval script paths and README to match the new structure.	2026-03-31 13:12:24 -04:00
Mark Backman	f2ce7ececc	Move foundational examples to examples/	2026-03-31 13:12:24 -04:00
kompfner	bd7496fa27	Merge pull request #4211 from pipecat-ai/pk/openai-responses-websocket-service-refactor Introduce WebsocketLLMService and refactor OpenAIResponsesLLMService …	2026-03-31 13:02:45 -04:00
Paul Kompfner	0a8bcf58c4	Register on_connection_error event handler in WebsocketLLMService	2026-03-31 10:52:33 -04:00
Paul Kompfner	30903042e5	Work around OpenAI Python SDK temperature bug in example	2026-03-31 10:16:30 -04:00
Mark Backman	32022a952e	Merge pull request #4205 from pipecat-ai/mb/remove-quickstart Remove quickstart example from repo	2026-03-30 18:58:49 -04:00
Mark Backman	b78ae40d3c	Remove quickstart example from repo	2026-03-30 18:20:41 -04:00
Aleix Conchillo Flaqué	dd1bea2a5f	audio(turn): remove FalSmartTurnAnalyzer and LocalSmartTurnAnalyzer	2026-03-30 14:04:29 -07:00
Aleix Conchillo Flaqué	f0d04dde1c	audio(filters): remove KrispFilter	2026-03-30 14:01:06 -07:00
Paul Kompfner	1c8d31de70	Add trace logging for previous_response_id decisions and fix example Add detailed trace-level logging to _apply_previous_response_optimization showing why the optimization was applied or fell back to full context, including the relevant data for debugging. Use append_to_context=False for the filler TTSSpeakFrame in the function-calling example to avoid altering the conversation history and breaking the previous_response_id prefix match.	2026-03-30 09:59:03 -04:00
Paul Kompfner	f2a8a9e753	Add WebSocket-based OpenAI Responses LLM service with previous_response_id optimization Introduce a WebSocket variant of the OpenAI Responses API service that maintains a persistent connection to wss://api.openai.com/v1/responses for lower-latency inference. The WebSocket variant automatically uses previous_response_id to send only incremental context when possible, falling back to full context on reconnection or cache miss. The WebSocket variant becomes the new default OpenAIResponsesLLMService, and the HTTP variant is renamed to OpenAIResponsesHttpLLMService. Both share a private base class with common settings, parameter building, and run_inference (always HTTP) logic.	2026-03-30 09:58:56 -04:00
Mark Backman	8c9e189394	Fix langchain imports for langchain 1.x compatibility ChatPromptTemplate moved from langchain.prompts to langchain_core.prompts in langchain 1.x.	2026-03-29 10:27:48 -04:00
Mark Backman	2177e28ee1	Remove OpenPipe integration OpenPipe was acquired by CoreWeave in September 2025. The Python package hasn't been updated since June 2025 and the repo since 2024. The openpipe package caps openai<=1.97.1, creating dependency conflicts with other extras. Remove the dead integration to clean up the codebase.	2026-03-29 10:12:35 -04:00
Mark Backman	63254fe337	Add NebiusLLMService with developer role and tool support fixes - Add Nebius LLM service wrapping OpenAI-compatible Token Factory API - Set supports_developer_role = False (Nebius rejects developer role) - Default to openai/gpt-oss-120b model (supports function calling) - Add Nebius function-calling example and env.example entry - Fix Sarvam developer role support - Update examples to use developer role for intro messages	2026-03-29 08:50:11 -04:00
Aleix Conchillo Flaqué	8b64166bb7	Fix Sarvam examples to use 'user' role instead of 'developer' Sarvam uses the OpenAI-compatible API but does not support the 'developer' role, causing errors. Use 'user' role instead.	2026-03-27 20:33:25 -07:00
Paul Kompfner	5caf53f086	Tweak 26i example system instruction for Gemini 3.1 Flash Live compatibility Gemini 3.1 Flash Live won't reliably report ending its turn until after it says something following a tool call. Restructure the system instruction so the model says goodbye after calling end_conversation, and add a comment explaining the deferred EndFrame behavior that makes this work.	2026-03-27 17:13:17 -04:00
Paul Kompfner	04adb697be	Warn when TEXT modality is set for Gemini Live, and remove 26d text example All recent Gemini Live models (including the default gemini-2.5-flash-native-audio-preview-12-2025, and going at least as far back as gemini-2.5-flash-native-audio-preview-09-2025) only support AUDIO as a response modality. We considered using `modalities=TEXT` as a Pipecat-level signal to suppress audio output frames (so developers could pair Gemini Live with an external TTS), but the output transcription from the API arrives too late relative to the audio to be useful for driving an external TTS service. For now, just log a warning when a TEXT modality is configured (at init or via set_model_modalities) and proceed as normal. The 26d text-modality example is removed since it no longer represents a viable configuration.	2026-03-27 16:21:15 -04:00
filipi87	f9670b9601	Removing the models from the Inworld example so we can use the default model.	2026-03-27 14:23:20 -03:00
Mark Backman	cbb3d99493	Merge pull request #4166 from pipecat-ai/mb/fix-example-ordering-56 Fix example numbering, add LemonSlice to evals	2026-03-27 10:29:07 -04:00
Filipi da Silva Fuchter	fb1996cedc	Merge pull request #4143 from pipecat-ai/cb/sagemaker-flux Add Deepgram Flux STT service for AWS SageMaker	2026-03-27 10:27:49 -04:00
Mark Backman	d8b0ed18fd	Fix example numbering, add LemonSlice to evals	2026-03-27 10:11:37 -04:00
Mark Backman	21a729ae5d	Merge pull request #4146 from pipecat-ai/mb/gemini-live-local-vad	2026-03-26 17:48:21 -04:00
filipi87	28683a7296	Moving flux_stt.py to deepgram/flux/sagemaker/stt.py	2026-03-26 17:43:51 -03:00
Mark Backman	533dcdba3f	Merge pull request #4154 from pipecat-ai/mb/deprecate-sambanova-stt Remove SambaNovaSTTService	2026-03-26 14:10:14 -04:00
Mark Backman	9c6d51c570	feat(mem0): add get_memories() convenience method to Mem0MemoryService Expose a public method for retrieving all stored memories outside the pipeline, avoiding the need for callers to reimplement client branching, OR filter construction, and asyncio.to_thread wrapping. Simplify the example get_initial_greeting() to use it.	2026-03-26 13:28:41 -04:00
Mark Backman	ca2bfd6f12	Remove SambaNovaSTTService SambaNova no longer offers speech-to-text audio models.	2026-03-26 12:22:06 -04:00
Mark Backman	6d1918f12a	Update GROK_API_KEY to XAI_API_KEY	2026-03-25 23:23:58 -04:00
Mark Backman	503e5e9106	Fix Gemini Live local VAD by sending correct activity events to server When Gemini Live was configured with local VAD (server-side VAD disabled), the service was listening for the wrong frame types and not sending ActivityStart/ActivityEnd events to the server. Now it listens for VADUserStartedSpeakingFrame/VADUserStoppedSpeakingFrame and sends the appropriate activity signals when local VAD is in use. Also removes the unnecessary local SileroVADAnalyzer from server-side VAD examples and adds a new 26a example demonstrating local VAD configuration.	2026-03-25 18:00:13 -04:00
Chad Bailey	4f0b2066c0	Add Deepgram Flux STT service for AWS SageMaker Add DeepgramFluxSageMakerSTTService that combines SageMaker's HTTP/2 transport with Flux's JSON turn detection protocol (StartOfTurn, EndOfTurn, EagerEndOfTurn, TurnResumed). Includes mid-stream Configure support, silence watchdog, and an example bot.	2026-03-25 19:09:52 +00:00
Mark Backman	1c99a537b2	Consolidate Grok services into xai module Both GrokLLMService and XAIHttpTTSService use the same xAI API (api.x.ai), so move Grok source files into the xai module. Leave deprecation shims in the old grok/ paths for backward compatibility.	2026-03-25 12:07:40 -04:00
Mark Backman	adc003d6c7	Code review cleanup	2026-03-25 10:53:07 -04:00
Nicholas Zhao	bbd14de9c5	Address PR review: rename to XAIHttpTTSService, add language map, clean up API - Rename XAITTSService → XAIHttpTTSService and XAITTSSettings → XAIHttpTTSSettings - Add language_to_xai_language() with explicit LANGUAGE_MAP using resolve_language() - Remove deprecated InputParams, params, voice, language init params - Remove XAI_DEFAULT_SAMPLE_RATE and XAI_PCM_CODEC constants; add encoding param - Set sample_rate=None default (picked up from PipelineParams or user) - Use Language.EN enum instead of string "en" for default language - Add changelog/4031.added.md - Add 07e-interruptible-xai.py foundational example - Update 14g-function-calling-grok.py to use XAIHttpTTSService - Register 07e in run-release-evals.py	2026-03-25 10:46:54 -04:00
Paul Kompfner	ac2b1ecd47	Prefer init-provided system instruction in Grok Realtime Add system_instruction parameter to the Grok Realtime adapter's get_llm_invocation_params() and call _resolve_system_instruction() to prefer init-provided over context-provided system instructions and warn on conflicts. Previously context-provided took precedence. Update the Grok Realtime example to use settings.system_instruction instead of session_properties.instructions.	2026-03-24 17:29:19 -04:00
Paul Kompfner	e0c49927cf	Remove hard-coded model overrides from Together and Groq examples Prefer service defaults — the hard-coded models we were using are no longer available on these providers.	2026-03-24 16:05:15 -04:00
Paul Kompfner	7a0f7b58d1	Remove bit of unintentionally-left-in debugging logic	2026-03-24 16:05:15 -04:00
Paul Kompfner	5806a3f0fa	Use "developer" role for remaining developer-intent messages in examples	2026-03-24 16:05:04 -04:00
Paul Kompfner	d779a5b4ea	Use "developer" role for programmatic conversation-kickoff messages These messages are developer instructions to the assistant (e.g. "Please introduce yourself to the user"), not simulated user input. The "developer" role is semantically correct for this purpose.	2026-03-24 16:02:42 -04:00
Paul Kompfner	e0bc9c73c6	Add Anthropic interruptible example (07e) and register in release evals	2026-03-24 16:02:42 -04:00

1 2 3 4 5 ...

1838 Commits