pipecat

Author	SHA1	Message	Date
Mark Backman	63254fe337	Add NebiusLLMService with developer role and tool support fixes - Add Nebius LLM service wrapping OpenAI-compatible Token Factory API - Set supports_developer_role = False (Nebius rejects developer role) - Default to openai/gpt-oss-120b model (supports function calling) - Add Nebius function-calling example and env.example entry - Fix Sarvam developer role support - Update examples to use developer role for intro messages	2026-03-29 08:50:11 -04:00
Aleix Conchillo Flaqué	8b64166bb7	Fix Sarvam examples to use 'user' role instead of 'developer' Sarvam uses the OpenAI-compatible API but does not support the 'developer' role, causing errors. Use 'user' role instead.	2026-03-27 20:33:25 -07:00
Paul Kompfner	5caf53f086	Tweak 26i example system instruction for Gemini 3.1 Flash Live compatibility Gemini 3.1 Flash Live won't reliably report ending its turn until after it says something following a tool call. Restructure the system instruction so the model says goodbye after calling end_conversation, and add a comment explaining the deferred EndFrame behavior that makes this work.	2026-03-27 17:13:17 -04:00
Paul Kompfner	04adb697be	Warn when TEXT modality is set for Gemini Live, and remove 26d text example All recent Gemini Live models (including the default gemini-2.5-flash-native-audio-preview-12-2025, and going at least as far back as gemini-2.5-flash-native-audio-preview-09-2025) only support AUDIO as a response modality. We considered using `modalities=TEXT` as a Pipecat-level signal to suppress audio output frames (so developers could pair Gemini Live with an external TTS), but the output transcription from the API arrives too late relative to the audio to be useful for driving an external TTS service. For now, just log a warning when a TEXT modality is configured (at init or via set_model_modalities) and proceed as normal. The 26d text-modality example is removed since it no longer represents a viable configuration.	2026-03-27 16:21:15 -04:00
filipi87	f9670b9601	Removing the models from the Inworld example so we can use the default model.	2026-03-27 14:23:20 -03:00
Mark Backman	cbb3d99493	Merge pull request #4166 from pipecat-ai/mb/fix-example-ordering-56 Fix example numbering, add LemonSlice to evals	2026-03-27 10:29:07 -04:00
Filipi da Silva Fuchter	fb1996cedc	Merge pull request #4143 from pipecat-ai/cb/sagemaker-flux Add Deepgram Flux STT service for AWS SageMaker	2026-03-27 10:27:49 -04:00
Mark Backman	d8b0ed18fd	Fix example numbering, add LemonSlice to evals	2026-03-27 10:11:37 -04:00
Mark Backman	21a729ae5d	Merge pull request #4146 from pipecat-ai/mb/gemini-live-local-vad	2026-03-26 17:48:21 -04:00
filipi87	28683a7296	Moving flux_stt.py to deepgram/flux/sagemaker/stt.py	2026-03-26 17:43:51 -03:00
Mark Backman	533dcdba3f	Merge pull request #4154 from pipecat-ai/mb/deprecate-sambanova-stt Remove SambaNovaSTTService	2026-03-26 14:10:14 -04:00
Mark Backman	9c6d51c570	feat(mem0): add get_memories() convenience method to Mem0MemoryService Expose a public method for retrieving all stored memories outside the pipeline, avoiding the need for callers to reimplement client branching, OR filter construction, and asyncio.to_thread wrapping. Simplify the example get_initial_greeting() to use it.	2026-03-26 13:28:41 -04:00
Mark Backman	ca2bfd6f12	Remove SambaNovaSTTService SambaNova no longer offers speech-to-text audio models.	2026-03-26 12:22:06 -04:00
Mark Backman	6d1918f12a	Update GROK_API_KEY to XAI_API_KEY	2026-03-25 23:23:58 -04:00
Mark Backman	503e5e9106	Fix Gemini Live local VAD by sending correct activity events to server When Gemini Live was configured with local VAD (server-side VAD disabled), the service was listening for the wrong frame types and not sending ActivityStart/ActivityEnd events to the server. Now it listens for VADUserStartedSpeakingFrame/VADUserStoppedSpeakingFrame and sends the appropriate activity signals when local VAD is in use. Also removes the unnecessary local SileroVADAnalyzer from server-side VAD examples and adds a new 26a example demonstrating local VAD configuration.	2026-03-25 18:00:13 -04:00
Chad Bailey	4f0b2066c0	Add Deepgram Flux STT service for AWS SageMaker Add DeepgramFluxSageMakerSTTService that combines SageMaker's HTTP/2 transport with Flux's JSON turn detection protocol (StartOfTurn, EndOfTurn, EagerEndOfTurn, TurnResumed). Includes mid-stream Configure support, silence watchdog, and an example bot.	2026-03-25 19:09:52 +00:00
Mark Backman	1c99a537b2	Consolidate Grok services into xai module Both GrokLLMService and XAIHttpTTSService use the same xAI API (api.x.ai), so move Grok source files into the xai module. Leave deprecation shims in the old grok/ paths for backward compatibility.	2026-03-25 12:07:40 -04:00
Mark Backman	adc003d6c7	Code review cleanup	2026-03-25 10:53:07 -04:00
Nicholas Zhao	bbd14de9c5	Address PR review: rename to XAIHttpTTSService, add language map, clean up API - Rename XAITTSService → XAIHttpTTSService and XAITTSSettings → XAIHttpTTSSettings - Add language_to_xai_language() with explicit LANGUAGE_MAP using resolve_language() - Remove deprecated InputParams, params, voice, language init params - Remove XAI_DEFAULT_SAMPLE_RATE and XAI_PCM_CODEC constants; add encoding param - Set sample_rate=None default (picked up from PipelineParams or user) - Use Language.EN enum instead of string "en" for default language - Add changelog/4031.added.md - Add 07e-interruptible-xai.py foundational example - Update 14g-function-calling-grok.py to use XAIHttpTTSService - Register 07e in run-release-evals.py	2026-03-25 10:46:54 -04:00
Paul Kompfner	ac2b1ecd47	Prefer init-provided system instruction in Grok Realtime Add system_instruction parameter to the Grok Realtime adapter's get_llm_invocation_params() and call _resolve_system_instruction() to prefer init-provided over context-provided system instructions and warn on conflicts. Previously context-provided took precedence. Update the Grok Realtime example to use settings.system_instruction instead of session_properties.instructions.	2026-03-24 17:29:19 -04:00
Paul Kompfner	e0c49927cf	Remove hard-coded model overrides from Together and Groq examples Prefer service defaults — the hard-coded models we were using are no longer available on these providers.	2026-03-24 16:05:15 -04:00
Paul Kompfner	7a0f7b58d1	Remove bit of unintentionally-left-in debugging logic	2026-03-24 16:05:15 -04:00
Paul Kompfner	5806a3f0fa	Use "developer" role for remaining developer-intent messages in examples	2026-03-24 16:05:04 -04:00
Paul Kompfner	d779a5b4ea	Use "developer" role for programmatic conversation-kickoff messages These messages are developer instructions to the assistant (e.g. "Please introduce yourself to the user"), not simulated user input. The "developer" role is semantically correct for this purpose.	2026-03-24 16:02:42 -04:00
Paul Kompfner	e0bc9c73c6	Add Anthropic interruptible example (07e) and register in release evals	2026-03-24 16:02:42 -04:00
Mark Backman	6eb988b729	Merge pull request #4092 from harshitajain165/harshita/smallest-tts-only Add Smallest AI TTS service integration	2026-03-24 11:54:34 -04:00
Filipi da Silva Fuchter	0783edb185	Merge pull request #4120 from pipecat-ai/filipi/krisp-viva-vad-support Added cleanup() method to VADAnalyzer base class	2026-03-24 11:26:53 -04:00
kompfner	cf083b8411	Merge pull request #4078 from pipecat-ai/cb/gemini-updates Updates for Gemini Live	2026-03-24 11:18:00 -04:00
Harshita Jain	099814d74a	Add Smallest AI TTS service integration Adds SmallestTTSService, a WebSocket-based TTS service using Smallest AI's Lightning v3.1 model. Follows current Pipecat service conventions: - SmallestTTSSettings dataclass with runtime-updatable settings (voice, language, speed, etc.) - Reconnects on model change; keepalive every 30s to prevent idle timeout - TTS settings default to None so the API applies its own defaults - Model enum: SmallestTTSModel.LIGHTNING_V3_1 Includes a foundational example (07zl-interruptible-smallest.py) using Deepgram STT + Smallest TTS + OpenAI LLM. STT integration will follow in a separate PR once the hallucination/finalize behaviour is resolved. Made-with: Cursor	2026-03-24 11:11:10 -04:00
Paul Kompfner	68a440ae2e	Move inference_on_context_initialization comment to constructor level	2026-03-24 10:49:45 -04:00
filipi87	311afef7da	Fixing Krisp Viva example.	2026-03-24 10:48:22 -03:00
Filipi da Silva Fuchter	5ed183d215	Merge pull request #4022 from krispai/krisp-viva-vad-support Draft Implementation for Krisp VIVA VAD.	2026-03-24 09:44:32 -04:00
Mark Backman	aa0b49d69f	Code review fixes	2026-03-24 09:22:08 -04:00
Mark Backman	cdd8c3e5bb	Fix examples	2026-03-24 08:53:56 -04:00
Mark Backman	1c8a8f51d4	Code review fixes	2026-03-24 08:46:03 -04:00
dhruvladia-sarvam	349b8645f3	Merge branch 'main' into feat/sarvam-llm-integration	2026-03-24 16:34:12 +05:30
dhruvladia-sarvam	696196e30c	alignment with pr 4081	2026-03-24 16:29:58 +05:30
Garegin Harutyunyan	dacffccd3a	fixed runtime issue.	2026-03-24 12:56:19 +04:00
filipi87	ddd1b71b56	Renaming audio_out_insert_silence to audio_out_auto_silence	2026-03-23 17:57:42 -03:00
Mark Backman	d314e2831a	Simplify 26 name, update evals	2026-03-23 15:46:13 -04:00
Chad Bailey	844555c520	removed old Gemini Live example	2026-03-23 18:31:36 +00:00
Garegin Harutyunyan	f8c7414ea7	format fix.	2026-03-23 18:58:19 +04:00
Garegin Harutyunyan	f1f51de962	Merge branch 'main' into krisp-viva-vad-support	2026-03-23 18:35:58 +04:00
Garegin Harutyunyan	c32240e14b	Fixed review comments.	2026-03-23 17:44:48 +04:00
filipi87	936a39f4a1	Updating tavus examples to not send silence.	2026-03-22 14:41:23 -03:00
Paul Kompfner	4c456ada04	Remove 05a example, which was broken and isn't currently a priority to fix	2026-03-19 15:52:48 -04:00
kompfner	488dc1d07e	Merge pull request #4074 from pipecat-ai/pk/openai-responses-llm-service feat: add OpenAI Responses API LLM service	2026-03-19 15:44:26 -04:00
Paul Kompfner	dafbb2eb66	fix: typo "conversatione" → "conversation" in 20- examples	2026-03-19 15:38:38 -04:00
Paul Kompfner	d702ebd6a2	Add frame_order parameter to SyncParallelPipeline Adds a FrameOrder enum with ARRIVAL (default, existing behavior) and PIPELINE (pushes frames in pipeline definition order). This lets callers guarantee output ordering between parallel pipelines — e.g. ensuring image frames precede audio frames — without needing a separate reordering processor downstream. Updates the 05-sync-speech-and-image example to use FrameOrder.PIPELINE, removing the ImageBeforeAudioReorderer class entirely.	2026-03-19 09:43:51 -04:00
Paul Kompfner	5e7639812a	Add ImageBeforeAudioReorderer to sync-speech-and-image example Add a processor after SyncParallelPipeline that ensures each image frame precedes its corresponding TTS audio frames. SyncParallelPipeline batches them together but doesn't guarantee branch ordering. The reorderer detects when TTS frames arrive before their image (via context_id tracking) and holds them until the image arrives. Also rename ImageAudioSync to MarkImageForPlaybackSync for clarity.	2026-03-19 09:43:51 -04:00

1 2 3 4 5 ...

1813 Commits