pipecat

Author	SHA1	Message	Date
Paul Kompfner	5b270fec8e	In AWS Nova Sonic examples, migrate to newer pattern of passing in `settings` with `voice` and `system_instruction`, in favor of passing in `voice_id` as a direct init arg and the system instruction as the first message in the context	2026-03-06 09:57:57 -05:00
Paul Kompfner	78deaa735d	Move `system_instruction` into `LLMSettings` Add `system_instruction` field to `LLMSettings` so it is runtime-updatable via settings. For Google (GoogleLLMService, GoogleVertexLLMService), deprecate the init-time arg since it was already shipped. For Anthropic, AWS Bedrock, and OpenAI, remove the init-time arg entirely since it was never shipped. Still need to handle realtime services (OpenAI Realtime, Grok Realtime, Gemini Live).	2026-03-06 09:57:08 -05:00
Mark Backman	8a203dd98f	Update more examples, misc services	2026-03-06 08:30:00 -05:00
Mark Backman	62554a2390	Update examples	2026-03-06 08:30:00 -05:00
Mark Backman	034e81ff18	Update STT service settings	2026-03-06 08:29:14 -05:00
Mark Backman	1274bb2c55	Update deprecation version to 0.0.105	2026-03-06 08:29:14 -05:00
Mark Backman	ca27e12c84	Merge pull request #3926 from pipecat-ai/mb/update-deps-2026-03-05 Update dependency version ranges for flexibility	2026-03-05 18:09:04 -05:00
zack	380726cfd3	Update AssemblyAI turn detection example to use keyterms_prompt Change the commented example from prompt string format to keyterms_prompt list format for better clarity and consistency with API best practices.	2026-03-05 15:47:54 -05:00
Mark Backman	3f97c91983	Update optional dependency version ranges and remove SDK dependencies Widen version ranges for stable packages (anthropic, azure, deepgram, groq, livekit, nvidia-riva-client, fastapi, ormsgpack, opentelemetry, faster-whisper) and add upper bounds to previously uncapped packages (hume, pyjwt, livekit-api, camb). Replace CartesiaHttpTTSService's internal use of the Cartesia SDK with direct aiohttp calls, accepting an optional aiohttp_session parameter. Replace fal-client SDK calls in FalSTTService and FalImageGenService with direct HTTP to bypass the SDK's aggressive retry/backoff logic that caused significant latency regressions.	2026-03-05 15:06:54 -05:00
Mark Backman	eeb8ed8588	Remove Hathora service integration Hathora is shutting down on March 5, 2026. Remove the STT/TTS services, examples, and related references.	2026-03-04 22:10:06 -05:00
Aleix Conchillo Flaqué	0004a116d8	examples(foundational): use system_instruction in all examples	2026-03-04 17:37:32 -08:00
Mark Backman	ca0ec16373	Merge pull request #3889 from ai-coustics/goedev/aic-voice-focus-and-memoryview-fix AIC Voice Focus version update & concurrency safety issue on audio buffer.	2026-03-03 09:28:13 -05:00
Mark Backman	c79a739c85	Merge pull request #3856 from zkleb-aai/assemblyai-u3-rt-pro Assemblyai u3 rt pro	2026-03-02 20:28:28 -05:00
Mark Backman	aad1211a57	Merge pull request #3885 from pipecat-ai/mb/latency-breakdown Add latency breakdown to UserBotLatencyObserver	2026-03-02 19:27:35 -05:00
Mark Backman	7dbb130666	Add chronological_events utility function to display UserBotLatencyObserver report	2026-03-02 19:23:42 -05:00
Aleix Conchillo Flaqué	141b0ee014	Merge pull request #3902 from pipecat-ai/aleix/deepgram-sagemaker-move Move Deepgram SageMaker modules to sagemaker/ subpackage	2026-03-02 15:25:17 -08:00
Aleix Conchillo Flaqué	088eb9b01c	examples: update to new sagemaker packages	2026-03-02 15:20:52 -08:00
zack	32773b42d6	Improve terminology: rename file and replace 'STT mode' with 'AssemblyAI turn detection' - Rename 07o-interruptible-assemblyai-stt.py -> 07o-interruptible-assemblyai-turn-detection.py - Replace 'STT mode' with 'AssemblyAI turn detection mode' throughout codebase - Replace 'Mode 1'/'Mode 2' with descriptive 'Pipecat turn detection'/'AssemblyAI turn detection' - Update changelog to use 'built-in turn detection' terminology - Addresses PR feedback about confusing terminology	2026-03-02 18:08:46 -05:00
zack	b449515410	Address PR review feedback: remove debug logs, fix hasattr logic, add VADAnalyzer	2026-03-02 17:54:31 -05:00
Mark Backman	aae9136df9	Review feedback	2026-03-02 17:52:39 -05:00
filipi87	49c73bb0a3	Merge branch 'main' into filipi/lemonslice # Conflicts: # README.md # uv.lock	2026-03-02 19:24:52 -03:00
filipi87	f07e55a4ed	Wrap LemonSlice session creation params in LemonSliceNewSessionRequest	2026-03-02 19:15:18 -03:00
filipi87	7afd7068b5	Retrieving the elevenlabs voice ID from environment variable	2026-03-02 19:02:51 -03:00
Mark Backman	ff5b985009	Convert observer data models to Pydantic BaseModel with timestamps Enables .model_dump() serialization for Pipecat Cloud collection. All metrics now include start_time (Unix timestamp) for timeline plotting alongside duration_secs.	2026-03-02 16:11:43 -05:00
Mark Backman	a738a4d82b	Add function call latency tracking to LatencyBreakdown	2026-03-02 16:11:43 -05:00
Mark Backman	ddba1b84a9	Add first-bot-speech latency to UserBotLatencyObserver Measure time from ClientConnectedFrame to first BotStartedSpeakingFrame, emitting a one-time on_first_bot_speech_latency event with breakdown.	2026-03-02 16:11:43 -05:00
Mark Backman	18155b6a63	Add latency breakdown to UserBotLatencyObserver Add per-service latency breakdown metrics alongside existing user-to-bot latency measurement. When enable_metrics=True, the observer now emits an on_latency_breakdown event with TTFB, text aggregation, and user turn duration metrics collected between VADUserStoppedSpeakingFrame and BotStartedSpeakingFrame. - Add LatencyBreakdown dataclass with ttfb, text_aggregation, user_turn_secs fields - Accumulate MetricsFrame data during user→bot cycles - Reset accumulators on InterruptionFrame to discard stale metrics - Measure user_turn_secs from actual user silence (VAD timestamp - stop_secs) to turn release (UserStoppedSpeakingFrame) - Filter zero-value TTFB entries from startup metric resets - Add frame deduplication using bounded deque + set pattern - Update example 29 with latency breakdown display	2026-03-02 16:11:43 -05:00
Mark Backman	b1e55fd6c2	Merge pull request #3881 from pipecat-ai/mb/startup-observer Add StartupTimingObserver	2026-03-02 16:07:28 -05:00
Aleix Conchillo Flaqué	193f93c2ce	Update Nvidia example to use llama-3.3-70b-instruct model	2026-03-02 10:16:27 -08:00
Mark Backman	68e8732e72	Add BotConnectedFrame and on_transport_timing_report event Add BotConnectedFrame (SystemFrame) pushed by SFU transports (Daily, LiveKit, HeyGen, Tavus) when the bot joins the room. Replace the on_transport_readiness_measured event with on_transport_timing_report which includes both bot_connected_secs and client_connected_secs.	2026-03-02 13:10:09 -05:00
Mark Backman	0836066898	Add ClientConnectedFrame and transport readiness timing Introduce ClientConnectedFrame (SystemFrame) pushed by all transports when a client connects. StartupTimingObserver uses this to measure transport readiness — the time from StartFrame to first client connection — via a new on_transport_readiness_measured event.	2026-03-02 13:10:09 -05:00
Mark Backman	c54232bdb4	Add StartupTimingObserver for measuring processor start() times Tracks how long each processor start method takes during pipeline startup by measuring StartFrame arrive/leave deltas. Emits a timing report via the on_startup_timing_report event and auto-logs a summary. Internal pipeline processors are excluded from reports by default.	2026-03-02 10:48:50 -05:00
Gökmen Görgen	8ff3e21654	use new version of vf model.	2026-03-02 11:22:51 +01:00
zack	42f91a9056	Apply ruff formatting fixes	2026-03-01 11:44:37 -05:00
zack	d1cbc81108	Fix 07o example to use new min_turn_silence parameter name in docs and comments	2026-03-01 11:36:46 -05:00
zack	07ae4b8d38	Update AssemblyAI examples to use u3-rt-pro and improve 55d example - Update 13d-assemblyai-transcription.py to explicitly use u3-rt-pro model - Update 55d-update-settings-assemblyai-stt.py to demonstrate keyterms updates instead of language updates - Add helpful logging to show before/after keyterms boosting effect - Use difficult names (Xiomara, Saoirse, Krzystof) to demonstrate boosting effectiveness	2026-03-01 11:27:31 -05:00
zack	21a409e447	Update prompt warning and rename min_end_of_turn_silence_when_confident to min_turn_silence - Add "beta feature" note to custom prompt warning - Rename min_end_of_turn_silence_when_confident parameter to min_turn_silence across all AssemblyAI code - Update documentation, examples, and test files to use new parameter name	2026-03-01 11:17:39 -05:00
Mark Backman	950a8628dc	Miscellaneous foundational example updates	2026-02-27 19:49:45 -05:00
zack	d7ce1eedd9	Add foundational examples for AssemblyAI u3-rt-pro - 07o-interruptible-assemblyai.py: Basic example using Pipecat VAD mode - 07o-interruptible-assemblyai-stt.py: Advanced example using STT-controlled turn detection with comprehensive documentation on u3-rt-pro features (turn detection tuning, prompt-based enhancement, speaker diarization)	2026-02-27 17:58:18 -05:00
filipi87	0839e3813f	Refactoring the examples to use the new context summarization classes.	2026-02-27 18:42:39 -03:00
filipi87	69414e8a5a	Added example 54b-context-summarization-manual-openai.py demonstrating on-demand summarization triggered via a function call tool.	2026-02-27 18:42:23 -03:00
Mark Backman	712305c5b1	Add example 54c showing custom context summarization	2026-02-27 12:07:34 -05:00
filipi87	1f45e80f9d	Updated the 52-live-translation.py example to demonstrate the fix	2026-02-27 11:56:52 -03:00
kompfner	7fe458fe59	Merge pull request #3817 from pipecat-ai/pk/service-settings-fix-back-compat-for-nested-external-sdk-types Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings`…	2026-02-26 11:08:27 -05:00
Mark Backman	907ff58d41	Align Ultravox Realtime service with OpenAI/Gemini patterns - Add InterruptionFrame handling with stop_all_metrics() - Add processing metrics (start/stop) at response boundaries - Fix agent transcript handling for voice and text modalities: - Voice mode: push LLMTextFrame (append_to_context=False) and TTSTextFrame for deltas, skip duplicated final text - Text mode: push LLMTextFrame with proper response lifecycle, no TTSTextFrame (downstream TTS handles audio) - Add output_medium parameter to AgentInputParams and OneShotInputParams - Improve TTFB measurement using VAD speech end time - Update example with user turn strategies and transcript events - Add text-only output example (50a-ultravox-realtime-text.py)	2026-02-26 10:44:36 -05:00
Mark Backman	3ae173520e	Code review feedback	2026-02-26 10:23:35 -05:00
Mark Backman	d69a337def	Add text_aggregation_mode parameter to TTSService Move the sentence vs token aggregation concern into text aggregators so all text flows through them regardless of mode. This enables pattern detection and tag handling to work in TOKEN mode. - Add TextAggregationMode enum (SENTENCE, TOKEN) as the user-facing TTS setting, separate from the internal AggregationType - Add TOKEN mode support to Simple, SkipTags, and PatternPair aggregators - Add text_aggregation_mode parameter to TTSService and all TTS subclasses - Deprecate aggregate_sentences in favor of text_aggregation_mode - Merge TTSService._process_text_frame() into a single codepath	2026-02-26 08:55:41 -05:00
Paul Kompfner	8b6aa4b912	Unflatten `LiveOptions` back into a single `live_options` field on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings`; add `apply_update` override with delta-merge semantics and `from_mapping` override for backward-compatible dict-style updates	2026-02-25 18:25:11 -05:00
Paul Kompfner	a4b6db6fb4	Flatten `LiveOptions` into individual fields on `DeepgramSTTSettings` and `DeepgramSageMakerSTTSettings` for backward-compatible dict-style updates via `STTUpdateSettingsFrame`; during the big service settings refactor, we accidentally got rid of the ability to update individual `LiveOptions` fields with a sparse update	2026-02-25 17:39:31 -05:00
kompfner	a8cb2a26d1	Merge pull request #3841 from pipecat-ai/pk/groq-tweaks A few Groq-related tweaks:	2026-02-25 15:54:33 -05:00

1 2 3 4 5 ...

1714 Commits