pipecat

Author	SHA1	Message	Date
Paul Kompfner	ea89819ece	chore: update previous_response_id comment	2026-03-20 21:34:22 -04:00
Paul Kompfner	c66a5a8ede	feat: set store=False and add run_inference tests Set store=False in Responses API calls since we send full conversation history as input items and don't use previous_response_id. Add 5 run_inference tests for OpenAIResponsesLLMService using real LLMContext and adapter (only HTTP client mocked).	2026-03-20 21:34:22 -04:00
Paul Kompfner	cd2886a4a8	chore: add note about previous_response_id and empty input handling	2026-03-20 21:34:22 -04:00
Paul Kompfner	312837a1d4	test: add run_inference tests for OpenAIResponsesLLMService Uses real LLMContext and adapter (only HTTP client is mocked) to test basic inference, client exception propagation, system_instruction override, empty context fallback, and max_tokens override.	2026-03-20 21:34:22 -04:00
Paul Kompfner	4d4e56cfef	test: add run_inference tests for OpenAIResponsesLLMService Tests cover basic inference, client exception propagation, system_instruction override, and max_tokens override.	2026-03-20 21:34:22 -04:00
Paul Kompfner	05e1d9f514	docs: add changelog for OpenAI Responses API service	2026-03-20 21:34:22 -04:00
Paul Kompfner	4d548117fa	feat: add OpenAI Responses API LLM service Add OpenAIResponsesLLMService using the Responses API, with a dedicated adapter that converts LLMContext messages to Responses API input items (system→developer, tool_calls→function_call, tool→function_call_output, multimodal content conversion, and tools schema flattening). - New adapter: open_ai_responses_adapter.py - New service: openai/responses/llm.py - Examples: 07-interruptible and 14-function-calling variants - 19 unit tests for adapter conversion logic - Eval entries for both examples	2026-03-20 21:34:22 -04:00
Paul Kompfner	b5c2d41ba3	Remove changelog fragment that no longer applies after a rebase	2026-03-20 21:34:22 -04:00
Paul Kompfner	dba2fc5451	Clarify SyncParallelPipeline docstrings Rewrite docstrings to more clearly explain what SyncParallelPipeline does: hold all output until every parallel branch finishes, so frames produced in response to a single input are released together.	2026-03-20 21:34:22 -04:00
Paul Kompfner	0a4acfa294	Add frame_order parameter to SyncParallelPipeline Adds a FrameOrder enum with ARRIVAL (default, existing behavior) and PIPELINE (pushes frames in pipeline definition order). This lets callers guarantee output ordering between parallel pipelines — e.g. ensuring image frames precede audio frames — without needing a separate reordering processor downstream. Updates the 05-sync-speech-and-image example to use FrameOrder.PIPELINE, removing the ImageBeforeAudioReorderer class entirely.	2026-03-20 21:34:22 -04:00
Paul Kompfner	ffdf629535	Add changelog entry for Whisker debugger fix	2026-03-20 21:34:22 -04:00
Paul Kompfner	a6b94c7424	Add changelog entries for PR #4029	2026-03-20 21:34:22 -04:00
Paul Kompfner	d2341e0199	Add ImageBeforeAudioReorderer to sync-speech-and-image example Add a processor after SyncParallelPipeline that ensures each image frame precedes its corresponding TTS audio frames. SyncParallelPipeline batches them together but doesn't guarantee branch ordering. The reorderer detects when TTS frames arrive before their image (via context_id tracking) and holds them until the image arrives. Also rename ImageAudioSync to MarkImageForPlaybackSync for clarity.	2026-03-20 21:34:22 -04:00
Paul Kompfner	4b66dd444b	Revert a couple of logs that were changed from `trace` to `debug` just for debugging	2026-03-20 21:34:22 -04:00
Paul Kompfner	7b859423ab	Use TextAggregationMode.TOKEN in the 05-sync-speech-and-image example since the SentenceAggregator already provides complete sentences.	2026-03-20 21:34:22 -04:00
Paul Kompfner	b68495ce0a	Add sync_with_audio support for OutputImageRawFrame Add a `sync_with_audio` field to `OutputImageRawFrame` that routes image frames through the audio queue in the output transport, ensuring images are only displayed after all preceding audio has been sent. This enables proper audio/image synchronization in pipelines like the calendar month narration example. Update the 05-sync-speech-and-image example to use an `ImageAudioSync` processor that sets this flag on image frames.	2026-03-20 21:34:22 -04:00
Paul Kompfner	f39472b150	Fix SyncParallelPipeline race condition with concurrent SystemFrame processing The FrameProcessor two-queue architecture processes SystemFrames and non-SystemFrames on separate concurrent async tasks. Both paths called SyncParallelPipeline.process_frame(), which used the same per-pipeline sink queues. A SystemFrame's wait_for_sync could steal frames from a concurrent non-SystemFrame's wait_for_sync, corrupting synchronization and stalling the pipeline. This was triggered by the auto-embedded RTVI processor (added in v0.0.101) which floods OutputTransportMessageUrgentFrame SystemFrames through the pipeline during LLM responses. Fix: SystemFrames (except EndFrame) now take a fast path — passed through internal pipelines and pushed downstream directly without touching the sink queues or drain logic. EndFrame retains the full drain behavior as a lifecycle frame.	2026-03-20 21:34:21 -04:00
Paul Kompfner	a8ea176ea3	Minor comment typo fix	2026-03-20 21:34:21 -04:00
Paul Kompfner	12cb9599ad	Fix bug resulting in `SyncParallelPipeline` breaking the Whisker debugger	2026-03-20 21:34:21 -04:00
filipi87	167f008e47	Mentioning the frame order fix in the changelog.	2026-03-20 21:34:21 -04:00
filipi87	fe8cb2f4e0	Always appending TTSTextFrame to the audio context.	2026-03-20 21:34:21 -04:00
filipi87	cdf44f7a3f	Fixing the frame ordering of the AggregatedTextFrame.	2026-03-20 21:34:21 -04:00
filipi87	d32a8a9ee2	Fixing TTS frame order.	2026-03-20 21:34:21 -04:00
joachimchauvet	ed160fd2e0	fix(livekit): suppress InvalidState log spam from audio mixer during interruptions	2026-03-20 21:34:21 -04:00
aconchillo	84eddb64d5	Update changelog for version 0.0.106	2026-03-20 21:34:21 -04:00
Aleix Conchillo Flaqué	189249caec	Add missing on_dtmf_event callback to Tavus transport The on_dtmf_event callback was added to DailyCallbacks in #4047 but the Tavus transport was not updated, causing a missing argument error.	2026-03-20 21:34:21 -04:00
Filipi da Silva Fuchter	3c90468e03	Fixed the ordering of `_maybe_pause_frame_processing` call in `TTSService` (#4071 ) * Fixing the invocation of pause_frame_processing at the correct time when receiving LLMFullResponseEndFrame and EndFrame.	2026-03-20 21:34:21 -04:00
Mark Backman	98d3f697f1	Add WakePhraseUserTurnStartStrategy (#4064 ) - Add WakePhraseUserTurnStartStrategy for gating interaction behind wake phrase detection, with timeout and single_activation modes - Add default_user_turn_start_strategies() and default_user_turn_stop_strategies() helper functions - Deprecate WakeCheckFilter in favor of the new strategy - Extend ProcessFrameResult to stop strategies for short-circuit evaluation - Fix MinWordsUserTurnStartStrategy including filtered text in output	2026-03-20 21:34:21 -04:00
Mark Backman	b9d996ff41	Improvements for Nova Sonic LLM and TTS output frames (#4042 ) * Fix empty user transcription causing spurious interruption in Nova Sonic Skip _report_user_transcription_ended() when _user_text_buffer is empty, which happens when the initial prompt is text-only. Previously, an empty TranscriptionFrame was pushed upstream, triggering a chain reaction: on_user_turn_stopped → UserStartedSpeakingFrame → interruption → premature BotStoppedSpeaking → multiple response start/stop cycles. * Improve TextFrame and assistant end of turn logic Now, SPECULATIVE text results are used to push the LLMTextFrame, AggregatedTextFrame, and TTSTextFrame. Additionally, the TTSTextFrames are push at the end of the corresponding audio segment. * Remove BotStoppedSpeakingFrame fallback from Nova Sonic Now that assistant response end is detected directly from Nova Sonic contentEnd events (END_TURN and INTERRUPTED), the BotStoppedSpeakingFrame handler is no longer needed. Inline the cleanup logic in reset_conversation.	2026-03-20 21:34:21 -04:00
Mark Backman	5de4256ab1	GradiumSTTService improvements (#4066 ) * Remove duplicate reconnection logic from Gradium STT The _receive_messages method had its own while-True reconnect loop, duplicating the reconnection handling already provided by WebsocketService._receive_task_handler (exponential backoff, max retries, error reporting). Flatten to just the inner message loop and let the base class handle reconnection. * Align Gradium STT VAD handling with base class patterns Replace the process_frame override with a _handle_vad_user_stopped_speaking override, which is the proper hook provided by STTService. Move start_processing_metrics() into run_stt (matching Gladia's pattern). Remove unused FrameDirection and VADUserStartedSpeakingFrame imports. * Add transcript aggregation delay after flushed to capture trailing tokens Gradium flushed response can arrive before all text tokens have been delivered. Instead of finalizing immediately on flushed, start a short timer (100ms) that allows trailing tokens to accumulate before pushing the final TranscriptionFrame. * Add changelog for PR #4066 * Change default encoding to pcm_16000 * Decouple encoding from sample_rate in Gradium STT The encoding parameter now takes just the base type (pcm, wav, opus) and the sample rate is derived from the pipeline audio_in_sample_rate, assembled dynamically via input_format_from_encoding(). This fixes the mismatch where SAMPLE_RATE=24000 was passed to the base class while encoding defaulted to pcm_16000.	2026-03-20 21:34:21 -04:00
Mark Backman	e2e0d9f8c4	fix: pass list-type Deepgram settings as lists instead of stringifying List-valued settings like keyterm, keywords, search, redact, and replace were being converted to strings before being passed to the SDK connect() method. The SDK expects lists so its encode_query can produce repeated query params (keyterm=a&keyterm=b).	2026-03-20 21:34:21 -04:00
Mark Backman	4c10fab0c9	Add changelog for #4046	2026-03-20 21:34:21 -04:00
Mark Backman	b610ba0aa5	Fix OpenAI STT crash when language is a plain string instead of Language enum	2026-03-20 21:34:21 -04:00
Mark Backman	d7d6ad6e96	Fix SonioxSTTService crash when language_hints contains plain strings (#4045 ) Refactor language_to_soniox_language to use resolve_language + LANGUAGE_MAP pattern consistent with other services. Fix resolve_language fallback to use str(language) instead of language.value so plain strings don't crash.	2026-03-20 21:34:21 -04:00
Mark Backman	7eedd5929d	Add changelog for #4026	2026-03-20 21:34:21 -04:00
Mark Backman	490e460c4b	Fix DeepgramSTTService base_url forcing HTTPS/WSS schemes The base_url parameter previously forced wss:// and https:// schemes, breaking air-gapped or private deployments that need ws:// or http://. Extract URL derivation into _derive_deepgram_urls() helper that respects the developers scheme choice while deriving the paired WebSocket and HTTP URLs the Deepgram SDK requires. Closes #4019	2026-03-20 21:34:21 -04:00
Mark Backman	e1ce74c7a5	Fix deprecation warning when using filter_incomplete_user_turns	2026-03-20 21:34:21 -04:00
Mark Backman	5faac08d36	docs: add changelog for #4058	2026-03-20 21:34:21 -04:00
Mark Backman	4171a75f79	fix: resolve raw language strings through Language enum for proper service conversion Raw strings like "de-DE" passed as the language parameter to TTS/STT services were bypassing the Language enum resolution logic, causing silent failures (e.g. ElevenLabs expects "de" not "de-DE"). Now raw strings are first converted to Language enums so they go through the same resolve_language() path, with a warning logged for unrecognized strings.	2026-03-20 21:34:21 -04:00
Mark Backman	fa345a510f	Add changelog for #4057	2026-03-20 21:34:21 -04:00
Mark Backman	55fb274d5a	Fix stale state in user turn stop strategies between turns Reset stop strategies at turn start (not just turn stop) so that late transcriptions arriving between turns do not leave stale _text that causes premature stops on the next turn. Also cancel pending timeout tasks in reset() for both SpeechTimeout and TurnAnalyzer strategies.	2026-03-20 21:34:21 -04:00
Mark Backman	fffb16ad39	Update uv.lock with pyasn1 v0.6.3	2026-03-20 21:34:20 -04:00
Mark Backman	9a32364b34	feat: add enable_dialout parameter to configure() for dial-out rooms Expose enable_dialout as a configure() parameter (default False) so dial-out examples can opt in without needing to build DailyRoomProperties manually.	2026-03-20 21:34:20 -04:00
Mark Backman	732afde3ea	fix: clean up configure() type hints, deduplicate token expiry, and improve comment Narrow misleading Optional type hints on parameters that never accept None, extract the duplicated token_exp_duration * 60 * 60 calculation, remove unnecessary forward-reference quotes on DailyMeetingTokenProperties, and clarify why enable_dialout is explicitly set to False.	2026-03-20 21:34:20 -04:00
copilot-swe-agent[bot]	e5215a636f	fix: set enable_dialout to False in PSTN runner to prevent room creation failures Co-authored-by: jamsea <614910+jamsea@users.noreply.github.com>	2026-03-20 21:34:20 -04:00
copilot-swe-agent[bot]	c0bc94a9ce	Initial plan	2026-03-20 21:34:20 -04:00
Julien Vantyghem	d26f512ba3	update docstring following https://github.com/pipecat-ai/pipecat/pull/3916	2026-03-20 21:34:20 -04:00
Blaine Kasten	fe84a881dd	turn off server vad	2026-03-20 11:17:38 -05:00
Blaine Kasten	591c02fb0e	a few updates	2026-03-19 13:37:21 -05:00
Blaine Kasten	077610184d	Add together STT and TTS services	2026-03-17 07:24:02 -05:00

1 2 3 4 5 ...

8403 Commits