pipecat

Author	SHA1	Message	Date
Gökmen Görgen	e25dccfc6b	update `aic-sdk` to `~=2.2.0` and rename `AICOUSTICS_LICENSE_KEY` to `AIC_LICENSE_KEY`.	2026-04-25 10:13:06 +02:00
Gökmen Görgen	3bbfc42854	remove adaptive audio enhancement example and support for runtime enhancement level updates in `AICFilter`.	2026-04-25 10:05:47 +02:00
Gökmen Görgen	3b2127f912	rename environment variables and references from `AICOUSTICS` to `AIC`.	2026-04-25 09:51:23 +02:00
Gökmen Görgen	ea12b10742	rename `mcp-aic-adaptive.py` to `mcp-aicoustics-adaptive.py`.	2026-04-25 09:51:23 +02:00
Gökmen Görgen	a2fbed86cf	add adaptive audio enhancement example and support for runtime enhancement level updates in `AICFilter`.	2026-04-25 09:51:23 +02:00
Gökmen Görgen	f75f361629	bump `aic-sdk` to 2.2.0 and update `AICFilter` with `model_id` and `enhancement_level` changes.	2026-04-25 09:51:23 +02:00
Filipi da Silva Fuchter	38a02271c5	Merge pull request #4368 from pipecat-ai/filipi/stt_service Fix issue where STTService unintentionally created a method with the same name as SegmentedSTTService.	2026-04-24 14:31:36 -03:00
filipi87	2ce203aeb8	Renaming the method to _maybe_reconnect_on_user_stopped_speaking.	2026-04-24 13:08:32 -03:00
filipi87	b30df95f13	Fix issue where STTService unintentionally created a method with the same name as SegmentedSTTService.	2026-04-24 13:00:38 -03:00
kompfner	6be8deee2a	Merge pull request #4361 from pipecat-ai/pk/pyright-fixes Some pyright fixes	2026-04-24 11:58:28 -04:00
Paul Kompfner	c113cacd59	refactor(types): name the LLMContext/OpenAI boundary with explicit cast helpers LLMContext's NotGiven, LLMContextToolChoice, and LLMStandardMessage are currently aliased to their OpenAI equivalents, so passing values between the two sides type-checks implicitly. That works today but obscures the fact that these are meant to be conceptually distinct — if LLMContext ever diverges from OpenAI's types, every implicit crossing would silently break. Introduce two module-private cast helpers in open_ai_adapter.py: - _openai_from_llm_context_tool_choice(tool_choice) - _openai_from_llm_standard_message(message) Both are typed no-ops today (implemented with typing.cast) but each carries a docstring explaining why the cast is present, and every boundary crossing now routes through a named function. Future readers (and future greps) can find the crossings; a later divergence becomes a mechanical find-and-update rather than hunting through adapter code. No behavior change, no pyright error delta.	2026-04-24 10:10:03 -04:00
Paul Kompfner	d0495eeef6	fix(types): narrow voice in SpeechmaticsTTSSettings to disallow None After widening TTSSettings.voice to str \| None \| _NotGiven (so other TTS services can opt into None as a valid "no voice" state), pyright flagged Speechmatics' URL builder receiving str \| None where it required str. Speechmatics has no "no voice" mode (the URL path includes the voice name), so override the inherited field in SpeechmaticsTTSSettings to str \| _NotGiven. The call site stays as a plain assert_given(...) without an extra None check.	2026-04-23 21:08:47 -04:00
Paul Kompfner	c3eb69165c	fix(types): accept SDK NotGiven in LLM Settings fields used for passthrough Three LLM services initialize certain Settings fields with the SDK's NOT_GIVEN (openai.NOT_GIVEN or anthropic.NOT_GIVEN) so the value flows unmodified into SDK API calls. The inherited field types from LLMSettings only admit pipecat's _NotGiven, so pyright flagged each constructor call as a flavor mismatch. Widen the field types in each service-specific Settings subclass so they accept both pipecat's _NotGiven (for delta-mode defaults) and the corresponding SDK NotGiven (for store-mode passthrough): - OpenAILLMSettings: frequency_penalty, presence_penalty, seed, temperature, top_p, max_tokens, max_completion_tokens. - OpenAIResponsesLLMSettings: temperature, top_p, max_completion_tokens. - AnthropicLLMSettings: temperature, top_k, top_p, thinking. Every overridden field is genuinely read from self._settings and passed directly to the SDK, so none of the overrides are vestigial. Clears 21 pyright errors and restores test_service_settings_complete parity with the pre-NOT_GIVEN-swap state.	2026-04-23 18:32:46 -04:00
Paul Kompfner	0302f6d05c	chore(pyright): drop newly-clean files from ignore list asyncai/tts and google/vertex/llm are now clean after the missing-None sweep (both benefited from the TTSSettings.voice / LLMSettings cascades). - src/pipecat/services/asyncai/tts.py - src/pipecat/services/google/vertex/llm.py	2026-04-23 18:18:00 -04:00
Paul Kompfner	b9ff333654	fix(types): admit None on settings fields that accept it as a default Service-specific Settings subclasses declared fields as T \| _NotGiven (no None), but the services routinely pass None to those fields during init to mean "don't override — use the vendor's default". The field type just didn't reflect that a None value is valid, so pyright flagged every None at the call sites. Change the declarations to T \| None \| _NotGiven, matching the pattern already used by ServiceSettings.model and TTSSettings.language. No constructor-call changes; the default_factory stays NOT_GIVEN. Fields touched across 11 files: - services/settings.py: TTSSettings.voice (base class; covers asyncai, cartesia, elevenlabs, fish, hume, kokoro, lmnt, mistral, neuphonic, piper, resembleai, rime, xtts TTS services). - services/aws/llm.py: latency. - services/aws/tts.py: engine, pitch, rate, volume, lexicon_names. - services/azure/tts.py: emphasis, pitch, rate, role, style, style_degree, volume. - services/google/gemini_live/llm.py: vad. - services/google/llm.py: thinking. - services/google/stt.py: language_codes. - services/inworld/tts.py: speaking_rate, temperature. - services/openai/tts.py: instructions, speed. - services/speechmatics/stt.py: 13 fields (domain, operating_point, max_delay, end_of_utterance_, punctuation_overrides, _partials, split_sentences, enable_diarization, speaker_*, max_speakers, prefer_current_speaker, extra_params). - services/ultravox/llm.py: output_medium. Clears 94 pyright errors (1035 -> 941).	2026-04-23 18:18:00 -04:00
Paul Kompfner	92610944af	chore(pyright): drop newly-clean files from ignore list Three files no longer have pyright errors after the is_given / assert_given sweep — remove them from the ignore list (which serves as a live todo of files with remaining type errors). - src/pipecat/processors/gstreamer/pipeline_source.py - src/pipecat/services/camb/tts.py - src/pipecat/services/speechmatics/tts.py	2026-04-23 17:44:17 -04:00
Paul Kompfner	6a337f1bc6	fix(types): assert_given at store-mode settings read sites Apply assert_given across service modules to narrow reads from store-mode settings fields (self._settings.X, default_settings.X), where _NotGiven is declared in the field type but should never appear at runtime (enforced by validate_complete()). Two idioms used: - Inline wrap for single uses: func(assert_given(self._settings.enable_prompt_caching), ...) - Extract-and-reuse when the same value is used multiple times: thinking = assert_given(self._settings.thinking) if thinking: params["thinking"] = thinking.model_dump(...) 43 service files touched. Cleared ~172 pyright errors; remaining _NotGiven-related errors are in adjacent categories (flavor mismatch between openai/anthropic NotGiven and pipecat _NotGiven, settings field types that should allow None but don't) that need different fixes.	2026-04-23 17:39:17 -04:00
Filipi da Silva Fuchter	ef7fa07bf7	Merge pull request #4358 from pipecat-ai/filipi/fix_aiortc_sctp Fixed SmallWebRTC data channel silently stalling on networks with a 1280-byte MTU	2026-04-23 17:49:18 -03:00
filipi87	ce1506792e	Linking to the docs instead of full explanation.	2026-04-23 17:46:54 -03:00
Paul Kompfner	70f3d32734	feat(types): add assert_given for narrowing store-mode settings reads In store-mode settings objects, _NotGiven should never appear (the invariant enforced by validate_complete). But the declared field types still include _NotGiven because the same class doubles as delta mode, so every field read is typed X \| None \| _NotGiven and pyright flags operations that assume X \| None. assert_given is a one-line extractor that narrows away _NotGiven and raises loudly if the invariant is violated — preferable to scattering is_given guards that defend against something that can't occur in practice. resolved_model = assert_given(self._settings.model) # str \| None	2026-04-23 16:40:07 -04:00
Paul Kompfner	356618b448	fix(types): use is_given at call sites pyright flagged Replace direct identity checks against NOT_GIVEN with is_given() at sites where pyright's inability to narrow on non-singleton sentinels was causing type errors. - adapters/services/anthropic_adapter.py: narrow converted.system for _resolve_system_instruction. - services/openai/llm.py: narrow params.service_tier using OpenAI's is_given. - services/sarvam/llm.py: narrow tools / tool_choice using OpenAI's is_given (aliased as openai_is_given alongside the existing settings.is_given import). - services/sarvam/tts.py: narrow settings.voice using settings.is_given.	2026-04-23 16:15:07 -04:00
Paul Kompfner	1624d7a474	feat(types): add is_given TypeGuard helpers for NotGiven sentinels Pyright can't narrow identity checks against module-level NotGiven sentinels (they aren't typed as singletons), which leaves many NotGiven-bearing unions stuck as unnarrowed types throughout the codebase. Introduce is_given TypeGuard helpers so narrowing works via isinstance under the hood. Each helper is co-located with the NotGiven flavor it guards: - services/settings.py: upgrade the existing is_given to a TypeGuard. - processors/aggregators/llm_context.py: add an is_given for LLMContext's NotGiven. Treat LLMContext's re-exported types (LLMStandardMessage, LLMContextToolChoice, NOT_GIVEN, NotGiven) as LLMContext's own — independent definitions that happen to coincide with OpenAI's as an implementation detail. - adapters/services/anthropic_adapter.py: add is_given for anthropic's NotGiven. - adapters/services/open_ai_adapter.py: add is_given for openai's NotGiven.	2026-04-23 15:33:43 -04:00
Paul Kompfner	092b1dcb0f	fix(types): widen TLLMInvocationParams bound to Mapping[str, Any] TypedDict types are not subtypes of dict[...] in the type system (per PEP 589), so TypedDict-based invocation param classes could not satisfy the TypeVar bound. Mapping[str, Any] accepts TypedDicts while preserving the "string-keyed mapping" constraint.	2026-04-23 14:35:59 -04:00
Mark Backman	b90ea9bf6a	Merge pull request #4352 from pipecat-ai/mb/pyright-fixes-1-per-file More pyright fixes	2026-04-23 14:14:36 -04:00
kompfner	05c97804d5	Merge pull request #4359 from pipecat-ai/pk/changelog-4355-rename chore: rebind Gemini Live reconnect changelog fragment to PR #4355	2026-04-23 14:10:36 -04:00
Paul Kompfner	7a8357a569	chore: rebind Gemini Live reconnect changelog fragment to PR #4355 The original contributor's PR (#4328) landed as #4355. Rename the fragment so the rendered changelog links to the merged PR, and add the leading `- ` bullet prefix that towncrier expects.	2026-04-23 12:00:56 -04:00
filipi87	44756de15a	Adding changelog for the SmallWebRTC fix.	2026-04-23 12:19:56 -03:00
filipi87	94304ec74e	Fixed SmallWebRTC data channel silently stalling on networks with a 1280-byte MTU.	2026-04-23 12:18:33 -03:00
kompfner	a3fe34f4a2	Merge pull request #4355 from pipecat-ai/pk/gemini-live-context-reseed-on-reconnect Re-seed Gemini Live context on reconnect without session resumption	2026-04-23 11:00:22 -04:00
Sathwika Reddy Geereddy	21f6c2afa5	Update NVIDIA STT services for Nemotron Speech defaults and config parity (#4269 ) * Update NVIDIA STT services for Nemotron Speech defaults and config parity * Add changelog entry for PR #4269 * initialize boosted LM settings defaults in streaming STT * Align NVIDIA STT language handling with other STT services * add finalised flag to Nvidia stt final transcripts, remove processing latency logs * Changing interim transcription logging to tracing. --------- Co-authored-by: sathwika <geereddysath@nvidia.com> Co-authored-by: filipi87 <filipi87@gmail.com>	2026-04-23 09:01:27 -04:00
Filipi da Silva Fuchter	4d14251f4a	Merge pull request #4354 from pipecat-ai/filipi/includes_inter_frame_spaces feat(tts): add includes_inter_frame_spaces flag to word-timestamp API - follow-up	2026-04-23 08:49:26 -03:00
Paul Kompfner	1421c4ba22	fix: handle Gemini Live 2.5 quirks when re-seeding context on reconnect Extends the reconnect re-seeding fix to work cleanly on Gemini Live 2.5, which has stricter seed requirements than 3.x and a documented audio-input / history-recall limitation. Both initial connection and reconnect now share a single code path (`_create_initial_response(for_reconnect=...)`), with four well-documented cases. On Gemini 2.5 reconnect, `turn_complete=True` is now forced on the seed so the model produces a recap-style response immediately instead of briefly acting "forgetful" on the user's next utterance — the latter being especially jarring mid-conversation. When a 2.5 seed doesn't already end with a user turn (e.g. the bot had finished speaking before the disconnect), a blank user turn is appended to satisfy the server's seed-shape requirement. Gemini 3.x needs neither workaround.	2026-04-22 15:58:54 -04:00
filipi87	6b1d8d9fa5	Fixing merge conflicts.	2026-04-22 15:22:32 -03:00
filipi87	ac810e57ed	Merge branch 'main' into filipi/includes_inter_frame_spaces # Conflicts: # uv.lock	2026-04-22 15:22:06 -03:00
filipi87	bba7ca80e3	Bumping to small-webrtc-prebuilt 2.5.0 to fix karaoke highlighting.	2026-04-22 15:20:37 -03:00
filipi87	79250f1fe0	Making includes_inter_frame_spaces optional for word-timestamp.	2026-04-22 14:20:30 -03:00
Mark Backman	4f6e76e6fd	Add changelog entries for #4352	2026-04-22 12:23:33 -04:00
Mark Backman	b0962861c8	Acknowledge Tkinter's GC-reference idiom with a scoped type ignore Tkinter's `Label` only stores `PhotoImage` references at the C level, so Python GC eats them unless something on the Python side keeps a reference. The canonical fix is to stash the reference on the widget itself: `label.image = photo`. Tkinter widgets are plain Python objects, so the assignment works at runtime, but the stub declares no `image` attribute (correctly — there isn't one; we're adding it). Narrow the suppression to `# type: ignore[attr-defined]` on the one line. The existing comment above the assignment already documents why.	2026-04-22 12:19:16 -04:00
Mark Backman	ec7c35fe98	Move Mistral message fixups into MistralLLMAdapter Mistral imposes three conversation-history quirks on top of the OpenAI-compatible wire format: tool messages must be followed by an assistant message; non-initial system messages are rejected; trailing assistant messages require `prefix=True`. These rules were applied inline in `MistralLLMService.build_chat_completion_params`, which is the wrong layer — every other provider with OpenAI-compatible-but-quirky shape (Perplexity, etc.) owns its transformations in a `BaseLLMAdapter` subclass that runs during `get_llm_invocation_params`. Create `MistralLLMAdapter(OpenAILLMAdapter)` on the Perplexity template and wire it in via the existing `adapter_class` dispatch. The service now only handles Mistral-specific request-level mapping (`random_seed` in place of `seed`), and the message shape concerns live with other provider format logic. No behavior change. The transform function casts to `list[dict[str, Any]]` internally because mutating `role` and attaching Mistral's non-standard `prefix` field both step outside OpenAI's TypedDict contract; the cast at the return boundary encodes that we're emitting Mistral's extended schema, not OpenAI's.	2026-04-22 12:17:46 -04:00
Mark Backman	10b86b4bbe	Coerce inspect.getdoc() None to empty string before parsing `inspect.getdoc()` returns `str \| None`, but `docstring_parser.parse()` requires `str`. Functions without a docstring produced `None`, which the type checker correctly flagged. Coerce to `""` at the call site. `docstring_parser.parse("")` returns an empty docstring whose `.description` and `.params` are already handled by the surrounding `or ""` fallbacks, so runtime behavior is unchanged.	2026-04-22 12:01:00 -04:00
Mark Backman	8ec56092c0	Remove duplicate ResponseCreated type	2026-04-22 11:58:15 -04:00
Mark Backman	0c3c5e5c7d	Widen ToolsSchema.standard_tools to Sequence for covariance `ToolsSchema.__init__` declared `standard_tools: list[FunctionSchema \| DirectFunction]`. Callers (`BaseLLMAdapter`, `MCPService`) pass in `list[FunctionSchema]`, which is not assignable to the union list because `list` is invariant in its element type. Widen the parameter to `Sequence[...]` (covariant) so `list[X]` and `list[X \| Y]` both fit. A narrower `list[FunctionSchema]` is still accepted, and nothing in this class mutates the argument — the constructor immediately copies it via `_map_standard_tools`. Also correct the `custom_tools` property return type to include `None`, matching the stored `_custom_tools` field. This single edit clears the pyright errors for three ignore-list entries: `tools_schema.py`, `base_llm_adapter.py`, and `mcp_service.py`.	2026-04-22 11:54:20 -04:00
Mark Backman	b64ed3f9e2	Narrow settings.model at service boundaries, not via truthiness Two services were reading `_settings.model` (typed `str \| _NotGiven \| None` because NOT_GIVEN is the default) and coercing it with `or ""` or similar. `_NotGiven.__bool__` returns False, so the runtime behavior happened to work, but the type was a lie — pyright saw `str \| _NotGiven` flowing into APIs that required `str` or `str \| None`. - `AIService._sync_model_name_to_metrics`: use `isinstance(model, str)` narrowing with an empty-string fallback. Equivalent runtime behavior, honest type, no truthiness dependency on a sentinel. - `SarvamLLMService.__init__`: validate the model is a real string before handing it to `_validate_model(str)`. A non-string model at this point is a configuration bug; raise `ValueError` so the error is clear and survives `python -O` (unlike an assert).	2026-04-22 11:52:20 -04:00
Mark Backman	5872006d6b	Encode lazy-init invariants at the right site, not at read sites Three spots had the same shape: a field starts None, a later method populates it, a read site later reads it. Pyright can't track the cross-method invariant. Rather than spray assertions at the read sites, fix each site at the structural level: - `FastAPIWebsocketInputTransport._monitor_websocket` now takes the session timeout as an argument. The task-creation site already guards on truthiness, so the call can pass the non-None value directly and the method's signature tells the truth. - `FrameProcessorMetrics.task_manager` raises `RuntimeError` instead of asserting. Asserts are stripped under `python -O`; a real raise keeps the runtime safety net and still narrows the type for pyright. - `SOXRStreamAudioResampler._maybe_initialize_sox_stream` returns the initialized stream. Callers use the return value and never touch the Optional `_soxr_stream` attribute, so narrowing stays inside the init method where the invariant is established.	2026-04-22 11:45:18 -04:00
Mark Backman	457eb7aa92	Mark abstract image/vision generators as real async generators `ImageGenService.run_image_gen` and `VisionService.run_vision` were declared `async def ... -> AsyncGenerator[Frame, None]` with `pass` bodies. Without a `yield` anywhere in the body, Python treats the function as a coroutine returning an `AsyncGenerator`, not as an async generator itself, so callers got a coroutine where they expected an iterator. Add `raise NotImplementedError; yield` so the body contains a yield (making this a real async generator) while still raising cleanly if a subclass ever calls `super().run_*` by mistake.	2026-04-22 11:19:23 -04:00
Mark Backman	14cd476b20	Drop pyright ignores for services fixed by run_stt/run_tts widening Deepgram STT, Gradium TTS, Smallest STT, and xAI STT/TTS had exactly one pyright error each, all of them the AsyncGenerator return-type mismatch resolved in `08fe9157c`. Remove them from the ignore list.	2026-04-22 11:09:27 -04:00
Mark Backman	3b0affe5b4	Guard run_stt WebSocket sends with try/except AssemblyAI, Cartesia, Gradium, and Soniox STT services sent audio over the WebSocket without catching transient send failures, so a single network hiccup could propagate an exception up through process_frame and end the pipeline. Other push-based STT services (Deepgram, xAI, Azure, Smallest, etc.) already guard their sends. Follow the deepgram/stt.py pattern: log a warning and continue. The existing connection-state check at the top of each call handles recovery on the next invocation.	2026-04-22 11:03:41 -04:00
Mark Backman	08fe9157cc	Widen run_stt/run_tts return type to AsyncGenerator[Frame \| None, None] The push-based STT/TTS implementations send audio/text over a socket and receive results via a separate receive task, so there is nothing to yield inline. They yield `None` by design. The previous declaration of `AsyncGenerator[Frame, None]` disagreed with that, while the consumer (`AIService.process_generator`) already accepted `Frame \| None`. Widen the producer side (abstract base and every subclass) so the type honestly describes the contract. Pure annotation change; no runtime behavior difference.	2026-04-22 11:01:50 -04:00
Mark Backman	3f3d3c9203	Merge pull request #4337 from pipecat-ai/mb/fix-speech-stop-strategy Split user-turn stop timeout into independent speech and STT timers	2026-04-22 10:23:03 -04:00
Mark Backman	6b6896a543	Merge pull request #4350 from pipecat-ai/mb/pyright-precise-ignore-list Expand pyright coverage to full src/pipecat with per-file ignores	2026-04-22 09:56:59 -04:00

1 2 3 4 5 ...

9179 Commits