pipecat

Author	SHA1	Message	Date
Mark Backman	58a038ddb2	Add Soniox real-time TTS service Introduce SonioxTTSService, a WebSocket TTS provider that streams text and receives audio over a persistent connection, multiplexing up to 5 concurrent streams per socket via Soniox's `stream_id`. Also updates the README service table and the Soniox voice example to use the new TTS end-to-end.	2026-04-27 16:04:02 -04:00
Gökmen Görgen	3b2127f912	rename environment variables and references from `AICOUSTICS` to `AIC`.	2026-04-25 09:51:23 +02:00
Gökmen Görgen	f75f361629	bump `aic-sdk` to 2.2.0 and update `AICFilter` with `model_id` and `enhancement_level` changes.	2026-04-25 09:51:23 +02:00
Mark Backman	d8f5c0be71	Add XAITTSService for xAI streaming WebSocket TTS Adds XAITTSService in the existing xai/tts.py module, alongside the existing XAIHttpTTSService. Connects to xAI's streaming endpoint at wss://api.x.ai/v1/tts, streams text.delta chunks up and base64 audio.delta chunks down on the same connection so audio starts flowing before the full utterance is synthesized. Extends InterruptibleTTSService since xAI's protocol is strictly sequential per connection and exposes neither a cancel verb nor a context ID — the only way to stop an in-flight utterance is to tear down the WebSocket, which is exactly what InterruptibleTTSService does on interruption when the bot is speaking. Voice, language, codec, and sample_rate are passed as query-string params at connect time; runtime setting changes reconnect the socket. Defaults to raw PCM so emitted TTSAudioRawFrame objects need no decoding downstream. Splits the existing example into voice-xai.py (WebSocket) and voice-xai-http.py (batch HTTP) so each variant has its own entry point. Promotes the xai extra to depend on pipecat-ai[websockets-base] since the new service imports the websockets library.	2026-04-21 15:48:26 -04:00
Mark Backman	58a17c7b1b	Include examples in type checking Remove `examples/` from the `pyrightconfig.json` ignore list and fix the resulting type errors across all example files. Common fixes: - Required API keys: `os.getenv("X")` -> `os.environ["X"]` so the return type is `str` rather than `str \| None`, and misconfiguration fails fast. - Narrow `LLMContextMessage` union members with `isinstance(..., dict)` before dict-style access. - `assert isinstance(params.llm, ...)` before calling service-specific methods that aren't on the base `LLMService`. - Guard optional frame fields (e.g. `LLMSearchResponseFrame.search_result`) before use.	2026-04-21 15:43:31 -04:00
Mark Backman	c091232f2f	Add xAI streaming STT service New `XAISTTService` wraps xAI's real-time speech-to-text WebSocket (`wss://api.x.ai/v1/stt`). It extends `WebsocketSTTService`, authenticates with the `XAI_API_KEY` as a Bearer token on the WS handshake, and streams raw audio (PCM/mu-law/A-law) with configurable interim results, endpointing, language, multichannel, and diarization settings. - `src/pipecat/services/xai/stt.py`: new service, settings dataclass, and `language_to_xai_stt_language` helper. - `src/pipecat/services/stt_latency.py`: `XAI_TTFS_P99` default. - `pyproject.toml` / `uv.lock`: `xai` extra now pulls in `websockets-base`. - `README.md`: link to xAI STT in the services table. - `examples/voice/voice-xai.py`: swap DeepgramSTTService for XAISTTService so the xAI voice example is fully xAI. - `examples/transcription/transcription-xai.py`: new transcription-only example using the new service.	2026-04-21 13:45:34 -04:00
Garegin Harutyunyan	4c19f5584c	VIVA SDK TT v3 support (#4252 ) * VIVA SDK TT v3 support * Format fix. * Renamed the API naming, removed '3' from the name. * Implementation of User turn start strategy using Krisp VIVA Interruption Prediction in scope of TT v3 support. * Typo fix in voice-krisp-viva example to use KrispVivaFilter class * style fix. * test run error fixes. * some test related changes. * Fixed tests * Stule fixes.	2026-04-17 07:53:41 -04:00
Mark Backman	68a3070ad4	Add Mistral Voxtral Realtime STT service	2026-04-07 15:26:56 -04:00
Mark Backman	aa7a014518	Add mistral voice example	2026-04-07 12:32:06 -04:00
Mark Backman	0c59819682	Remove allow_interruptions from voice-sarvam example This was missed from the allow_interruptions removal commit.	2026-04-02 11:32:44 -04:00
Harshita Jain	bd6cbd7fe7	feat: add Smallest AI STT service integration (#4162 ) Add SmallestSTTService using the Pulse WebSocket API for real-time transcription. Includes SmallestSTTSettings dataclass, 32-language support with resolve_language fallback, VAD-driven finalize signal, and SMALLEST_TTFS_P99 latency constant. Also adds X-Source and X-Pipecat-Version headers to Smallest STT and TTS WebSocket connections.	2026-04-01 13:44:04 -04:00
Mark Backman	d3021b4590	Rename example files to prepend parent folder name, preventing package shadowing Example files like openai.py shadow installed packages when Python adds the script directory to sys.path. Prepend the parent folder name to each example file (e.g. openai.py -> function-calling-openai.py). Also split thinking-and-mcp/ into separate mcp/ and thinking/ directories.	2026-03-31 22:06:01 -04:00
Mark Backman	47b41a0ff7	Rename services/ to voice/ and function-calling/, flatten to top level Replace the nested services/speech/ and services/function-calling/ with top-level voice/ and function-calling/ directories. Update eval script paths and README to match.	2026-03-31 15:20:03 -04:00

13 Commits