Merge pull request #4423 from joycech333/feat/inception-llm-service

feat: add Inception LLM service with Mercury 2 support
Code review fixes
2026-05-21 12:02:27 -04:00 · 2026-05-21 11:45:17 -04:00 · 2026-05-21 11:23:23 -04:00 · 2026-05-21 08:35:46 -04:00 · 2026-05-21 08:35:31 -04:00 · 2026-05-21 08:35:15 -04:00
30 changed files with 513 additions and 31 deletions
--- a/.claude/skills/squash-commits/SKILL.md
+++ b/.claude/skills/squash-commits/SKILL.md
@@ -0,0 +1,91 @@
+---
+name: squash-commits
+description: Reorganize messy branch commits into a small set of logical, meaningful commits without changing any content. Drops merge-from-main commits. Safe: creates a backup branch first.
+---
+
+Reorganize the commits on the current branch into a small number of logical commits. Do NOT change any file content — only the commit structure changes.
+
+## Instructions
+
+### 1. Safety check
+
+```bash
+git status --short
+```
+
+If there are uncommitted changes, stop and tell the user to commit or stash them first.
+
+### 2. Inspect the branch
+
+```bash
+git log main..HEAD --oneline
+git diff main..HEAD --name-only
+```
+
+List every file changed vs `main` and every commit on the branch (excluding merge commits from main).
+
+### 3. Create a backup branch
+
+```bash
+git branch backup/<current-branch-name>
+```
+
+Tell the user the backup exists so they can recover if needed.
+
+### 4. Soft-reset to main and unstage everything
+
+```bash
+git reset --soft main
+git restore --staged .
+```
+
+All branch changes are now in the working tree, unstaged. No content has changed.
+
+### 5. Plan the logical groups
+
+Read the changed files and the original commit messages to understand what the work covers. Group related files into logical commits. Typical groups:
+
+- Core feature or fix (new source files + modified core files)
+- Secondary features or fixes (each as its own commit if distinct)
+- Refactoring or renames
+- Tests
+- Changelogs / docs
+
+Use the changelog files (if any) as a strong hint — each changelog entry often maps to one commit.
+
+Present the proposed grouping to the user and ask for confirmation before committing.
+
+### 6. Commit in logical groups
+
+For each group, stage only the relevant files and commit with a clear message following the project's conventions:
+
+```bash
+git add <file1> <file2> ...
+git commit -m "..."
+```
+
+Use conventional commit prefixes if the project uses them (`feat:`, `fix:`, `refactor:`, `test:`, `chore:`).
+
+### 7. Verify
+
+```bash
+git log main..HEAD --oneline
+git diff main..HEAD --name-only
+git status --short
+```
+
+Confirm:
+- Commit count is small and each message is meaningful
+- The set of changed files vs `main` is identical to before
+- Working tree is clean
+
+### 8. Remind about force-push
+
+The branch history has been rewritten. Tell the user they will need to `git push --force-with-lease` when they are ready to update the remote. Do NOT push automatically.
+
+## Rules
+
+- Never change file contents. If you find yourself editing a file, stop.
+- Never skip the backup branch step.
+- Never force-push without explicit user instruction.
+- If any step fails or the result looks wrong, tell the user and suggest restoring from the backup: `git reset --hard backup/<branch-name>`.
--- a/README.md
+++ b/README.md
@@ -92,7 +92,7 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
 | Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
 | ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | Speech-to-Text      | [AssemblyAI](https://docs.pipecat.ai/api-reference/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/api-reference/server/services/stt/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/api-reference/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/api-reference/server/services/stt/gladia), [Google](https://docs.pipecat.ai/api-reference/server/services/stt/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/stt/mistral), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/stt/nvidia), [OpenAI (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/api-reference/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/api-reference/server/services/stt/whisper), [xAI](https://docs.pipecat.ai/api-reference/server/services/stt/xai)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
-| LLMs                | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+| LLMs                | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Inception](https://docs.pipecat.ai/api-reference/server/services/llm/inception), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
 | Text-to-Speech      | [Async](https://docs.pipecat.ai/api-reference/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/api-reference/server/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/api-reference/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/api-reference/server/services/tts/fish), [Google](https://docs.pipecat.ai/api-reference/server/services/tts/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/api-reference/server/services/tts/groq), [Hume](https://docs.pipecat.ai/api-reference/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/api-reference/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/api-reference/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/api-reference/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/api-reference/server/services/tts/minimax), [Mistral](https://docs.pipecat.ai/api-reference/server/services/tts/mistral), [Neuphonic](https://docs.pipecat.ai/api-reference/server/services/tts/neuphonic), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/tts/nvidia), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/tts/openai), [Piper](https://docs.pipecat.ai/api-reference/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/api-reference/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/api-reference/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/api-reference/server/services/tts/smallest), [Soniox](https://docs.pipecat.ai/api-reference/server/services/tts/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/api-reference/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/api-reference/server/services/tts/xtts) |
 | Speech-to-Speech    | [AWS Nova Sonic](https://docs.pipecat.ai/api-reference/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/api-reference/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/api-reference/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/api-reference/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/api-reference/server/services/s2s/ultravox),                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
 | Transport           | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/api-reference/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/api-reference/server/services/transport/small-webrtc), [Vonage (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/vonage), [WebSocket Server](https://docs.pipecat.ai/api-reference/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/api-reference/server/services/transport/whatsapp), Local                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
--- a/changelog/4423.added.md
+++ b/changelog/4423.added.md
@@ -0,0 +1 @@
+- Added `InceptionLLMService` for Inception's Mercury 2 diffusion reasoning model, with support for `reasoning_effort` and `realtime` settings.
--- a/changelog/4514.fixed.md
+++ b/changelog/4514.fixed.md
@@ -0,0 +1 @@
+- Fixed websocket STT connection setup failures so services clear stale websocket state and emit non-fatal error frames, allowing `ServiceSwitcher` failover to keep agents running.
--- a/changelog/4521.added.md
+++ b/changelog/4521.added.md
@@ -0,0 +1 @@
+- Added `max_endpoint_delay_ms` to `SonioxSTTService.Settings`, controlling the maximum delay (500-3000 ms) before endpoint detection finalizes a turn.
--- a/changelog/4521.changed.md
+++ b/changelog/4521.changed.md
@@ -0,0 +1 @@
+- `SonioxSTTService` now applies settings updates (e.g. via `STTUpdateSettingsFrame`) using a graceful reconnect instead of a hard disconnect/reconnect, preserving the service's reconnect retry behavior.
--- a/changelog/4521.removed.md
+++ b/changelog/4521.removed.md
@@ -0,0 +1 @@
+- Removed the unsupported Georgian (`Language.KA`) language mapping from `SonioxSTTService`.
--- a/changelog/4531.changed.md
+++ b/changelog/4531.changed.md
@@ -0,0 +1 @@
+- Bumped `pipecat-ai-prebuilt` to 1.0.1 in the `runner` extra, updating the prebuilt client UI served by the development runner.
--- a/env.example
+++ b/env.example
@@ -91,6 +91,9 @@ HEYGEN_LIVE_AVATAR_API_KEY=...
 HUME_API_KEY=...
 HUME_VOICE_ID=...

+# Inception
+INCEPTION_API_KEY=...
+
 # Inworld
 INWORLD_API_KEY=...

--- a/examples/function-calling/function-calling-inception.py
+++ b/examples/function-calling/function-calling-inception.py
@@ -0,0 +1,177 @@
+#
+# Copyright (c) 2024-2026, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+
+import os
+
+from dotenv import load_dotenv
+from loguru import logger
+
+from pipecat.adapters.schemas.function_schema import FunctionSchema
+from pipecat.adapters.schemas.tools_schema import ToolsSchema
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.frames.frames import LLMRunFrame, TTSSpeakFrame
+from pipecat.pipeline.pipeline import Pipeline
+from pipecat.pipeline.runner import PipelineRunner
+from pipecat.pipeline.task import PipelineParams, PipelineTask
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import (
+    LLMContextAggregatorPair,
+    LLMUserAggregatorParams,
+)
+from pipecat.runner.types import RunnerArguments
+from pipecat.runner.utils import create_transport
+from pipecat.services.cartesia.tts import CartesiaTTSService
+from pipecat.services.deepgram.stt import DeepgramSTTService
+from pipecat.services.inception.llm import InceptionLLMService
+from pipecat.services.llm_service import FunctionCallParams
+from pipecat.transports.base_transport import BaseTransport, TransportParams
+from pipecat.transports.daily.transport import DailyParams
+from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
+
+load_dotenv(override=True)
+
+
+async def fetch_weather_from_api(params: FunctionCallParams):
+    await params.result_callback({"conditions": "nice", "temperature": "75"})
+
+
+async def fetch_restaurant_recommendation(params: FunctionCallParams):
+    await params.result_callback({"name": "The Golden Dragon"})
+
+
+# We use lambdas to defer transport parameter creation until the transport
+# type is selected at runtime.
+transport_params = {
+    "daily": lambda: DailyParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+    ),
+    "twilio": lambda: FastAPIWebsocketParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+    ),
+    "webrtc": lambda: TransportParams(
+        audio_in_enabled=True,
+        audio_out_enabled=True,
+    ),
+}
+
+
+async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+    logger.info(f"Starting bot")
+
+    stt = DeepgramSTTService(api_key=os.environ["DEEPGRAM_API_KEY"])
+
+    tts = CartesiaTTSService(
+        api_key=os.environ["CARTESIA_API_KEY"],
+        settings=CartesiaTTSService.Settings(
+            voice="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
+        ),
+    )
+
+    llm = InceptionLLMService(
+        api_key=os.environ["INCEPTION_API_KEY"],
+        settings=InceptionLLMService.Settings(
+            reasoning_effort="instant",
+            system_instruction="You are a helpful assistant in a voice conversation. Your responses will be spoken aloud, so avoid emojis, bullet points, or other formatting that can't be spoken. Respond to what the user said in a creative, helpful, and brief way.",
+        ),
+    )
+    # You can also register a function_name of None to get all functions
+    # sent to the same callback with an additional function_name parameter.
+    llm.register_function("get_current_weather", fetch_weather_from_api)
+    llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
+
+    @llm.event_handler("on_function_calls_started")
+    async def on_function_calls_started(service, function_calls):
+        await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
+
+    weather_function = FunctionSchema(
+        name="get_current_weather",
+        description="Get the current weather",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+            "format": {
+                "type": "string",
+                "enum": ["celsius", "fahrenheit"],
+                "description": "The temperature unit to use. Infer this from the user's location.",
+            },
+        },
+        required=["location", "format"],
+    )
+
+    restaurant_function = FunctionSchema(
+        name="get_restaurant_recommendation",
+        description="Get a restaurant recommendation",
+        properties={
+            "location": {
+                "type": "string",
+                "description": "The city and state, e.g. San Francisco, CA",
+            },
+        },
+        required=["location"],
+    )
+    tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
+
+    context = LLMContext(tools=tools)
+    user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
+        context,
+        user_params=LLMUserAggregatorParams(vad_analyzer=SileroVADAnalyzer()),
+    )
+
+    pipeline = Pipeline(
+        [
+            transport.input(),
+            stt,
+            user_aggregator,
+            llm,
+            tts,
+            transport.output(),
+            assistant_aggregator,
+        ]
+    )
+
+    task = PipelineTask(
+        pipeline,
+        params=PipelineParams(
+            enable_metrics=True,
+            enable_usage_metrics=True,
+        ),
+        idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
+    )
+
+    @transport.event_handler("on_client_connected")
+    async def on_client_connected(transport, client):
+        logger.info(f"Client connected")
+        # Kick off the conversation.
+        context.add_message(
+            {"role": "developer", "content": "Please introduce yourself to the user."}
+        )
+        await task.queue_frames([LLMRunFrame()])
+
+    @transport.event_handler("on_client_disconnected")
+    async def on_client_disconnected(transport, client):
+        logger.info(f"Client disconnected")
+        await task.cancel()
+
+    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+
+    await runner.run(task)
+
+
+async def bot(runner_args: RunnerArguments):
+    """Main bot entry point compatible with Pipecat Cloud."""
+    transport = await create_transport(runner_args, transport_params)
+    await run_bot(transport, runner_args)
+
+
+if __name__ == "__main__":
+    from pipecat.runner.run import main
+
+    main()
--- a/examples/update-settings/stt/stt-soniox.py
+++ b/examples/update-settings/stt/stt-soniox.py
@@ -22,9 +22,9 @@ from pipecat.processors.aggregators.llm_response_universal import (
 )
 from pipecat.runner.types import RunnerArguments
 from pipecat.runner.utils import create_transport
-from pipecat.services.cartesia.tts import CartesiaTTSService
 from pipecat.services.openai.llm import OpenAILLMService
 from pipecat.services.soniox.stt import SonioxSTTService
+from pipecat.services.soniox.tts import SonioxTTSService
 from pipecat.transcriptions.language import Language
 from pipecat.transports.base_transport import BaseTransport, TransportParams
 from pipecat.transports.daily.transport import DailyParams
@@ -53,12 +53,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):

    stt = SonioxSTTService(api_key=os.environ["SONIOX_API_KEY"])

-    tts = CartesiaTTSService(
-        api_key=os.environ["CARTESIA_API_KEY"],
-        settings=CartesiaTTSService.Settings(
-            voice="71a7ad14-091c-4e8e-a314-022ece01c121",  # British Reading Lady
-        ),
-    )
+    tts = SonioxTTSService(api_key=os.environ["SONIOX_API_KEY"])

    llm = OpenAILLMService(
        api_key=os.environ["OPENAI_API_KEY"],
@@ -103,9 +98,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
        await task.queue_frames([LLMRunFrame()])

        await asyncio.sleep(10)
-        logger.info("Updating Soniox STT settings: language=es")
+        logger.info("Updating Soniox STT settings: language_hints=[es]")
        await task.queue_frame(
-            STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language=Language.ES))
+            STTUpdateSettingsFrame(delta=SonioxSTTService.Settings(language_hints=[Language.ES]))
        )

    @transport.event_handler("on_client_disconnected")
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -77,6 +77,7 @@ groq = [ "groq>=0.23.0,<2" ]
 gstreamer = [ "pygobject~=3.50.0" ]
 heygen = [ "livekit>=1.0.13,<2", "pipecat-ai[websockets-base]" ]
 hume = [ "hume>=0.11.2,<1" ]
+inception = []
 inworld = [ "pipecat-ai[websockets-base]" ]
 koala = [ "pvkoala~=2.0.3" ]
 kokoro = [ "kokoro-onnx>=0.5.0,<1", "requests>=2.32.5,<3" ]
@@ -103,7 +104,7 @@ piper = [ "piper-tts>=1.3.0,<2", "requests>=2.32.5,<3" ]
 qwen = []
 resembleai = [ "pipecat-ai[websockets-base]" ]
 rime = [ "pipecat-ai[websockets-base]" ]
-runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.0"]
+runner = [ "python-dotenv>=1.0.0,<2.0.0", "uvicorn>=0.32.0,<1.0.0", "fastapi>=0.115.6,<1", "pipecat-ai-prebuilt>=1.0.1"]
 sagemaker = ["aws_sdk_sagemaker_runtime_http2; python_version>='3.12'"]
 sambanova = []
 sarvam = [ "sarvamai==0.1.28", "pipecat-ai[websockets-base]" ]
--- a/scripts/evals/run-release-evals.py
+++ b/scripts/evals/run-release-evals.py
@@ -198,6 +198,7 @@ TESTS_FUNCTION_CALLING = [
    ("function-calling/function-calling-sarvam.py", EVAL_WEATHER),
    ("function-calling/function-calling-novita.py", EVAL_WEATHER),
    ("function-calling/function-calling-deepseek.py", EVAL_WEATHER),
+    ("function-calling/function-calling-inception.py", EVAL_WEATHER),
    # Video
    ("function-calling/function-calling-anthropic-video.py", EVAL_VISION_CAMERA),
    ("function-calling/function-calling-aws-video.py", EVAL_VISION_CAMERA),
--- a/src/pipecat/processors/frameworks/rtvi/observer.py
+++ b/src/pipecat/processors/frameworks/rtvi/observer.py
@@ -529,7 +529,7 @@ class RTVIObserver(BaseObserver):

        isTTS = isinstance(frame, TTSTextFrame)
        if agg_type is not AggregationType.WORD:
-            logger.debug(f"{self} Aggregated LLM text: {text}, {agg_type} spoken:{isTTS}")
+            logger.trace(f"{self} Aggregated LLM text: {text}, {agg_type} spoken:{isTTS}")

        if self._params.bot_output_enabled:
            message = RTVI.BotOutputMessage(
--- a/src/pipecat/services/assemblyai/stt.py
+++ b/src/pipecat/services/assemblyai/stt.py
@@ -586,9 +586,9 @@ class AssemblyAISTTService(WebsocketSTTService):
            await self._call_event_handler("on_connected")
            logger.debug(f"{self} Connected to AssemblyAI WebSocket")
        except Exception as e:
+            self._websocket = None
            self._connected = False
            await self.push_error(error_msg=f"Unable to connect to AssemblyAI: {e}", exception=e)
-            raise

    async def _disconnect_websocket(self):
        """Close the websocket connection to AssemblyAI."""
--- a/src/pipecat/services/aws/stt.py
+++ b/src/pipecat/services/aws/stt.py
@@ -339,10 +339,10 @@ class AWSTranscribeSTTService(WebsocketSTTService):
            await self._call_event_handler("on_connected")
            logger.info(f"{self} Successfully connected to AWS Transcribe")
        except Exception as e:
+            self._websocket = None
            await self.push_error(
                error_msg=f"Unable to connect to AWS Transcribe: {e}", exception=e
            )
-            raise

    async def _disconnect_websocket(self):
        """Close the websocket connection to AWS Transcribe."""
--- a/src/pipecat/services/cartesia/stt.py
+++ b/src/pipecat/services/cartesia/stt.py
@@ -354,7 +354,8 @@ class CartesiaSTTService(WebsocketSTTService):
            self._websocket = await websocket_connect(ws_url, additional_headers=headers)
            await self._call_event_handler("on_connected")
        except Exception as e:
-            await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
+            self._websocket = None
+            await self.push_error(error_msg=f"Unable to connect to Cartesia: {e}", exception=e)

    async def _disconnect_websocket(self):
        ws = self._websocket
--- a/src/pipecat/services/elevenlabs/stt.py
+++ b/src/pipecat/services/elevenlabs/stt.py
@@ -823,6 +823,7 @@ class ElevenLabsRealtimeSTTService(WebsocketSTTService):
            await self._call_event_handler("on_connected")
            logger.debug("Connected to ElevenLabs Realtime STT")
        except Exception as e:
+            self._websocket = None
            await self.push_error(
                error_msg=f"Unable to connect to ElevenLabs Realtime STT: {e}", exception=e
            )
--- a/src/pipecat/services/gladia/stt.py
+++ b/src/pipecat/services/gladia/stt.py
@@ -558,8 +558,9 @@ class GladiaSTTService(WebsocketSTTService):

            logger.debug(f"{self} Connected to Gladia WebSocket")
        except Exception as e:
+            self._websocket = None
+            self._connection_active = False
            await self.push_error(error_msg=f"Unable to connect to Gladia: {e}", exception=e)
-            raise

    async def _disconnect_websocket(self):
        """Close the websocket connection to Gladia."""
--- a/src/pipecat/services/gradium/stt.py
+++ b/src/pipecat/services/gradium/stt.py
@@ -423,8 +423,8 @@ class GradiumSTTService(WebsocketSTTService):
            logger.debug("Connected to Gradium STT")

        except Exception as e:
-            await self.push_error(error_msg=f"Unknown error occurred: {e}", exception=e)
-            raise
+            self._websocket = None
+            await self.push_error(error_msg=f"Unable to connect to Gradium: {e}", exception=e)

    async def _disconnect(self):
        await super()._disconnect()
--- a/src/pipecat/services/inception/init.py
+++ b/src/pipecat/services/inception/init.py
--- a/src/pipecat/services/inception/llm.py
+++ b/src/pipecat/services/inception/llm.py
@@ -0,0 +1,124 @@
+#
+# Copyright (c) 2024-2026, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+"""Inception LLM service implementation using OpenAI-compatible interface."""
+
+from dataclasses import dataclass, field
+from typing import Literal
+
+from loguru import logger
+
+from pipecat.adapters.services.open_ai_adapter import OpenAILLMInvocationParams
+from pipecat.services.openai.base_llm import BaseOpenAILLMService
+from pipecat.services.openai.llm import OpenAILLMService
+from pipecat.services.settings import NOT_GIVEN as _NOT_GIVEN
+from pipecat.services.settings import _NotGiven, is_given
+
+
+@dataclass
+class InceptionLLMSettings(BaseOpenAILLMService.Settings):
+    """Settings for InceptionLLMService.
+
+    Parameters:
+        reasoning_effort: Controls how much reasoning the model applies.
+            One of "instant", "low", "medium", or "high". When unset, the
+            parameter is omitted and Inception's server-side default applies.
+        realtime: When True, reduces time to first diffusion block (TTFT).
+    """
+
+    reasoning_effort: Literal["instant", "low", "medium", "high"] | None | _NotGiven = field(
+        default_factory=lambda: _NOT_GIVEN
+    )
+    realtime: bool | None | _NotGiven = field(default_factory=lambda: _NOT_GIVEN)
+
+
+class InceptionLLMService(OpenAILLMService):
+    """A service for interacting with Inception's API using the OpenAI-compatible interface.
+
+    This service extends OpenAILLMService to connect to Inception's API endpoint while
+    maintaining full compatibility with OpenAI's interface and functionality.
+    Supports Mercury-2, Inception's diffusion-based reasoning model.
+    """
+
+    # Inception doesn't support the "developer" message role.
+    supports_developer_role = False
+
+    Settings = InceptionLLMSettings
+    _settings: Settings
+
+    def __init__(
+        self,
+        *,
+        api_key: str,
+        base_url: str = "https://api.inceptionlabs.ai/v1",
+        settings: Settings | None = None,
+        **kwargs,
+    ):
+        """Initialize the Inception LLM service.
+
+        Args:
+            api_key: The API key for accessing Inception's API.
+            base_url: The base URL for Inception API. Defaults to "https://api.inceptionlabs.ai/v1".
+            settings: Runtime-updatable settings.
+            **kwargs: Additional keyword arguments passed to OpenAILLMService.
+        """
+        default_settings = self.Settings(
+            model="mercury-2",
+            reasoning_effort=None,
+            realtime=None,
+        )
+
+        if settings is not None:
+            default_settings.apply_update(settings)
+
+        super().__init__(api_key=api_key, base_url=base_url, settings=default_settings, **kwargs)
+
+    def create_client(self, api_key=None, base_url=None, **kwargs):
+        """Create OpenAI-compatible client for Inception API endpoint.
+
+        Args:
+            api_key: The API key for authentication. If None, uses instance default.
+            base_url: The base URL for the API. If None, uses instance default.
+            **kwargs: Additional keyword arguments for client configuration.
+
+        Returns:
+            An OpenAI-compatible client configured for Inception's API.
+        """
+        logger.debug(f"Creating Inception client with api {base_url}")
+        return super().create_client(api_key, base_url, **kwargs)
+
+    def build_chat_completion_params(self, params_from_context: OpenAILLMInvocationParams) -> dict:
+        """Build parameters for Inception chat completion request.
+
+        Extends the base OpenAI parameters with Inception-specific options
+        such as reasoning_effort and realtime.
+
+        Args:
+            params_from_context: Parameters, derived from the LLM context, to
+                use for the chat completion. Contains messages, tools, and tool
+                choice.
+
+        Returns:
+            Dictionary of parameters for the chat completion request.
+        """
+        params = super().build_chat_completion_params(params_from_context)
+
+        if (
+            is_given(self._settings.reasoning_effort)
+            and self._settings.reasoning_effort is not None
+        ):
+            params["reasoning_effort"] = self._settings.reasoning_effort
+
+        # realtime is Inception-specific and unknown to the OpenAI SDK,
+        # so it must be passed via extra_body to avoid validation errors.
+        extra_body = {}
+        if is_given(self._settings.realtime) and self._settings.realtime is not None:
+            extra_body["realtime"] = self._settings.realtime
+
+        if extra_body:
+            params["extra_body"] = extra_body
+
+        return params
--- a/src/pipecat/services/soniox/stt.py
+++ b/src/pipecat/services/soniox/stt.py
@@ -155,7 +155,6 @@ def language_to_soniox_language(language: Language) -> str:
        Language.ID: "id",
        Language.IT: "it",
        Language.JA: "ja",
-        Language.KA: "ka",
        Language.KK: "kk",
        Language.KN: "kn",
        Language.KO: "ko",
@@ -232,6 +231,7 @@ class SonioxSTTSettings(STTSettings):
            context_version 2.
        enable_speaker_diarization: Whether to enable speaker diarization.
        enable_language_identification: Whether to enable language identification.
+        max_endpoint_delay_ms: Max ms before endpoint detection finalizes the turn (500-3000).
        client_reference_id: Client reference ID to use for transcription.
    """

@@ -242,6 +242,7 @@ class SonioxSTTSettings(STTSettings):
    enable_language_identification: bool | None | _NotGiven = field(
        default_factory=lambda: NOT_GIVEN
    )
+    max_endpoint_delay_ms: int | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)
    client_reference_id: str | None | _NotGiven = field(default_factory=lambda: NOT_GIVEN)


@@ -309,6 +310,7 @@ class SonioxSTTService(WebsocketSTTService):
            context=None,
            enable_speaker_diarization=False,
            enable_language_identification=False,
+            max_endpoint_delay_ms=None,
            client_reference_id=None,
        )

@@ -390,8 +392,7 @@ class SonioxSTTService(WebsocketSTTService):
        changed = await super()._update_settings(delta)

        if changed:
-            await self._disconnect()
-            await self._connect()
+            await self._request_reconnect()

        return changed

@@ -522,6 +523,7 @@ class SonioxSTTService(WebsocketSTTService):
                "audio_format": self._audio_format,
                "num_channels": self._num_channels,
                "enable_endpoint_detection": enable_endpoint_detection,
+                "max_endpoint_delay_ms": s.max_endpoint_delay_ms,
                "sample_rate": self.sample_rate,
                "language_hints": _prepare_language_hints(assert_given(s.language_hints)),
                "language_hints_strict": s.language_hints_strict,
@@ -537,8 +539,8 @@ class SonioxSTTService(WebsocketSTTService):
            await self._call_event_handler("on_connected")
            logger.debug("Connected to Soniox STT")
        except Exception as e:
+            self._websocket = None
            await self.push_error(error_msg=f"Unable to connect to Soniox: {e}", exception=e)
-            raise

    async def _disconnect_websocket(self):
        """Close the websocket connection to Soniox."""
--- a/src/pipecat/services/websocket_service.py
+++ b/src/pipecat/services/websocket_service.py
@@ -76,7 +76,9 @@ class WebsocketService(ABC):
        logger.warning(f"{self} reconnecting (attempt: {attempt_number})")
        await self._disconnect_websocket()
        await self._connect_websocket()
-        return await self._verify_connection()
+        if not await self._verify_connection():
+            raise ConnectionError(f"{self} websocket reconnection failed verification")
+        return True

    async def _try_reconnect(
        self,
--- a/src/pipecat/services/xai/stt.py
+++ b/src/pipecat/services/xai/stt.py
@@ -293,8 +293,9 @@ class XAISTTService(WebsocketSTTService):
            await self._call_event_handler("on_connected")
            logger.debug(f"{self} connected to xAI STT WebSocket")
        except Exception as e:
+            self._websocket = None
+            self._session_ready.clear()
            await self.push_error(error_msg=f"Unable to connect to xAI STT: {e}", exception=e)
-            raise

    async def _disconnect_websocket(self):
        """Close the WebSocket connection."""
--- a/src/pipecat/utils/context/word_completion_tracker.py
+++ b/src/pipecat/utils/context/word_completion_tracker.py
@@ -86,7 +86,6 @@ class WordCompletionTracker:
        self._overflow_word: str | None = None
        self._llm_consumed: str | None = None
        self._frame_word: str | None = None
-        logger.debug(f"WordCompletionTracker: {self._tts_normalized}")

    @staticmethod
    def _normalize(text: str) -> str:
--- a/tests/test_cartesia_stt.py
+++ b/tests/test_cartesia_stt.py
@@ -0,0 +1,45 @@
+#
+# Copyright (c) 2024-2026, Daily
+#
+# SPDX-License-Identifier: BSD 2-Clause License
+#
+
+from unittest.mock import AsyncMock
+
+import pytest
+from websockets.protocol import State
+
+from pipecat.services.cartesia.stt import CartesiaSTTService
+
+
+class _FakeWebsocket:
+    def __init__(self, *, state=State.OPEN, send_side_effect=None):
+        self.state = state
+        self.send = AsyncMock(side_effect=send_side_effect)
+
+
+@pytest.mark.asyncio
+async def test_cartesia_connect_failure_clears_stale_websocket(monkeypatch):
+    async def fake_websocket_connect(*args, **kwargs):
+        raise RuntimeError("connection failed")
+
+    monkeypatch.setattr("pipecat.services.cartesia.stt.websocket_connect", fake_websocket_connect)
+
+    service = CartesiaSTTService(api_key="test-key", sample_rate=16000)
+    service._websocket = _FakeWebsocket(state=State.CLOSED)
+
+    await service._connect_websocket()
+
+    assert service._websocket is None
+
+
+@pytest.mark.asyncio
+async def test_cartesia_run_stt_logs_send_failure_without_clearing_websocket():
+    service = CartesiaSTTService(api_key="test-key", sample_rate=16000)
+    websocket = _FakeWebsocket(send_side_effect=RuntimeError("websocket closed"))
+    service._websocket = websocket
+
+    async for _ in service.run_stt(b"\x00" * 160):
+        pass
+
+    assert service._websocket is websocket
--- a/tests/test_soniox_stt.py
+++ b/tests/test_soniox_stt.py
@@ -5,8 +5,10 @@
 #

 import json
+from unittest.mock import AsyncMock

 import pytest
+from websockets.protocol import State

 from pipecat.frames.frames import TranscriptionFrame
 from pipecat.services.soniox.stt import END_TOKEN, SonioxSTTService, _language_from_tokens
@@ -14,8 +16,10 @@ from pipecat.transcriptions.language import Language


 class _FakeWebsocket:
-    def __init__(self, messages):
+    def __init__(self, messages, *, state=State.OPEN, send_side_effect=None):
        self._messages = messages
+        self.state = state
+        self.send = AsyncMock(side_effect=send_side_effect)

    def __aiter__(self):
        return self._iter_messages()
@@ -25,6 +29,21 @@ class _FakeWebsocket:
            yield message


+@pytest.mark.asyncio
+async def test_connect_failure_clears_stale_websocket_without_raising(monkeypatch):
+    async def fake_websocket_connect(*args, **kwargs):
+        raise RuntimeError("connection failed")
+
+    monkeypatch.setattr("pipecat.services.soniox.stt.websocket_connect", fake_websocket_connect)
+
+    service = SonioxSTTService(api_key="test-key")
+    service._websocket = _FakeWebsocket([], state=State.CLOSED)
+
+    await service._connect_websocket()
+
+    assert service._websocket is None
+
+
 def test_language_from_tokens_uses_single_recognized_language():
    tokens = [
        {"text": "Hello", "language": "en"},
--- a/tests/test_websocket_service.py
+++ b/tests/test_websocket_service.py
@@ -165,6 +165,19 @@ async def test_reconnect_exhausted_emits_non_fatal_error(service, report_error):
    assert "Connection refused" in final_error.error


+@pytest.mark.asyncio
+async def test_reconnect_exhausted_when_connect_does_not_raise(service, report_error):
+    """A non-raising failed connect is treated as a failed reconnect attempt."""
+    result = await service._try_reconnect(report_error=report_error)
+
+    assert result is False
+    assert report_error.call_count == 4
+    final_error = report_error.call_args_list[-1][0][0]
+    assert isinstance(final_error, ErrorFrame)
+    assert final_error.fatal is False
+    assert "websocket reconnection failed verification" in final_error.error
+
+
 # ---------------------------------------------------------------------------
 # Quick failure detection — accept then immediately close
 # ---------------------------------------------------------------------------
--- a/uv.lock
+++ b/uv.lock
@@ -4539,7 +4539,7 @@ requires-dist = [
    { name = "pipecat-ai", extras = ["websockets-base"], marker = "extra == 'ultravox'" },
    { name = "pipecat-ai", extras = ["websockets-base"], marker = "extra == 'websocket'" },
    { name = "pipecat-ai", extras = ["websockets-base"], marker = "extra == 'xai'" },
-    { name = "pipecat-ai-prebuilt", marker = "extra == 'runner'", specifier = ">=1.0.0" },
+    { name = "pipecat-ai-prebuilt", marker = "extra == 'runner'", specifier = ">=1.0.1" },
    { name = "piper-tts", marker = "extra == 'piper'", specifier = ">=1.3.0,<2" },
    { name = "protobuf", specifier = ">=5.29.6,<7" },
    { name = "protobuf", marker = "extra == 'nvidia'", specifier = ">=6.31.1,<7" },
@@ -4574,7 +4574,7 @@ requires-dist = [
    { name = "wait-for2", marker = "python_full_version < '3.12'", specifier = ">=0.4.1,<1" },
    { name = "websockets", marker = "extra == 'websockets-base'", specifier = ">=13.1,<16.0" },
 ]
-provides-extras = ["aic", "anthropic", "assemblyai", "asyncai", "aws", "aws-nova-sonic", "azure", "cartesia", "camb", "cerebras", "daily", "deepgram", "deepseek", "elevenlabs", "fal", "fireworks", "fish", "gladia", "google", "gradium", "grok", "groq", "gstreamer", "heygen", "hume", "inworld", "koala", "kokoro", "langchain", "lemonslice", "livekit", "lmnt", "local", "local-smart-turn", "mcp", "mem0", "mistral", "mlx-whisper", "moondream", "nebius", "neuphonic", "novita", "nvidia", "openai", "rnnoise", "openrouter", "perplexity", "piper", "qwen", "resembleai", "rime", "runner", "sagemaker", "sambanova", "sarvam", "sentry", "silero", "simli", "smallest", "soniox", "soundfile", "speechmatics", "strands", "tavus", "together", "tracing", "ultravox", "vonage-video-connector", "webrtc", "websocket", "websockets-base", "whisper", "xai"]
+provides-extras = ["aic", "anthropic", "assemblyai", "asyncai", "aws", "aws-nova-sonic", "azure", "cartesia", "camb", "cerebras", "daily", "deepgram", "deepseek", "elevenlabs", "fal", "fireworks", "fish", "gladia", "google", "gradium", "grok", "groq", "gstreamer", "heygen", "hume", "inception", "inworld", "koala", "kokoro", "langchain", "lemonslice", "livekit", "lmnt", "local", "local-smart-turn", "mcp", "mem0", "mistral", "mlx-whisper", "moondream", "nebius", "neuphonic", "novita", "nvidia", "openai", "rnnoise", "openrouter", "perplexity", "piper", "qwen", "resembleai", "rime", "runner", "sagemaker", "sambanova", "sarvam", "sentry", "silero", "simli", "smallest", "soniox", "soundfile", "speechmatics", "strands", "tavus", "together", "tracing", "ultravox", "vonage-video-connector", "webrtc", "websocket", "websockets-base", "whisper", "xai"]

 [package.metadata.requires-dev]
 dev = [
@@ -4603,14 +4603,14 @@ docs = [

 [[package]]
 name = "pipecat-ai-prebuilt"
-version = "1.0.0"
+version = "1.0.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "fastapi", extra = ["all"] },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/d0/86/7527474a324e3da787468133a1dba877e06576edc502e1bc7dd84ba7c9f7/pipecat_ai_prebuilt-1.0.0.tar.gz", hash = "sha256:dc66df541f17620eef5dedb2fd44737eb97232899779afb66dcca5aaa9317512", size = 601709, upload-time = "2026-05-14T21:15:26.575Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/fa/27/91857cd93661922687e51f4141583dbeb71f9a6c8d0d6379bae1aa467522/pipecat_ai_prebuilt-1.0.1.tar.gz", hash = "sha256:9453136fcb994802f9b650b5175f3ce1d0476849a9e609fefe52ecc1c3299680", size = 601771, upload-time = "2026-05-20T16:08:14.485Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/89/b1/648122d5e418d3e0c8f797028bc53a22229ffc07a2406712b13b76735f38/pipecat_ai_prebuilt-1.0.0-py3-none-any.whl", hash = "sha256:6b7057920d3d00e5687adb26e032634ba1f6d924eb9079b1804d031620a1e854", size = 601949, upload-time = "2026-05-14T21:15:24.666Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/4f/a636e47967c3aa885ae912502d73a46d1e824a67992e405ea1e94b78bd94/pipecat_ai_prebuilt-1.0.1-py3-none-any.whl", hash = "sha256:45d78d3fd2ac8193626a5dabb5f45d0ff2d35bfc92098b4bcea308ae612196aa", size = 601994, upload-time = "2026-05-20T16:08:12.4Z" },
 ]

 [[package]]
Author	SHA1	Message	Date
Mark Backman	780c004168	Merge pull request #4423 from joycech333/feat/inception-llm-service feat: add Inception LLM service with Mercury 2 support	2026-05-21 12:02:27 -04:00
Mark Backman	28f9203401	Code review fixes	2026-05-21 11:45:17 -04:00
joycech333	77cc314a08	feat: add Inception LLM service with Mercury-2 support Adds InceptionLLMService, an OpenAI-compatible service for Inception's Mercury-2 diffusion-based reasoning model. Supports reasoning_effort (instant/low/medium/high) and realtime mode for reduced TTFT.	2026-05-21 11:23:23 -04:00
Mark Backman	4a8d1d0b5e	Merge pull request #4532 from pipecat-ai/mb/cleanup-logging-after-smart-text-handling Clean up smart text logging	2026-05-21 08:35:46 -04:00
Mark Backman	87f5d60693	Merge pull request #4531 from pipecat-ai/mb/pipecat-prebuilt-1.0.1 chore: bump pipecat-ai-prebuilt to 1.0.1	2026-05-21 08:35:31 -04:00
Mark Backman	c699b31daa	Merge pull request #4534 from pipecat-ai/mb/changelog-4521 Add changelog for #4521	2026-05-21 08:35:15 -04:00
Mark Backman	ee674ffb01	Add changelog for #4521	2026-05-20 17:57:43 -04:00
mihafabcic-soniox	86a5710801	Add max_endpoint_delay_ms and clean up Sonoix STT settings (#4521 )	2026-05-20 17:54:48 -04:00
Mark Backman	4a96b2a9e6	Clean up smart text logging	2026-05-20 15:38:59 -04:00
Mark Backman	105d6f27da	Merge pull request #4514 from pipecat-ai/mb/websocket-stt-service-exception-handling Align websocket STT connection failures	2026-05-20 15:15:35 -04:00
Filipi da Silva Fuchter	e0e3cd336a	Merge pull request #4529 from pipecat-ai/filipi/squash_skill New skill to squash commits.	2026-05-20 16:06:23 -03:00
Mark Backman	9586db5b50	Preserve websocket reconnect failure retries	2026-05-20 14:45:29 -04:00
Mark Backman	a890ab7b21	Add changelog for PR #4531	2026-05-20 12:18:03 -04:00
Mark Backman	c1bf7dbb4a	chore: bump pipecat-ai-prebuilt to 1.0.1	2026-05-20 12:15:09 -04:00
filipi87	c321f50e76	New skill to squash commits.	2026-05-20 10:29:03 -03:00
Mark Backman	e298491068	Add changelog for websocket STT failure handling	2026-05-18 12:41:56 -04:00
Mark Backman	97b00042df	Align websocket STT connection failures	2026-05-18 12:35:01 -04:00
				`@@ -0,0 +1 @@`
				- Added `InceptionLLMService` for Inception's Mercury 2 diffusion reasoning model, with support for `reasoning_effort` and `realtime` settings.
				`@@ -0,0 +1 @@`
				- Fixed websocket STT connection setup failures so services clear stale websocket state and emit non-fatal error frames, allowing `ServiceSwitcher` failover to keep agents running.
				`@@ -0,0 +1 @@`
				- Added `max_endpoint_delay_ms` to `SonioxSTTService.Settings`, controlling the maximum delay (500-3000 ms) before endpoint detection finalizes a turn.
				`@@ -0,0 +1 @@`
				- `SonioxSTTService` now applies settings updates (e.g. via `STTUpdateSettingsFrame`) using a graceful reconnect instead of a hard disconnect/reconnect, preserving the service's reconnect retry behavior.
				`@@ -0,0 +1 @@`
				- Removed the unsupported Georgian (`Language.KA`) language mapping from `SonioxSTTService`.
				`@@ -0,0 +1 @@`
				- Bumped `pipecat-ai-prebuilt` to 1.0.1 in the `runner` extra, updating the prebuilt client UI served by the development runner.