Merge pull request #4185 from pipecat-ai/changelog-0.0.108

Release 0.0.108 - Changelog Update
Update changelog for version 0.0.108
2026-03-27 21:47:53 -07:00 · 2026-03-27 21:43:37 -07:00 · 2026-03-27 21:40:21 -07:00 · 2026-03-27 21:36:03 -07:00 · 2026-03-28 00:02:44 -04:00 · 2026-03-28 00:01:25 -04:00
65 changed files with 501 additions and 85 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,308 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 <!-- towncrier release notes start -->

+## [0.0.108] - 2026-03-27
+
+### Added
+
+- Added `SarvamLLMService` with support for `sarvam-30b`, `sarvam-30b-16k`,
+  `sarvam-105b` and `sarvam-105b-32k`.
+  (PR [#3978](https://github.com/pipecat-ai/pipecat/pull/3978))
+
+- Added `on_turn_context_created(context_id)` hook to `TTSService`. Override
+  this to perform provider-specific setup (e.g. eagerly opening a server-side
+  context) before text starts flowing. Called each time a new turn context ID
+  is created.
+  (PR [#4013](https://github.com/pipecat-ai/pipecat/pull/4013))
+
+- Added `XAIHttpTTSService` for text-to-speech using xAI's HTTP TTS API.
+  (PR [#4031](https://github.com/pipecat-ai/pipecat/pull/4031))
+
+- Added support for "developer" role messages in conversation context across
+  all LLM adapters. For non-OpenAI services (Anthropic, Google, AWS Bedrock),
+  "developer" messages are converted to "user" messages (use
+  `system_instruction` to set the system instruction). For OpenAI services,
+  "developer" messages pass through in conversation history. For the Responses
+  API, they are kept as "developer" role (matching the existing "system" →
+  "developer" conversion).
+  (PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
+
+- Added `SmallestTTSService`, a WebSocket-based TTS service integration with
+  Smallest AI's Waves API. Supports the Lightning v2 and v3.1 models with
+  configurable voice, language, speed, consistency, similarity, and enhancement
+  settings.
+  (PR [#4092](https://github.com/pipecat-ai/pipecat/pull/4092))
+
+- Added warnings in turn stop strategies when `VADParams.stop_secs` differs
+  from the recommended default (0.2s) or when `stop_secs >= STT p99 latency`,
+  which collapses the STT wait timeout to 0s and may cause delayed turn
+  detection. The warnings guide developers to re-run the
+  [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) with their VAD
+  settings.
+  (PR [#4115](https://github.com/pipecat-ai/pipecat/pull/4115))
+
+- Added `domain` parameter to `AssemblyAISTTSettings` for specialized
+  recognition modes such as Medical Mode (`domain="medical-v1"`).
+  (PR [#4117](https://github.com/pipecat-ai/pipecat/pull/4117))
+
+- Added `NovitaLLMService` for using Novita AI's LLM models via their
+  OpenAI-compatible API.
+  (PR [#4119](https://github.com/pipecat-ai/pipecat/pull/4119))
+
+- Added `cleanup()` method to `VADAnalyzer` and `VADController` so VAD analyzer
+  resources are properly released when no longer needed. Custom `VADAnalyzer`
+  subclasses can override `cleanup()` to free any held resources.
+  (PR [#4120](https://github.com/pipecat-ai/pipecat/pull/4120))
+
+- Added `on_end_of_turn` event handler to `AssemblyAISTTService`. This fires
+  after the final transcript is pushed, providing a reliable hook for
+  end-of-turn logic that doesn't race with `TranscriptionFrame`. Works in both
+  Pipecat and AssemblyAI turn detection modes.
+  (PR [#4128](https://github.com/pipecat-ai/pipecat/pull/4128))
+
+- Added `DeepgramFluxSageMakerSTTService` for running Deepgram Flux
+  speech-to-text on AWS SageMaker endpoints.  Use with
+  `ExternalUserTurnStrategies` to take advantage of Flux's turn detection.
+  (PR [#4143](https://github.com/pipecat-ai/pipecat/pull/4143))
+
+- Added `Mem0MemoryService.get_memories()` convenience method for retrieving
+  all stored memories outside the pipeline (e.g. to build a personalized
+  greeting at connection time). This avoids the need to manually handle client
+  type branching, filter construction, and async wrapping.
+  (PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
+
+### Changed
+
+- Added context prewarming path for `InworldTTSService` to improve first audio
+  latency.
+  (PR [#4013](https://github.com/pipecat-ai/pipecat/pull/4013))
+
+- Added `KrispVivaVadAnalyzer` for Voice Activity Detection using the Krisp
+  VIVA SDK (requires `krisp_audio`).
+  (PR [#4022](https://github.com/pipecat-ai/pipecat/pull/4022))
+
+- Modified `InworldTTSService` to close context at end of turn instead of
+  relying on idle timeout.
+  (PR [#4028](https://github.com/pipecat-ai/pipecat/pull/4028))
+
+- Added Gemini 3 support to the Gemini Live service.
+  (PR [#4078](https://github.com/pipecat-ai/pipecat/pull/4078))
+
+- `TTSService`: the default `stop_frame_timeout_s` (idle time before an
+  automatic `TTSStoppedFrame` is pushed when `push_stop_frames=True`) has
+  changed from `2.0` to `3.0` seconds.
+  (PR [#4084](https://github.com/pipecat-ai/pipecat/pull/4084))
+
+- ⚠️ `GeminiLLMAdapter` now only treats `messages[0]` as the initial system
+  message, matching all other adapters. Previously it searched for the first
+  "system" message anywhere in the conversation history. A "system" message
+  appearing later in the list will now be converted to "user" instead of being
+  extracted as the system instruction.
+  (PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
+
+- Fixed `InworldTtsService` to fallback to full text when TTS timestamps are
+  not received.
+  (PR [#4113](https://github.com/pipecat-ai/pipecat/pull/4113))
+
+- ⚠️ Realtime services (Gemini Live, OpenAI Realtime, Grok Realtime, Nova
+  Sonic) now prefer `system_instruction` from service settings over an initial
+  system message in the LLM context, matching the behavior of non-realtime
+  services. Previously, context-provided system instructions took precedence. A
+  warning is now logged when both are set.
+  (PR [#4130](https://github.com/pipecat-ai/pipecat/pull/4130))
+
+- Bumped `nvidia-riva-client` minimum version to `>=2.25.1`.
+  (PR [#4136](https://github.com/pipecat-ai/pipecat/pull/4136))
+
+- Upgraded `protobuf` from 5.x to 6.x (`>=6.31.1,<7`).
+  (PR [#4136](https://github.com/pipecat-ai/pipecat/pull/4136))
+
+- Unrecognized language strings (e.g. Deepgram's `"multi"`) no longer produce a
+  warning at startup. The log message has been downgraded to debug level since
+  these are valid service-specific values that are passed through correctly.
+  (PR [#4137](https://github.com/pipecat-ai/pipecat/pull/4137))
+
+- `GrokLLMService` and `GrokRealtimeLLMService` now live in the
+  `pipecat.services.xai` module alongside `XAIHttpTTSService`, since all three
+  use the same xAI API. Update imports from `pipecat.services.grok.*` to
+  `pipecat.services.xai.*` (e.g. `from pipecat.services.xai.llm import
+  GrokLLMService`).
+  (PR [#4142](https://github.com/pipecat-ai/pipecat/pull/4142))
+
+- ⚠️ Bumped `mem0ai` dependency from `~=0.1.94` to `>=1.0.8,<2`. Users of the
+  `mem0` extra will need to update their mem0ai package.
+  (PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
+
+### Deprecated
+
+- `pipecat.services.grok.llm`, `pipecat.services.grok.realtime.llm`, and
+  `pipecat.services.grok.realtime.events` are deprecated. The old import paths
+  still work but emit a `DeprecationWarning`; use `pipecat.services.xai.llm`,
+  `pipecat.services.xai.realtime.llm`, and
+  `pipecat.services.xai.realtime.events` instead.
+  (PR [#4142](https://github.com/pipecat-ai/pipecat/pull/4142))
+
+### Removed
+
+- ⚠️ `TTSService.add_word_timestamps()` no longer supports the `"Reset"` and
+  `"TTSStoppedFrame"` sentinel strings. If you have a custom TTS service that
+  called `await self.add_word_timestamps([("Reset", 0)])` or `await
+  self.add_word_timestamps([("TTSStoppedFrame", 0), ("Reset", 0)], ctx_id)`,
+  replace them with `await self.append_to_audio_context(ctx_id,
+  TTSStoppedFrame(context_id=ctx_id))` and let `_handle_audio_context` manage
+  the word-timestamp reset automatically.
+  (PR [#4145](https://github.com/pipecat-ai/pipecat/pull/4145))
+
+- Removed `SambaNovaSTTService`. SambaNova no longer offers speech-to-text
+  audio models. Use another STT provider instead.
+  (PR [#4154](https://github.com/pipecat-ai/pipecat/pull/4154))
+
+### Fixed
+
+- Fixed Gemini Live (`GoogleGeminiLiveLLMService`) not honoring
+  `settings.system_instruction`. The system instruction was being read from a
+  deprecated constructor parameter instead of the settings object, causing it
+  to be silently ignored.
+  (PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
+
+- Fixed `AWSBedrockLLMAdapter` sending an empty message list to the API when
+  the only message in context was a system message. The lone system message is
+  now converted to "user" role instead of being extracted, matching the
+  existing Anthropic adapter behavior.
+  (PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
+
+- Fixed Gemini Live pipeline hanging indefinitely when an `EndFrame` was
+  deferred while waiting for the bot to finish responding and `turn_complete`
+  never arrived. As a possible root-cause fix, `turn_complete` messages are now
+  handled even if they lack `usage_metadata`. As a fallback, the deferred
+  `EndFrame` now has a 30-second safety timeout.
+  (PR [#4125](https://github.com/pipecat-ai/pipecat/pull/4125))
+
+- Fixed ElevenLabs WebSocket disconnections (1008 "Maximum simultaneous
+  contexts exceeded") caused by rapid user interruptions. When interruptions
+  arrived before any TTS text was generated, phantom contexts were created on
+  the ElevenLabs server that were never closed, eventually exceeding the
+  5-context limit.
+  (PR [#4126](https://github.com/pipecat-ai/pipecat/pull/4126))
+
+- Fixed the final sentence being dropped from the conversation context when
+  using RTVI text input with non-word-timestamp TTS services. The
+  `LLMFullResponseEndFrame` was racing ahead of the last `TTSTextFrame`,
+  causing the `LLMAssistantAggregator` to finalize the context before the final
+  sentence arrived.
+  (PR [#4127](https://github.com/pipecat-ai/pipecat/pull/4127))
+
+- Fixed audio crackling and popping in recordings when both user and bot are
+  speaking. `AudioBufferProcessor` no longer injects silence into a track's
+  buffer while that track is actively producing audio, preventing mid-utterance
+  interruptions in the recorded output.
+  (PR [#4135](https://github.com/pipecat-ai/pipecat/pull/4135))
+
+- Fixed websocket TTS word timestamps so interrupted contexts cannot leak stale
+  words or backward PTS values into later turns.
+  (PR [#4145](https://github.com/pipecat-ai/pipecat/pull/4145))
+
+- Fixed a race condition in `InterruptibleTTSService` where, if `run_tts` had
+  been invoked but `BotStartedSpeakingFrame` had not yet been received, a user
+  interruption could allow stale audio to leak through.
+  (PR [#4145](https://github.com/pipecat-ai/pipecat/pull/4145))
+
+- Fixed Gemini Live local VAD mode (`GeminiVADParams(disabled=True)` with
+  external VAD) not working. The bot now correctly detects user speech and
+  signals turn boundaries to the Gemini API.
+  (PR [#4146](https://github.com/pipecat-ai/pipecat/pull/4146))
+
+- Fixed Gemini Live message handling to process all `server_content` fields
+  independently. Gemini 3.x can bundle multiple fields (e.g. `model_turn` and
+  `output_transcription`) on the same message, but the previous `elif` chain
+  only processed the first match, silently dropping the rest.
+  (PR [#4147](https://github.com/pipecat-ai/pipecat/pull/4147))
+
+- Fixed `ServiceSwitcher` with `ServiceSwitcherStrategyFailover` incorrectly
+  triggering failover when `ErrorFrame`s from other pipeline stages (e.g. TTS)
+  propagated upstream through the switcher. Previously, any non-fatal error
+  passing through would be misattributed to the active service and trigger an
+  unwanted service switch. Now only errors originating from the switcher's own
+  managed services trigger failover.
+  (PR [#4149](https://github.com/pipecat-ai/pipecat/pull/4149))
+
+- Fixed `LiveKitOutputTransport` not clearing the `rtc.AudioSource` internal
+  buffer on interruption, causing the bot to continue speaking for several
+  seconds after being interrupted.
+  (PR [#4151](https://github.com/pipecat-ai/pipecat/pull/4151))
+
+- Fixed a crash in OpenAI LLM processing when the provider returns
+  `chunk.choices[0].delta.audio = None`, which caused `'NoneType' object has no
+  attribute 'get'` errors during audio transcript handling.
+  (PR [#4152](https://github.com/pipecat-ai/pipecat/pull/4152))
+
+- Fixed error floods in `DeepgramSTTService` when the WebSocket connection
+  drops. With Deepgram SDK 6.x, `send_media()` raises exceptions on a dead
+  connection instead of silently failing, causing every queued audio frame to
+  log an error. Now `send_media()` failures are caught gracefully — a single
+  warning is logged and audio frames are skipped until the existing
+  reconnection logic restores the connection.
+  (PR [#4153](https://github.com/pipecat-ai/pipecat/pull/4153))
+
+- `Mem0MemoryService` no longer blocks the event loop during memory storage and
+  retrieval. All Mem0 API calls now run in a background thread, and message
+  storage is fire-and-forget so it doesn't delay downstream processing.
+  (PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
+
+- Fixed `Mem0MemoryService` failing to store messages when the context
+  contained system or developer role messages. The Mem0 API only accepts user
+  and assistant roles, so other roles are now filtered out before storing.
+  (PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
+
+- Added missing `on_dtmf_event` callback to `LemonSliceTransportClient.setup()`
+  `DailyCallbacks` construction, fixing a `ValidationError` at pipeline setup
+  time.
+  (PR [#4161](https://github.com/pipecat-ai/pipecat/pull/4161))
+
+- Fixed an issue in `InworldTTSService` where, in cases of fast interruption,
+  we would continue receiving audio from the previous context.
+  (PR [#4167](https://github.com/pipecat-ai/pipecat/pull/4167))
+
+- Fixed a word timestamp interleaving issue in `InworldTTSService` when
+  processing multiple sentences.
+  (PR [#4167](https://github.com/pipecat-ai/pipecat/pull/4167))
+
+- Fixed duplicate `TTSStoppedFrame` being pushed in TTS services using
+  `push_stop_frames=True`. When the stop-frame timeout fired, a second
+  `TTSStoppedFrame` could be pushed after the normal one at context completion.
+  (PR [#4172](https://github.com/pipecat-ai/pipecat/pull/4172))
+
+- ⚠️ Fixed `DeepgramSTTService` compatibility with deepgram-sdk 6.1.0. The SDK
+  now requires explicit message objects for `send_keep_alive()`,
+  `send_close_stream()`, and `send_finalize()`. The minimum deepgram-sdk
+  version is now 6.1.0.
+  (PR [#4174](https://github.com/pipecat-ai/pipecat/pull/4174))
+
+- Fixed RTVI events not being delivered to clients when using WebSocket
+  transports. `ProtobufFrameSerializer` now sets `ignore_rtvi_messages=False`
+  by default.
+  (PR [#4176](https://github.com/pipecat-ai/pipecat/pull/4176))
+
+- Fixed a timing issue where turn detection timer tasks (idle controller,
+  speech timeout, turn analyzer, and turn completion) could miss their first
+  tick because the newly created asyncio task was not yet scheduled when the
+  caller continued.
+  (PR [#4183](https://github.com/pipecat-ai/pipecat/pull/4183))
+
+- Fixed `FastAPIWebsocketTransport` intermittently hanging on shutdown when the
+  remote side (e.g. Twilio) disconnects while audio is being sent. A race
+  condition between the send and receive paths could cause the
+  `on_client_disconnected` callback to be skipped, leaving the pipeline waiting
+  for a disconnect signal that never came.
+  (PR [#4186](https://github.com/pipecat-ai/pipecat/pull/4186))
+
+### Performance
+
+- `RimeTTSService` now handles Rime's `done` WebSocket message to complete
+  audio contexts immediately, eliminating the 3-second idle timeout that
+  previously added latency at the end of each utterance.
+  (PR [#4172](https://github.com/pipecat-ai/pipecat/pull/4172))
+
 ## [0.0.107] - 2026-03-23

 ### Added
--- a/changelog/3978.added.md
+++ b/changelog/3978.added.md
@@ -1 +0,0 @@
- Added `SarvamLLMService` with support for `sarvam-30b`, `sarvam-30b-16k`, `sarvam-105b` and `sarvam-105b-32k`
--- a/changelog/4013.added.md
+++ b/changelog/4013.added.md
@@ -1 +0,0 @@
- Added `on_turn_context_created(context_id)` hook to `TTSService`. Override this to perform provider-specific setup (e.g. eagerly opening a server-side context) before text starts flowing. Called each time a new turn context ID is created.
--- a/changelog/4013.changed.md
+++ b/changelog/4013.changed.md
@@ -1 +0,0 @@
- Added context prewarming path for `InworldTTSService` to improve first audio latency
--- a/changelog/4022.changed.md
+++ b/changelog/4022.changed.md
@@ -1 +0,0 @@
- Added `KrispVivaVadAnalyzer` for Voice Activity Detection using the Krisp VIVA SDK (requires `krisp_audio`).
--- a/changelog/4028.changed.md
+++ b/changelog/4028.changed.md
@@ -1 +0,0 @@
- Modeified `InworldTTSService` to close context at end of turn instead of relying on idle timeout
--- a/changelog/4031.added.md
+++ b/changelog/4031.added.md
@@ -1 +0,0 @@
- Added `XAIHttpTTSService` for text-to-speech using xAI's HTTP TTS API.
--- a/changelog/4078.changed.md
+++ b/changelog/4078.changed.md
@@ -1 +0,0 @@
- Added Gemini 3 support to the Gemini Live service.
--- a/changelog/4084.changed.md
+++ b/changelog/4084.changed.md
@@ -1 +0,0 @@
- `TTSService`: the default `stop_frame_timeout_s` (idle time before an automatic `TTSStoppedFrame` is pushed when `push_stop_frames=True`) has changed from `2.0` to `3.0` seconds.
--- a/changelog/4089.added.md
+++ b/changelog/4089.added.md
@@ -1 +0,0 @@
- Added support for "developer" role messages in conversation context across all LLM adapters. For non-OpenAI services (Anthropic, Google, AWS Bedrock), "developer" messages are converted to "user" messages (use `system_instruction` to set the system instruction). For OpenAI services, "developer" messages pass through in conversation history. For the Responses API, they are kept as "developer" role (matching the existing "system" → "developer" conversion).
--- a/changelog/4089.changed.md
+++ b/changelog/4089.changed.md
@@ -1 +0,0 @@
- ⚠️ `GeminiLLMAdapter` now only treats `messages[0]` as the initial system message, matching all other adapters. Previously it searched for the first "system" message anywhere in the conversation history. A "system" message appearing later in the list will now be converted to "user" instead of being extracted as the system instruction.
--- a/changelog/4089.fixed.2.md
+++ b/changelog/4089.fixed.2.md
@@ -1 +0,0 @@
- Fixed Gemini Live (`GoogleGeminiLiveLLMService`) not honoring `settings.system_instruction`. The system instruction was being read from a deprecated constructor parameter instead of the settings object, causing it to be silently ignored.
--- a/changelog/4089.fixed.md
+++ b/changelog/4089.fixed.md
@@ -1 +0,0 @@
- Fixed `AWSBedrockLLMAdapter` sending an empty message list to the API when the only message in context was a system message. The lone system message is now converted to "user" role instead of being extracted, matching the existing Anthropic adapter behavior.
--- a/changelog/4092.added.md
+++ b/changelog/4092.added.md
@@ -1 +0,0 @@
- Added `SmallestTTSService`, a WebSocket-based TTS service integration with Smallest AI's Waves API. Supports the Lightning v2 and v3.1 models with configurable voice, language, speed, consistency, similarity, and enhancement settings.
--- a/changelog/4113.changed.md
+++ b/changelog/4113.changed.md
@@ -1 +0,0 @@
- Fixed `InworldTtsService` to fallback to full text when TTS timestamps are not received
--- a/changelog/4115.added.md
+++ b/changelog/4115.added.md
@@ -1 +0,0 @@
- Added warnings in turn stop strategies when `VADParams.stop_secs` differs from the recommended default (0.2s) or when `stop_secs >= STT p99 latency`, which collapses the STT wait timeout to 0s and may cause delayed turn detection. The warnings guide developers to re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) with their VAD settings.
--- a/changelog/4117.added.md
+++ b/changelog/4117.added.md
@@ -1 +0,0 @@
- Added `domain` parameter to `AssemblyAISTTSettings` for specialized recognition modes such as Medical Mode (`domain="medical-v1"`).
--- a/changelog/4119.added.md
+++ b/changelog/4119.added.md
@@ -1 +0,0 @@
- Added `NovitaLLMService` for using Novita AI's LLM models via their OpenAI-compatible API.
--- a/changelog/4120.added.md
+++ b/changelog/4120.added.md
@@ -1 +0,0 @@
- Added `cleanup()` method to `VADAnalyzer` and `VADController` so VAD analyzer resources are properly released when no longer needed. Custom `VADAnalyzer` subclasses can override `cleanup()` to free any held resources.
--- a/changelog/4125.fixed.md
+++ b/changelog/4125.fixed.md
@@ -1 +0,0 @@
- Fixed Gemini Live pipeline hanging indefinitely when an `EndFrame` was deferred while waiting for the bot to finish responding and `turn_complete` never arrived. As a possible root-cause fix, `turn_complete` messages are now handled even if they lack `usage_metadata`. As a fallback, the deferred `EndFrame` now has a 30-second safety timeout.
--- a/changelog/4126.fixed.md
+++ b/changelog/4126.fixed.md
@@ -1 +0,0 @@
- Fixed ElevenLabs WebSocket disconnections (1008 "Maximum simultaneous contexts exceeded") caused by rapid user interruptions. When interruptions arrived before any TTS text was generated, phantom contexts were created on the ElevenLabs server that were never closed, eventually exceeding the 5-context limit.
--- a/changelog/4127.fixed.md
+++ b/changelog/4127.fixed.md
@@ -1 +0,0 @@
- Fixed the final sentence being dropped from the conversation context when using RTVI text input with non-word-timestamp TTS services. The `LLMFullResponseEndFrame` was racing ahead of the last `TTSTextFrame`, causing the `LLMAssistantAggregator` to finalize the context before the final sentence arrived.
--- a/changelog/4128.added.md
+++ b/changelog/4128.added.md
@@ -1 +0,0 @@
- Added `on_end_of_turn` event handler to `AssemblyAISTTService`. This fires after the final transcript is pushed, providing a reliable hook for end-of-turn logic that doesn't race with `TranscriptionFrame`. Works in both Pipecat and AssemblyAI turn detection modes.
--- a/changelog/4130.changed.md
+++ b/changelog/4130.changed.md
@@ -1 +0,0 @@
- ⚠️ Realtime services (Gemini Live, OpenAI Realtime, Grok Realtime, Nova Sonic) now prefer `system_instruction` from service settings over an initial system message in the LLM context, matching the behavior of non-realtime services. Previously, context-provided system instructions took precedence. A warning is now logged when both are set.
--- a/changelog/4135.fixed.md
+++ b/changelog/4135.fixed.md
@@ -1 +0,0 @@
- Fixed audio crackling and popping in recordings when both user and bot are speaking. `AudioBufferProcessor` no longer injects silence into a track's buffer while that track is actively producing audio, preventing mid-utterance interruptions in the recorded output.
--- a/changelog/4136.changed.2.md
+++ b/changelog/4136.changed.2.md
@@ -1 +0,0 @@
- Bumped `nvidia-riva-client` minimum version to `>=2.25.1`.
--- a/changelog/4136.changed.md
+++ b/changelog/4136.changed.md
@@ -1 +0,0 @@
- Upgraded `protobuf` from 5.x to 6.x (`>=6.31.1,<7`).
--- a/changelog/4137.changed.md
+++ b/changelog/4137.changed.md
@@ -1 +0,0 @@
- Unrecognized language strings (e.g. Deepgram's `"multi"`) no longer produce a warning at startup. The log message has been downgraded to debug level since these are valid service-specific values that are passed through correctly.
--- a/changelog/4142.changed.md
+++ b/changelog/4142.changed.md
@@ -1 +0,0 @@
- `GrokLLMService` and `GrokRealtimeLLMService` now live in the `pipecat.services.xai` module alongside `XAIHttpTTSService`, since all three use the same xAI API. Update imports from `pipecat.services.grok.*` to `pipecat.services.xai.*` (e.g. `from pipecat.services.xai.llm import GrokLLMService`).
--- a/changelog/4142.deprecated.md
+++ b/changelog/4142.deprecated.md
@@ -1 +0,0 @@
- `pipecat.services.grok.llm`, `pipecat.services.grok.realtime.llm`, and `pipecat.services.grok.realtime.events` are deprecated. The old import paths still work but emit a `DeprecationWarning`; use `pipecat.services.xai.llm`, `pipecat.services.xai.realtime.llm`, and `pipecat.services.xai.realtime.events` instead.
--- a/changelog/4143.added.md
+++ b/changelog/4143.added.md
@@ -1 +0,0 @@
- Added `DeepgramFluxSageMakerSTTService` for running Deepgram Flux speech-to-text on AWS SageMaker endpoints.  Use with `ExternalUserTurnStrategies` to take advantage of Flux's turn detection.
--- a/changelog/4145.fixed.2.md
+++ b/changelog/4145.fixed.2.md
@@ -1 +0,0 @@
- Fixed websocket TTS word timestamps so interrupted contexts cannot leak stale words or backward PTS values into later turns.
--- a/changelog/4145.fixed.md
+++ b/changelog/4145.fixed.md
@@ -1 +0,0 @@
- Fixed a race condition in `InterruptibleTTSService` where, if `run_tts` had been invoked but `BotStartedSpeakingFrame` had not yet been received, a user interruption could allow stale audio to leak through.
--- a/changelog/4145.removed.md
+++ b/changelog/4145.removed.md
@@ -1 +0,0 @@
- ⚠️ `TTSService.add_word_timestamps()` no longer supports the `"Reset"` and `"TTSStoppedFrame"` sentinel strings. If you have a custom TTS service that called `await self.add_word_timestamps([("Reset", 0)])` or `await self.add_word_timestamps([("TTSStoppedFrame", 0), ("Reset", 0)], ctx_id)`, replace them with `await self.append_to_audio_context(ctx_id, TTSStoppedFrame(context_id=ctx_id))` and let `_handle_audio_context` manage the word-timestamp reset automatically.
--- a/changelog/4146.fixed.md
+++ b/changelog/4146.fixed.md
@@ -1 +0,0 @@
- Fixed Gemini Live local VAD mode (`GeminiVADParams(disabled=True)` with external VAD) not working. The bot now correctly detects user speech and signals turn boundaries to the Gemini API.
--- a/changelog/4147.fixed.md
+++ b/changelog/4147.fixed.md
@@ -1 +0,0 @@
- Fixed Gemini Live message handling to process all `server_content` fields independently. Gemini 3.x can bundle multiple fields (e.g. `model_turn` and `output_transcription`) on the same message, but the previous `elif` chain only processed the first match, silently dropping the rest.
--- a/changelog/4149.fixed.md
+++ b/changelog/4149.fixed.md
@@ -1 +0,0 @@
- Fixed `ServiceSwitcher` with `ServiceSwitcherStrategyFailover` incorrectly triggering failover when `ErrorFrame`s from other pipeline stages (e.g. TTS) propagated upstream through the switcher. Previously, any non-fatal error passing through would be misattributed to the active service and trigger an unwanted service switch. Now only errors originating from the switcher's own managed services trigger failover.
--- a/changelog/4151.fixed.md
+++ b/changelog/4151.fixed.md
@@ -1 +0,0 @@
- Fixed `LiveKitOutputTransport` not clearing the `rtc.AudioSource` internal buffer on interruption, causing the bot to continue speaking for several seconds after being interrupted.
--- a/changelog/4152.fixed.md
+++ b/changelog/4152.fixed.md
@@ -1 +0,0 @@
- Fixed a crash in OpenAI LLM processing when the provider returns `chunk.choices[0].delta.audio = None`, which caused `'NoneType' object has no attribute 'get'` errors during audio transcript handling.
--- a/changelog/4153.fixed.md
+++ b/changelog/4153.fixed.md
@@ -1 +0,0 @@
- Fixed error floods in `DeepgramSTTService` when the WebSocket connection drops. With Deepgram SDK 6.x, `send_media()` raises exceptions on a dead connection instead of silently failing, causing every queued audio frame to log an error. Now `send_media()` failures are caught gracefully — a single warning is logged and audio frames are skipped until the existing reconnection logic restores the connection.
--- a/changelog/4154.removed.md
+++ b/changelog/4154.removed.md
@@ -1 +0,0 @@
- Removed `SambaNovaSTTService`. SambaNova no longer offers speech-to-text audio models. Use another STT provider instead.
--- a/changelog/4156.added.md
+++ b/changelog/4156.added.md
@@ -1 +0,0 @@
- Added `Mem0MemoryService.get_memories()` convenience method for retrieving all stored memories outside the pipeline (e.g. to build a personalized greeting at connection time). This avoids the need to manually handle client type branching, filter construction, and async wrapping.
--- a/changelog/4156.changed.md
+++ b/changelog/4156.changed.md
@@ -1 +0,0 @@
- ⚠️ Bumped `mem0ai` dependency from `~=0.1.94` to `>=1.0.8,<2`. Users of the `mem0` extra will need to update their mem0ai package.
--- a/changelog/4156.fixed.2.md
+++ b/changelog/4156.fixed.2.md
@@ -1 +0,0 @@
- Fixed `Mem0MemoryService` failing to store messages when the context contained system or developer role messages. The Mem0 API only accepts user and assistant roles, so other roles are now filtered out before storing.
--- a/changelog/4156.fixed.md
+++ b/changelog/4156.fixed.md
@@ -1 +0,0 @@
- `Mem0MemoryService` no longer blocks the event loop during memory storage and retrieval. All Mem0 API calls now run in a background thread, and message storage is fire-and-forget so it doesn't delay downstream processing.
--- a/changelog/4161.fixed.md
+++ b/changelog/4161.fixed.md
@@ -1 +0,0 @@
- Added missing `on_dtmf_event` callback to `LemonSliceTransportClient.setup()` `DailyCallbacks` construction, fixing a `ValidationError` at pipeline setup time.
--- a/changelog/4167.fixed.2.md
+++ b/changelog/4167.fixed.2.md
@@ -1 +0,0 @@
- Fixed an issue in `InworldTTSService` where, in cases of fast interruption, we would continue receiving audio from the previous context.
--- a/changelog/4167.fixed.md
+++ b/changelog/4167.fixed.md
@@ -1 +0,0 @@
- Fixed a word timestamp interleaving issue in `InworldTTSService` when processing multiple sentences.
--- a/changelog/4172.fixed.md
+++ b/changelog/4172.fixed.md
@@ -1 +0,0 @@
- Fixed duplicate `TTSStoppedFrame` being pushed in TTS services using `push_stop_frames=True`. When the stop-frame timeout fired, a second `TTSStoppedFrame` could be pushed after the normal one at context completion.
--- a/changelog/4172.performance.md
+++ b/changelog/4172.performance.md
@@ -1 +0,0 @@
- `RimeTTSService` now handles Rime's `done` WebSocket message to complete audio contexts immediately, eliminating the 3-second idle timeout that previously added latency at the end of each utterance.
--- a/changelog/4174.fixed.md
+++ b/changelog/4174.fixed.md
@@ -1 +0,0 @@
- ⚠️ Fixed `DeepgramSTTService` compatibility with deepgram-sdk 6.1.0. The SDK now requires explicit message objects for `send_keep_alive()`, `send_close_stream()`, and `send_finalize()`. The minimum deepgram-sdk version is now 6.1.0.
--- a/examples/foundational/07z-interruptible-sarvam-http.py
+++ b/examples/foundational/07z-interruptible-sarvam-http.py
@@ -111,7 +111,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
            logger.info(f"Client connected")
            # Kick off the conversation.
            context.add_message(
-                {"role": "developer", "content": "Please introduce yourself to the user."}
+                {"role": "user", "content": "Please introduce yourself to the user."}
            )
            await task.queue_frames([LLMRunFrame()])

--- a/examples/foundational/07z-interruptible-sarvam.py
+++ b/examples/foundational/07z-interruptible-sarvam.py
@@ -104,9 +104,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    async def on_client_connected(transport, client):
        logger.info(f"Client connected")
        # Kick off the conversation.
-        context.add_message(
-            {"role": "developer", "content": "Please introduce yourself to the user."}
-        )
+        context.add_message({"role": "user", "content": "Please introduce yourself to the user."})
        await task.queue_frames([LLMRunFrame()])

        # Optionally, you can wait for 30 seconds and then change the voice.
--- a/examples/foundational/26i-gemini-live-graceful-end.py
+++ b/examples/foundational/26i-gemini-live-graceful-end.py
@@ -51,6 +51,12 @@ async def end_conversation(params: FunctionCallParams):
    await params.llm.push_frame(EndTaskFrame(), FrameDirection.UPSTREAM)


+# NOTE: we can ask the model to say something *after* the call to
+# end_conversation because GeminiLiveLLMService defers processing EndFrames
+# until after the bot finishes its current turn. With Gemini 3.1 Flash Live,
+# the model won't reliably report ending its turn until after it says something
+# following the tool call, which is why the system instruction is structured
+# the way it is.
 system_instruction = """
 You are a helpful assistant who can answer questions and use tools.

@@ -59,9 +65,10 @@ You have three tools available to you:
 2. get_restaurant_recommendation: Use this tool to get a restaurant recommendation in a specific location.
 3. end_conversation: Use this tool to gracefully end the conversation.

-After you've responded to the user three times, do two things, in order:
-1. Politely let them know that that's all the time you have today and say goodbye.
-2. *WITHOUT WAITING FOR THE USER TO RESPOND*, call the end_conversation tool to gracefully end the conversation.
+After you've responded to the user three times, do the following:
+1. Politely let them know that that's all the time you have today (but don't say "goodbye" yet).
+2. Then immediately call the end_conversation function. *DO NOT FORGET TO DO THIS STEP.*
+3. After the tool reports success, say goodbye.
 """


--- a/pyproject.toml
+++ b/pyproject.toml
@@ -63,7 +63,7 @@ cartesia = [ "pipecat-ai[websockets-base]" ]
 camb = [ "camb-sdk>=1.5.4,<2" ]
 cerebras = []
 daily = [ "daily-python~=0.27.0" ]
-deepgram = [ "deepgram-sdk>=6.1.0,<7", "pipecat-ai[websockets-base]" ]
+deepgram = [ "deepgram-sdk>=6.1.1,<7", "pipecat-ai[websockets-base]" ]
 deepseek = []
 elevenlabs = [ "pipecat-ai[websockets-base]" ]
 fal = []
@@ -81,7 +81,7 @@ inworld = []
 koala = [ "pvkoala~=2.0.3" ]
 kokoro = [ "kokoro-onnx>=0.5.0,<1", "requests>=2.32.5,<3" ]
 krisp = [ "pipecat-ai-krisp~=0.4.0" ]
-langchain = [ "langchain~=0.3.20", "langchain-community~=0.3.20", "langchain-openai~=0.3.9" ]
+langchain = [ "langchain~=0.3.28", "langchain-community~=0.3.31", "langchain-openai~=0.3.29" ]
 lemonslice = [ "pipecat-ai[daily]" ]
 livekit = [ "livekit>=1.0.13,<2", "livekit-api>=1.0.5,<2", "tenacity>=8.2.3,<10.0.0", "pyjwt>=2.12.0,<3" ]
 lmnt = [ "pipecat-ai[websockets-base]" ]
--- a/src/pipecat/serializers/protobuf.py
+++ b/src/pipecat/serializers/protobuf.py
@@ -68,6 +68,11 @@ class ProtobufFrameSerializer(FrameSerializer):
            params: Configuration parameters.
        """
        super().__init__(params)
+        # The base serializer defaults to filtering out RTVI protocol messages
+        # to avoid sending them over telephony media streams. ProtobufFrameSerializer
+        # is used by WebSocket transports, which are the delivery channel for
+        # these messages, so we disable the filter.
+        self._params.ignore_rtvi_messages = False

    async def serialize(self, frame: Frame) -> str | bytes | None:
        """Serialize a frame to Protocol Buffer binary format.
--- a/src/pipecat/services/google/gemini_live/llm.py
+++ b/src/pipecat/services/google/gemini_live/llm.py
@@ -1238,6 +1238,14 @@ class GeminiLiveLLMService(LLMService):
            self._end_frame_deferral_timeout_task.cancel()
        self._end_frame_deferral_timeout_task = None

+    def _get_history_config(self) -> Optional[HistoryConfig]:
+        """Return the history config for the Live API connection.
+
+        Subclasses can override this to disable history config (e.g. Vertex AI
+        does not support it).
+        """
+        return HistoryConfig(initial_history_in_client_content=True)
+
    async def _connect(self, session_resumption_handle: Optional[str] = None):
        """Establish client connection to Gemini Live API."""
        if self._session:
@@ -1273,9 +1281,13 @@ class GeminiLiveLLMService(LLMService):
                input_audio_transcription=AudioTranscriptionConfig(),
                output_audio_transcription=AudioTranscriptionConfig(),
                session_resumption=SessionResumptionConfig(handle=session_resumption_handle),
-                history_config=HistoryConfig(initial_history_in_client_content=True),
            )

+            # Add history config, if supported (not supported by Vertex)
+            history_config = self._get_history_config()
+            if history_config:
+                config.history_config = history_config
+
            # Add context window compression to configuration, if enabled
            cwc = self._settings.context_window_compression or {}
            if cwc.get("enabled", False):
--- a/src/pipecat/services/google/gemini_live/vertex/llm.py
+++ b/src/pipecat/services/google/gemini_live/vertex/llm.py
@@ -212,6 +212,10 @@ class GeminiLiveVertexLLMService(GeminiLiveLLMService):
            **kwargs,
        )

+    def _get_history_config(self):
+        """Vertex AI does not support history_config."""
+        return None
+
    def create_client(self):
        """Create the Gemini client instance."""
        self._client = Client(
--- a/src/pipecat/transports/websocket/fastapi.py
+++ b/src/pipecat/transports/websocket/fastapi.py
@@ -150,17 +150,9 @@ class FastAPIWebsocketClient:
                else:
                    await self._websocket.send_text(data)
        except Exception as e:
-            logger.error(
+            logger.warning(
                f"{self} exception sending data: {e.__class__.__name__} ({e}), application_state: {self._websocket.application_state}"
            )
-            # For some reason the websocket is disconnected, and we are not able to send data
-            # So let's properly handle it and disconnect the transport if it is not already disconnecting
-            if (
-                self._websocket.application_state == WebSocketState.DISCONNECTED
-                and not self.is_closing
-            ):
-                logger.warning("Closing already disconnected websocket!")
-                self._closing = True

    async def disconnect(self):
        """Disconnect the WebSocket client."""
@@ -189,7 +181,11 @@ class FastAPIWebsocketClient:

    def _can_send(self):
        """Check if data can be sent through the WebSocket."""
-        return self.is_connected and not self.is_closing
+        return (
+            self.is_connected
+            and not self.is_closing
+            and self._websocket.application_state != WebSocketState.DISCONNECTED
+        )

    @property
    def is_connected(self) -> bool:
--- a/src/pipecat/turns/user_idle_controller.py
+++ b/src/pipecat/turns/user_idle_controller.py
@@ -143,6 +143,8 @@ class UserIdleController(BaseObject):
            self._idle_timer_expired(),
            f"{self}::idle_timer",
        )
+        # Make sure the task is scheduled.
+        await asyncio.sleep(0)

    async def _cancel_idle_timer(self):
        """Cancel the idle timer if running."""
--- a/src/pipecat/turns/user_stop/speech_timeout_user_turn_stop_strategy.py
+++ b/src/pipecat/turns/user_stop/speech_timeout_user_turn_stop_strategy.py
@@ -153,6 +153,8 @@ class SpeechTimeoutUserTurnStopStrategy(BaseUserTurnStopStrategy):
        self._timeout_task = self.task_manager.create_task(
            self._timeout_handler(timeout), f"{self}::_timeout_handler"
        )
+        # Make sure the task is scheduled.
+        await asyncio.sleep(0)

    async def _handle_transcription(self, frame: TranscriptionFrame):
        """Handle user transcription."""
@@ -174,6 +176,8 @@ class SpeechTimeoutUserTurnStopStrategy(BaseUserTurnStopStrategy):
            self._timeout_task = self.task_manager.create_task(
                self._timeout_handler(timeout), f"{self}::_timeout_handler"
            )
+            # Make sure the task is scheduled.
+            await asyncio.sleep(0)

    def _calculate_timeout(self) -> float:
        """Calculate the timeout value based on current state.
--- a/src/pipecat/turns/user_stop/turn_analyzer_user_turn_stop_strategy.py
+++ b/src/pipecat/turns/user_stop/turn_analyzer_user_turn_stop_strategy.py
@@ -193,6 +193,8 @@ class TurnAnalyzerUserTurnStopStrategy(BaseUserTurnStopStrategy):
        self._timeout_task = self.task_manager.create_task(
            self._timeout_handler(timeout), f"{self}::_timeout_handler"
        )
+        # Make sure the task is scheduled.
+        await asyncio.sleep(0)

    async def _handle_transcription(self, frame: TranscriptionFrame):
        """Handle user transcription."""
@@ -217,6 +219,8 @@ class TurnAnalyzerUserTurnStopStrategy(BaseUserTurnStopStrategy):
            self._timeout_task = self.task_manager.create_task(
                self._timeout_handler(timeout), f"{self}::_timeout_handler"
            )
+            # Make sure the task is scheduled.
+            await asyncio.sleep(0)

    async def _handle_prediction_result(self, result: Optional[MetricsData]):
        """Handle a prediction result event from the turn analyzer."""
--- a/src/pipecat/turns/user_turn_completion_mixin.py
+++ b/src/pipecat/turns/user_turn_completion_mixin.py
@@ -254,6 +254,8 @@ class UserTurnCompletionLLMServiceMixin:
            self._incomplete_timeout_handler(incomplete_type, timeout),
            f"_incomplete_timeout_{incomplete_type}",
        )
+        # Make sure the task is scheduled.
+        await asyncio.sleep(0)

    async def _cancel_incomplete_timeout(self):
        """Cancel any pending incomplete timeout task."""
--- a/tests/test_fastapi_websocket.py
+++ b/tests/test_fastapi_websocket.py
@@ -4,10 +4,17 @@
 # SPDX-License-Identifier: BSD 2-Clause License
 #

+import asyncio
 import unittest
-from unittest.mock import AsyncMock
+from unittest.mock import AsyncMock, PropertyMock

-from pipecat.transports.websocket.fastapi import _WebSocketMessageIterator
+from starlette.websockets import WebSocketState
+
+from pipecat.transports.websocket.fastapi import (
+    FastAPIWebsocketCallbacks,
+    FastAPIWebsocketClient,
+    _WebSocketMessageIterator,
+)


 class TestWebSocketMessageIterator(unittest.IsolatedAsyncioTestCase):
@@ -66,5 +73,134 @@ class TestWebSocketMessageIterator(unittest.IsolatedAsyncioTestCase):
        self.assertEqual(len(messages), 0)


+class TestSendDisconnectRace(unittest.IsolatedAsyncioTestCase):
+    """Tests for the race condition in issue #3912.
+
+    When the remote side disconnects while send() is in flight, send() should
+    not set _closing = True, because that flag means "we initiated the close."
+    Setting it from send() prevents the receive loop from firing
+    on_client_disconnected, which can cause the pipeline to hang.
+    """
+
+    def _make_client(self, mock_ws):
+        callbacks = FastAPIWebsocketCallbacks(
+            on_client_connected=AsyncMock(),
+            on_client_disconnected=AsyncMock(),
+            on_session_timeout=AsyncMock(),
+        )
+        client = FastAPIWebsocketClient(mock_ws, callbacks)
+        return client, callbacks
+
+    async def test_send_disconnect_does_not_set_closing(self):
+        """send() should not set _closing when the remote side disconnects."""
+        mock_ws = AsyncMock()
+        type(mock_ws).client_state = PropertyMock(return_value=WebSocketState.CONNECTED)
+        type(mock_ws).application_state = PropertyMock(return_value=WebSocketState.DISCONNECTED)
+        mock_ws.send_bytes.side_effect = Exception("connection closed")
+
+        client, _ = self._make_client(mock_ws)
+
+        await client.send(b"audio data")
+
+        self.assertFalse(client.is_closing)
+
+    async def test_send_suppressed_after_disconnect(self):
+        """After a failed send, _can_send() returns False via application_state.
+
+        Simulates real Starlette behavior: application_state starts CONNECTED,
+        transitions to DISCONNECTED when send_bytes raises (Starlette does this
+        internally on OSError before re-raising as WebSocketDisconnect).
+        """
+        mock_ws = AsyncMock()
+        type(mock_ws).client_state = PropertyMock(return_value=WebSocketState.CONNECTED)
+
+        # application_state transitions from CONNECTED → DISCONNECTED on send failure
+        app_state = {"state": WebSocketState.CONNECTED}
+        type(mock_ws).application_state = PropertyMock(side_effect=lambda: app_state["state"])
+
+        def fail_and_transition(data):
+            app_state["state"] = WebSocketState.DISCONNECTED
+            raise Exception("connection closed")
+
+        mock_ws.send_bytes.side_effect = fail_and_transition
+
+        client, _ = self._make_client(mock_ws)
+
+        # First send: _can_send() passes (app_state CONNECTED), send_bytes raises,
+        # Starlette sets app_state to DISCONNECTED
+        await client.send(b"audio data")
+        # Second send: _can_send() returns False (app_state now DISCONNECTED)
+        await client.send(b"more audio")
+
+        # send_bytes was only called once (the first attempt)
+        mock_ws.send_bytes.assert_called_once()
+
+    async def test_disconnect_callback_fires_when_send_races_receive(self):
+        """Regression test for issue #3912.
+
+        The receive loop is blocked waiting for the next message. Meanwhile,
+        send() is called and hits an exception because the remote side closed.
+        Then the receive loop unblocks and sees the disconnect.
+
+        on_client_disconnected must still fire, because the remote side
+        initiated the close — not us.
+        """
+        send_done = asyncio.Event()
+
+        mock_ws = AsyncMock()
+        type(mock_ws).client_state = PropertyMock(return_value=WebSocketState.CONNECTED)
+        type(mock_ws).application_state = PropertyMock(return_value=WebSocketState.DISCONNECTED)
+        mock_ws.send_bytes.side_effect = Exception("connection closed")
+
+        # receive() blocks until send has completed, then returns disconnect.
+        # This enforces the exact ordering that causes the bug.
+        async def mock_receive():
+            await send_done.wait()
+            return {"type": "websocket.disconnect"}
+
+        mock_ws.receive = mock_receive
+
+        client, callbacks = self._make_client(mock_ws)
+
+        # Simulate the _receive_messages logic from FastAPIWebsocketInputTransport
+        async def receive_loop():
+            try:
+                async for _ in _WebSocketMessageIterator(mock_ws):
+                    pass
+            except Exception:
+                pass
+            if not client.is_closing:
+                await client.trigger_client_disconnected()
+
+        recv_task = asyncio.create_task(receive_loop())
+
+        # Let the receive loop start and block on receive()
+        await asyncio.sleep(0)
+
+        # send() races — hits exception but does NOT set _closing
+        await client.send(b"audio data")
+        self.assertFalse(client.is_closing)
+
+        # Unblock the receive loop — it sees the disconnect
+        send_done.set()
+        await recv_task
+
+        # The callback fires because _closing was not poisoned by send()
+        callbacks.on_client_disconnected.assert_called_once()
+
+    async def test_send_text_disconnect_does_not_set_closing(self):
+        """Same as test_send_disconnect_does_not_set_closing but with text data."""
+        mock_ws = AsyncMock()
+        type(mock_ws).client_state = PropertyMock(return_value=WebSocketState.CONNECTED)
+        type(mock_ws).application_state = PropertyMock(return_value=WebSocketState.DISCONNECTED)
+        mock_ws.send_text.side_effect = Exception("connection closed")
+
+        client, _ = self._make_client(mock_ws)
+
+        await client.send("text data")
+
+        self.assertFalse(client.is_closing)
+
+
 if __name__ == "__main__":
    unittest.main()
--- a/uv.lock
+++ b/uv.lock
@@ -1474,7 +1474,7 @@ wheels = [

 [[package]]
 name = "deepgram-sdk"
-version = "6.1.0"
+version = "6.1.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "httpx" },
@@ -1483,9 +1483,9 @@ dependencies = [
    { name = "typing-extensions" },
    { name = "websockets" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/ef/5c/3cea6888343ae4c90ddb2640371e270b2406e3cc66f76e33c67282559ef5/deepgram_sdk-6.1.0.tar.gz", hash = "sha256:2eb124fd120be733e8297173b7201aa4375a86cd174854aaa5e2d036ba122515", size = 201973, upload-time = "2026-03-26T20:06:06.961Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/29/91/85360d998685fee6b570e12250e5ac74ca43420d5d22b44a865b6c3f2469/deepgram_sdk-6.1.1.tar.gz", hash = "sha256:78726f42b2386f80d9fdd92a22ac59ad5558b6c0475b29bb57273ddaaa344794", size = 202224, upload-time = "2026-03-27T19:59:57.202Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/1b/85/4ad495bc08de200618b9e9eb17930e1863032ee3d2cfec1c4b0c7e5b8bbf/deepgram_sdk-6.1.0-py3-none-any.whl", hash = "sha256:5501c77ae64b7e496fef7d17111f8323dfb46af42b4e511eec9d998d593a9769", size = 576143, upload-time = "2026-03-26T20:06:05.407Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/7f/818ac35b40e38f225858cea38808c7fdd57292553479cabf0bebf7fcdc91/deepgram_sdk-6.1.1-py3-none-any.whl", hash = "sha256:329f45f2a084e8786d3e90d4a4aee630b85cf8770a3b2ee6323bedb176c47113", size = 576299, upload-time = "2026-03-27T19:59:55.949Z" },
 ]

 [[package]]
@@ -2169,7 +2169,6 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/38/3f/9859f655d11901e7b2996c6e3d33e0caa9a1d4572c3bc61ed0faa64b2f4c/greenlet-3.3.2-cp310-cp310-macosx_11_0_universal2.whl", hash = "sha256:9bc885b89709d901859cf95179ec9f6bb67a3d2bb1f0e88456461bd4b7f8fd0d", size = 277747, upload-time = "2026-02-20T20:16:21.325Z" },
    { url = "https://files.pythonhosted.org/packages/fb/07/cb284a8b5c6498dbd7cba35d31380bb123d7dceaa7907f606c8ff5993cbf/greenlet-3.3.2-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b568183cf65b94919be4438dc28416b234b678c608cafac8874dfeeb2a9bbe13", size = 579202, upload-time = "2026-02-20T20:47:28.955Z" },
    { url = "https://files.pythonhosted.org/packages/ed/45/67922992b3a152f726163b19f890a85129a992f39607a2a53155de3448b8/greenlet-3.3.2-cp310-cp310-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:527fec58dc9f90efd594b9b700662ed3fb2493c2122067ac9c740d98080a620e", size = 590620, upload-time = "2026-02-20T20:55:55.581Z" },
-    { url = "https://files.pythonhosted.org/packages/03/5f/6e2a7d80c353587751ef3d44bb947f0565ec008a2e0927821c007e96d3a7/greenlet-3.3.2-cp310-cp310-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:508c7f01f1791fbc8e011bd508f6794cb95397fdb198a46cb6635eb5b78d85a7", size = 602132, upload-time = "2026-02-20T21:02:43.261Z" },
    { url = "https://files.pythonhosted.org/packages/ad/55/9f1ebb5a825215fadcc0f7d5073f6e79e3007e3282b14b22d6aba7ca6cb8/greenlet-3.3.2-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ad0c8917dd42a819fe77e6bdfcb84e3379c0de956469301d9fd36427a1ca501f", size = 591729, upload-time = "2026-02-20T20:20:58.395Z" },
    { url = "https://files.pythonhosted.org/packages/24/b4/21f5455773d37f94b866eb3cf5caed88d6cea6dd2c6e1f9c34f463cba3ec/greenlet-3.3.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:97245cc10e5515dbc8c3104b2928f7f02b6813002770cfaffaf9a6e0fc2b94ef", size = 1551946, upload-time = "2026-02-20T20:49:31.102Z" },
    { url = "https://files.pythonhosted.org/packages/00/68/91f061a926abead128fe1a87f0b453ccf07368666bd59ffa46016627a930/greenlet-3.3.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:8c1fdd7d1b309ff0da81d60a9688a8bd044ac4e18b250320a96fc68d31c209ca", size = 1618494, upload-time = "2026-02-20T20:21:06.541Z" },
@@ -2177,7 +2176,6 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/f3/47/16400cb42d18d7a6bb46f0626852c1718612e35dcb0dffa16bbaffdf5dd2/greenlet-3.3.2-cp311-cp311-macosx_11_0_universal2.whl", hash = "sha256:c56692189a7d1c7606cb794be0a8381470d95c57ce5be03fb3d0ef57c7853b86", size = 278890, upload-time = "2026-02-20T20:19:39.263Z" },
    { url = "https://files.pythonhosted.org/packages/a3/90/42762b77a5b6aa96cd8c0e80612663d39211e8ae8a6cd47c7f1249a66262/greenlet-3.3.2-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1ebd458fa8285960f382841da585e02201b53a5ec2bac6b156fc623b5ce4499f", size = 581120, upload-time = "2026-02-20T20:47:30.161Z" },
    { url = "https://files.pythonhosted.org/packages/bf/6f/f3d64f4fa0a9c7b5c5b3c810ff1df614540d5aa7d519261b53fba55d4df9/greenlet-3.3.2-cp311-cp311-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a443358b33c4ec7b05b79a7c8b466f5d275025e750298be7340f8fc63dff2a55", size = 594363, upload-time = "2026-02-20T20:55:56.965Z" },
-    { url = "https://files.pythonhosted.org/packages/9c/8b/1430a04657735a3f23116c2e0d5eb10220928846e4537a938a41b350bed6/greenlet-3.3.2-cp311-cp311-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:4375a58e49522698d3e70cc0b801c19433021b5c37686f7ce9c65b0d5c8677d2", size = 605046, upload-time = "2026-02-20T21:02:45.234Z" },
    { url = "https://files.pythonhosted.org/packages/72/83/3e06a52aca8128bdd4dcd67e932b809e76a96ab8c232a8b025b2850264c5/greenlet-3.3.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8e2cd90d413acbf5e77ae41e5d3c9b3ac1d011a756d7284d7f3f2b806bbd6358", size = 594156, upload-time = "2026-02-20T20:20:59.955Z" },
    { url = "https://files.pythonhosted.org/packages/70/79/0de5e62b873e08fe3cef7dbe84e5c4bc0e8ed0c7ff131bccb8405cd107c8/greenlet-3.3.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:442b6057453c8cb29b4fb36a2ac689382fc71112273726e2423f7f17dc73bf99", size = 1554649, upload-time = "2026-02-20T20:49:32.293Z" },
    { url = "https://files.pythonhosted.org/packages/5a/00/32d30dee8389dc36d42170a9c66217757289e2afb0de59a3565260f38373/greenlet-3.3.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:45abe8eb6339518180d5a7fa47fa01945414d7cca5ecb745346fc6a87d2750be", size = 1619472, upload-time = "2026-02-20T20:21:07.966Z" },
@@ -2186,7 +2184,6 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/ea/ab/1608e5a7578e62113506740b88066bf09888322a311cff602105e619bd87/greenlet-3.3.2-cp312-cp312-macosx_11_0_universal2.whl", hash = "sha256:ac8d61d4343b799d1e526db579833d72f23759c71e07181c2d2944e429eb09cd", size = 280358, upload-time = "2026-02-20T20:17:43.971Z" },
    { url = "https://files.pythonhosted.org/packages/a5/23/0eae412a4ade4e6623ff7626e38998cb9b11e9ff1ebacaa021e4e108ec15/greenlet-3.3.2-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3ceec72030dae6ac0c8ed7591b96b70410a8be370b6a477b1dbc072856ad02bd", size = 601217, upload-time = "2026-02-20T20:47:31.462Z" },
    { url = "https://files.pythonhosted.org/packages/f8/16/5b1678a9c07098ecb9ab2dd159fafaf12e963293e61ee8d10ecb55273e5e/greenlet-3.3.2-cp312-cp312-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a2a5be83a45ce6188c045bcc44b0ee037d6a518978de9a5d97438548b953a1ac", size = 611792, upload-time = "2026-02-20T20:55:58.423Z" },
-    { url = "https://files.pythonhosted.org/packages/5c/c5/cc09412a29e43406eba18d61c70baa936e299bc27e074e2be3806ed29098/greenlet-3.3.2-cp312-cp312-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:ae9e21c84035c490506c17002f5c8ab25f980205c3e61ddb3a2a2a2e6c411fcb", size = 626250, upload-time = "2026-02-20T21:02:46.596Z" },
    { url = "https://files.pythonhosted.org/packages/50/1f/5155f55bd71cabd03765a4aac9ac446be129895271f73872c36ebd4b04b6/greenlet-3.3.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:43e99d1749147ac21dde49b99c9abffcbc1e2d55c67501465ef0930d6e78e070", size = 613875, upload-time = "2026-02-20T20:21:01.102Z" },
    { url = "https://files.pythonhosted.org/packages/fc/dd/845f249c3fcd69e32df80cdab059b4be8b766ef5830a3d0aa9d6cad55beb/greenlet-3.3.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:4c956a19350e2c37f2c48b336a3afb4bff120b36076d9d7fb68cb44e05d95b79", size = 1571467, upload-time = "2026-02-20T20:49:33.495Z" },
    { url = "https://files.pythonhosted.org/packages/2a/50/2649fe21fcc2b56659a452868e695634722a6655ba245d9f77f5656010bf/greenlet-3.3.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:6c6f8ba97d17a1e7d664151284cb3315fc5f8353e75221ed4324f84eb162b395", size = 1640001, upload-time = "2026-02-20T20:21:09.154Z" },
@@ -2195,7 +2192,6 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/ac/48/f8b875fa7dea7dd9b33245e37f065af59df6a25af2f9561efa8d822fde51/greenlet-3.3.2-cp313-cp313-macosx_11_0_universal2.whl", hash = "sha256:aa6ac98bdfd716a749b84d4034486863fd81c3abde9aa3cf8eff9127981a4ae4", size = 279120, upload-time = "2026-02-20T20:19:01.9Z" },
    { url = "https://files.pythonhosted.org/packages/49/8d/9771d03e7a8b1ee456511961e1b97a6d77ae1dea4a34a5b98eee706689d3/greenlet-3.3.2-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ab0c7e7901a00bc0a7284907273dc165b32e0d109a6713babd04471327ff7986", size = 603238, upload-time = "2026-02-20T20:47:32.873Z" },
    { url = "https://files.pythonhosted.org/packages/59/0e/4223c2bbb63cd5c97f28ffb2a8aee71bdfb30b323c35d409450f51b91e3e/greenlet-3.3.2-cp313-cp313-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:d248d8c23c67d2291ffd47af766e2a3aa9fa1c6703155c099feb11f526c63a92", size = 614219, upload-time = "2026-02-20T20:55:59.817Z" },
-    { url = "https://files.pythonhosted.org/packages/94/2b/4d012a69759ac9d77210b8bfb128bc621125f5b20fc398bce3940d036b1c/greenlet-3.3.2-cp313-cp313-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:ccd21bb86944ca9be6d967cf7691e658e43417782bce90b5d2faeda0ff78a7dd", size = 628268, upload-time = "2026-02-20T21:02:48.024Z" },
    { url = "https://files.pythonhosted.org/packages/7a/34/259b28ea7a2a0c904b11cd36c79b8cef8019b26ee5dbe24e73b469dea347/greenlet-3.3.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b6997d360a4e6a4e936c0f9625b1c20416b8a0ea18a8e19cabbefc712e7397ab", size = 616774, upload-time = "2026-02-20T20:21:02.454Z" },
    { url = "https://files.pythonhosted.org/packages/0a/03/996c2d1689d486a6e199cb0f1cf9e4aa940c500e01bdf201299d7d61fa69/greenlet-3.3.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:64970c33a50551c7c50491671265d8954046cb6e8e2999aacdd60e439b70418a", size = 1571277, upload-time = "2026-02-20T20:49:34.795Z" },
    { url = "https://files.pythonhosted.org/packages/d9/c4/2570fc07f34a39f2caf0bf9f24b0a1a0a47bc2e8e465b2c2424821389dfc/greenlet-3.3.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:1a9172f5bf6bd88e6ba5a84e0a68afeac9dc7b6b412b245dd64f52d83c81e55b", size = 1640455, upload-time = "2026-02-20T20:21:10.261Z" },
@@ -2204,7 +2200,6 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/3f/ae/8bffcbd373b57a5992cd077cbe8858fff39110480a9d50697091faea6f39/greenlet-3.3.2-cp314-cp314-macosx_11_0_universal2.whl", hash = "sha256:8d1658d7291f9859beed69a776c10822a0a799bc4bfe1bd4272bb60e62507dab", size = 279650, upload-time = "2026-02-20T20:18:00.783Z" },
    { url = "https://files.pythonhosted.org/packages/d1/c0/45f93f348fa49abf32ac8439938726c480bd96b2a3c6f4d949ec0124b69f/greenlet-3.3.2-cp314-cp314-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:18cb1b7337bca281915b3c5d5ae19f4e76d35e1df80f4ad3c1a7be91fadf1082", size = 650295, upload-time = "2026-02-20T20:47:34.036Z" },
    { url = "https://files.pythonhosted.org/packages/b3/de/dd7589b3f2b8372069ab3e4763ea5329940fc7ad9dcd3e272a37516d7c9b/greenlet-3.3.2-cp314-cp314-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:c2e47408e8ce1c6f1ceea0dffcdf6ebb85cc09e55c7af407c99f1112016e45e9", size = 662163, upload-time = "2026-02-20T20:56:01.295Z" },
-    { url = "https://files.pythonhosted.org/packages/cd/ac/85804f74f1ccea31ba518dcc8ee6f14c79f73fe36fa1beba38930806df09/greenlet-3.3.2-cp314-cp314-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:e3cb43ce200f59483eb82949bf1835a99cf43d7571e900d7c8d5c62cdf25d2f9", size = 675371, upload-time = "2026-02-20T21:02:49.664Z" },
    { url = "https://files.pythonhosted.org/packages/d2/d8/09bfa816572a4d83bccd6750df1926f79158b1c36c5f73786e26dbe4ee38/greenlet-3.3.2-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:63d10328839d1973e5ba35e98cccbca71b232b14051fd957b6f8b6e8e80d0506", size = 664160, upload-time = "2026-02-20T20:21:04.015Z" },
    { url = "https://files.pythonhosted.org/packages/48/cf/56832f0c8255d27f6c35d41b5ec91168d74ec721d85f01a12131eec6b93c/greenlet-3.3.2-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:8e4ab3cfb02993c8cc248ea73d7dae6cec0253e9afa311c9b37e603ca9fad2ce", size = 1619181, upload-time = "2026-02-20T20:49:36.052Z" },
    { url = "https://files.pythonhosted.org/packages/0a/23/b90b60a4aabb4cec0796e55f25ffbfb579a907c3898cd2905c8918acaa16/greenlet-3.3.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:94ad81f0fd3c0c0681a018a976e5c2bd2ca2d9d94895f23e7bb1af4e8af4e2d5", size = 1687713, upload-time = "2026-02-20T20:21:11.684Z" },
@@ -2213,7 +2208,6 @@ wheels = [
    { url = "https://files.pythonhosted.org/packages/98/6d/8f2ef704e614bcf58ed43cfb8d87afa1c285e98194ab2cfad351bf04f81e/greenlet-3.3.2-cp314-cp314t-macosx_11_0_universal2.whl", hash = "sha256:e26e72bec7ab387ac80caa7496e0f908ff954f31065b0ffc1f8ecb1338b11b54", size = 286617, upload-time = "2026-02-20T20:19:29.856Z" },
    { url = "https://files.pythonhosted.org/packages/5e/0d/93894161d307c6ea237a43988f27eba0947b360b99ac5239ad3fe09f0b47/greenlet-3.3.2-cp314-cp314t-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8b466dff7a4ffda6ca975979bab80bdadde979e29fc947ac3be4451428d8b0e4", size = 655189, upload-time = "2026-02-20T20:47:35.742Z" },
    { url = "https://files.pythonhosted.org/packages/f5/2c/d2d506ebd8abcb57386ec4f7ba20f4030cbe56eae541bc6fd6ef399c0b41/greenlet-3.3.2-cp314-cp314t-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:b8bddc5b73c9720bea487b3bffdb1840fe4e3656fba3bd40aa1489e9f37877ff", size = 658225, upload-time = "2026-02-20T20:56:02.527Z" },
-    { url = "https://files.pythonhosted.org/packages/d1/67/8197b7e7e602150938049d8e7f30de1660cfb87e4c8ee349b42b67bdb2e1/greenlet-3.3.2-cp314-cp314t-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:59b3e2c40f6706b05a9cd299c836c6aa2378cabe25d021acd80f13abf81181cf", size = 666581, upload-time = "2026-02-20T21:02:51.526Z" },
    { url = "https://files.pythonhosted.org/packages/8e/30/3a09155fbf728673a1dea713572d2d31159f824a37c22da82127056c44e4/greenlet-3.3.2-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b26b0f4428b871a751968285a1ac9648944cea09807177ac639b030bddebcea4", size = 657907, upload-time = "2026-02-20T20:21:05.259Z" },
    { url = "https://files.pythonhosted.org/packages/f3/fd/d05a4b7acd0154ed758797f0a43b4c0962a843bedfe980115e842c5b2d08/greenlet-3.3.2-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:1fb39a11ee2e4d94be9a76671482be9398560955c9e568550de0224e41104727", size = 1618857, upload-time = "2026-02-20T20:49:37.309Z" },
    { url = "https://files.pythonhosted.org/packages/6f/e1/50ee92a5db521de8f35075b5eff060dd43d39ebd46c2181a2042f7070385/greenlet-3.3.2-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:20154044d9085151bc309e7689d6f7ba10027f8f5a8c0676ad398b951913d89e", size = 1680010, upload-time = "2026-02-20T20:21:13.427Z" },
@@ -4902,7 +4896,7 @@ requires-dist = [
    { name = "camb-sdk", marker = "extra == 'camb'", specifier = ">=1.5.4,<2" },
    { name = "coremltools", marker = "extra == 'local-smart-turn'", specifier = ">=8.0" },
    { name = "daily-python", marker = "extra == 'daily'", specifier = "~=0.27.0" },
-    { name = "deepgram-sdk", marker = "extra == 'deepgram'", specifier = ">=6.1.0,<7" },
+    { name = "deepgram-sdk", marker = "extra == 'deepgram'", specifier = ">=6.1.1,<7" },
    { name = "docstring-parser", specifier = ">=0.16,<1" },
    { name = "einops", marker = "extra == 'moondream'", specifier = "~=0.8.0" },
    { name = "fastapi", marker = "extra == 'runner'", specifier = ">=0.115.6,<1" },
@@ -4914,9 +4908,9 @@ requires-dist = [
    { name = "groq", marker = "extra == 'groq'", specifier = ">=0.23.0,<2" },
    { name = "hume", marker = "extra == 'hume'", specifier = ">=0.11.2,<1" },
    { name = "kokoro-onnx", marker = "extra == 'kokoro'", specifier = ">=0.5.0,<1" },
-    { name = "langchain", marker = "extra == 'langchain'", specifier = "~=0.3.20" },
-    { name = "langchain-community", marker = "extra == 'langchain'", specifier = "~=0.3.20" },
-    { name = "langchain-openai", marker = "extra == 'langchain'", specifier = "~=0.3.9" },
+    { name = "langchain", marker = "extra == 'langchain'", specifier = "~=0.3.28" },
+    { name = "langchain-community", marker = "extra == 'langchain'", specifier = "~=0.3.31" },
+    { name = "langchain-openai", marker = "extra == 'langchain'", specifier = "~=0.3.29" },
    { name = "livekit", marker = "extra == 'heygen'", specifier = ">=1.0.13,<2" },
    { name = "livekit", marker = "extra == 'livekit'", specifier = ">=1.0.13,<2" },
    { name = "livekit-api", marker = "extra == 'livekit'", specifier = ">=1.0.5,<2" },
Author	SHA1	Message	Date
Aleix Conchillo Flaqué	a84c69858e	Merge pull request #4185 from pipecat-ai/changelog-0.0.108 Release 0.0.108 - Changelog Update	2026-03-27 21:47:53 -07:00
aconchillo	ca224219dc	Update changelog for version 0.0.108	2026-03-27 21:43:37 -07:00
Aleix Conchillo Flaqué	83dc979d19	Merge pull request #4186 from pipecat-ai/mb/fix-websocket-disconnect-race-condition Fix FastAPI WebSocket disconnect race condition	2026-03-27 21:40:21 -07:00
Aleix Conchillo Flaqué	fc76b3f2fb	update pyproject.toml and uv.lock	2026-03-27 21:36:03 -07:00
Mark Backman	4670370dbb	Add changelog for #4186	2026-03-28 00:02:44 -04:00
Mark Backman	47e53890e3	Fix FastAPI WebSocket disconnect race condition causing pipeline hang When the remote side disconnects while send() is in flight, send() was setting _closing=True. This prevented the receive loop from firing on_client_disconnected, causing the pipeline to hang waiting for a disconnect signal that never came. The fix removes _closing from send() (that flag means we initiated the close) and instead checks Starlette application_state in _can_send() to suppress subsequent sends after a failure. Fixes #3912	2026-03-28 00:01:25 -04:00
Aleix Conchillo Flaqué	195180b6f4	Merge pull request #4184 from pipecat-ai/aleix/fix-sarvam-examples-role Fix Sarvam examples to use 'user' role instead of 'developer'	2026-03-27 20:34:59 -07:00
Aleix Conchillo Flaqué	8b64166bb7	Fix Sarvam examples to use 'user' role instead of 'developer' Sarvam uses the OpenAI-compatible API but does not support the 'developer' role, causing errors. Use 'user' role instead.	2026-03-27 20:33:25 -07:00
Aleix Conchillo Flaqué	1d18995435	Merge pull request #4183 from pipecat-ai/aleix/fix-task-scheduling Yield after create_task to ensure timer tasks are scheduled	2026-03-27 20:32:32 -07:00
Aleix Conchillo Flaqué	ea7324b2ba	Add changelog for #4183	2026-03-27 19:03:55 -07:00
Aleix Conchillo Flaqué	52ed7137af	Yield after create_task to ensure timer tasks are scheduled Add `await asyncio.sleep(0)` after `create_task()` calls in UserIdleController, SpeechTimeoutUserTurnStopStrategy, TurnAnalyzerUserTurnStopStrategy, and UserTurnCompletionLLMServiceMixin so the event loop schedules the newly created timer tasks before the caller continues.	2026-03-27 19:03:23 -07:00
kompfner	b33df03724	Merge pull request #4179 from pipecat-ai/pk/fix-gemini-live-vertex Don't send history_config for Gemini Live Vertex (unsupported)	2026-03-27 17:34:29 -04:00
Paul Kompfner	28fbe1db08	Don't send history_config for Gemini Live Vertex (unsupported)	2026-03-27 17:30:47 -04:00
kompfner	9240e92d9f	Merge pull request #4177 from pipecat-ai/pk/tweak-26i-for-gemini-3.1-flash-live-support Tweak 26i example system instruction for Gemini 3.1 Flash Live compat…	2026-03-27 17:20:06 -04:00
Paul Kompfner	5caf53f086	Tweak 26i example system instruction for Gemini 3.1 Flash Live compatibility Gemini 3.1 Flash Live won't reliably report ending its turn until after it says something following a tool call. Restructure the system instruction so the model says goodbye after calling end_conversation, and add a comment explaining the deferred EndFrame behavior that makes this work.	2026-03-27 17:13:17 -04:00
Mark Backman	ac2716811c	Merge pull request #4176 from pipecat-ai/mb/fix-websocket-rtvi-messages Fix RTVI events not delivered over WebSocket transports	2026-03-27 16:50:37 -04:00
Mark Backman	d313d56776	Fix RTVI events not delivered over WebSocket transports The base serializer filters out RTVI protocol messages by default (ignore_rtvi_messages=True) to prevent them from being sent over telephony media streams. ProtobufFrameSerializer is used by WebSocket transports, which are the delivery channel for these messages, so disable the filter there.	2026-03-27 16:47:11 -04:00
				`@@ -1 +0,0 @@`
				- Added `SarvamLLMService` with support for `sarvam-30b`, `sarvam-30b-16k`, `sarvam-105b` and `sarvam-105b-32k`
				`@@ -1 +0,0 @@`
				- Added `on_turn_context_created(context_id)` hook to `TTSService`. Override this to perform provider-specific setup (e.g. eagerly opening a server-side context) before text starts flowing. Called each time a new turn context ID is created.
				`@@ -1 +0,0 @@`
				- Added context prewarming path for `InworldTTSService` to improve first audio latency
				`@@ -1 +0,0 @@`
				- Added `KrispVivaVadAnalyzer` for Voice Activity Detection using the Krisp VIVA SDK (requires `krisp_audio`).
				`@@ -1 +0,0 @@`
				- Modeified `InworldTTSService` to close context at end of turn instead of relying on idle timeout
				`@@ -1 +0,0 @@`
				- Added `XAIHttpTTSService` for text-to-speech using xAI's HTTP TTS API.
				`@@ -1 +0,0 @@`
				`- Added Gemini 3 support to the Gemini Live service.`
				`@@ -1 +0,0 @@`
				- `TTSService`: the default `stop_frame_timeout_s` (idle time before an automatic `TTSStoppedFrame` is pushed when `push_stop_frames=True`) has changed from `2.0` to `3.0` seconds.
				`@@ -1 +0,0 @@`
				- Added support for "developer" role messages in conversation context across all LLM adapters. For non-OpenAI services (Anthropic, Google, AWS Bedrock), "developer" messages are converted to "user" messages (use `system_instruction` to set the system instruction). For OpenAI services, "developer" messages pass through in conversation history. For the Responses API, they are kept as "developer" role (matching the existing "system" → "developer" conversion).