Merge pull request #3906 from pipecat-ai/changelog-0.0.104

Release 0.0.104 - Changelog Update
2026-03-02 21:24:05 -08:00
parent d1ad7a9580 62260454a2
commit 5940731dd0
68 changed files with 383 additions and 94 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,389 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 <!-- towncrier release notes start -->

+## [0.0.104] - 2026-03-02
+
+### Added
+
+- Added `TextAggregationMetricsData` metric measuring the time from the first
+  LLM token to the first complete sentence, representing the latency cost of
+  sentence aggregation in the TTS pipeline.
+  (PR [#3696](https://github.com/pipecat-ai/pipecat/pull/3696))
+
+- Added support for using strongly-typed objects instead of dicts for updating
+  service settings at runtime.
+
+    Instead of, say:
+
+    ```python
+    await task.queue_frame(
+        STTUpdateSettingsFrame(settings={"language": Language.ES})
+    )
+    ```
+
+    you'd do:
+
+    ```python
+    await task.queue_frame(
+        STTUpdateSettingsFrame(delta=DeepgramSTTSettings(language=Language.ES))
+    )
+    ```
+
+  Each service now vends strongly-typed classes like `DeepgramSTTSettings`
+  representing the service's runtime-updatable settings.
+  (PR [#3714](https://github.com/pipecat-ai/pipecat/pull/3714))
+
+- Added support for specifying private endpoints for Azure Speech-to-Text,
+  enabling use in private networks behind firewalls.
+  (PR [#3764](https://github.com/pipecat-ai/pipecat/pull/3764))
+
+- Added `LemonSliceTransport` and `LemonSliceApi` to support adding real-time
+  LemonSlice Avatars to any Daily room.
+  (PR [#3791](https://github.com/pipecat-ai/pipecat/pull/3791))
+
+- Added `output_medium` parameter to `AgentInputParams` and
+  `OneShotInputParams` in Ultravox service to control initial output medium
+  (text or voice) at call creation time.
+  (PR [#3806](https://github.com/pipecat-ai/pipecat/pull/3806))
+
+- Added `TurnMetricsData` as a generic metrics class for turn detection, with
+  e2e processing time measurement. `KrispVivaTurn` now emits `TurnMetricsData`
+  with `e2e_processing_time_ms` tracking the interval from VAD
+  speech-to-silence transition to turn completion.
+  (PR [#3809](https://github.com/pipecat-ai/pipecat/pull/3809))
+
+- Added `on_audio_context_interrupted()` and `on_audio_context_completed()`
+  callbacks to `AudioContextTTSService`. Subclasses can override these to
+  perform provider-specific cleanup instead of overriding
+  `_handle_interruption()`.
+  (PR [#3814](https://github.com/pipecat-ai/pipecat/pull/3814))
+
+- Added `on_summary_applied` event to `LLMContextSummarizer` for observability,
+  providing message counts before and after context summarization.
+  (PR [#3855](https://github.com/pipecat-ai/pipecat/pull/3855))
+
+- Added `summary_message_template` to `LLMContextSummarizationConfig` for
+  customizing how summaries are formatted when injected into context (e.g.,
+  wrapping in XML tags).
+  (PR [#3855](https://github.com/pipecat-ai/pipecat/pull/3855))
+
+- Added `summarization_timeout` to `LLMContextSummarizationConfig` (default
+  120s) to prevent hung LLM calls from permanently blocking future
+  summarizations.
+  (PR [#3855](https://github.com/pipecat-ai/pipecat/pull/3855))
+
+- Added optional `llm` field to `LLMContextSummarizationConfig` for routing
+  summarization to a dedicated LLM service (e.g., a cheaper/faster model)
+  instead of the pipeline's primary model.
+  (PR [#3855](https://github.com/pipecat-ai/pipecat/pull/3855))
+
+- Add AssemblyAI u3-rt-pro model support with built-in turn detection mode
+  (PR [#3856](https://github.com/pipecat-ai/pipecat/pull/3856))
+
+- Added `LLMSummarizeContextFrame` to trigger on-demand context summarization
+  from anywhere in the pipeline (e.g. a function call tool). Accepts an
+  optional `config: LLMContextSummaryConfig` to override summary generation
+  settings per request.
+  (PR [#3863](https://github.com/pipecat-ai/pipecat/pull/3863))
+
+- Added `LLMContextSummaryConfig` (summary generation params:
+  `target_context_tokens`, `min_messages_after_summary`,
+  `summarization_prompt`) and `LLMAutoContextSummarizationConfig` (auto-trigger
+  thresholds: `max_context_tokens`, `max_unsummarized_messages`, plus a nested
+  `summary_config`). These replace the monolithic
+  `LLMContextSummarizationConfig`.
+  (PR [#3863](https://github.com/pipecat-ai/pipecat/pull/3863))
+
+- Added support for the `speed_alpha` parameter to the `arcana` model in
+  `RimeTTSService`.
+  (PR [#3873](https://github.com/pipecat-ai/pipecat/pull/3873))
+
+- Added `ClientConnectedFrame`, a new `SystemFrame` pushed by all transports
+  (Daily, LiveKit, FastAPI WebSocket, WebSocket Server, SmallWebRTC, HeyGen,
+  Tavus) when a client connects. Enables observers to track transport readiness
+  timing.
+  (PR [#3881](https://github.com/pipecat-ai/pipecat/pull/3881))
+
+- Added `StartupTimingObserver` for measuring how long each processor's
+  `start()` method takes during pipeline startup. Also measures transport
+  readiness — the time from `StartFrame` to first client connection — via the
+  `on_transport_timing_report` event.
+  (PR [#3881](https://github.com/pipecat-ai/pipecat/pull/3881))
+
+- Added `BotConnectedFrame` for SFU transports and `on_transport_timing_report`
+  event to `StartupTimingObserver` with bot and client connection timing.
+  (PR [#3881](https://github.com/pipecat-ai/pipecat/pull/3881))
+
+- Added optional `direction` parameter to `PipelineTask.queue_frame()` and
+  `PipelineTask.queue_frames()`, allowing frames to be pushed upstream from the
+  end of the pipeline.
+  (PR [#3883](https://github.com/pipecat-ai/pipecat/pull/3883))
+
+- Added `on_latency_breakdown` event to `UserBotLatencyObserver` providing
+  per-service TTFB, text aggregation, user turn duration, and function call
+  latency metrics for each user-to-bot response cycle.
+  (PR [#3885](https://github.com/pipecat-ai/pipecat/pull/3885))
+
+- Added `on_first_bot_speech_latency` event to `UserBotLatencyObserver`
+  measuring the time from client connection to first bot speech. An
+  `on_latency_breakdown` is also emitted for this first speech event.
+  (PR [#3885](https://github.com/pipecat-ai/pipecat/pull/3885))
+
+- Added `broadcast_interruption()` to `FrameProcessor`. This method pushes an
+  `InterruptionFrame` both upstream and downstream directly from the calling
+  processor, avoiding the round-trip through the pipeline task that
+  `push_interruption_task_frame_and_wait()` required.
+  (PR [#3896](https://github.com/pipecat-ai/pipecat/pull/3896))
+
+### Changed
+
+- Added `text_aggregation_mode` parameter to `TTSService` and all TTS
+  subclasses with a new `TextAggregationMode` enum (`SENTENCE`, `TOKEN`). All
+  text now flows through text aggregators regardless of mode, enabling pattern
+  detection and tag handling in TOKEN mode.
+  (PR [#3696](https://github.com/pipecat-ai/pipecat/pull/3696))
+
+- ⚠️ Refactored runtime-updatable service settings to use strongly-typed
+  classes (`TTSSettings`, `STTSettings`, `LLMSettings`, and service-specific
+  subclasses) instead of plain dicts. Each service's `_settings` now holds
+  these strongly-typed objects. For service maintainers, see changes in
+  COMMUNITY_INTEGRATIONS.md.
+  (PR [#3714](https://github.com/pipecat-ai/pipecat/pull/3714))
+
+- Word timestamp support has been moved from `WordTTSService` into `TTSService`
+  via a new `supports_word_timestamps` parameter. Services that previously
+  extended `WordTTSService`, `AudioContextWordTTSService`, or
+  `WebsocketWordTTSService` now pass `supports_word_timestamps=True` to their
+  parent `__init__` instead.
+  (PR [#3786](https://github.com/pipecat-ai/pipecat/pull/3786))
+
+- Improved Ultravox TTFB measurement accuracy by using VAD speech end time
+  instead of `UserStoppedSpeakingFrame` timing.
+  (PR [#3806](https://github.com/pipecat-ai/pipecat/pull/3806))
+
+- Aligned `UltravoxRealtimeLLMService` frame handling with OpenAI/Gemini
+  realtime services: added `InterruptionFrame` handling with metrics cleanup,
+  processing metrics at response boundaries, and improved agent transcript
+  handling for both voice and text output modalities.
+  (PR [#3806](https://github.com/pipecat-ai/pipecat/pull/3806))
+
+- Updated `OpenAIRealtimeLLMService` default model to `gpt-realtime-1.5`.
+  (PR [#3807](https://github.com/pipecat-ai/pipecat/pull/3807))
+
+- Added `api_key` parameter to `KrispVivaSDKManager`, `KrispVivaTurn`, and
+  `KrispVivaFilter` for Krisp SDK v1.6.1+ licensing. Falls back to
+  `KRISP_VIVA_API_KEY` environment variable.
+  (PR [#3809](https://github.com/pipecat-ai/pipecat/pull/3809))
+
+- Bumped `nltk` minimum version from 3.9.1 to 3.9.3 to resolve a security
+  vulnerability.
+  (PR [#3811](https://github.com/pipecat-ai/pipecat/pull/3811))
+
+- `ServiceSettingsUpdateFrame`s are now `UninterruptibleFrame`s. Generally
+  speaking, you don't want a user interruption to prevent a service setting
+  change from going into effect. Note that you usually don't use
+  `ServiceSettingsUpdateFrame` directly, you use one of its subclasses:
+    - `LLMUpdateSettingsFrame`
+    - `TTSUpdateSettingsFrame`
+    - `STTUpdateSettingsFrame`
+  (PR [#3819](https://github.com/pipecat-ai/pipecat/pull/3819))
+
+- Updated context summarization to use `user` role instead of `assistant` for
+  summary messages.
+  (PR [#3855](https://github.com/pipecat-ai/pipecat/pull/3855))
+
+- Rename `AssemblyAISTTService` parameter
+  `min_end_of_turn_silence_when_confident` parameter to `min_turn_silence` (old
+  name still supported with deprecation warning)
+  (PR [#3856](https://github.com/pipecat-ai/pipecat/pull/3856))
+
+- ⚠️ Renamed `LLMAssistantAggregatorParams` fields:
+  `enable_context_summarization` → `enable_auto_context_summarization` and
+  `context_summarization_config` → `auto_context_summarization_config` (now
+  accepts `LLMAutoContextSummarizationConfig`). The old names still work with a
+  `DeprecationWarning` for one release cycle.
+  (PR [#3863](https://github.com/pipecat-ai/pipecat/pull/3863))
+
+- `ElevenLabsRealtimeSTTService` now sets `TranscriptionFrame.finalized` to
+  `True` when using `CommitStrategy.MANUAL`.
+  (PR [#3865](https://github.com/pipecat-ai/pipecat/pull/3865))
+
+- Updated numba version pin from == to >=0.61.2
+  (PR [#3868](https://github.com/pipecat-ai/pipecat/pull/3868))
+
+- Updated tracing code to use `ServiceSettings` dataclass API
+  (`given_fields()`, attribute access) instead of dict-style access
+  (`.items()`, `in`, subscript).
+  (PR [#3879](https://github.com/pipecat-ai/pipecat/pull/3879))
+
+- ⚠️ Removed `event` field and `complete()` method from `InterruptionFrame`.
+  Removed `event` field from `InterruptionTaskFrame`. These are no longer
+  needed since `broadcast_interruption()` does not require a round-trip
+  completion signal.
+  (PR [#3896](https://github.com/pipecat-ai/pipecat/pull/3896))
+
+- Moved `pipecat.services.deepgram.stt_sagemaker` and
+  `pipecat.services.deepgram.tts_sagemaker` to
+  `pipecat.services.deepgram.sagemaker.stt` and
+  `pipecat.services.deepgram.sagemaker.tts`. The old import paths still work
+  but emit a `DeprecationWarning`.
+  (PR [#3902](https://github.com/pipecat-ai/pipecat/pull/3902))
+
+### Deprecated
+
+- ⚠️ Deprecated `aggregate_sentences` parameter on `TTSService` and all TTS
+  subclasses. Use `text_aggregation_mode=TextAggregationMode.SENTENCE` or
+  `text_aggregation_mode=TextAggregationMode.TOKEN` instead.
+  (PR [#3696](https://github.com/pipecat-ai/pipecat/pull/3696))
+
+- Deprecated `set_model()`, `set_voice()`, and `set_language()` on AI services
+  in favor of runtime updates via `TTSUpdateSettingsFrame`,
+  `STTUpdateSettingsFrame`, and `LLMUpdateSettingsFrame`.
+
+  ⚠️ Note, too, a subtle behavior change in these deprecated methods. Whereas
+  previously only `set_language()` caused the service to actually react to the
+  update (e.g. by reconnecting to a remote service so it an pick up the
+  change), now all these methods do. This change was made as part of a refactor
+  making them all work the same way under the hood.
+  (PR [#3714](https://github.com/pipecat-ai/pipecat/pull/3714))
+
+- Dict-based `*UpdateSettingsFrame(settings={...})` is deprecated in favor of
+  passing typed settings delta objects with
+  `*UpdateSettingsFrame(delta={...})`.
+  (PR [#3714](https://github.com/pipecat-ai/pipecat/pull/3714))
+
+- Deprecated `WordTTSService`, `WebsocketWordTTSService`,
+  `AudioContextWordTTSService`, and `InterruptibleWordTTSService`. Use their
+  non-word counterparts with `supports_word_timestamps=True` instead:
+    - `WordTTSService` → `TTSService(supports_word_timestamps=True)`
+    - `WebsocketWordTTSService` →
+  `WebsocketTTSService(supports_word_timestamps=True)`
+    - `AudioContextWordTTSService` →
+  `AudioContextTTSService(supports_word_timestamps=True)`
+    - `InterruptibleWordTTSService` →
+  `InterruptibleTTSService(supports_word_timestamps=True)`
+  (PR [#3786](https://github.com/pipecat-ai/pipecat/pull/3786))
+
+- Deprecated `SmartTurnMetricsData` in favor of `TurnMetricsData`.
+  `BaseSmartTurn` now emits `TurnMetricsData` directly.
+  (PR [#3809](https://github.com/pipecat-ai/pipecat/pull/3809))
+
+- Deprecated `LLMContextSummarizationConfig`. Use
+  `LLMAutoContextSummarizationConfig` with a nested `LLMContextSummaryConfig`
+  instead. The old class emits a `DeprecationWarning`.
+  (PR [#3863](https://github.com/pipecat-ai/pipecat/pull/3863))
+
+- Deprecated `push_interruption_task_frame_and_wait()` in `FrameProcessor`. Use
+  `broadcast_interruption()` instead. The old method now delegates to
+  `broadcast_interruption()` and logs a deprecation warning.
+  (PR [#3896](https://github.com/pipecat-ai/pipecat/pull/3896))
+
+### Removed
+
+- Removed `local-smart-turn-v3` optional extra from `pyproject.toml`. The
+  `transformers` and `onnxruntime` packages are now always installed as core
+  dependencies since they are required by the default turn stop strategy,
+  `TurnAnalyzerUserTurnStopStrategy` which uses `LocalSmartTurnAnalyzerV3`.
+  (PR [#3803](https://github.com/pipecat-ai/pipecat/pull/3803))
+
+- ⚠️ Removed `PlayHTTTSService` and `PlayHTHttpTTSService`. PlayHT has been
+  shut down and is no longer available.
+  (PR [#3838](https://github.com/pipecat-ai/pipecat/pull/3838))
+
+### Fixed
+
+- Added `LLMSpecificMessage` handling in `LLMContextSummarizationUtil` to skip
+  provider-specific messages during context summarization.
+  (PR [#3794](https://github.com/pipecat-ai/pipecat/pull/3794))
+
+- Treated `response_cancel_not_active` as a non-fatal error in realtime
+  services (`OpenAIRealtimeLLMService`, `GrokRealtimeLLMService`,
+  `OpenAIRealtimeBetaLLMService`) to prevent WebSocket disconnection when
+  cancelling an inactive response.
+  (PR [#3795](https://github.com/pipecat-ai/pipecat/pull/3795))
+
+- Fixed Poetry compatibility by inlining `local-smart-turn-v3` dependencies
+  (`transformers`, `onnxruntime`) into core dependencies instead of using a
+  self-referential extra.
+  (PR [#3803](https://github.com/pipecat-ai/pipecat/pull/3803))
+
+- Fixed `SentryMetrics` method signatures to match updated
+  `FrameProcessorMetrics` base class, resolving `TypeError` when using
+  `start_time`/`end_time` keyword arguments.
+  (PR [#3808](https://github.com/pipecat-ai/pipecat/pull/3808))
+
+- Fixed STT TTFB metrics not being reported for `SonioxSTTService` and
+  `AWSTranscribeSTTService` due to missing `can_generate_metrics()` override.
+  (PR [#3813](https://github.com/pipecat-ai/pipecat/pull/3813))
+
+- Fixed an issue where `AudioContextTTSService`-based providers (AsyncAI,
+  ElevenLabs, Inworld, Rime) did not close or clean up their server-side audio
+  contexts after normal speech completion, only on interruption.
+  (PR [#3814](https://github.com/pipecat-ai/pipecat/pull/3814))
+
+- Fixed STT TTFB metrics measuring timeout expiry time instead of actual
+  transcript arrival time.
+  (PR [#3822](https://github.com/pipecat-ai/pipecat/pull/3822))
+
+- Fixed `InterimTranscriptionFrame` and `TranslationFrame` being
+  unintentionally pushed downstream in `LLMUserAggregator`. They are now
+  consumed like `TranscriptionFrame`.
+  (PR [#3825](https://github.com/pipecat-ai/pipecat/pull/3825))
+
+- Fixed misleading "Empty audio frame received for STT service" warnings when
+  using audio filters (e.g. `RNNoiseFilter`, `KrispVivaFilter`, `AICFilter`)
+  that buffer audio internally.
+  (PR [#3828](https://github.com/pipecat-ai/pipecat/pull/3828))
+
+- Fixed issues with `RimeNonJsonTTSService` where trailing punctuation is
+  sometimes vocalized
+  (PR [#3837](https://github.com/pipecat-ai/pipecat/pull/3837))
+
+- Fixed `TTSSpeakFrame` not committing spoken text to the conversation context
+  when used outside of an LLM response (e.g., bot greetings or injected
+  speech).
+  (PR [#3845](https://github.com/pipecat-ai/pipecat/pull/3845))
+
+- Removed verbose per-chunk audio logging from `GenesysAudioHookSerializer`
+  that flooded production logs.
+  (PR [#3850](https://github.com/pipecat-ai/pipecat/pull/3850))
+
+- Add beta feature warning when using custom prompts with AssemblyAI
+  (PR [#3856](https://github.com/pipecat-ai/pipecat/pull/3856))
+
+- Fixed `LocalSmartTurnAnalyzerV3` producing incorrect end-of-turn predictions
+  at non-16kHz sample rates (e.g. 8kHz Twilio telephony) by adding automatic
+  resampling to 16kHz before Whisper feature extraction.
+  (PR [#3857](https://github.com/pipecat-ai/pipecat/pull/3857))
+
+- Fixed `PipelineTask` double-inserting `RTVIProcessor` into the frame chain
+  when the user provides both an `RTVIProcessor` in the pipeline and a custom
+  `RTVIObserver` subclass in observers.
+  (PR [#3867](https://github.com/pipecat-ai/pipecat/pull/3867))
+
+- Fixed turn completion instructions being lost when `LLMMessagesUpdateFrame`
+  replaces the LLM context. When `filter_incomplete_user_turns` is enabled, the
+  turn completion system message is now re-injected after context replacement.
+  (PR [#3888](https://github.com/pipecat-ai/pipecat/pull/3888))
+
+- Fixed Azure TTS and STT services silently swallowing cancellation errors
+  (invalid API key, network failures, rate limiting) instead of propagating
+  them as `ErrorFrame`s to the pipeline.
+  (PR [#3893](https://github.com/pipecat-ai/pipecat/pull/3893))
+
+### Performance
+
+- Switched `GradiumTTSService` from `InterruptibleWordTTSService` to
+  `AudioContextWordTTSService`, eliminating websocket disconnect/reconnect on
+  every interruption by using `client_req_id`-based multiplexing.
+  (PR [#3759](https://github.com/pipecat-ai/pipecat/pull/3759))
+
+### Other
+
+- Standardized Sarvam STT/TTS User-Agent header handling to consistently send
+  Pipecat SDK identity in websocket requests.
+  (PR [#3886](https://github.com/pipecat-ai/pipecat/pull/3886))
+
 ## [0.0.103] - 2026-02-20

 ### Added
--- a/changelog/3696.added.md
+++ b/changelog/3696.added.md
@@ -1 +0,0 @@
- Added `TextAggregationMetricsData` metric measuring the time from the first LLM token to the first complete sentence, representing the latency cost of sentence aggregation in the TTS pipeline.
--- a/changelog/3696.changed.md
+++ b/changelog/3696.changed.md
@@ -1 +0,0 @@
- Added `text_aggregation_mode` parameter to `TTSService` and all TTS subclasses with a new `TextAggregationMode` enum (`SENTENCE`, `TOKEN`). All text now flows through text aggregators regardless of mode, enabling pattern detection and tag handling in TOKEN mode.
--- a/changelog/3696.deprecated.md
+++ b/changelog/3696.deprecated.md
@@ -1 +0,0 @@
- ⚠️ Deprecated `aggregate_sentences` parameter on `TTSService` and all TTS subclasses. Use `text_aggregation_mode=TextAggregationMode.SENTENCE` or `text_aggregation_mode=TextAggregationMode.TOKEN` instead.
--- a/changelog/3714.added.md
+++ b/changelog/3714.added.md
@@ -1,19 +0,0 @@
- Added support for using strongly-typed objects instead of dicts for updating service settings at runtime.
-
-  Instead of, say:
-
-  ```python
-  await task.queue_frame(
-      STTUpdateSettingsFrame(settings={"language": Language.ES})
-  )
-  ```
-
-  you'd do:
-
-  ```python
-  await task.queue_frame(
-      STTUpdateSettingsFrame(delta=DeepgramSTTSettings(language=Language.ES))
-  )
-  ```
-
-  Each service now vends strongly-typed classes like `DeepgramSTTSettings` representing the service's runtime-updatable settings.
--- a/changelog/3714.changed.md
+++ b/changelog/3714.changed.md
@@ -1 +0,0 @@
- ⚠️ Refactored runtime-updatable service settings to use strongly-typed classes (`TTSSettings`, `STTSettings`, `LLMSettings`, and service-specific subclasses) instead of plain dicts. Each service's `_settings` now holds these strongly-typed objects. For service maintainers, see changes in COMMUNITY_INTEGRATIONS.md.
--- a/changelog/3714.deprecated.2.md
+++ b/changelog/3714.deprecated.2.md
@@ -1 +0,0 @@
- Dict-based `*UpdateSettingsFrame(settings={...})` is deprecated in favor of passing typed settings delta objects with `*UpdateSettingsFrame(delta={...})`.
--- a/changelog/3714.deprecated.md
+++ b/changelog/3714.deprecated.md
@@ -1,3 +0,0 @@
- Deprecated `set_model()`, `set_voice()`, and `set_language()` on AI services in favor of runtime updates via `TTSUpdateSettingsFrame`, `STTUpdateSettingsFrame`, and `LLMUpdateSettingsFrame`.
-
-  ⚠️ Note, too, a subtle behavior change in these deprecated methods. Whereas previously only `set_language()` caused the service to actually react to the update (e.g. by reconnecting to a remote service so it an pick up the change), now all these methods do. This change was made as part of a refactor making them all work the same way under the hood.
--- a/changelog/3759.performance.md
+++ b/changelog/3759.performance.md
@@ -1 +0,0 @@
- Switched `GradiumTTSService` from `InterruptibleWordTTSService` to `AudioContextWordTTSService`, eliminating websocket disconnect/reconnect on every interruption by using `client_req_id`-based multiplexing.
--- a/changelog/3764.added.md
+++ b/changelog/3764.added.md
@@ -1 +0,0 @@
- Added support for specifying private endpoints for Azure Speech-to-Text, enabling use in private networks behind firewalls.
--- a/changelog/3786.changed.md
+++ b/changelog/3786.changed.md
@@ -1 +0,0 @@
- Word timestamp support has been moved from `WordTTSService` into `TTSService` via a new `supports_word_timestamps` parameter. Services that previously extended `WordTTSService`, `AudioContextWordTTSService`, or `WebsocketWordTTSService` now pass `supports_word_timestamps=True` to their parent `__init__` instead.
--- a/changelog/3786.deprecated.md
+++ b/changelog/3786.deprecated.md
@@ -1,5 +0,0 @@
- Deprecated `WordTTSService`, `WebsocketWordTTSService`, `AudioContextWordTTSService`, and `InterruptibleWordTTSService`. Use their non-word counterparts with `supports_word_timestamps=True` instead:
-  - `WordTTSService` → `TTSService(supports_word_timestamps=True)`
-  - `WebsocketWordTTSService` → `WebsocketTTSService(supports_word_timestamps=True)`
-  - `AudioContextWordTTSService` → `AudioContextTTSService(supports_word_timestamps=True)`
-  - `InterruptibleWordTTSService` → `InterruptibleTTSService(supports_word_timestamps=True)`
--- a/changelog/3791.added.md
+++ b/changelog/3791.added.md
@@ -1 +0,0 @@
- Added `LemonSliceTransport` and `LemonSliceApi` to support adding real-time LemonSlice Avatars to any Daily room. 
--- a/changelog/3794.fixed.md
+++ b/changelog/3794.fixed.md
@@ -1 +0,0 @@
- Added `LLMSpecificMessage` handling in `LLMContextSummarizationUtil` to skip provider-specific messages during context summarization.
--- a/changelog/3795.fixed.md
+++ b/changelog/3795.fixed.md
@@ -1 +0,0 @@
- Treated `response_cancel_not_active` as a non-fatal error in realtime services (`OpenAIRealtimeLLMService`, `GrokRealtimeLLMService`, `OpenAIRealtimeBetaLLMService`) to prevent WebSocket disconnection when cancelling an inactive response.
--- a/changelog/3803.fixed.md
+++ b/changelog/3803.fixed.md
@@ -1 +0,0 @@
- Fixed Poetry compatibility by inlining `local-smart-turn-v3` dependencies (`transformers`, `onnxruntime`) into core dependencies instead of using a self-referential extra.
--- a/changelog/3803.removed.md
+++ b/changelog/3803.removed.md
@@ -1 +0,0 @@
- Removed `local-smart-turn-v3` optional extra from `pyproject.toml`. The `transformers` and `onnxruntime` packages are now always installed as core dependencies since they are required by the default turn stop strategy, `TurnAnalyzerUserTurnStopStrategy` which uses `LocalSmartTurnAnalyzerV3`.
--- a/changelog/3806.added.md
+++ b/changelog/3806.added.md
@@ -1 +0,0 @@
- Added `output_medium` parameter to `AgentInputParams` and `OneShotInputParams` in Ultravox service to control initial output medium (text or voice) at call creation time.
--- a/changelog/3806.changed.2.md
+++ b/changelog/3806.changed.2.md
@@ -1 +0,0 @@
- Improved Ultravox TTFB measurement accuracy by using VAD speech end time instead of `UserStoppedSpeakingFrame` timing.
--- a/changelog/3806.changed.md
+++ b/changelog/3806.changed.md
@@ -1 +0,0 @@
- Aligned `UltravoxRealtimeLLMService` frame handling with OpenAI/Gemini realtime services: added `InterruptionFrame` handling with metrics cleanup, processing metrics at response boundaries, and improved agent transcript handling for both voice and text output modalities.
--- a/changelog/3807.changed.md
+++ b/changelog/3807.changed.md
@@ -1 +0,0 @@
- Updated `OpenAIRealtimeLLMService` default model to `gpt-realtime-1.5`.
--- a/changelog/3808.fixed.md
+++ b/changelog/3808.fixed.md
@@ -1 +0,0 @@
- Fixed `SentryMetrics` method signatures to match updated `FrameProcessorMetrics` base class, resolving `TypeError` when using `start_time`/`end_time` keyword arguments.
--- a/changelog/3809.added.md
+++ b/changelog/3809.added.md
@@ -1 +0,0 @@
- Added `TurnMetricsData` as a generic metrics class for turn detection, with e2e processing time measurement. `KrispVivaTurn` now emits `TurnMetricsData` with `e2e_processing_time_ms` tracking the interval from VAD speech-to-silence transition to turn completion.
--- a/changelog/3809.changed.md
+++ b/changelog/3809.changed.md
@@ -1 +0,0 @@
- Added `api_key` parameter to `KrispVivaSDKManager`, `KrispVivaTurn`, and `KrispVivaFilter` for Krisp SDK v1.6.1+ licensing. Falls back to `KRISP_VIVA_API_KEY` environment variable.
--- a/changelog/3809.deprecated.md
+++ b/changelog/3809.deprecated.md
@@ -1 +0,0 @@
- Deprecated `SmartTurnMetricsData` in favor of `TurnMetricsData`. `BaseSmartTurn` now emits `TurnMetricsData` directly.
--- a/changelog/3811.changed.md
+++ b/changelog/3811.changed.md
@@ -1 +0,0 @@
- Bumped `nltk` minimum version from 3.9.1 to 3.9.3 to resolve a security vulnerability.
--- a/changelog/3813.fixed.md
+++ b/changelog/3813.fixed.md
@@ -1 +0,0 @@
- Fixed STT TTFB metrics not being reported for `SonioxSTTService` and `AWSTranscribeSTTService` due to missing `can_generate_metrics()` override.
--- a/changelog/3814.added.md
+++ b/changelog/3814.added.md
@@ -1 +0,0 @@
- Added `on_audio_context_interrupted()` and `on_audio_context_completed()` callbacks to `AudioContextTTSService`. Subclasses can override these to perform provider-specific cleanup instead of overriding `_handle_interruption()`.
--- a/changelog/3814.fixed.md
+++ b/changelog/3814.fixed.md
@@ -1 +0,0 @@
- Fixed an issue where `AudioContextTTSService`-based providers (AsyncAI, ElevenLabs, Inworld, Rime) did not close or clean up their server-side audio contexts after normal speech completion, only on interruption.
--- a/changelog/3819.changed.md
+++ b/changelog/3819.changed.md
@@ -1,4 +0,0 @@
- `ServiceSettingsUpdateFrame`s are now `UninterruptibleFrame`s. Generally speaking, you don't want a user interruption to prevent a service setting change from going into effect. Note that you usually don't use `ServiceSettingsUpdateFrame` directly, you use one of its subclasses:
-  - `LLMUpdateSettingsFrame`
-  - `TTSUpdateSettingsFrame`
-  - `STTUpdateSettingsFrame`
--- a/changelog/3822.fixed.md
+++ b/changelog/3822.fixed.md
@@ -1 +0,0 @@
- Fixed STT TTFB metrics measuring timeout expiry time instead of actual transcript arrival time.
--- a/changelog/3825.fixed.md
+++ b/changelog/3825.fixed.md
@@ -1 +0,0 @@
- Fixed `InterimTranscriptionFrame` and `TranslationFrame` being unintentionally pushed downstream in `LLMUserAggregator`. They are now consumed like `TranscriptionFrame`.
--- a/changelog/3828.fixed.md
+++ b/changelog/3828.fixed.md
@@ -1 +0,0 @@
- Fixed misleading "Empty audio frame received for STT service" warnings when using audio filters (e.g. `RNNoiseFilter`, `KrispVivaFilter`, `AICFilter`) that buffer audio internally.
--- a/changelog/3837.fixed.md
+++ b/changelog/3837.fixed.md
@@ -1 +0,0 @@
- Fixed issues with `RimeNonJsonTTSService` where trailing punctuation is sometimes vocalized
--- a/changelog/3838.removed.md
+++ b/changelog/3838.removed.md
@@ -1 +0,0 @@
- ⚠️ Removed `PlayHTTTSService` and `PlayHTHttpTTSService`. PlayHT has been shut down and is no longer available.
--- a/changelog/3845.fixed.md
+++ b/changelog/3845.fixed.md
@@ -1 +0,0 @@
- Fixed `TTSSpeakFrame` not committing spoken text to the conversation context when used outside of an LLM response (e.g., bot greetings or injected speech).
--- a/changelog/3850.fixed.md
+++ b/changelog/3850.fixed.md
@@ -1 +0,0 @@
- Removed verbose per-chunk audio logging from `GenesysAudioHookSerializer` that flooded production logs.
--- a/changelog/3855.added.2.md
+++ b/changelog/3855.added.2.md
@@ -1 +0,0 @@
- Added optional `llm` field to `LLMContextSummarizationConfig` for routing summarization to a dedicated LLM service (e.g., a cheaper/faster model) instead of the pipeline's primary model.
--- a/changelog/3855.added.3.md
+++ b/changelog/3855.added.3.md
@@ -1 +0,0 @@
- Added `summarization_timeout` to `LLMContextSummarizationConfig` (default 120s) to prevent hung LLM calls from permanently blocking future summarizations.
--- a/changelog/3855.added.4.md
+++ b/changelog/3855.added.4.md
@@ -1 +0,0 @@
- Added `on_summary_applied` event to `LLMContextSummarizer` for observability, providing message counts before and after context summarization.
--- a/changelog/3855.added.md
+++ b/changelog/3855.added.md
@@ -1 +0,0 @@
- Added `summary_message_template` to `LLMContextSummarizationConfig` for customizing how summaries are formatted when injected into context (e.g., wrapping in XML tags).
--- a/changelog/3855.changed.md
+++ b/changelog/3855.changed.md
@@ -1 +0,0 @@
- Updated context summarization to use `user` role instead of `assistant` for summary messages.
--- a/changelog/3856.added.md
+++ b/changelog/3856.added.md
@@ -1 +0,0 @@
- Add AssemblyAI u3-rt-pro model support with built-in turn detection mode
--- a/changelog/3856.changed.md
+++ b/changelog/3856.changed.md
@@ -1 +0,0 @@
- Rename `AssemblyAISTTService` parameter `min_end_of_turn_silence_when_confident` parameter to `min_turn_silence` (old name still supported with deprecation warning)
--- a/changelog/3856.fixed.md
+++ b/changelog/3856.fixed.md
@@ -1 +0,0 @@
- Add beta feature warning when using custom prompts with AssemblyAI
--- a/changelog/3857.fixed.md
+++ b/changelog/3857.fixed.md
@@ -1 +0,0 @@
- Fixed `LocalSmartTurnAnalyzerV3` producing incorrect end-of-turn predictions at non-16kHz sample rates (e.g. 8kHz Twilio telephony) by adding automatic resampling to 16kHz before Whisper feature extraction.
--- a/changelog/3863.added.2.md
+++ b/changelog/3863.added.2.md
@@ -1 +0,0 @@
- Added `LLMContextSummaryConfig` (summary generation params: `target_context_tokens`, `min_messages_after_summary`, `summarization_prompt`) and `LLMAutoContextSummarizationConfig` (auto-trigger thresholds: `max_context_tokens`, `max_unsummarized_messages`, plus a nested `summary_config`). These replace the monolithic `LLMContextSummarizationConfig`.
--- a/changelog/3863.added.md
+++ b/changelog/3863.added.md
@@ -1 +0,0 @@
- Added `LLMSummarizeContextFrame` to trigger on-demand context summarization from anywhere in the pipeline (e.g. a function call tool). Accepts an optional `config: LLMContextSummaryConfig` to override summary generation settings per request.
--- a/changelog/3863.changed.md
+++ b/changelog/3863.changed.md
@@ -1 +0,0 @@
- ⚠️ Renamed `LLMAssistantAggregatorParams` fields: `enable_context_summarization` → `enable_auto_context_summarization` and `context_summarization_config` → `auto_context_summarization_config` (now accepts `LLMAutoContextSummarizationConfig`). The old names still work with a `DeprecationWarning` for one release cycle.
--- a/changelog/3863.deprecated.md
+++ b/changelog/3863.deprecated.md
@@ -1 +0,0 @@
- Deprecated `LLMContextSummarizationConfig`. Use `LLMAutoContextSummarizationConfig` with a nested `LLMContextSummaryConfig` instead. The old class emits a `DeprecationWarning`.
--- a/changelog/3865.changed.md
+++ b/changelog/3865.changed.md
@@ -1 +0,0 @@
- `ElevenLabsRealtimeSTTService` now sets `TranscriptionFrame.finalized` to `True` when using `CommitStrategy.MANUAL`.
--- a/changelog/3867.fixed.md
+++ b/changelog/3867.fixed.md
@@ -1 +0,0 @@
- Fixed `PipelineTask` double-inserting `RTVIProcessor` into the frame chain when the user provides both an `RTVIProcessor` in the pipeline and a custom `RTVIObserver` subclass in observers.
--- a/changelog/3868.changed.md
+++ b/changelog/3868.changed.md
@@ -1 +0,0 @@
- Updated numba version pin from == to >=0.61.2
--- a/changelog/3873.added.md
+++ b/changelog/3873.added.md
@@ -1 +0,0 @@
- Added support for the `speed_alpha` parameter to the `arcana` model in `RimeTTSService`.
--- a/changelog/3879.changed.md
+++ b/changelog/3879.changed.md
@@ -1 +0,0 @@
- Updated tracing code to use `ServiceSettings` dataclass API (`given_fields()`, attribute access) instead of dict-style access (`.items()`, `in`, subscript).
--- a/changelog/3881.added.2.md
+++ b/changelog/3881.added.2.md
@@ -1 +0,0 @@
- Added `ClientConnectedFrame`, a new `SystemFrame` pushed by all transports (Daily, LiveKit, FastAPI WebSocket, WebSocket Server, SmallWebRTC, HeyGen, Tavus) when a client connects. Enables observers to track transport readiness timing.
--- a/changelog/3881.added.3.md
+++ b/changelog/3881.added.3.md
@@ -1 +0,0 @@
-Added `BotConnectedFrame` for SFU transports and `on_transport_timing_report` event to `StartupTimingObserver` with bot and client connection timing.
--- a/changelog/3881.added.md
+++ b/changelog/3881.added.md
@@ -1 +0,0 @@
- Added `StartupTimingObserver` for measuring how long each processor's `start()` method takes during pipeline startup. Also measures transport readiness — the time from `StartFrame` to first client connection — via the `on_transport_timing_report` event.
--- a/changelog/3883.added.md
+++ b/changelog/3883.added.md
@@ -1 +0,0 @@
- Added optional `direction` parameter to `PipelineTask.queue_frame()` and `PipelineTask.queue_frames()`, allowing frames to be pushed upstream from the end of the pipeline.
--- a/changelog/3885.added.2.md
+++ b/changelog/3885.added.2.md
@@ -1 +0,0 @@
- Added `on_first_bot_speech_latency` event to `UserBotLatencyObserver` measuring the time from client connection to first bot speech. An `on_latency_breakdown` is also emitted for this first speech event.
--- a/changelog/3885.added.md
+++ b/changelog/3885.added.md
@@ -1 +0,0 @@
- Added `on_latency_breakdown` event to `UserBotLatencyObserver` providing per-service TTFB, text aggregation, user turn duration, and function call latency metrics for each user-to-bot response cycle.
--- a/changelog/3886.other.md
+++ b/changelog/3886.other.md
@@ -1 +0,0 @@
- Standardized Sarvam STT/TTS User-Agent header handling to consistently send Pipecat SDK identity in websocket requests.
--- a/changelog/3888.fixed.md
+++ b/changelog/3888.fixed.md
@@ -1 +0,0 @@
- Fixed turn completion instructions being lost when `LLMMessagesUpdateFrame` replaces the LLM context. When `filter_incomplete_user_turns` is enabled, the turn completion system message is now re-injected after context replacement.
--- a/changelog/3893.fixed.md
+++ b/changelog/3893.fixed.md
@@ -1 +0,0 @@
- Fixed Azure TTS and STT services silently swallowing cancellation errors (invalid API key, network failures, rate limiting) instead of propagating them as `ErrorFrame`s to the pipeline.
--- a/changelog/3896.added.md
+++ b/changelog/3896.added.md
@@ -1 +0,0 @@
- Added `broadcast_interruption()` to `FrameProcessor`. This method pushes an `InterruptionFrame` both upstream and downstream directly from the calling processor, avoiding the round-trip through the pipeline task that `push_interruption_task_frame_and_wait()` required.
--- a/changelog/3896.changed.md
+++ b/changelog/3896.changed.md
@@ -1 +0,0 @@
- ⚠️ Removed `event` field and `complete()` method from `InterruptionFrame`. Removed `event` field from `InterruptionTaskFrame`. These are no longer needed since `broadcast_interruption()` does not require a round-trip completion signal.
--- a/changelog/3896.deprecated.md
+++ b/changelog/3896.deprecated.md
@@ -1 +0,0 @@
- Deprecated `push_interruption_task_frame_and_wait()` in `FrameProcessor`. Use `broadcast_interruption()` instead. The old method now delegates to `broadcast_interruption()` and logs a deprecation warning.
--- a/changelog/3902.changed.md
+++ b/changelog/3902.changed.md
@@ -1 +0,0 @@
- Moved `pipecat.services.deepgram.stt_sagemaker` and `pipecat.services.deepgram.tts_sagemaker` to `pipecat.services.deepgram.sagemaker.stt` and `pipecat.services.deepgram.sagemaker.tts`. The old import paths still work but emit a `DeprecationWarning`.
				`@@ -1 +0,0 @@`
				- Added `TextAggregationMetricsData` metric measuring the time from the first LLM token to the first complete sentence, representing the latency cost of sentence aggregation in the TTS pipeline.
				`@@ -1 +0,0 @@`
				- Added `text_aggregation_mode` parameter to `TTSService` and all TTS subclasses with a new `TextAggregationMode` enum (`SENTENCE`, `TOKEN`). All text now flows through text aggregators regardless of mode, enabling pattern detection and tag handling in TOKEN mode.
				`@@ -1 +0,0 @@`
				- ⚠️ Deprecated `aggregate_sentences` parameter on `TTSService` and all TTS subclasses. Use `text_aggregation_mode=TextAggregationMode.SENTENCE` or `text_aggregation_mode=TextAggregationMode.TOKEN` instead.
				`@@ -1 +0,0 @@`
				- ⚠️ Refactored runtime-updatable service settings to use strongly-typed classes (`TTSSettings`, `STTSettings`, `LLMSettings`, and service-specific subclasses) instead of plain dicts. Each service's `_settings` now holds these strongly-typed objects. For service maintainers, see changes in COMMUNITY_INTEGRATIONS.md.
				`@@ -1 +0,0 @@`
				- Dict-based `UpdateSettingsFrame(settings={...})` is deprecated in favor of passing typed settings delta objects with `UpdateSettingsFrame(delta={...})`.
				`@@ -1 +0,0 @@`
				- Switched `GradiumTTSService` from `InterruptibleWordTTSService` to `AudioContextWordTTSService`, eliminating websocket disconnect/reconnect on every interruption by using `client_req_id`-based multiplexing.
				`@@ -1 +0,0 @@`
				`- Added support for specifying private endpoints for Azure Speech-to-Text, enabling use in private networks behind firewalls.`
				`@@ -1 +0,0 @@`
				- Word timestamp support has been moved from `WordTTSService` into `TTSService` via a new `supports_word_timestamps` parameter. Services that previously extended `WordTTSService`, `AudioContextWordTTSService`, or `WebsocketWordTTSService` now pass `supports_word_timestamps=True` to their parent `__init__` instead.
				`@@ -1 +0,0 @@`
				- Added `LemonSliceTransport` and `LemonSliceApi` to support adding real-time LemonSlice Avatars to any Daily room.
				`@@ -1 +0,0 @@`
				- Added `LLMSpecificMessage` handling in `LLMContextSummarizationUtil` to skip provider-specific messages during context summarization.