From 4df0a9bf7398cc5edb1f796eff3850349aa0e267 Mon Sep 17 00:00:00 2001 From: markbackman <1924426+markbackman@users.noreply.github.com> Date: Fri, 5 Dec 2025 23:45:12 +0000 Subject: [PATCH] Update changelog for version 0.0.97 --- CHANGELOG.md | 108 +++++++++++++++++++++++++++++++++++ changelog/3072.changed.md | 1 - changelog/3130.deprecated.md | 7 --- changelog/3132.changed.2.md | 9 --- changelog/3132.changed.md | 1 - changelog/3132.fixed.2.md | 1 - changelog/3132.fixed.md | 1 - changelog/3153.fixed.md | 1 - changelog/3155.fixed.md | 1 - changelog/3156.fixed.md | 1 - changelog/3162.changed.md | 1 - changelog/3162.deprecated.md | 1 - changelog/3168.fixed.md | 1 - changelog/3176.added.md | 1 - changelog/3176.changed.md | 1 - changelog/3177.changed.md | 1 - changelog/3181.deprecated.md | 1 - changelog/3183.changed.md | 1 - changelog/3184.added.md | 6 -- changelog/3185.fixed.md | 1 - changelog/3186.fixed.md | 1 - changelog/3187.added.md | 1 - changelog/3192.changed.md | 1 - changelog/3195.changed.md | 7 --- 24 files changed, 108 insertions(+), 48 deletions(-) delete mode 100644 changelog/3072.changed.md delete mode 100644 changelog/3130.deprecated.md delete mode 100644 changelog/3132.changed.2.md delete mode 100644 changelog/3132.changed.md delete mode 100644 changelog/3132.fixed.2.md delete mode 100644 changelog/3132.fixed.md delete mode 100644 changelog/3153.fixed.md delete mode 100644 changelog/3155.fixed.md delete mode 100644 changelog/3156.fixed.md delete mode 100644 changelog/3162.changed.md delete mode 100644 changelog/3162.deprecated.md delete mode 100644 changelog/3168.fixed.md delete mode 100644 changelog/3176.added.md delete mode 100644 changelog/3176.changed.md delete mode 100644 changelog/3177.changed.md delete mode 100644 changelog/3181.deprecated.md delete mode 100644 changelog/3183.changed.md delete mode 100644 changelog/3184.added.md delete mode 100644 changelog/3185.fixed.md delete mode 100644 changelog/3186.fixed.md delete mode 100644 changelog/3187.added.md delete mode 100644 changelog/3192.changed.md delete mode 100644 changelog/3195.changed.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 7d7979bef..11bafc5b3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,114 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 +## [0.0.97] - 2025-12-05 + +### Added + +- Added new Gradium services, `GradiumSTTService` and `GradiumTTSService`, for + speech-to-text and text-to-speech functionality using Gradium's API. + +- Additions for `AsyncAITTSService` and `AsyncAIHttpTTSService`: + + - Added new `languages`: `pt`, `nl`, `ar`, `ru`, `ro`, `ja`, `he`, `hy`, + `tr`, `hi`, `zh`. + - Updated the default model to `asyncflow_multilingual_v1.0` for improved + accuracy and broader language coverage. + +- Added optional tool and tool output filters for MCP services. + +### Changed + +- Updated Deepgram logging to include Deepgram request IDs for improved + debugging. + +- Text Aggregation Improvements: + + - **Breaking Change**: `BaseTextAggregator.aggregate()` now returns + `AsyncIterator[Aggregation]` instead of `Optional[Aggregation]`. This + enables the aggregator to return multiple results based on the provided + text. + - Refactored text aggregators to use inheritance: `SkipTagsAggregator` and + `PatternPairAggregator` now inherit from `SimpleTextAggregator`, reusing + the base class's sentence detection logic. + +- Improved interruption handling to prevent bots from repeating themselves. LLM + services that return multiple sentences in a single response (e.g., + `GoogleLLMService`) are now split into individual sentences before being sent + to TTS. This ensures interruptions occur at sentence boundaries, preventing + the bot from repeating content after being interrupted during long responses. + +- Updated `AICFilter` to use Quail STT as the default model + (`AICModelType.QUAIL_STT`). Quail STT is optimized for human-to-machine + interaction (e.g., voice agents, speech-to-text) and operates at a native + sample rate of 16 kHz with fixed enhancement parameters. + +- If an unexpected exception is caught, or if `FrameProcessor.push_error()` is + called with an exception, the file name and line number where the exception + occured are now logged. + +- Updated Smart Turn model weights to v3.1. + +- Smart Turn analyzer now uses the full context of the turn rather than just + the audio since VAD last triggered. + +- Updated `CartesiaSTTService` to return the full transcription `result` in the + `TranscriptionFrame` and `InterimTranscriptionFrame`. This provides access to + word timestamp data. + +- `HumeTTSService` changes: + + - Added tracking headers (`X-Hume-Client-Name` and `X-Hume-Client-Version`) + to all requests made by `HumeTTSService` to the Hume API for better usage + tracking and analytics. + - Added `stop()` and `cancel()` cleanup methods to `HumeTTSService` to + properly close the HTTP client and prevent resource leaks. + +### Deprecated + +- NVIDIA Services name changes (all functionality is unchanged): + + - `NimLLMService` is now deprecated, use `NvidiaLLMService` instead. + - `RivaSTTService` is now deprecated, use `NvidiaSTTService` instead. + - `RivaTTSService` is now deprecated, use `NvidiaTTSService` instead. + - Use `uv pip install pipecat-ai[nvidia]` instead of + `uv pip install pipecat-ai[riva]` + +- The `noise_gate_enable` parameter in `AICFilter` is deprecated and no longer + has any effect. Noise gating is now handled automatically by the AIC VAD + system. Use `AICFilter.create_vad_analyzer()` for VAD functionality instead. + +- Package `pipecat.sync` is deprecated, use `pipecat.utils.sync` instead. + +### Fixed + +- Fixed bug in `PatternPairAggregator` where pattern handlers could be called + multiple times for `KEEP` or `AGGREGATE` patterns. + +- Fixed sentence aggregation to correctly handle ambiguous punctuation in + streaming text, such as currency ("$29.95") and abbreviations ("Mr. Smith"). + +- Fixed an issue in `AWSTranscribeSTTService` where the `region` arg was always + set to `us-east-1` when providing an AWS_REGION env var. + +- Fixed an issue in `SarvamTTSService` where the last sentence was not being + spoken. Now, audio is flushed when the TTS services receives the + `LLMFullResponseEndFrame` or `EndFrame`. + +- Fixed an issue in `DeepgramTTSService` where a `TTSStoppedFrame` was + incorrectly pushed after a functional call. This caused an issue with the + voice-ui-kit's conversational panel rending of the LLM output after a + function call. + +- Fixed an issue where `LLMTextFrame.skip_tts` was being overwritten by LLM + services. + +- Fixed an issue that caused `WebsocketService` instances to attempt + reconnection during shutdown. + +- Fixed an issue in `ElevenLabsTTSService` where character usage metrics were + only reported on the first TTS generation per turn. + ## [0.0.96] - 2025-11-26 🦃 "Happy Thanksgiving!" 🦃 ### Added diff --git a/changelog/3072.changed.md b/changelog/3072.changed.md deleted file mode 100644 index e81058b29..000000000 --- a/changelog/3072.changed.md +++ /dev/null @@ -1 +0,0 @@ -- Updated Deepgram logging to include Deepgram request IDs for improved debugging. diff --git a/changelog/3130.deprecated.md b/changelog/3130.deprecated.md deleted file mode 100644 index c7a1c66f6..000000000 --- a/changelog/3130.deprecated.md +++ /dev/null @@ -1,7 +0,0 @@ -- NVIDIA Services name changes (all functionality is unchanged): - - - `NimLLMService` is now deprecated, use `NvidiaLLMService` instead. - - `RivaSTTService` is now deprecated, use `NvidiaSTTService` instead. - - `RivaTTSService` is now deprecated, use `NvidiaTTSService` instead. - - Use `uv pip install pipecat-ai[nvidia]` instead of - `uv pip install pipecat-ai[riva]` diff --git a/changelog/3132.changed.2.md b/changelog/3132.changed.2.md deleted file mode 100644 index d7e9c0a5c..000000000 --- a/changelog/3132.changed.2.md +++ /dev/null @@ -1,9 +0,0 @@ -- Text Aggregation Improvements: - - - **Breaking Change**: `BaseTextAggregator.aggregate()` now returns - `AsyncIterator[Aggregation]` instead of `Optional[Aggregation]`. This - enables the aggregator to return multiple results based on the provided - text. - - Refactored text aggregators to use inheritance: `SkipTagsAggregator` and - `PatternPairAggregator` now inherit from `SimpleTextAggregator`, reusing - the base class's sentence detection logic. diff --git a/changelog/3132.changed.md b/changelog/3132.changed.md deleted file mode 100644 index 4df044f58..000000000 --- a/changelog/3132.changed.md +++ /dev/null @@ -1 +0,0 @@ -- Improved interruption handling to prevent bots from repeating themselves. LLM services that return multiple sentences in a single response (e.g., `GoogleLLMService`) are now split into individual sentences before being sent to TTS. This ensures interruptions occur at sentence boundaries, preventing the bot from repeating content after being interrupted during long responses. diff --git a/changelog/3132.fixed.2.md b/changelog/3132.fixed.2.md deleted file mode 100644 index 4124deee1..000000000 --- a/changelog/3132.fixed.2.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed bug in `PatternPairAggregator` where pattern handlers could be called multiple times for `KEEP` or `AGGREGATE` patterns. diff --git a/changelog/3132.fixed.md b/changelog/3132.fixed.md deleted file mode 100644 index 7f063d281..000000000 --- a/changelog/3132.fixed.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed sentence aggregation to correctly handle ambiguous punctuation in streaming text, such as currency ("$29.95") and abbreviations ("Mr. Smith"). diff --git a/changelog/3153.fixed.md b/changelog/3153.fixed.md deleted file mode 100644 index f6ffbe541..000000000 --- a/changelog/3153.fixed.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed an issue in `AWSTranscribeSTTService` where the `region` arg was always set to `us-east-1` when providing an AWS_REGION env var. diff --git a/changelog/3155.fixed.md b/changelog/3155.fixed.md deleted file mode 100644 index d45c7884f..000000000 --- a/changelog/3155.fixed.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed an issue in `SarvamTTSService` where the last sentence was not being spoken. Now, audio is flushed when the TTS services receives the `LLMFullResponseEndFrame` or `EndFrame`. diff --git a/changelog/3156.fixed.md b/changelog/3156.fixed.md deleted file mode 100644 index 0704fb154..000000000 --- a/changelog/3156.fixed.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed an issue in `DeepgramTTSService` where a `TTSStoppedFrame` was incorrectly pushed after a functional call. This caused an issue with the voice-ui-kit's conversational panel rending of the LLM output after a function call. diff --git a/changelog/3162.changed.md b/changelog/3162.changed.md deleted file mode 100644 index ee0483bd1..000000000 --- a/changelog/3162.changed.md +++ /dev/null @@ -1 +0,0 @@ -- Updated `AICFilter` to use Quail STT as the default model (`AICModelType.QUAIL_STT`). Quail STT is optimized for human-to-machine interaction (e.g., voice agents, speech-to-text) and operates at a native sample rate of 16 kHz with fixed enhancement parameters. diff --git a/changelog/3162.deprecated.md b/changelog/3162.deprecated.md deleted file mode 100644 index 7d7c97810..000000000 --- a/changelog/3162.deprecated.md +++ /dev/null @@ -1 +0,0 @@ -- The `noise_gate_enable` parameter in `AICFilter` is deprecated and no longer has any effect. Noise gating is now handled automatically by the AIC VAD system. Use `AICFilter.create_vad_analyzer()` for VAD functionality instead. diff --git a/changelog/3168.fixed.md b/changelog/3168.fixed.md deleted file mode 100644 index 0a0826a82..000000000 --- a/changelog/3168.fixed.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed an issue where `LLMTextFrame.skip_tts` was being overwritten by LLM services. diff --git a/changelog/3176.added.md b/changelog/3176.added.md deleted file mode 100644 index ed3c87886..000000000 --- a/changelog/3176.added.md +++ /dev/null @@ -1 +0,0 @@ -- Added new Gradium services, `GradiumSTTService` and `GradiumTTSService`, for speech-to-text and text-to-speech functionality using Gradium's API. diff --git a/changelog/3176.changed.md b/changelog/3176.changed.md deleted file mode 100644 index 618837677..000000000 --- a/changelog/3176.changed.md +++ /dev/null @@ -1 +0,0 @@ -- If an unexpected exception is caught, or if `FrameProcessor.push_error()` is called with an exception, the file name and line number where the exception occured are now logged. diff --git a/changelog/3177.changed.md b/changelog/3177.changed.md deleted file mode 100644 index 5f0ad3e94..000000000 --- a/changelog/3177.changed.md +++ /dev/null @@ -1 +0,0 @@ -- Updated Smart Turn model weights to v3.1. diff --git a/changelog/3181.deprecated.md b/changelog/3181.deprecated.md deleted file mode 100644 index ee6994373..000000000 --- a/changelog/3181.deprecated.md +++ /dev/null @@ -1 +0,0 @@ -- Package `pipecat.sync` is deprecated, use `pipecat.utils.sync` instead. diff --git a/changelog/3183.changed.md b/changelog/3183.changed.md deleted file mode 100644 index 6c152022d..000000000 --- a/changelog/3183.changed.md +++ /dev/null @@ -1 +0,0 @@ -- Smart Turn analyzer now uses the full context of the turn rather than just the audio since VAD last triggered. diff --git a/changelog/3184.added.md b/changelog/3184.added.md deleted file mode 100644 index ab297e631..000000000 --- a/changelog/3184.added.md +++ /dev/null @@ -1,6 +0,0 @@ -- Additions for `AsyncAITTSService` and `AsyncAIHttpTTSService`: - - - Added new `languages`: `pt`, `nl`, `ar`, `ru`, `ro`, `ja`, `he`, `hy`, - `tr`, `hi`, `zh`. - - Updated the default model to `asyncflow_multilingual_v1.0` for improved - accuracy and broader language coverage. diff --git a/changelog/3185.fixed.md b/changelog/3185.fixed.md deleted file mode 100644 index bef616aea..000000000 --- a/changelog/3185.fixed.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed an issue that caused `WebsocketService` instances to attempt reconnection during shutdown. diff --git a/changelog/3186.fixed.md b/changelog/3186.fixed.md deleted file mode 100644 index 8c3dfc526..000000000 --- a/changelog/3186.fixed.md +++ /dev/null @@ -1 +0,0 @@ -- Fixed an issue in `ElevenLabsTTSService` where character usage metrics were only reported on the first TTS generation per turn. diff --git a/changelog/3187.added.md b/changelog/3187.added.md deleted file mode 100644 index 967655a3c..000000000 --- a/changelog/3187.added.md +++ /dev/null @@ -1 +0,0 @@ -- Added optional tool and tool output filters for MCP services. \ No newline at end of file diff --git a/changelog/3192.changed.md b/changelog/3192.changed.md deleted file mode 100644 index 1770be384..000000000 --- a/changelog/3192.changed.md +++ /dev/null @@ -1 +0,0 @@ -- Updated `CartesiaSTTService` to return the full transcription `result` in the `TranscriptionFrame` and `InterimTranscriptionFrame`. This provides access to word timestamp data. diff --git a/changelog/3195.changed.md b/changelog/3195.changed.md deleted file mode 100644 index 168f3857d..000000000 --- a/changelog/3195.changed.md +++ /dev/null @@ -1,7 +0,0 @@ -- `HumeTTSService` changes: - - - Added tracking headers (`X-Hume-Client-Name` and `X-Hume-Client-Version`) - to all requests made by `HumeTTSService` to the Hume API for better usage - tracking and analytics. - - Added `stop()` and `cancel()` cleanup methods to `HumeTTSService` to - properly close the HTTP client and prevent resource leaks.