diff --git a/changelog/3583.added.2.md b/changelog/3583.added.2.md new file mode 100644 index 000000000..f0c5c7d1e --- /dev/null +++ b/changelog/3583.added.2.md @@ -0,0 +1 @@ +- Added `VADController` for managing voice activity detection state and emitting speech events independently of transport or pipeline processors. diff --git a/changelog/3583.added.3.md b/changelog/3583.added.3.md new file mode 100644 index 000000000..f5da90407 --- /dev/null +++ b/changelog/3583.added.3.md @@ -0,0 +1 @@ +- Added `VADProcessor` for detecting speech in audio streams within a pipeline. Pushes `VADUserStartedSpeakingFrame`, `VADUserStoppedSpeakingFrame`, and `UserSpeakingFrame` downstream based on VAD state changes. diff --git a/changelog/3583.added.md b/changelog/3583.added.md new file mode 100644 index 000000000..47048e226 --- /dev/null +++ b/changelog/3583.added.md @@ -0,0 +1,10 @@ +- Added `vad_analyzer` parameter to `LLMUserAggregatorParams`. VAD analysis is now handled inside the `LLMUserAggregator` rather than in the transport, keeping voice activity detection closer to where it is consumed. The `vad_analyzer` on `BaseInputTransport` is now deprecated. + + ```python + context_aggregator = LLMContextAggregatorPair( + context, + user_params=LLMUserAggregatorParams( + vad_analyzer=SileroVADAnalyzer(), + ), + ) + ``` diff --git a/changelog/3583.deprecated.md b/changelog/3583.deprecated.md new file mode 100644 index 000000000..d4ea492eb --- /dev/null +++ b/changelog/3583.deprecated.md @@ -0,0 +1 @@ +- ⚠️ Deprecated `vad_analyzer` parameter on `BaseInputTransport`. Pass `vad_analyzer` to `LLMUserAggregatorParams` instead or use `VADProcessor` in the pipeline.