diff --git a/changelog/4344.fixed.md b/changelog/4344.fixed.md new file mode 100644 index 000000000..4e786f65c --- /dev/null +++ b/changelog/4344.fixed.md @@ -0,0 +1 @@ +- Fixed `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` emitting word timestamps and `TTSTextFrame` content that matched the input text instead of the spoken audio when a pronunciation dictionary (`pronunciation_dictionary_locators`) or text normalization rewrote the input. Both services now consume ElevenLabs' normalized alignment, so downstream consumers (captions, transcripts, context aggregation) reflect what the listener actually hears.