Remove unnecessary SpeechStarted fallback in STT mode

u3-rt-pro guarantees SpeechStarted is always sent before transcripts,
so the fallback UserStartedSpeakingFrame broadcast is never needed.

This ensures clean pairing of UserStarted/StoppedSpeakingFrame:
- Start: Always from _handle_speech_started
- Stop: Always from _handle_transcription on final turn
This commit is contained in:
zack
2026-02-27 15:00:38 -05:00
parent aa7e9a17d5
commit 6ba9f780b0

View File

@@ -752,14 +752,9 @@ class AssemblyAISTTService(WebsocketSTTService):
)
else:
# --- Mode 2: STT turn detection ---
# SpeechStarted handles UserStartedSpeakingFrame + interruption.
# If SpeechStarted hasn't fired yet (shouldn't happen, but guard),
# broadcast here as fallback.
# SpeechStarted always arrives before transcripts with u3-rt-pro,
# so UserStartedSpeakingFrame is guaranteed to be broadcast first.
logger.debug(f"{self} Transcript received in STT mode (_user_speaking={self._user_speaking})")
if not self._user_speaking:
logger.warning(f"{self} Transcript arrived before SpeechStarted, broadcasting fallback UserStartedSpeakingFrame")
await self.broadcast_frame(UserStartedSpeakingFrame)
self._user_speaking = True
if is_final_turn:
# STT mode: AssemblyAI controls finalization, just mark as finalized