* initial config
* skeleton
* Added a README (to be added to).
* Payloads coming from the ASR.
* doc update
* handle the partials and finals
* enable diarization in the example
* support sending messages to pipecat pipeline
* requirements fix in README
* updated example (with amusement)
* updated example to match master
* updated docs
* support for diarization tags
* logic fix for wrapper
* Use an internal SpeechFrame for speaker_id (not user_id).
* only include speaker tags on finalised transcript (as this may skew end of utterance detection)
* updated docs
* correction to docs and updated example
* updated requirement
* Fix for using default EU server.
* Updates from PR comments.
* Refactor based on comments in the original PR.
Primary focus on documentation, naming conventions and how `user_id` is used.
* Check for SMX installed when importing.
* Variable name change
* Comment correction.
* Support for Esporanto and Uyghur
* Impoved language support
* function name change
* Locale fix
* intercept
* interim changes
* pass the pipeline task to the module for adding events to the top of the pipeline
* logging for the pipeline
* Reduce timeout for content aggregator.
* staged update
* testing with Azure
* Updated context (Azure was dropping punctuation) and using better ElevenLabs model.
* Updated to RT 0.3.0 and use OpenAI (not Azure).
* Missing OpenAI import; parameter name change for output locale validation.
* Revert to `0.2.0` of RT SDK.
* fix for assignment of `output_locale_code`.
* update Speechmatics library to 0.3.1
* new transcription example
* updated asyncio task handling
* Updated doc strings
* enable OpenTelemetry logging
* removed import from stt for __init__
* updated examples and default values
* updated examples
* prevent lock up when closing the STT connection
The parameter video_in_enabled=True was missing in DailyParams, which prevented image capture
from working. Without this parameter, UserImageRequestFrame would be sent but no actual image data would be captured from participants.
This fix enables the "Let me take a look" functionality to work as
intended by allowing the transport to capture video frames for vision processing with Moondream.