* Added Sarvam TTS Websocket Implementation
* Addressed some of the comments on PR
* added change voice logic
* added changes from main
* pushing text frames and added flush audio
* updated docs string for better docs
* Addressed comments and added some improvements
* pushed optional args down
* removed new line
* made aiohttp session mandatory in http service
* added push frame and removed unused function
* removed pong message
* added disconnecting logic
---------
Co-authored-by: vinayak-sarvam <vinayak@sarvam.ai>
This patch uses `wait_for2` package to implement `asyncio.wait_for()` for
Python < 3.12.
In Python 3.12, `asyncio.wait_for()` is implemented in terms of
`asyncio.timeout()` which fixed a bunch of issues. However, this was never
backported (because of the lack of `async.timeout()`) and there are still many
remainig issues, specially in Python 3.10, in `async.wait_for()`.
See https://github.com/python/cpython/pull/98518
Changes
Split out module attributes to make engine settings clearer
Removed internal audio buffer to use latest Speechmatics python SDK (0.4.0)
Use diarization for improved VAD in multi-speaker situations
Support custom dictionary / vocabulary with attributes
Deprecated attributes superseded by re-organised attributes
Diarization Enhancements
Focus on specific speakers (using speaker labels)
Ignore specific speakers (using speaker labels)
Separate transcription formats for active and inactive speakers
Support for known speakers
* Adds pipecat.runner.run - FastAPI-based development server with automatic bot discovery
* Adds new RunnerArguments types for different transports
* Adds new runner utils for creating transports and parsing data
* Adds new Daily and LiveKit utils for setup
* initial config
* skeleton
* Added a README (to be added to).
* Payloads coming from the ASR.
* doc update
* handle the partials and finals
* enable diarization in the example
* support sending messages to pipecat pipeline
* requirements fix in README
* updated example (with amusement)
* updated example to match master
* updated docs
* support for diarization tags
* logic fix for wrapper
* Use an internal SpeechFrame for speaker_id (not user_id).
* only include speaker tags on finalised transcript (as this may skew end of utterance detection)
* updated docs
* correction to docs and updated example
* updated requirement
* Fix for using default EU server.
* Updates from PR comments.
* Refactor based on comments in the original PR.
Primary focus on documentation, naming conventions and how `user_id` is used.
* Check for SMX installed when importing.
* Variable name change
* Comment correction.
* Support for Esporanto and Uyghur
* Impoved language support
* function name change
* Locale fix
* intercept
* interim changes
* pass the pipeline task to the module for adding events to the top of the pipeline
* logging for the pipeline
* Reduce timeout for content aggregator.
* staged update
* testing with Azure
* Updated context (Azure was dropping punctuation) and using better ElevenLabs model.
* Updated to RT 0.3.0 and use OpenAI (not Azure).
* Missing OpenAI import; parameter name change for output locale validation.
* Revert to `0.2.0` of RT SDK.
* fix for assignment of `output_locale_code`.
* update Speechmatics library to 0.3.1
* new transcription example
* updated asyncio task handling
* Updated doc strings
* enable OpenTelemetry logging
* removed import from stt for __init__
* updated examples and default values
* updated examples
* prevent lock up when closing the STT connection