diff --git a/CHANGELOG.md b/CHANGELOG.md index 6d55b97c3..42fad1c44 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -128,8 +128,60 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 Gemini models. Added foundational example `14p-function-calling-gemini-vertex-ai.py`. +- Added support in `OpenAIRealtimeBetaLLMService` for the + `conversation.item.input_audio_transcription.delta` server message. + +- Added error handling in `OpenAIRealtimeBetaLLMService` for the + `response.done` server message reporting a failure. + +- Added support in `OpenAIRealtimeBetaLLMService` for the new + `gpt-4o-transcribe-latest` input audio transcription model. + +- Added support in `OpenAIRealtimeBetaLLMService` for the new + `input_audio_noise_reduction` session property. + + ```python + session_properties = SessionProperties( + # ... + input_audio_noise_reduction=InputAudioNoiseReduction( + type="near_field" # also supported: "far_field" + ) + # ... + ) + ``` + +- Added support in `OpenAIRealtimeBetaLLMService` for the new + `semantic_vad` `turn_detection` session property, which is a more + sophisticated model for detecting when the user has stopped speaking. + +- Added `on_conversation_item_created` and `on_conversation_item_updated` + events to `OpenAIRealtimeBetaLLMService`. + + ```python + @llm.event_handler("on_conversation_item_created") + async def on_conversation_item_created(llm, item_id, item): + # ... + + @llm.event_handler("on_conversation_item_updated") + async def on_conversation_item_updated(llm, item_id, item): + # `item` may not always be available here + # ... + ``` + +- Added `retrieve_conversation_item(item_id)` to `OpenAIRealtimeBetaLLMService`. + + ```python + item = await llm.retrieve_conversation_item(item_id) + ``` + ### Changed +- Updated the default model for `CartesiaTTSService` and + `CartesiaHttpTTSService` to `sonic-2`. + +- Updated the default model for `OpenAIRealtimeBetaLLMService` to + `gpt-4o-realtime-preview-latest`. + - Function calls are now executed in tasks. This means that the pipeline will not be blocked while the function call is being executed. @@ -216,6 +268,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Fixed an issue in `RimeTTSService` where the last line of text sent didn't result in an audio output being generated. +- Fixed `OpenAIRealtimeBetaLLMService` by adding support for the + `conversation.item.input_audio_transcription.delta` server message, which was + added server-side at some point and not handled client-side. + ### Other - Add foundational example `07w-interruptible-fal.py`, showing `FalSTTService`.