diff --git a/changelog/4217.added.2.md b/changelog/4217.added.2.md new file mode 100644 index 000000000..9621d43c0 --- /dev/null +++ b/changelog/4217.added.2.md @@ -0,0 +1 @@ +- Added `group_parallel_tools` parameter to `LLMService` (default `True`). When `True`, all function calls from the same LLM response batch share a group ID and the LLM is triggered exactly once after the last call completes. Set to `False` to trigger inference independently for each function call result as it arrives. diff --git a/changelog/4217.added.md b/changelog/4217.added.md new file mode 100644 index 000000000..a14d1eb9e --- /dev/null +++ b/changelog/4217.added.md @@ -0,0 +1 @@ +- Added async function call support to `register_function()` and `register_direct_function()` via `cancel_on_interruption=False`. When set to `False`, the LLM continues the conversation immediately without waiting for the function result. The result is injected back into the context as a `developer` message once available, triggering a new LLM inference at that point. diff --git a/changelog/4217.changed.md b/changelog/4217.changed.md new file mode 100644 index 000000000..af8831c3b --- /dev/null +++ b/changelog/4217.changed.md @@ -0,0 +1 @@ +- When multiple function calls are returned in a single LLM response, the LLM is now triggered exactly once after the last call in the batch completes, rather than waiting for all function calls. diff --git a/changelog/4217.fixed.2.md b/changelog/4217.fixed.2.md new file mode 100644 index 000000000..af7bff8fa --- /dev/null +++ b/changelog/4217.fixed.2.md @@ -0,0 +1 @@ +- Fixed `BaseOutputTransport` discarding pending `UninterruptibleFrame` items (e.g. function-call context updates) when an interruption arrived. The audio task is now kept alive and only interruptible frames are drained when uninterruptible frames are present in the queue. diff --git a/changelog/4217.fixed.md b/changelog/4217.fixed.md new file mode 100644 index 000000000..234effdc0 --- /dev/null +++ b/changelog/4217.fixed.md @@ -0,0 +1 @@ +- Fixed spurious LLM inference being triggered when a function call result arrived while the user was actively speaking. The context frame is now suppressed until the user stops speaking.