- Introduced `asr_interim_enabled` field in the Assistant model to control interim ASR results. - Updated AssistantBase and AssistantUpdate schemas to include the new field. - Modified the database schema to add the `asr_interim_enabled` column. - Enhanced runtime metadata to reflect interim ASR settings. - Updated API endpoints and tests to validate the new functionality. - Adjusted documentation to include details about interim ASR results configuration.
Realtime Agent Studio Engine
This repo contains a Python 3.11+ codebase for building low-latency realtime human-agent interaction pipelines (capture, stream, and process audio) using WebSockets or WebRTC.
This repo contains a Python 3.11+ codebase for building low-latency voice pipelines (capture, stream, and process audio) using WebRTC and WebSockets. It is currently in an early, experimental stage.
Usage
启动
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
说明
- 启动阶段不再通过参数加载 Agent YAML。
- 会话阶段统一按
assistant_id拉取运行时配置:- 有
BACKEND_URL:从 backend API 获取。 - 无
BACKEND_URL(或BACKEND_MODE=disabled):从ASSISTANT_LOCAL_CONFIG_DIR/<assistant_id>.yaml获取。
- 有
Backend Integration
Engine runtime now supports adapter-based backend integration:
BACKEND_MODE=auto|http|disabledBACKEND_URL+BACKEND_TIMEOUT_SECASSISTANT_LOCAL_CONFIG_DIR(defaultengine/config/agents)HISTORY_ENABLED=true|false
Behavior:
auto: use HTTP backend only whenBACKEND_URLis set, otherwise engine-only mode.http: force HTTP backend; falls back to engine-only mode when URL is missing.disabled: force engine-only mode (no backend calls).
Assistant config source behavior:
- If
BACKEND_URLis configured and backend mode is enabled, assistant config is loaded from backend API. - If
BACKEND_URLis empty (or backend mode is disabled), assistant config is loaded from local YAML.
Local assistant YAML example:
- File path:
engine/config/agents/<assistant_id>.yaml - Runtime still requires WebSocket query param
assistant_id; it must match the local file name.
History write path is now asynchronous and buffered per session:
HISTORY_QUEUE_MAX_SIZEHISTORY_RETRY_MAX_ATTEMPTSHISTORY_RETRY_BACKOFF_SECHISTORY_FINALIZE_DRAIN_TIMEOUT_SEC
This keeps turn processing responsive even when backend history APIs are slow/failing.
Detailed notes: docs/backend_integration.md.
测试
python examples/test_websocket.py
python mic_client.py
WS Protocol
/ws uses a strict v1 JSON control protocol with binary PCM audio frames.
See docs/ws_v1_schema.md.