Files
AI-VideoAssistant/engine
Xin Wang da38157638 Add ASR interim results support in Assistant model and API
- Introduced `asr_interim_enabled` field in the Assistant model to control interim ASR results.
- Updated AssistantBase and AssistantUpdate schemas to include the new field.
- Modified the database schema to add the `asr_interim_enabled` column.
- Enhanced runtime metadata to reflect interim ASR settings.
- Updated API endpoints and tests to validate the new functionality.
- Adjusted documentation to include details about interim ASR results configuration.
2026-03-06 12:58:54 +08:00
..
2026-02-26 01:58:39 +08:00

Realtime Agent Studio Engine

This repo contains a Python 3.11+ codebase for building low-latency realtime human-agent interaction pipelines (capture, stream, and process audio) using WebSockets or WebRTC.

This repo contains a Python 3.11+ codebase for building low-latency voice pipelines (capture, stream, and process audio) using WebRTC and WebSockets. It is currently in an early, experimental stage.

Usage

启动

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

说明

  • 启动阶段不再通过参数加载 Agent YAML。
  • 会话阶段统一按 assistant_id 拉取运行时配置:
    • BACKEND_URL:从 backend API 获取。
    • BACKEND_URL(或 BACKEND_MODE=disabled):从 ASSISTANT_LOCAL_CONFIG_DIR/<assistant_id>.yaml 获取。

Backend Integration

Engine runtime now supports adapter-based backend integration:

  • BACKEND_MODE=auto|http|disabled
  • BACKEND_URL + BACKEND_TIMEOUT_SEC
  • ASSISTANT_LOCAL_CONFIG_DIR (default engine/config/agents)
  • HISTORY_ENABLED=true|false

Behavior:

  • auto: use HTTP backend only when BACKEND_URL is set, otherwise engine-only mode.
  • http: force HTTP backend; falls back to engine-only mode when URL is missing.
  • disabled: force engine-only mode (no backend calls).

Assistant config source behavior:

  • If BACKEND_URL is configured and backend mode is enabled, assistant config is loaded from backend API.
  • If BACKEND_URL is empty (or backend mode is disabled), assistant config is loaded from local YAML.

Local assistant YAML example:

  • File path: engine/config/agents/<assistant_id>.yaml
  • Runtime still requires WebSocket query param assistant_id; it must match the local file name.

History write path is now asynchronous and buffered per session:

  • HISTORY_QUEUE_MAX_SIZE
  • HISTORY_RETRY_MAX_ATTEMPTS
  • HISTORY_RETRY_BACKOFF_SEC
  • HISTORY_FINALIZE_DRAIN_TIMEOUT_SEC

This keeps turn processing responsive even when backend history APIs are slow/failing.

Detailed notes: docs/backend_integration.md.

测试

python examples/test_websocket.py
python mic_client.py

WS Protocol

/ws uses a strict v1 JSON control protocol with binary PCM audio frames.

See docs/ws_v1_schema.md.

Reference