Files
AI-VideoAssistant/engine
Xin Wang 00b88c5afa Add manual opener tool calls to Assistant model and API
- Introduced `manual_opener_tool_calls` field in the Assistant model to support custom tool calls.
- Updated AssistantBase and AssistantUpdate schemas to include the new field.
- Implemented normalization and migration logic for handling manual opener tool calls in the API.
- Enhanced runtime metadata to include manual opener tool calls in responses.
- Updated tests to validate the new functionality and ensure proper handling of tool calls.
- Refactored tool ID normalization to support legacy tool names for backward compatibility.
2026-03-02 12:34:42 +08:00
..
2026-02-26 03:02:48 +08:00
2026-02-06 14:01:34 +08:00
2026-02-06 14:01:34 +08:00
2026-02-06 14:01:34 +08:00
2026-02-26 01:58:39 +08:00
2026-02-06 14:01:34 +08:00
2026-02-26 03:02:48 +08:00
2026-02-26 03:02:48 +08:00

py-active-call-cc

Python Active-Call: real-time audio streaming with WebSocket and WebRTC.

This repo contains a Python 3.11+ codebase for building low-latency voice pipelines (capture, stream, and process audio) using WebRTC and WebSockets. It is currently in an early, experimental stage.

Usage

启动

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

使用 agent profile推荐

python -m app.main --agent-profile default

使用指定 YAML

python -m app.main --agent-config config/agents/default.yaml

Agent 配置路径优先级

  1. --agent-config
  2. --agent-profile(映射到 config/agents/<profile>.yaml
  3. AGENT_CONFIG_PATH
  4. AGENT_PROFILE
  5. config/agents/default.yaml(若存在)

说明

  • Agent 相关配置是严格模式YAML 缺少必须项会直接报错,不会回退到 .env 或代码默认值。
  • 如果要引用环境变量,请在 YAML 显式写 ${ENV_VAR}
  • siliconflow 独立 section 已移除;请在 agent.llm / agent.tts / agent.asr 内通过 providerapi_keyapi_urlmodel 配置。
  • agent.tts.provider 现支持 dashscopeRealtime 协议,非 OpenAI-compatible默认 URL 为 wss://dashscope.aliyuncs.com/api-ws/v1/realtime,默认模型为 qwen3-tts-flash-realtime
  • agent.tts.dashscope_mode(兼容旧写法 agent.tts.mode)支持 commit | server_commit,且仅在 provider=dashscope 时生效:
    • commitEngine 先按句切分,再逐句提交给 DashScope。
    • server_commitEngine 不再逐句切分,由 DashScope 对整段文本自行切分。
  • 现在支持在 Agent YAML 中配置 agent.tools(列表),用于声明运行时可调用工具。
  • 工具配置示例见 config/agents/tools.yaml

Backend Integration

Engine runtime now supports adapter-based backend integration:

  • BACKEND_MODE=auto|http|disabled
  • BACKEND_URL + BACKEND_TIMEOUT_SEC
  • HISTORY_ENABLED=true|false

Behavior:

  • auto: use HTTP backend only when BACKEND_URL is set, otherwise engine-only mode.
  • http: force HTTP backend; falls back to engine-only mode when URL is missing.
  • disabled: force engine-only mode (no backend calls).

History write path is now asynchronous and buffered per session:

  • HISTORY_QUEUE_MAX_SIZE
  • HISTORY_RETRY_MAX_ATTEMPTS
  • HISTORY_RETRY_BACKOFF_SEC
  • HISTORY_FINALIZE_DRAIN_TIMEOUT_SEC

This keeps turn processing responsive even when backend history APIs are slow/failing.

Detailed notes: docs/backend_integration.md.

测试

python examples/test_websocket.py
python mic_client.py

WS Protocol

/ws uses a strict v1 JSON control protocol with binary PCM audio frames.

See docs/ws_v1_schema.md.