Refactor backend integration and service architecture
- Removed the backend client compatibility wrapper and associated methods to streamline backend integration. - Updated session management to utilize control plane gateways and runtime configuration providers. - Adjusted TTS service implementations to remove the EdgeTTS service and simplify service dependencies. - Enhanced documentation to reflect changes in backend integration and service architecture. - Updated configuration files to remove deprecated TTS provider options and clarify available settings.
This commit is contained in:
@@ -27,9 +27,8 @@ Assistant config source behavior:
|
||||
|
||||
## Architecture
|
||||
|
||||
- Ports: `core/ports/backend.py`
|
||||
- Ports: `core/ports/control_plane.py`
|
||||
- Adapters: `app/backend_adapters.py`
|
||||
- Compatibility wrappers: `app/backend_client.py`
|
||||
|
||||
`Session` and `DuplexPipeline` receive backend capabilities via injected adapter
|
||||
methods instead of hard-coding backend client imports.
|
||||
|
||||
47
engine/docs/extension_ports.md
Normal file
47
engine/docs/extension_ports.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# Engine Extension Ports (Draft)
|
||||
|
||||
This document defines the draft port set used to keep core runtime extensible.
|
||||
|
||||
## Port Modules
|
||||
|
||||
- `core/ports/control_plane.py`
|
||||
- `AssistantRuntimeConfigProvider`
|
||||
- `ConversationHistoryStore`
|
||||
- `KnowledgeRetriever`
|
||||
- `ToolCatalog`
|
||||
- `ControlPlaneGateway`
|
||||
- `core/ports/llm.py`
|
||||
- `LLMServiceSpec`
|
||||
- `LLMPort`
|
||||
- optional extensions: `LLMCancellable`, `LLMRuntimeConfigurable`
|
||||
- `core/ports/tts.py`
|
||||
- `TTSServiceSpec`
|
||||
- `TTSPort`
|
||||
- `core/ports/asr.py`
|
||||
- `ASRServiceSpec`
|
||||
- `ASRPort`
|
||||
- optional extensions: `ASRInterimControl`, `ASRBufferControl`
|
||||
- `core/ports/service_factory.py`
|
||||
- `RealtimeServiceFactory`
|
||||
|
||||
## Adapter Layer
|
||||
|
||||
- `app/service_factory.py` provides `DefaultRealtimeServiceFactory`.
|
||||
- It maps resolved provider specs to concrete adapters.
|
||||
- Core orchestration (`core/duplex_pipeline.py`) depends on the factory port/specs, not concrete provider classes.
|
||||
|
||||
## Provider Behavior (Current)
|
||||
|
||||
- LLM:
|
||||
- supported providers: `openai`, `openai_compatible`, `openai-compatible`, `siliconflow`
|
||||
- fallback: `MockLLMService`
|
||||
- TTS:
|
||||
- supported providers: `dashscope`, `openai_compatible`, `openai-compatible`, `siliconflow`
|
||||
- fallback: `MockTTSService`
|
||||
- ASR:
|
||||
- supported providers: `openai_compatible`, `openai-compatible`, `siliconflow`
|
||||
- fallback: `BufferedASRService`
|
||||
|
||||
## Notes
|
||||
|
||||
- This is a draft contract set; follow-up work can add explicit capability negotiation and contract-version fields.
|
||||
129
engine/docs/high_level_architecture.md
Normal file
129
engine/docs/high_level_architecture.md
Normal file
@@ -0,0 +1,129 @@
|
||||
# Engine High-Level Architecture
|
||||
|
||||
This document describes the runtime architecture of `engine` for realtime voice/text assistant interactions.
|
||||
|
||||
## Goals
|
||||
|
||||
- Low-latency duplex interaction (user speaks while assistant can respond)
|
||||
- Clear separation between transport, orchestration, and model/service integrations
|
||||
- Backend-optional runtime (works with or without external backend)
|
||||
- Protocol-first interoperability through strict WS v1 control messages
|
||||
|
||||
## Top-Level Components
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
C[Client\nWeb / Mobile / Device] <-- WS v1 + PCM --> A[FastAPI App\napp/main.py]
|
||||
A --> S[Session\ncore/session.py]
|
||||
S --> D[Duplex Pipeline\ncore/duplex_pipeline.py]
|
||||
|
||||
D --> P[Processors\nVAD / EOU / Tracks]
|
||||
D --> R[Workflow Runner\ncore/workflow_runner.py]
|
||||
D --> E[Event Bus + Models\ncore/events.py + models/*]
|
||||
|
||||
R --> SV[Service Layer\nservices/asr.py\nservices/llm.py\nservices/tts.py]
|
||||
R --> TE[Tool Executor\ncore/tool_executor.py]
|
||||
|
||||
S --> HB[History Bridge\ncore/history_bridge.py]
|
||||
S --> BA[Control Plane Port\ncore/ports/control_plane.py]
|
||||
BA --> AD[Adapters\napp/backend_adapters.py]
|
||||
|
||||
AD --> B[(External Backend API\noptional)]
|
||||
SV --> M[(ASR/LLM/TTS Providers)]
|
||||
```
|
||||
|
||||
## Request Lifecycle (Simplified)
|
||||
|
||||
1. Client connects to `/ws?assistant_id=<id>` and sends `session.start`.
|
||||
2. App creates a `Session` with resolved assistant config (backend or local YAML).
|
||||
3. Binary PCM frames enter the duplex pipeline.
|
||||
4. `VAD`/`EOU` processors detect speech segments and trigger ASR finalization.
|
||||
5. ASR text is routed into workflow + LLM generation.
|
||||
6. Optional tool calls are executed (server-side or client-side result return).
|
||||
7. LLM output streams as text deltas; TTS produces audio chunks for playback.
|
||||
8. Session emits structured events (`transcript.*`, `assistant.*`, `output.audio.*`, `error`).
|
||||
9. History bridge persists conversation data asynchronously.
|
||||
10. On `session.stop` (or disconnect), session finalizes and drains pending writes.
|
||||
|
||||
## Layering and Responsibilities
|
||||
|
||||
### 1) Transport / API Layer
|
||||
|
||||
- Entry point: `app/main.py`
|
||||
- Responsibilities:
|
||||
- WebSocket lifecycle management
|
||||
- WS v1 message validation and order guarantees
|
||||
- Session creation and teardown
|
||||
- Converting raw WS frames into internal events
|
||||
|
||||
### 2) Session + Orchestration Layer
|
||||
|
||||
- Core: `core/session.py`, `core/duplex_pipeline.py`, `core/conversation.py`
|
||||
- Responsibilities:
|
||||
- Per-session state machine
|
||||
- Turn boundaries and interruption/cancel handling
|
||||
- Event sequencing (`seq`) and envelope consistency
|
||||
- Bridging input/output tracks (`audio_in`, `audio_out`, `control`)
|
||||
|
||||
### 3) Processing Layer
|
||||
|
||||
- Modules: `processors/vad.py`, `processors/eou.py`, `processors/tracks.py`
|
||||
- Responsibilities:
|
||||
- Speech activity detection
|
||||
- End-of-utterance decisioning
|
||||
- Track-oriented routing and timing-sensitive pre/post processing
|
||||
|
||||
### 4) Workflow + Tooling Layer
|
||||
|
||||
- Modules: `core/workflow_runner.py`, `core/tool_executor.py`
|
||||
- Responsibilities:
|
||||
- Assistant workflow execution
|
||||
- Tool call planning/execution and timeout handling
|
||||
- Tool result normalization into protocol events
|
||||
|
||||
### 5) Service Integration Layer
|
||||
|
||||
- Modules: `services/*`
|
||||
- Responsibilities:
|
||||
- Abstracting ASR/LLM/TTS provider differences
|
||||
- Streaming token/audio adaptation
|
||||
- Provider-specific adapters (OpenAI-compatible, DashScope, SiliconFlow, etc.)
|
||||
|
||||
### 6) Backend Integration Layer (Optional)
|
||||
|
||||
- Port: `core/ports/control_plane.py`
|
||||
- Adapters: `app/backend_adapters.py`
|
||||
- Responsibilities:
|
||||
- Fetching assistant runtime config
|
||||
- Persisting call/session metadata and history
|
||||
- Supporting `BACKEND_MODE=auto|http|disabled`
|
||||
|
||||
### 7) Persistence / Reliability Layer
|
||||
|
||||
- Module: `core/history_bridge.py`
|
||||
- Responsibilities:
|
||||
- Non-blocking queue-based history writes
|
||||
- Retry with backoff on backend failures
|
||||
- Best-effort drain on session finalize
|
||||
|
||||
## Key Design Principles
|
||||
|
||||
- Dependency inversion for backend: session/pipeline depend on port interfaces, not concrete clients.
|
||||
- Streaming-first: text/audio are emitted incrementally to minimize perceived latency.
|
||||
- Fail-soft behavior: backend/history failures should not block realtime interaction paths.
|
||||
- Protocol strictness: WS v1 rejects malformed/out-of-order control traffic early.
|
||||
- Explicit event model: all client-observable state changes are represented as typed events.
|
||||
|
||||
## Configuration Boundaries
|
||||
|
||||
- Runtime environment settings live in `app/config.py`.
|
||||
- Assistant-specific behavior is loaded by `assistant_id`:
|
||||
- backend mode: from backend API
|
||||
- engine-only mode: local `engine/config/agents/<assistant_id>.yaml`
|
||||
- Client-provided `metadata.overrides` and `dynamicVariables` can alter runtime behavior within protocol constraints.
|
||||
|
||||
## Related Docs
|
||||
|
||||
- WS protocol: `engine/docs/ws_v1_schema.md`
|
||||
- Backend integration details: `engine/docs/backend_integration.md`
|
||||
- Duplex interaction diagram: `engine/docs/duplex_interaction.svg`
|
||||
Reference in New Issue
Block a user