Add Mermaid diagram support and update architecture documentation

- Included a new JavaScript file for Mermaid configuration to ensure consistent diagram sizing across documentation.
- Enhanced architecture documentation to reflect the updated pipeline engine structure, including VAD, ASR, TD, LLM, and TTS components.
- Updated various sections to clarify the integration of external services and tools within the architecture.
- Improved styling for Mermaid diagrams to enhance visual consistency and usability.
This commit is contained in:
Xin Wang
2026-03-05 11:01:56 +08:00
parent 4748f3b5f1
commit ac9b0047ee
7 changed files with 275 additions and 80 deletions

View File

@@ -38,7 +38,7 @@ Realtime Agent Studio (RAS) 是一款以大语言模型为核心,构建实时
---
管线式全双工架构,ASR/LLM/TTS 流水线处理,支持智能打断,端到端延迟 < 500ms
管线式全双工架构,VAD/ASR/TD/LLM/TTS 流水线处理,支持智能打断,端到端延迟 < 500ms
- :brain: **多模态模型支持**
@@ -76,38 +76,152 @@ Realtime Agent Studio (RAS) 是一款以大语言模型为核心,构建实时
## 系统架构
平台架构层级:
```mermaid
flowchart TB
%% ================= ACCESS =================
subgraph Access["Access Layer"]
direction TB
API[API]
SDK[SDK]
Browser[Browser UI]
Embed[Web Embed]
end
%% ================= REALTIME ENGINE =================
subgraph Runtime["Realtime Interaction Engine"]
direction LR
%% -------- Duplex Engine --------
subgraph Duplex["Duplex Interaction Engine"]
direction LR
subgraph Pipeline["Pipeline Engine"]
direction LR
VAD[VAD]
ASR[ASR]
TD[Turn Detection]
LLM[LLM]
TTS[TTS]
end
subgraph Multi["Realtime Engine"]
MM[Realtime Model]
end
end
%% -------- Capabilities --------
subgraph Capability["Agent Capabilities"]
subgraph Tools["Tool System"]
Webhook[Webhook]
ClientTool[Client Tools]
Builtin[Builtin Tools]
end
subgraph KB["Knowledge System"]
Docs[Documents]
Vector[(Vector Index)]
Retrieval[Retrieval]
end
end
end
%% ================= PLATFORM =================
subgraph Platform["Platform Services"]
direction TB
Backend[Backend Service]
Frontend[Frontend Console]
DB[(Database)]
end
%% ================= CONNECTIONS =================
Access --> Runtime
Runtime <--> Backend
Backend <--> DB
Backend <--> Frontend
LLM --> Tools
MM --> Tools
LLM <--> KB
MM <--> KB
```
管线式引擎交互引擎对话流程图:
```mermaid
flowchart LR
subgraph Client["客户端"]
Web[Web 浏览器]
App[移动应用]
SDK[SDK]
end
subgraph RAS["Realtime Agent Studio"]
Engine[实时交互引擎]
API[API 服务]
DB[(数据库)]
end
User((User Speech))
Audio[Audio Stream]
subgraph Pipeline["管线式引擎"]
ASR[语音识别]
LLM[大语言模型]
TTS[语音合成]
end
VAD[VAD\nVoice Activity Detection]
ASR[ASR\nSpeech Recognition]
subgraph External["外部服务"]
OpenAI[OpenAI]
Azure[Azure]
Local[本地模型]
end
TD[Turn Detection]
Client -->|WebSocket| Engine
Client -->|REST| API
Engine --> Pipeline
Engine <--> API
API <--> DB
Pipeline --> External
LLM[LLM\nReasoning]
Tools[Tools / APIs]
TTS[TTS\nSpeech Synthesis]
AudioOut[Audio Stream Out]
User --> Audio
Audio --> VAD
VAD --> ASR
ASR --> TD
TD --> LLM
LLM --> Tools
Tools --> LLM
LLM --> TTS
TTS --> AudioOut
AudioOut --> User
```
基于实时交互模型的对话流程图:
```mermaid
flowchart LR
User((User))
Input[Audio / Video / Text]
MM[Multimodal Model]
Tools[Tools / APIs]
KB[Knowledge Base]
Output[Audio / Video / Text]
User --> Input
Input --> MM
MM --> Tools
Tools --> MM
MM --> KB
KB --> MM
MM --> Output
Output --> User
```
---
@@ -119,9 +233,9 @@ flowchart LR
| **前端** | React 18, TypeScript, Tailwind CSS, Zustand |
| **后端** | FastAPI (Python 3.10+) |
| **引擎** | Python, WebSocket, asyncio |
| **数据库** | SQLite / PostgreSQL |
| **数据库** | SQLite |
| **知识库** | chroma |
| **部署** | Docker, Nginx |
| **部署** | Docker |
---
@@ -204,16 +318,6 @@ ws.onopen = () => {
---
## 参与贡献
我们欢迎社区贡献!查看 [贡献指南](https://github.com/your-org/AI-VideoAssistant/blob/main/CONTRIBUTING.md) 了解如何参与。
- :star: Star 项目支持我们
- :bug: 提交 Issue 报告问题
- :hammer: 提交 PR 贡献代码
---
## 许可证
本项目基于 [MIT 许可证](https://github.com/your-org/AI-VideoAssistant/blob/main/LICENSE) 开源。