Add Mermaid diagram support and update architecture documentation
- Included a new JavaScript file for Mermaid configuration to ensure consistent diagram sizing across documentation. - Enhanced architecture documentation to reflect the updated pipeline engine structure, including VAD, ASR, TD, LLM, and TTS components. - Updated various sections to clarify the integration of external services and tools within the architecture. - Improved styling for Mermaid diagrams to enhance visual consistency and usability.
This commit is contained in:
@@ -38,7 +38,7 @@ Realtime Agent Studio (RAS) 是一款以大语言模型为核心,构建实时
|
||||
|
||||
---
|
||||
|
||||
管线式全双工架构,ASR/LLM/TTS 流水线处理,支持智能打断,端到端延迟 < 500ms
|
||||
管线式全双工架构,VAD/ASR/TD/LLM/TTS 流水线处理,支持智能打断,端到端延迟 < 500ms
|
||||
|
||||
- :brain: **多模态模型支持**
|
||||
|
||||
@@ -76,38 +76,152 @@ Realtime Agent Studio (RAS) 是一款以大语言模型为核心,构建实时
|
||||
|
||||
## 系统架构
|
||||
|
||||
平台架构层级:
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
|
||||
%% ================= ACCESS =================
|
||||
subgraph Access["Access Layer"]
|
||||
direction TB
|
||||
API[API]
|
||||
SDK[SDK]
|
||||
Browser[Browser UI]
|
||||
Embed[Web Embed]
|
||||
end
|
||||
|
||||
|
||||
%% ================= REALTIME ENGINE =================
|
||||
subgraph Runtime["Realtime Interaction Engine"]
|
||||
|
||||
direction LR
|
||||
|
||||
%% -------- Duplex Engine --------
|
||||
subgraph Duplex["Duplex Interaction Engine"]
|
||||
direction LR
|
||||
|
||||
subgraph Pipeline["Pipeline Engine"]
|
||||
direction LR
|
||||
VAD[VAD]
|
||||
ASR[ASR]
|
||||
TD[Turn Detection]
|
||||
LLM[LLM]
|
||||
TTS[TTS]
|
||||
end
|
||||
|
||||
subgraph Multi["Realtime Engine"]
|
||||
MM[Realtime Model]
|
||||
end
|
||||
|
||||
end
|
||||
|
||||
|
||||
%% -------- Capabilities --------
|
||||
subgraph Capability["Agent Capabilities"]
|
||||
|
||||
subgraph Tools["Tool System"]
|
||||
Webhook[Webhook]
|
||||
ClientTool[Client Tools]
|
||||
Builtin[Builtin Tools]
|
||||
end
|
||||
|
||||
subgraph KB["Knowledge System"]
|
||||
Docs[Documents]
|
||||
Vector[(Vector Index)]
|
||||
Retrieval[Retrieval]
|
||||
end
|
||||
|
||||
end
|
||||
|
||||
end
|
||||
|
||||
|
||||
%% ================= PLATFORM =================
|
||||
subgraph Platform["Platform Services"]
|
||||
direction TB
|
||||
Backend[Backend Service]
|
||||
Frontend[Frontend Console]
|
||||
DB[(Database)]
|
||||
end
|
||||
|
||||
|
||||
%% ================= CONNECTIONS =================
|
||||
|
||||
Access --> Runtime
|
||||
|
||||
Runtime <--> Backend
|
||||
Backend <--> DB
|
||||
Backend <--> Frontend
|
||||
|
||||
LLM --> Tools
|
||||
MM --> Tools
|
||||
|
||||
LLM <--> KB
|
||||
MM <--> KB
|
||||
```
|
||||
|
||||
管线式引擎交互引擎对话流程图:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph Client["客户端"]
|
||||
Web[Web 浏览器]
|
||||
App[移动应用]
|
||||
SDK[SDK]
|
||||
end
|
||||
|
||||
subgraph RAS["Realtime Agent Studio"]
|
||||
Engine[实时交互引擎]
|
||||
API[API 服务]
|
||||
DB[(数据库)]
|
||||
end
|
||||
User((User Speech))
|
||||
Audio[Audio Stream]
|
||||
|
||||
subgraph Pipeline["管线式引擎"]
|
||||
ASR[语音识别]
|
||||
LLM[大语言模型]
|
||||
TTS[语音合成]
|
||||
end
|
||||
VAD[VAD\nVoice Activity Detection]
|
||||
ASR[ASR\nSpeech Recognition]
|
||||
|
||||
subgraph External["外部服务"]
|
||||
OpenAI[OpenAI]
|
||||
Azure[Azure]
|
||||
Local[本地模型]
|
||||
end
|
||||
TD[Turn Detection]
|
||||
|
||||
Client -->|WebSocket| Engine
|
||||
Client -->|REST| API
|
||||
Engine --> Pipeline
|
||||
Engine <--> API
|
||||
API <--> DB
|
||||
Pipeline --> External
|
||||
LLM[LLM\nReasoning]
|
||||
|
||||
Tools[Tools / APIs]
|
||||
|
||||
TTS[TTS\nSpeech Synthesis]
|
||||
|
||||
AudioOut[Audio Stream Out]
|
||||
|
||||
User --> Audio
|
||||
Audio --> VAD
|
||||
VAD --> ASR
|
||||
ASR --> TD
|
||||
TD --> LLM
|
||||
|
||||
LLM --> Tools
|
||||
Tools --> LLM
|
||||
|
||||
LLM --> TTS
|
||||
TTS --> AudioOut
|
||||
AudioOut --> User
|
||||
```
|
||||
|
||||
基于实时交互模型的对话流程图:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
|
||||
User((User))
|
||||
|
||||
Input[Audio / Video / Text]
|
||||
|
||||
MM[Multimodal Model]
|
||||
|
||||
Tools[Tools / APIs]
|
||||
KB[Knowledge Base]
|
||||
|
||||
Output[Audio / Video / Text]
|
||||
|
||||
User --> Input
|
||||
Input --> MM
|
||||
|
||||
MM --> Tools
|
||||
Tools --> MM
|
||||
|
||||
MM --> KB
|
||||
KB --> MM
|
||||
|
||||
MM --> Output
|
||||
Output --> User
|
||||
```
|
||||
|
||||
---
|
||||
@@ -119,9 +233,9 @@ flowchart LR
|
||||
| **前端** | React 18, TypeScript, Tailwind CSS, Zustand |
|
||||
| **后端** | FastAPI (Python 3.10+) |
|
||||
| **引擎** | Python, WebSocket, asyncio |
|
||||
| **数据库** | SQLite / PostgreSQL |
|
||||
| **数据库** | SQLite |
|
||||
| **知识库** | chroma |
|
||||
| **部署** | Docker, Nginx |
|
||||
| **部署** | Docker |
|
||||
|
||||
---
|
||||
|
||||
@@ -204,16 +318,6 @@ ws.onopen = () => {
|
||||
|
||||
---
|
||||
|
||||
## 参与贡献
|
||||
|
||||
我们欢迎社区贡献!查看 [贡献指南](https://github.com/your-org/AI-VideoAssistant/blob/main/CONTRIBUTING.md) 了解如何参与。
|
||||
|
||||
- :star: Star 项目支持我们
|
||||
- :bug: 提交 Issue 报告问题
|
||||
- :hammer: 提交 PR 贡献代码
|
||||
|
||||
---
|
||||
|
||||
## 许可证
|
||||
|
||||
本项目基于 [MIT 许可证](https://github.com/your-org/AI-VideoAssistant/blob/main/LICENSE) 开源。
|
||||
|
||||
Reference in New Issue
Block a user