Enhance API documentation by adding new endpoints for ASR preview, assistant configuration retrieval, and knowledge base management. Update existing assistant and tool definitions for improved clarity and functionality. Remove outdated sections from history records documentation, ensuring a streamlined reference for users.

This commit is contained in:
Xin Wang
2026-03-02 01:56:38 +08:00
parent 1561056a3d
commit 531688aa6b
9 changed files with 829 additions and 135 deletions

420
api/docs/knowledge.md Normal file
View File

@@ -0,0 +1,420 @@
# 知识库 (Knowledge Base) API
知识库 API 用于管理知识库和文档的创建、索引和搜索。
## 基础信息
| 项目 | 值 |
|------|-----|
| Base URL | `/api/v1/knowledge` |
| 认证方式 | Bearer Token (预留) |
---
## 数据模型
### KnowledgeBase
```typescript
interface KnowledgeBase {
id: string; // 知识库唯一标识 (8位UUID)
user_id: number; // 所属用户ID
name: string; // 知识库名称
description: string; // 知识库描述
embeddingModel: string; // Embedding 模型名称
chunkSize: number; // 文档分块大小
chunkOverlap: number; // 分块重叠大小
docCount: number; // 文档数量
chunkCount: number; // 切分后的文本块数量
status: string; // 状态: "active" | "inactive"
createdAt: string; // 创建时间
updatedAt: string; // 更新时间
documents: KnowledgeDocument[]; // 关联的文档列表
}
```
### KnowledgeDocument
```typescript
interface KnowledgeDocument {
id: string; // 文档唯一标识
kb_id: string; // 所属知识库ID
name: string; // 文档名称
size: string; // 文件大小
fileType: string; // 文件类型
storageUrl: string; // 存储地址
status: string; // 状态: "pending" | "processing" | "completed" | "failed"
chunkCount: number; // 切分后的文本块数量
errorMessage: string; // 错误信息
uploadDate: string; // 上传时间
createdAt: string; // 创建时间
processedAt: string; // 处理完成时间
}
```
---
## API 端点
### 1. 获取知识库列表
```http
GET /api/v1/knowledge/bases
```
**Query Parameters:**
| 参数 | 类型 | 必填 | 默认值 | 说明 |
|------|------|------|--------|------|
| user_id | int | 否 | 1 | 用户ID |
| page | int | 否 | 1 | 页码 |
| limit | int | 否 | 50 | 每页数量 |
**Response:**
```json
{
"total": 2,
"page": 1,
"limit": 50,
"list": [
{
"id": "kb_001",
"user_id": 1,
"name": "产品知识库",
"description": "产品文档和FAQ",
"embeddingModel": "text-embedding-3-small",
"chunkSize": 500,
"chunkOverlap": 50,
"docCount": 10,
"chunkCount": 150,
"status": "active",
"createdAt": "2024-01-15T10:30:00",
"updatedAt": "2024-01-15T10:30:00",
"documents": [...]
}
]
}
```
---
### 2. 获取单个知识库详情
```http
GET /api/v1/knowledge/bases/{kb_id}
```
**Response:**
```json
{
"id": "kb_001",
"user_id": 1,
"name": "产品知识库",
"description": "产品文档和FAQ",
"embeddingModel": "text-embedding-3-small",
"chunkSize": 500,
"chunkOverlap": 50,
"docCount": 10,
"chunkCount": 150,
"status": "active",
"createdAt": "2024-01-15T10:30:00",
"updatedAt": "2024-01-15T10:30:00",
"documents": [
{
"id": "doc_001",
"kb_id": "kb_001",
"name": "产品手册.pdf",
"size": "1.2 MB",
"fileType": "application/pdf",
"storageUrl": "",
"status": "completed",
"chunkCount": 45,
"errorMessage": null,
"uploadDate": "2024-01-15T10:30:00",
"createdAt": "2024-01-15T10:30:00",
"processedAt": "2024-01-15T10:30:05"
}
]
}
```
---
### 3. 创建知识库
```http
POST /api/v1/knowledge/bases
```
**Request Body:**
```json
{
"name": "产品知识库",
"description": "产品文档和FAQ",
"embeddingModel": "text-embedding-3-small",
"chunkSize": 500,
"chunkOverlap": 50
}
```
**Fields 说明:**
| 字段 | 类型 | 必填 | 说明 |
|------|------|------|------|
| name | string | 是 | 知识库名称 |
| description | string | 否 | 知识库描述 |
| embeddingModel | string | 否 | Embedding 模型名称,默认 "text-embedding-3-small" |
| chunkSize | int | 否 | 文档分块大小,默认 500 |
| chunkOverlap | int | 否 | 分块重叠大小,默认 50 |
---
### 4. 更新知识库
```http
PUT /api/v1/knowledge/bases/{kb_id}
```
**Request Body:** (部分更新)
```json
{
"name": "更新后的知识库名称",
"description": "新的描述",
"chunkSize": 800
}
```
**注意:** 如果知识库中已有索引的文档,则不能修改 embeddingModel。如需修改请先删除所有文档。
---
### 5. 删除知识库
```http
DELETE /api/v1/knowledge/bases/{kb_id}
```
**Response:**
```json
{
"message": "Deleted successfully"
}
```
**注意:** 删除知识库会同时删除向量数据库中的相关数据。
---
### 6. 上传文档
```http
POST /api/v1/knowledge/bases/{kb_id}/documents
```
支持两种上传方式:
**方式一:文件上传 (multipart/form-data)**
| 参数 | 类型 | 必填 | 说明 |
|------|------|------|------|
| file | file | 是 | 要上传的文档文件 |
支持的文件类型:`.txt`, `.md`, `.csv`, `.json`, `.pdf`, `.docx`
**方式二:仅创建文档记录 (application/json)**
```json
{
"name": "document.pdf",
"size": "1.2 MB",
"fileType": "application/pdf",
"storageUrl": "https://storage.example.com/doc.pdf"
}
```
**Response (文件上传):**
```json
{
"id": "doc_001",
"name": "产品手册.pdf",
"size": "1.2 MB",
"fileType": "application/pdf",
"storageUrl": "",
"status": "completed",
"chunkCount": 45,
"message": "Document uploaded and indexed"
}
```
---
### 7. 索引文档内容
```http
POST /api/v1/knowledge/bases/{kb_id}/documents/{doc_id}/index
```
直接向向量数据库索引文本内容,无需上传文件。
**Request Body:**
```json
{
"content": "要索引的文本内容..."
}
```
**Response:**
```json
{
"message": "Document indexed",
"chunkCount": 10
}
```
---
### 8. 删除文档
```http
DELETE /api/v1/knowledge/bases/{kb_id}/documents/{doc_id}
```
**Response:**
```json
{
"message": "Deleted successfully"
}
```
---
### 9. 搜索知识库
```http
POST /api/v1/knowledge/search
```
**Request Body:**
```json
{
"kb_id": "kb_001",
"query": "产品退货政策",
"nResults": 5
}
```
**Fields 说明:**
| 字段 | 类型 | 必填 | 说明 |
|------|------|------|------|
| kb_id | string | 是 | 知识库ID |
| query | string | 是 | 搜索查询文本 |
| nResults | int | 否 | 返回结果数量,默认 5 |
**Response:**
```json
{
"results": [
{
"id": "doc_001",
"text": "我们的退货政策是...",
"score": 0.85,
"metadata": {
"document_name": "退货政策.pdf",
"chunk_index": 3
}
}
]
}
```
---
### 10. 获取知识库统计
```http
GET /api/v1/knowledge/bases/{kb_id}/stats
```
**Response:**
```json
{
"kb_id": "kb_001",
"docCount": 10,
"chunkCount": 150
}
```
---
## 支持的文件类型
| 文件类型 | 扩展名 | 说明 |
|----------|--------|------|
| 纯文本 | .txt | 纯文本文件 |
| Markdown | .md | Markdown 格式文档 |
| CSV | .csv | CSV 表格数据 |
| JSON | .json | JSON 格式数据 |
| PDF | .pdf | PDF 文档 (需要 pypdf) |
| Word | .docx | Word 文档 (需要 python-docx) |
**注意:** 不支持旧的 .doc 格式,请转换为 .docx 或其他格式。
---
## Schema 定义
```python
from pydantic import BaseModel
from typing import Optional, List
class KnowledgeBaseCreate(BaseModel):
name: str
description: Optional[str] = None
embeddingModel: Optional[str] = "text-embedding-3-small"
chunkSize: Optional[int] = 500
chunkOverlap: Optional[int] = 50
class KnowledgeBaseUpdate(BaseModel):
name: Optional[str] = None
description: Optional[str] = None
embeddingModel: Optional[str] = None
chunkSize: Optional[int] = None
chunkOverlap: Optional[int] = None
class KnowledgeSearchQuery(BaseModel):
kb_id: str
query: str
nResults: Optional[int] = 5
class DocumentIndexRequest(BaseModel):
content: str
```
---
## 单元测试
项目包含完整的单元测试,位于 `api/tests/test_knowledge.py`
### 运行测试
```bash
# 运行知识库相关测试
pytest api/tests/test_knowledge.py -v
# 运行所有测试
pytest api/tests/ -v
```