Enhance API documentation by adding new endpoints for ASR preview, assistant configuration retrieval, and knowledge base management. Update existing assistant and tool definitions for improved clarity and functionality. Remove outdated sections from history records documentation, ensuring a streamlined reference for users.
This commit is contained in:
420
api/docs/knowledge.md
Normal file
420
api/docs/knowledge.md
Normal file
@@ -0,0 +1,420 @@
|
||||
# 知识库 (Knowledge Base) API
|
||||
|
||||
知识库 API 用于管理知识库和文档的创建、索引和搜索。
|
||||
|
||||
## 基础信息
|
||||
|
||||
| 项目 | 值 |
|
||||
|------|-----|
|
||||
| Base URL | `/api/v1/knowledge` |
|
||||
| 认证方式 | Bearer Token (预留) |
|
||||
|
||||
---
|
||||
|
||||
## 数据模型
|
||||
|
||||
### KnowledgeBase
|
||||
|
||||
```typescript
|
||||
interface KnowledgeBase {
|
||||
id: string; // 知识库唯一标识 (8位UUID)
|
||||
user_id: number; // 所属用户ID
|
||||
name: string; // 知识库名称
|
||||
description: string; // 知识库描述
|
||||
embeddingModel: string; // Embedding 模型名称
|
||||
chunkSize: number; // 文档分块大小
|
||||
chunkOverlap: number; // 分块重叠大小
|
||||
docCount: number; // 文档数量
|
||||
chunkCount: number; // 切分后的文本块数量
|
||||
status: string; // 状态: "active" | "inactive"
|
||||
createdAt: string; // 创建时间
|
||||
updatedAt: string; // 更新时间
|
||||
documents: KnowledgeDocument[]; // 关联的文档列表
|
||||
}
|
||||
```
|
||||
|
||||
### KnowledgeDocument
|
||||
|
||||
```typescript
|
||||
interface KnowledgeDocument {
|
||||
id: string; // 文档唯一标识
|
||||
kb_id: string; // 所属知识库ID
|
||||
name: string; // 文档名称
|
||||
size: string; // 文件大小
|
||||
fileType: string; // 文件类型
|
||||
storageUrl: string; // 存储地址
|
||||
status: string; // 状态: "pending" | "processing" | "completed" | "failed"
|
||||
chunkCount: number; // 切分后的文本块数量
|
||||
errorMessage: string; // 错误信息
|
||||
uploadDate: string; // 上传时间
|
||||
createdAt: string; // 创建时间
|
||||
processedAt: string; // 处理完成时间
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API 端点
|
||||
|
||||
### 1. 获取知识库列表
|
||||
|
||||
```http
|
||||
GET /api/v1/knowledge/bases
|
||||
```
|
||||
|
||||
**Query Parameters:**
|
||||
|
||||
| 参数 | 类型 | 必填 | 默认值 | 说明 |
|
||||
|------|------|------|--------|------|
|
||||
| user_id | int | 否 | 1 | 用户ID |
|
||||
| page | int | 否 | 1 | 页码 |
|
||||
| limit | int | 否 | 50 | 每页数量 |
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"total": 2,
|
||||
"page": 1,
|
||||
"limit": 50,
|
||||
"list": [
|
||||
{
|
||||
"id": "kb_001",
|
||||
"user_id": 1,
|
||||
"name": "产品知识库",
|
||||
"description": "产品文档和FAQ",
|
||||
"embeddingModel": "text-embedding-3-small",
|
||||
"chunkSize": 500,
|
||||
"chunkOverlap": 50,
|
||||
"docCount": 10,
|
||||
"chunkCount": 150,
|
||||
"status": "active",
|
||||
"createdAt": "2024-01-15T10:30:00",
|
||||
"updatedAt": "2024-01-15T10:30:00",
|
||||
"documents": [...]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. 获取单个知识库详情
|
||||
|
||||
```http
|
||||
GET /api/v1/knowledge/bases/{kb_id}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "kb_001",
|
||||
"user_id": 1,
|
||||
"name": "产品知识库",
|
||||
"description": "产品文档和FAQ",
|
||||
"embeddingModel": "text-embedding-3-small",
|
||||
"chunkSize": 500,
|
||||
"chunkOverlap": 50,
|
||||
"docCount": 10,
|
||||
"chunkCount": 150,
|
||||
"status": "active",
|
||||
"createdAt": "2024-01-15T10:30:00",
|
||||
"updatedAt": "2024-01-15T10:30:00",
|
||||
"documents": [
|
||||
{
|
||||
"id": "doc_001",
|
||||
"kb_id": "kb_001",
|
||||
"name": "产品手册.pdf",
|
||||
"size": "1.2 MB",
|
||||
"fileType": "application/pdf",
|
||||
"storageUrl": "",
|
||||
"status": "completed",
|
||||
"chunkCount": 45,
|
||||
"errorMessage": null,
|
||||
"uploadDate": "2024-01-15T10:30:00",
|
||||
"createdAt": "2024-01-15T10:30:00",
|
||||
"processedAt": "2024-01-15T10:30:05"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. 创建知识库
|
||||
|
||||
```http
|
||||
POST /api/v1/knowledge/bases
|
||||
```
|
||||
|
||||
**Request Body:**
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "产品知识库",
|
||||
"description": "产品文档和FAQ",
|
||||
"embeddingModel": "text-embedding-3-small",
|
||||
"chunkSize": 500,
|
||||
"chunkOverlap": 50
|
||||
}
|
||||
```
|
||||
|
||||
**Fields 说明:**
|
||||
|
||||
| 字段 | 类型 | 必填 | 说明 |
|
||||
|------|------|------|------|
|
||||
| name | string | 是 | 知识库名称 |
|
||||
| description | string | 否 | 知识库描述 |
|
||||
| embeddingModel | string | 否 | Embedding 模型名称,默认 "text-embedding-3-small" |
|
||||
| chunkSize | int | 否 | 文档分块大小,默认 500 |
|
||||
| chunkOverlap | int | 否 | 分块重叠大小,默认 50 |
|
||||
|
||||
---
|
||||
|
||||
### 4. 更新知识库
|
||||
|
||||
```http
|
||||
PUT /api/v1/knowledge/bases/{kb_id}
|
||||
```
|
||||
|
||||
**Request Body:** (部分更新)
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "更新后的知识库名称",
|
||||
"description": "新的描述",
|
||||
"chunkSize": 800
|
||||
}
|
||||
```
|
||||
|
||||
**注意:** 如果知识库中已有索引的文档,则不能修改 embeddingModel。如需修改,请先删除所有文档。
|
||||
|
||||
---
|
||||
|
||||
### 5. 删除知识库
|
||||
|
||||
```http
|
||||
DELETE /api/v1/knowledge/bases/{kb_id}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"message": "Deleted successfully"
|
||||
}
|
||||
```
|
||||
|
||||
**注意:** 删除知识库会同时删除向量数据库中的相关数据。
|
||||
|
||||
---
|
||||
|
||||
### 6. 上传文档
|
||||
|
||||
```http
|
||||
POST /api/v1/knowledge/bases/{kb_id}/documents
|
||||
```
|
||||
|
||||
支持两种上传方式:
|
||||
|
||||
**方式一:文件上传 (multipart/form-data)**
|
||||
|
||||
| 参数 | 类型 | 必填 | 说明 |
|
||||
|------|------|------|------|
|
||||
| file | file | 是 | 要上传的文档文件 |
|
||||
|
||||
支持的文件类型:`.txt`, `.md`, `.csv`, `.json`, `.pdf`, `.docx`
|
||||
|
||||
**方式二:仅创建文档记录 (application/json)**
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "document.pdf",
|
||||
"size": "1.2 MB",
|
||||
"fileType": "application/pdf",
|
||||
"storageUrl": "https://storage.example.com/doc.pdf"
|
||||
}
|
||||
```
|
||||
|
||||
**Response (文件上传):**
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "doc_001",
|
||||
"name": "产品手册.pdf",
|
||||
"size": "1.2 MB",
|
||||
"fileType": "application/pdf",
|
||||
"storageUrl": "",
|
||||
"status": "completed",
|
||||
"chunkCount": 45,
|
||||
"message": "Document uploaded and indexed"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 7. 索引文档内容
|
||||
|
||||
```http
|
||||
POST /api/v1/knowledge/bases/{kb_id}/documents/{doc_id}/index
|
||||
```
|
||||
|
||||
直接向向量数据库索引文本内容,无需上传文件。
|
||||
|
||||
**Request Body:**
|
||||
|
||||
```json
|
||||
{
|
||||
"content": "要索引的文本内容..."
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"message": "Document indexed",
|
||||
"chunkCount": 10
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 8. 删除文档
|
||||
|
||||
```http
|
||||
DELETE /api/v1/knowledge/bases/{kb_id}/documents/{doc_id}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"message": "Deleted successfully"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 9. 搜索知识库
|
||||
|
||||
```http
|
||||
POST /api/v1/knowledge/search
|
||||
```
|
||||
|
||||
**Request Body:**
|
||||
|
||||
```json
|
||||
{
|
||||
"kb_id": "kb_001",
|
||||
"query": "产品退货政策",
|
||||
"nResults": 5
|
||||
}
|
||||
```
|
||||
|
||||
**Fields 说明:**
|
||||
|
||||
| 字段 | 类型 | 必填 | 说明 |
|
||||
|------|------|------|------|
|
||||
| kb_id | string | 是 | 知识库ID |
|
||||
| query | string | 是 | 搜索查询文本 |
|
||||
| nResults | int | 否 | 返回结果数量,默认 5 |
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"results": [
|
||||
{
|
||||
"id": "doc_001",
|
||||
"text": "我们的退货政策是...",
|
||||
"score": 0.85,
|
||||
"metadata": {
|
||||
"document_name": "退货政策.pdf",
|
||||
"chunk_index": 3
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 10. 获取知识库统计
|
||||
|
||||
```http
|
||||
GET /api/v1/knowledge/bases/{kb_id}/stats
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"kb_id": "kb_001",
|
||||
"docCount": 10,
|
||||
"chunkCount": 150
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 支持的文件类型
|
||||
|
||||
| 文件类型 | 扩展名 | 说明 |
|
||||
|----------|--------|------|
|
||||
| 纯文本 | .txt | 纯文本文件 |
|
||||
| Markdown | .md | Markdown 格式文档 |
|
||||
| CSV | .csv | CSV 表格数据 |
|
||||
| JSON | .json | JSON 格式数据 |
|
||||
| PDF | .pdf | PDF 文档 (需要 pypdf) |
|
||||
| Word | .docx | Word 文档 (需要 python-docx) |
|
||||
|
||||
**注意:** 不支持旧的 .doc 格式,请转换为 .docx 或其他格式。
|
||||
|
||||
---
|
||||
|
||||
## Schema 定义
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
from typing import Optional, List
|
||||
|
||||
class KnowledgeBaseCreate(BaseModel):
|
||||
name: str
|
||||
description: Optional[str] = None
|
||||
embeddingModel: Optional[str] = "text-embedding-3-small"
|
||||
chunkSize: Optional[int] = 500
|
||||
chunkOverlap: Optional[int] = 50
|
||||
|
||||
class KnowledgeBaseUpdate(BaseModel):
|
||||
name: Optional[str] = None
|
||||
description: Optional[str] = None
|
||||
embeddingModel: Optional[str] = None
|
||||
chunkSize: Optional[int] = None
|
||||
chunkOverlap: Optional[int] = None
|
||||
|
||||
class KnowledgeSearchQuery(BaseModel):
|
||||
kb_id: str
|
||||
query: str
|
||||
nResults: Optional[int] = 5
|
||||
|
||||
class DocumentIndexRequest(BaseModel):
|
||||
content: str
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 单元测试
|
||||
|
||||
项目包含完整的单元测试,位于 `api/tests/test_knowledge.py`。
|
||||
|
||||
### 运行测试
|
||||
|
||||
```bash
|
||||
# 运行知识库相关测试
|
||||
pytest api/tests/test_knowledge.py -v
|
||||
|
||||
# 运行所有测试
|
||||
pytest api/tests/ -v
|
||||
```
|
||||
Reference in New Issue
Block a user