# 语音识别 (ASR Model) API 语音识别 API 用于管理语音识别模型的配置和调用。 ## 基础信息 | 项目 | 值 | |------|-----| | Base URL | `/api/v1/asr` | | 认证方式 | Bearer Token (预留) | --- ## 数据模型 ### ASRModel ```typescript interface ASRModel { id: string; // 模型唯一标识 (8位UUID) user_id: number; // 所属用户ID name: string; // 模型显示名称 vendor: string; // 供应商: "OpenAI Compatible" | "Paraformer" | 等 language: string; // 识别语言: "zh" | "en" | "Multi-lingual" base_url: string; // API Base URL api_key: string; // API Key model_name?: string; // 模型名称,如 "whisper-1" | "paraformer-v2" hotwords?: string[]; // 热词列表 enable_punctuation: boolean; // 是否启用标点 enable_normalization: boolean; // 是否启用文本规范化 enabled: boolean; // 是否启用 created_at: string; } ``` --- ## API 端点 ### 1. 获取 ASR 模型列表 ```http GET /api/v1/asr ``` **Query Parameters:** | 参数 | 类型 | 必填 | 默认值 | 说明 | |------|------|------|--------|------| | language | string | 否 | - | 过滤语言: "zh" \| "en" \| "Multi-lingual" | | enabled | boolean | 否 | - | 过滤启用状态 | | page | int | 否 | 1 | 页码 | | limit | int | 否 | 50 | 每页数量 | **Response:** ```json { "total": 3, "page": 1, "limit": 50, "list": [ { "id": "abc12345", "user_id": 1, "name": "Whisper 多语种识别", "vendor": "OpenAI Compatible", "language": "Multi-lingual", "base_url": "https://api.openai.com/v1", "api_key": "sk-***", "model_name": "whisper-1", "enable_punctuation": true, "enable_normalization": true, "enabled": true, "created_at": "2024-01-15T10:30:00Z" }, { "id": "def67890", "user_id": 1, "name": "SenseVoice 中文识别", "vendor": "OpenAI Compatible", "language": "zh", "base_url": "https://api.siliconflow.cn/v1", "api_key": "sf-***", "model_name": "paraformer-v2", "hotwords": ["小助手", "帮我"], "enable_punctuation": true, "enable_normalization": true, "enabled": true, "created_at": "2024-01-15T10:30:00Z" } ] } ``` --- ### 2. 获取单个 ASR 模型详情 ```http GET /api/v1/asr/{id} ``` **Path Parameters:** | 参数 | 类型 | 说明 | |------|------|------| | id | string | 模型ID | **Response:** ```json { "id": "abc12345", "user_id": 1, "name": "Whisper 多语种识别", "vendor": "OpenAI Compatible", "language": "Multi-lingual", "base_url": "https://api.openai.com/v1", "api_key": "sk-***", "model_name": "whisper-1", "hotwords": [], "enable_punctuation": true, "enable_normalization": true, "enabled": true, "created_at": "2024-01-15T10:30:00Z" } ``` --- ### 3. 创建 ASR 模型 ```http POST /api/v1/asr ``` **Request Body:** ```json { "name": "SenseVoice 中文识别", "vendor": "OpenAI Compatible", "language": "zh", "base_url": "https://api.siliconflow.cn/v1", "api_key": "sk-your-api-key", "model_name": "paraformer-v2", "hotwords": ["小助手", "帮我"], "enable_punctuation": true, "enable_normalization": true, "enabled": true } ``` **Fields 说明:** | 字段 | 类型 | 必填 | 说明 | |------|------|------|------| | name | string | 是 | 模型显示名称 | | vendor | string | 是 | 供应商: "OpenAI Compatible" / "Paraformer" | | language | string | 是 | 语言: "zh" / "en" / "Multi-lingual" | | base_url | string | 是 | API Base URL | | api_key | string | 是 | API Key | | model_name | string | 否 | 模型名称 | | hotwords | string[] | 否 | 热词列表,提升识别准确率 | | enable_punctuation | boolean | 否 | 是否输出标点,默认 true | | enable_normalization | boolean | 否 | 是否文本规范化,默认 true | | enabled | boolean | 否 | 是否启用,默认 true | | id | string | 否 | 指定模型ID,默认自动生成 | --- ### 4. 更新 ASR 模型 ```http PUT /api/v1/asr/{id} ``` **Request Body:** (部分更新) ```json { "name": "Whisper-1 优化版", "language": "zh", "enable_punctuation": true, "hotwords": ["新词1", "新词2"] } ``` --- ### 5. 删除 ASR 模型 ```http DELETE /api/v1/asr/{id} ``` **Response:** ```json { "message": "Deleted successfully" } ``` --- ### 6. 测试 ASR 模型 ```http POST /api/v1/asr/{id}/test ``` **Request Body:** ```json { "audio_url": "https://example.com/test-audio.wav" } ``` 或使用 Base64 编码的音频数据: ```json { "audio_data": "UklGRi..." } ``` **Response (成功):** ```json { "success": true, "transcript": "您好,请问有什么可以帮助您?", "language": "zh", "confidence": 0.95, "latency_ms": 500 } ``` **Response (失败):** ```json { "success": false, "error": "HTTP Error: 401 - Unauthorized" } ``` --- ### 7. 转写音频 ```http POST /api/v1/asr/{id}/transcribe ``` **Query Parameters:** | 参数 | 类型 | 必填 | 说明 | |------|------|------|------| | audio_url | string | 否* | 音频文件URL | | audio_data | string | 否* | Base64编码的音频数据 | | hotwords | string[] | 否 | 热词列表 | *二选一,至少提供一个 **Response:** ```json { "success": true, "transcript": "您好,请问有什么可以帮助您?", "language": "zh", "confidence": 0.95 } ``` --- ## Schema 定义 ```python from enum import Enum from pydantic import BaseModel from typing import Optional, List from datetime import datetime class ASRLanguage(str, Enum): ZH = "zh" EN = "en" MULTILINGUAL = "Multi-lingual" class ASRModelBase(BaseModel): name: str vendor: str language: str # "zh" | "en" | "Multi-lingual" base_url: str api_key: str model_name: Optional[str] = None hotwords: List[str] = [] enable_punctuation: bool = True enable_normalization: bool = True enabled: bool = True class ASRModelCreate(ASRModelBase): id: Optional[str] = None class ASRModelUpdate(BaseModel): name: Optional[str] = None language: Optional[str] = None base_url: Optional[str] = None api_key: Optional[str] = None model_name: Optional[str] = None hotwords: Optional[List[str]] = None enable_punctuation: Optional[bool] = None enable_normalization: Optional[bool] = None enabled: Optional[bool] = None class ASRModelOut(ASRModelBase): id: str user_id: int created_at: datetime class Config: from_attributes = True class ASRTestRequest(BaseModel): audio_url: Optional[str] = None audio_data: Optional[str] = None # base64 encoded class ASRTestResponse(BaseModel): success: bool transcript: Optional[str] = None language: Optional[str] = None confidence: Optional[float] = None latency_ms: Optional[int] = None error: Optional[str] = None ``` --- ## 供应商配置示例 ### OpenAI Whisper ```json { "vendor": "OpenAI Compatible", "base_url": "https://api.openai.com/v1", "api_key": "sk-xxx", "model_name": "whisper-1", "language": "Multi-lingual", "enable_punctuation": true, "enable_normalization": true } ``` ### OpenAI Compatible Paraformer ```json { "vendor": "OpenAI Compatible", "base_url": "https://api.siliconflow.cn/v1", "api_key": "sf-xxx", "model_name": "paraformer-v2", "language": "zh", "hotwords": ["产品名称", "公司名"], "enable_punctuation": true, "enable_normalization": true } ``` --- ## 单元测试 项目包含完整的单元测试,位于 `api/tests/test_asr.py`。 ### 测试用例概览 | 测试方法 | 说明 | |----------|------| | test_get_asr_models_empty | 空数据库获取测试 | | test_create_asr_model | 创建模型测试 | | test_create_asr_model_minimal | 最小数据创建测试 | | test_get_asr_model_by_id | 获取单个模型测试 | | test_get_asr_model_not_found | 获取不存在模型测试 | | test_update_asr_model | 更新模型测试 | | test_delete_asr_model | 删除模型测试 | | test_list_asr_models_with_pagination | 分页测试 | | test_filter_asr_models_by_language | 按语言过滤测试 | | test_filter_asr_models_by_enabled | 按启用状态过滤测试 | | test_create_asr_model_with_hotwords | 热词配置测试 | | test_test_asr_model_siliconflow | OpenAI Compatible 供应商测试 | | test_test_asr_model_openai | OpenAI 供应商测试 | | test_different_asr_languages | 多语言测试 | | test_different_asr_vendors | 多供应商测试 | ### 运行测试 ```bash # 运行 ASR 相关测试 pytest api/tests/test_asr.py -v # 运行所有测试 pytest api/tests/ -v ```