Files

Xin Wang aeeeee20d1 Add Volcengine support for TTS and ASR services

- Introduced Volcengine as a new provider for both TTS and ASR services.
- Updated configuration files to include Volcengine-specific parameters such as app_id, resource_id, and uid.
- Enhanced the ASR service to support streaming mode with Volcengine's API.
- Modified existing tests to validate the integration of Volcengine services.
- Updated documentation to reflect the addition of Volcengine as a supported provider for TTS and ASR.
- Refactored service factory to accommodate Volcengine alongside existing providers.

2026-03-08 23:09:50 +08:00

1.3 KiB

Raw Blame History

语音识别

语音识别（ASR）负责将用户音频实时转写为文本，供对话引擎理解。

模式

offline：引擎本地缓冲音频后触发识别（适用于 OpenAI-compatible / SiliconFlow）。
streaming：音频分片实时发送到服务端，服务端持续返回转写事件（适用于 DashScope Realtime ASR、Volcengine BigASR）。

配置项

配置项	说明
ASR 引擎	选择语音识别服务提供商
模型	识别模型名称
`enable_interim`	是否开启离线 ASR 中间结果（默认 `false`，仅离线模式生效）
`app_id` / `resource_id`	Volcengine 等厂商的应用标识与资源标识
`request_params`	厂商原生请求参数透传，例如 `end_window_size`、`force_to_speech_time`、`context`
语言	中文/英文/多语言
热词	提升特定词汇识别准确率
标点与规范化	是否自动补全标点、文本规范化

建议

客服场景建议开启热词并维护业务词表
多语言场景建议按会话入口显式指定语言
对延迟敏感场景优先选择流式识别模型
当前支持提供商：openai_compatible、siliconflow、dashscope、volcengine、buffered（回退）

1.3 KiB Raw Blame History Unescape Escape

语音识别

模式

配置项

建议

相关文档

1.3 KiB

Raw Blame History