Commit Graph

229 Commits

Author SHA1 Message Date
Xin Wang
403b4b93c7 Add ASR capture timeout handling in DuplexPipeline and enhance EOU detection logic. Introduce _ASR_CAPTURE_MAX_MS constant and manage capture state timing to ensure timely end of utterance processing, even during silence. Update EouDetector to allow silence-only EOU when VAD state is lost. 2026-02-27 09:59:54 +08:00
Xin Wang
0b308f9bce Remove deprecated agent configuration files: default.yaml, example.yaml, and tools.yaml, streamlining the agent behavior setup and eliminating unused parameters. 2026-02-27 09:39:23 +08:00
Xin Wang
e14eac347f Update default.yaml configuration for speech agent parameters, adjusting min_speech_duration_ms from 100 to 120 ms and eou_threshold_ms from 800 to 1300 ms. Modify audio model parameters: set start_min_speech_ms to 100 ms, pre_speech_ms to 360 ms, and final_tail_ms to 180 ms for improved audio processing. 2026-02-27 09:00:38 +08:00
Xin Wang
0f02de5fc3 Update AssistantsPage to include new icon for audio preview button and adjust text for clarity. Add square icon for stop audio preview button to enhance UI consistency. 2026-02-26 16:35:12 +08:00
Xin Wang
0de6fe529e Add audio preview functionality for assistant opener audio in AssistantsPage. Implement controls for previewing and stopping audio playback, and integrate new API endpoint for fetching PCM buffer. Enhance user interface with updated button states for audio actions. 2026-02-26 16:15:31 +08:00
Xin Wang
fb95e2abe2 Add opener audio functionality to Assistant model and related schemas, enabling audio generation and playback features. Update API routes and frontend components to support opener audio management, including status retrieval and generation controls. 2026-02-26 14:31:50 +08:00
Xin Wang
833cb0d4c4 Add unsaved changes confirmation dialog in AssistantsPage to enhance user experience when opening debug window 2026-02-26 14:17:01 +08:00
Xin Wang
fbbb2e0fee Add assistant snapshot management to track unsaved changes and enhance debug handling 2026-02-26 14:13:47 +08:00
Xin Wang
da83c8ec8a Implement initial greeting emission in DuplexPipeline after session activation, ensuring proper event ordering for frontend notifications. 2026-02-26 14:07:46 +08:00
Xin Wang
cfc8db3fe7 Implement API URL resolution in OpenAICompatibleASRService to ensure correct endpoint handling for transcription requests. 2026-02-26 12:04:59 +08:00
Xin Wang
37b646186d Refactor DebugDrawer to store submitted session metadata and update resolved config view with merged metadata 2026-02-26 11:21:34 +08:00
Xin Wang
14b4b3d966 Implement API URL resolution for OpenAICompatibleTTSService to handle both base and full speech endpoint formats. 2026-02-26 11:07:54 +08:00
Xin Wang
1bcf625f86 Enhance runtime metadata fetching by including assistant and app IDs, and defaulting channel to 'web_debug' 2026-02-26 11:04:54 +08:00
Xin Wang
8bc21c7874 Add detailed logging for session runtime configuration and service resolution 2026-02-26 11:02:15 +08:00
Xin Wang
f77f7c7531 Voice library support dashscope 2026-02-26 03:54:52 +08:00
Xin Wang
b193f91432 Set DashScope TTS default mode to commit 2026-02-26 03:10:07 +08:00
Xin Wang
562341a72c add dashscope tts 2026-02-26 03:02:48 +08:00
Xin Wang
6744646390 Update frontend debug drawer 2026-02-26 02:23:23 +08:00
Xin Wang
72ed7d0512 Unify db api 2026-02-26 01:58:39 +08:00
Xin Wang
56f8aa2191 Fix talking voice error 2026-02-12 19:39:26 +08:00
Xin Wang
81ed89b84f Vendor can show more 2026-02-12 19:29:24 +08:00
Xin Wang
3c7efce80b Consistent library UI 2026-02-12 19:23:30 +08:00
Xin Wang
20afc63a28 Make init db args cleaen 2026-02-12 19:16:50 +08:00
Xin Wang
da1293e39a Assistants use short generated id too 2026-02-12 19:11:09 +08:00
Xin Wang
28ca003662 Use generated short id for llm asr tts 2026-02-12 19:05:50 +08:00
Xin Wang
14991af1bf Clean migrate code in init db 2026-02-12 18:51:27 +08:00
Xin Wang
ff3a03b1ad Use openai compatible as vendor 2026-02-12 18:44:55 +08:00
Xin Wang
260ff621bf Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-02-12 18:01:19 +08:00
Xin Wang
0f9543d8a4 Voice add interface change 2026-02-12 18:01:05 +08:00
Xin Wang
024beeaea3 Make init db slim 2026-02-12 17:54:01 +08:00
Xin Wang
98207936ae Update .env.example 2026-02-12 17:44:38 +08:00
Xin Wang
35bd83767e Cleanup engine 2026-02-12 17:42:21 +08:00
Xin Wang
838c19bf9c Add env example 2026-02-12 17:02:41 +08:00
Xin Wang
aabf2ce8b9 Fix asr begin error 2026-02-12 16:52:42 +08:00
Xin Wang
543528239e Tune engine vad config 2026-02-12 16:29:55 +08:00
Xin Wang
a92a56b845 Presist opener to history 2026-02-12 15:59:36 +08:00
Xin Wang
bbfb5570cc Remove redundant code in init_db 2026-02-12 15:59:21 +08:00
Xin Wang
399c9c97b1 Add tool call log 2026-02-12 15:44:01 +08:00
Xin Wang
6744704c7e Make get time tool use system tool 2026-02-12 15:39:09 +08:00
Xin Wang
39bcd67eac Update voice vad ui 2026-02-12 15:31:45 +08:00
Xin Wang
82521e7b90 Update assistants setting ui 2026-02-12 15:28:10 +08:00
Xin Wang
edcbc2cec7 Add first turn option 2026-02-12 15:23:32 +08:00
Xin Wang
56ca95c200 Improve UI 2026-02-12 15:09:25 +08:00
Xin Wang
cbebfe1c7a Fix opener not trigger when tts disabled 2026-02-12 14:55:03 +08:00
Xin Wang
a7ef8858de Fix frontend opener showing 2026-02-12 14:46:16 +08:00
Xin Wang
ef13ddb6b2 Text drawer use generated opener 2026-02-12 14:40:22 +08:00
Xin Wang
a17ef6f182 Remove db migration code to init 2026-02-12 14:29:47 +08:00
Xin Wang
d41db6418c Add bot not interrupt and generated opener 2026-02-12 13:51:27 +08:00
Xin Wang
6179053388 Fix text and tool interleaving with minimal change 2026-02-11 15:32:43 +08:00
Xin Wang
6e63b49a4c Clean ovewview code 2026-02-11 15:01:59 +08:00