Commit Graph

256 Commits

Author SHA1 Message Date
Xin Wang
bfe165daae Add DashScope ASR model support and enhance related components
- Introduced DashScope as a new ASR model in the database initialization.
- Updated ASRModel schema to include vendor information.
- Enhanced ASR router to support DashScope-specific functionality, including connection testing and preview capabilities.
- Modified frontend components to accommodate DashScope as a selectable vendor with appropriate default settings.
- Added tests to validate DashScope ASR model creation, updates, and connectivity.
- Updated backend API to handle DashScope-specific base URLs and vendor normalization.
2026-03-09 07:37:00 +08:00
Xin Wang
e41d34fe23 Add DashScope agent configuration files for VAD, LLM, TTS, and ASR services
- Introduced new YAML configuration files for DashScope, detailing agent behavior settings for VAD, LLM, TTS, and ASR.
- Configured parameters including model paths, API keys, and service URLs for real-time processing.
- Ensured compatibility with existing agent-side behavior management while providing specific settings for DashScope integration.
2026-03-08 23:28:08 +08:00
Xin Wang
aeeeee20d1 Add Volcengine support for TTS and ASR services
- Introduced Volcengine as a new provider for both TTS and ASR services.
- Updated configuration files to include Volcengine-specific parameters such as app_id, resource_id, and uid.
- Enhanced the ASR service to support streaming mode with Volcengine's API.
- Modified existing tests to validate the integration of Volcengine services.
- Updated documentation to reflect the addition of Volcengine as a supported provider for TTS and ASR.
- Refactored service factory to accommodate Volcengine alongside existing providers.
2026-03-08 23:09:50 +08:00
Xin Wang
3604db21eb Remove obsolete audio example files from the project 2026-03-06 14:43:11 +08:00
Xin Wang
da38157638 Add ASR interim results support in Assistant model and API
- Introduced `asr_interim_enabled` field in the Assistant model to control interim ASR results.
- Updated AssistantBase and AssistantUpdate schemas to include the new field.
- Modified the database schema to add the `asr_interim_enabled` column.
- Enhanced runtime metadata to reflect interim ASR settings.
- Updated API endpoints and tests to validate the new functionality.
- Adjusted documentation to include details about interim ASR results configuration.
2026-03-06 12:58:54 +08:00
Xin Wang
e11c3abb9e Implement DashScope ASR provider and enhance ASR service architecture
- Added DashScope ASR service implementation for real-time streaming.
- Updated ASR provider logic to support DashScope alongside existing providers.
- Enhanced runtime metadata resolution to include DashScope as a valid ASR provider.
- Modified configuration files and documentation to reflect the addition of DashScope.
- Introduced tests to validate DashScope integration and ASR service behavior.
- Refactored ASR service factory to accommodate new provider options and modes.
2026-03-06 11:44:39 +08:00
Xin Wang
7e0b777923 Refactor project structure and enhance backend integration
- Expanded package inclusion in `pyproject.toml` to support new modules.
- Introduced new `adapters` and `protocol` packages for better organization.
- Added backend adapter implementations for control plane integration.
- Updated main application imports to reflect new package structure.
- Removed deprecated core components and adjusted documentation accordingly.
- Enhanced architecture documentation to clarify the new runtime and integration layers.
2026-03-06 09:51:56 +08:00
Xin Wang
4e2450e800 Refactor backend integration and service architecture
- Removed the backend client compatibility wrapper and associated methods to streamline backend integration.
- Updated session management to utilize control plane gateways and runtime configuration providers.
- Adjusted TTS service implementations to remove the EdgeTTS service and simplify service dependencies.
- Enhanced documentation to reflect changes in backend integration and service architecture.
- Updated configuration files to remove deprecated TTS provider options and clarify available settings.
2026-03-06 09:00:43 +08:00
Xin Wang
6b589a1b7c Enhance session management and logging configuration
- Updated .env.example to clarify audio frame size validation and default codec settings.
- Refactored logging setup in main.py to support JSON serialization based on log format configuration.
- Improved session.py to dynamically compute audio frame bytes and include protocol version in session events.
- Added tests to validate session start events and audio frame handling based on chunk size settings.
2026-03-05 21:44:23 +08:00
Xin Wang
1cecbaa172 Update .gitignore and add audio example file
- Removed duplicate entry for Thumbs.db in .gitignore to streamline ignored files.
- Added a new audio example file: three_utterances_simple.wav to the audio_examples directory.
2026-03-05 21:28:17 +08:00
Xin Wang
935f2fbd1f Refactor assistant configuration management and update documentation
- Removed legacy agent profile settings from the .env.example and README, streamlining the configuration process.
- Introduced a new local YAML configuration adapter for assistant settings, allowing for easier management of assistant profiles.
- Updated backend integration documentation to clarify the behavior of assistant config sourcing based on backend URL settings.
- Adjusted various service implementations to directly utilize API keys from the new configuration structure.
- Enhanced test coverage for the new local YAML adapter and its integration with backend services.
2026-03-05 21:24:15 +08:00
Xin Wang
d0a6419990 Remove duplicate entry for Vocode Core from the roadmap documentation, streamlining the list of reference projects. 2026-03-05 13:22:21 +08:00
Xin Wang
b8760c24be Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-03-05 13:20:40 +08:00
Xin Wang
14abbe6f10 Update roadmap documentation with additional reference projects
- Added new sections for open-source and commercial projects to enhance resource visibility.
- Included links to various relevant projects, expanding the list of resources available for users.
2026-03-05 13:17:37 +08:00
Xin Wang
efdcbe5550 Update roadmap documentation with additional reference projects
- Added new sections for open-source and commercial projects to enhance resource visibility.
- Included links to various relevant projects, expanding the list of resources available for users.
2026-03-05 13:14:22 +08:00
Xin Wang
3b6a2f75ee Add changelog README and update roadmap with reference projects
- Created a new README file for the changelog to outline version history.
- Updated the roadmap documentation to replace the contribution section with a list of reference projects, enhancing resource visibility.
2026-03-05 12:53:18 +08:00
Xin Wang
ac9b0047ee Add Mermaid diagram support and update architecture documentation
- Included a new JavaScript file for Mermaid configuration to ensure consistent diagram sizing across documentation.
- Enhanced architecture documentation to reflect the updated pipeline engine structure, including VAD, ASR, TD, LLM, and TTS components.
- Updated various sections to clarify the integration of external services and tools within the architecture.
- Improved styling for Mermaid diagrams to enhance visual consistency and usability.
2026-03-05 11:01:56 +08:00
Xin Wang
4748f3b5f1 Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-03-04 11:21:47 +08:00
Xin Wang
947af3a525 Refactor mkdocs.yml and add new documentation for workflow configuration and voice customization
- Restructured the navigation in mkdocs.yml to improve organization, introducing subcategories for assistant creation and component libraries.
- Added new documentation for workflow configuration options, detailing setup and best practices.
- Introduced new sections for voice recognition and generation, outlining configuration items and recommendations for optimal performance.
2026-03-04 11:21:33 +08:00
Xin Wang
d572e1a7f0 Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-03-04 11:08:27 +08:00
Xin Wang
d03b3b0e0c Refactor mkdocs.yml for improved navigation structure
- Adjusted indentation in mkdocs.yml to enhance readability and maintain consistency in the navigation hierarchy.
- Ensured that sections for "功能定制" and "数据分析" are clearly organized under their respective categories.
2026-03-04 10:57:18 +08:00
Xin Wang
526024d603 Enhance assistant configuration documentation with details on persistence and runtime overrides
- Added a new section explaining the two layers of assistant configuration: database persistence and session-level overrides.
- Included a table listing fields that are stored in the database and those that can be overridden during a session.
- Provided code examples demonstrating the merging of baseline configuration with session overrides for clarity.
2026-03-04 10:57:02 +08:00
Xin Wang
b4c6277d2a Add telephone integration to roadmap documentation
- Included a new item in the roadmap for telephone integration, specifying automatic call handling and batch calling capabilities.
- Updated the existing SDK support section to reflect the addition of this feature.
2026-03-04 10:42:41 +08:00
Xin Wang
a8fa66e9cc Update documentation to reflect changes in WebSocket API message formatting and knowledge base
- Updated the WebSocket API reference to improve clarity by removing unnecessary headings and emphasizing message types.
- Revised the index.md to specify 'chroma' as the knowledge base, enhancing the overview of the platform's architecture.
2026-03-04 10:32:56 +08:00
Xin Wang
aaef370d70 Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-03-04 10:01:41 +08:00
Xin Wang
7d4af18815 Add output.audio.played message handling and update documentation
- Introduced `output.audio.played` message type for client acknowledgment of audio playback completion.
- Updated `DuplexPipeline` to track client playback state and handle playback completion events.
- Enhanced session handling to route `output.audio.played` messages to the pipeline.
- Revised API documentation to include details about the new message type and its fields.
- Updated schema documentation to reflect the addition of `output.audio.played` in the message flow.
2026-03-04 10:01:34 +08:00
Xin Wang
530d95eea4 Enhance Docker configuration and update dependencies for Realtime Agent Studio
- Updated Dockerfile for the API to include build tools for C++11 required for native extensions.
- Revised requirements.txt to upgrade several dependencies, including FastAPI and SQLAlchemy.
- Expanded docker-compose.yml to add MinIO service for S3-compatible storage and improved health checks for backend and engine services.
- Enhanced README.md in the Docker directory to provide detailed service descriptions and quick start instructions.
- Updated mkdocs.yml to reflect new navigation structure and added deployment overview documentation.
- Introduced new Dockerfiles for the engine and web services, including development configurations for hot reloading.
2026-03-04 10:01:00 +08:00
Xin Wang
4c05131536 Update documentation and configuration for Realtime Agent Studio
- Revised mkdocs.yml to reflect the new site name and description, enhancing clarity for users.
- Added a changelog.md to document important changes and updates for the project.
- Introduced a roadmap.md to outline development plans and progress for future releases.
- Expanded index.md with a comprehensive overview of the platform, including core features and installation instructions.
- Enhanced concepts documentation with detailed explanations of assistants, engines, and their configurations.
- Updated configuration documentation to provide clear guidance on environment setup and service configurations.
- Added extra JavaScript for improved user experience in the documentation site.
2026-03-02 23:35:22 +08:00
Xin Wang
80fff09b76 Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-03-02 22:51:03 +08:00
Xin Wang
eecde9f0fb Integrate React Query for data management and enhance Debug Preferences
- Added React Query for managing API calls related to assistants and voices.
- Introduced `useAssistantsQuery` and `useVoicesQuery` hooks for fetching data.
- Implemented mutations for creating, updating, and deleting voices using React Query.
- Integrated a global `QueryClient` for managing query states and configurations.
- Refactored components to utilize the new query hooks, improving data handling and performance.
- Added a Zustand store for managing debug preferences, including WebSocket URL and audio settings.
2026-03-02 22:50:57 +08:00
Xin Wang
7fbf52078f Update documentation to reflect changes in quickstart navigation and API reference
- Replaced the "通过控制台" and "通过 API" entries in the quickstart section with "资源库配置" for improved clarity.
- Updated the API reference link in index.md to direct users to the main quickstart page instead of the outdated API usage example.
2026-03-02 17:33:32 +08:00
Xin Wang
a003134477 Update documentation to enhance clarity and resource configuration for RAS
- Revised the introduction in index.md to emphasize the need for resource configuration before creating an AI assistant.
- Added a new section detailing the configuration process for ASR, LLM, and TTS resources.
- Updated the quickstart guide to reflect the new resource management steps and included troubleshooting tips for common issues.
- Removed the outdated API guide as it has been integrated into the new resource configuration workflow.
2026-03-02 17:30:48 +08:00
Xin Wang
85315ba6ca Update index.md to clarify RAS's core focus on large voice models
- Revised the description of the Realtime Agent Studio (RAS) to emphasize its foundation on large voice models, enhancing clarity on the platform's capabilities.
2026-03-02 17:01:55 +08:00
Xin Wang
9734b38808 Add task list support and update roadmap in documentation
- Added pymdownx.tasklist extension to mkdocs.yml for enhanced task management.
- Revised the roadmap section in index.md to include additional completed and in-progress tasks, improving project tracking and visibility.
2026-03-02 17:01:24 +08:00
Xin Wang
0a7a3253a6 Add emoji support and enhance documentation in RAS
- Added pymdownx.emoji extension to mkdocs.yml for emoji rendering.
- Updated index.md to include a new dashboard image and revised descriptions for clarity.
- Expanded the features section with detailed descriptions of tools and testing capabilities.
- Introduced a roadmap section outlining completed, in-progress, and to-do features for better project visibility.
2026-03-02 16:50:17 +08:00
Xin Wang
a82100fc79 Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-03-02 15:12:04 +08:00
Xin Wang
d0897aca92 Update documentation to reflect rebranding from AI Video Assistant to Realtime Agent Studio (RAS)
- Changed site name and description in mkdocs.yml.
- Revised content in index.md to provide a comprehensive overview of RAS features and capabilities.
- Updated API reference and error documentation to replace AI Video Assistant with RAS.
- Modified deployment and getting started guides to align with the new branding.
- Enhanced quickstart instructions to specify RAS service requirements.
2026-03-02 15:11:33 +08:00
Xin Wang
70b4043f9b Enhance DebugDrawer to support voice prompts in text prompt dialogs
- Added `promptType` and `voiceText` properties to `DebugTextPromptDialogState`.
- Updated state management for text prompt dialogs to handle voice prompts.
- Modified dialog activation logic to play voice prompts when applicable.
- Adjusted UI to reflect the type of prompt being displayed (text or voice).
- Ensured proper handling of prompt closure messages based on prompt type.
2026-03-02 15:10:03 +08:00
Xin Wang
3aa9e0f432 Enhance DuplexPipeline to support follow-up context for manual opener tool calls
- Introduced logic to trigger a follow-up turn when the manual opener greeting is empty.
- Updated `_execute_manual_opener_tool_calls` to return structured tool call and result data.
- Added `_build_manual_opener_follow_up_context` method to construct context for follow-up turns.
- Modified `_handle_turn` to accept system context for improved conversation management.
- Enhanced tests to validate the new follow-up behavior and ensure proper context handling.
2026-03-02 14:27:44 +08:00
Xin Wang
fb017f9952 Refactor selectedToolSchemas logic in DebugDrawer to simplify tool ID normalization. Removed redundant inclusion of DEBUG_CLIENT_TOOLS, enhancing code clarity and performance. 2026-03-02 12:40:00 +08:00
Xin Wang
00b88c5afa Add manual opener tool calls to Assistant model and API
- Introduced `manual_opener_tool_calls` field in the Assistant model to support custom tool calls.
- Updated AssistantBase and AssistantUpdate schemas to include the new field.
- Implemented normalization and migration logic for handling manual opener tool calls in the API.
- Enhanced runtime metadata to include manual opener tool calls in responses.
- Updated tests to validate the new functionality and ensure proper handling of tool calls.
- Refactored tool ID normalization to support legacy tool names for backward compatibility.
2026-03-02 12:34:42 +08:00
Xin Wang
b5cdb76e52 Implement initial generated opener logic in DuplexPipeline to utilize tool-capable assistant turns when tools are available. Update tests to verify the correct behavior of the generated opener under various conditions, ensuring proper handling of user input and task management. 2026-03-02 02:47:30 +08:00
Xin Wang
4d553de34d Refactor assistant greeting logic to conditionally use system prompt for generated openers. Update related tests to verify new behavior and ensure correct metadata handling in API responses. Enhance UI to reflect changes in opener management based on generated opener settings. 2026-03-02 02:38:45 +08:00
Xin Wang
31b3969b96 Enhance ToolLibrary by adding sourceKey to ToolParameterDraft and updating related functions for improved schema management. Introduce normalization functions for object schemas and defaults, and refactor buildToolParameterConfig to utilize these enhancements. Update state management in ToolLibraryPage to accommodate new schema handling and defaults integration. 2026-03-02 02:18:28 +08:00
Xin Wang
3f22e2b875 Merge branch 'master' of https://gitea.xiaowang.eu.org/wx44wx/AI-VideoAssistant 2026-03-02 01:56:47 +08:00
Xin Wang
531688aa6b Enhance API documentation by adding new endpoints for ASR preview, assistant configuration retrieval, and knowledge base management. Update existing assistant and tool definitions for improved clarity and functionality. Remove outdated sections from history records documentation, ensuring a streamlined reference for users. 2026-03-02 01:56:38 +08:00
Xin Wang
3626297211 Implement schema editor functionality in ToolLibrary, allowing users to manage tool parameters with JSON schema validation. Add a drawer for schema editing, enhance state management for schema-related errors, and integrate schema defaults into tool parameter configuration. Update UI to include a button for opening the schema drawer. 2026-03-02 01:54:54 +08:00
Xin Wang
1561056a3d Add voice_choice_prompt and text_choice_prompt tools to API and UI. Implement state management and parameter definitions for user selection prompts, enhancing user interaction and experience. 2026-03-02 00:49:31 +08:00
Xin Wang
3a5d27d6c3 Implement runtime configuration debugging in DebugDrawer by adding a new function to format session metadata and WebSocket configuration. Update the display logic to enhance clarity and user experience, including renaming UI elements for better context. 2026-03-01 23:14:08 +08:00
Xin Wang
3643431565 Enhance WebSocket session configuration by introducing an optional config.resolved event, which provides a public snapshot of the session's configuration. Update the API reference documentation to clarify the conditions under which this event is emitted and the details it includes. Modify session management to respect the new setting for emitting configuration details, ensuring sensitive information remains secure. Update tests to validate the new behavior and ensure compliance with the updated configuration schema. 2026-03-01 23:08:44 +08:00