39 Commits

Author SHA1 Message Date
Xin Wang
809b634420 Enhance AssistantConfig and pipeline for FastGPT integration
- Add new fields in AssistantConfig for FastGPT connection details, including `fastgpt_api_url`, `fastgpt_api_key`, and `fastgpt_app_id`.
- Update the pipeline to utilize the new FastGPT configuration, ensuring proper integration with external services.
- Introduce type handling for different assistant types, including support for realtime modes and external brain management.
- Refactor frontend components to include hints for FastGPT configuration inputs, improving user guidance during setup.
2026-06-16 16:55:51 +08:00
Xin Wang
43cc188d85 Add audio output silence configuration to prevent abrupt call termination
- Introduce a new parameter `audio_out_end_silence_secs` in the `_base_params` function to control the duration of silence added after the end frame, allowing for smoother call termination.
- Set the default value to 0 to ensure immediate hang-up after the end speech, enhancing user experience during call endings.
2026-06-16 09:39:16 +08:00
Xin Wang
b22a9e1045 Enhance pipeline execution and voice preview handling for graceful call termination
- Introduce mechanisms in the pipeline to ensure that the end call process waits for the completion of the end speech before hanging up, improving user experience during call termination.
- Update the useVoicePreview hook to handle server-initiated call endings gracefully, distinguishing between normal and error disconnections.
- Adjust TTS stop frame timeout settings to optimize the timing of call terminations, ensuring timely responses without unnecessary delays.
- Refactor related components to support the new end call logic, enhancing overall workflow management and user interaction.
2026-06-16 09:24:24 +08:00
Xin Wang
c2ef76620e Enhance workflow engine and frontend components with transition speech support
- Introduce edge transition speech functionality in the WorkflowEngine to provide optional speech during node transitions.
- Update pipeline execution to utilize the new transition speech feature, enhancing user experience by masking delays during transitions.
- Modify frontend components to support transition speech in edge specifications, allowing users to define and edit transition speech for edges.
- Refactor edge handling logic in the WorkflowEditor to accommodate the new transition speech field, improving workflow management capabilities.
2026-06-15 15:57:05 +08:00
Xin Wang
09a5ffbdbc Update node specifications and enhance GenericNode component
- Change the 'addable' property of a specific node type to true, allowing for dynamic addition of nodes.
- Modify the GenericNode component to include a new icon and adjust styles for better visual representation.
- Update node handling logic to prevent deletion of 'startCall' nodes and improve node change handling in the workflow editor.
- Refactor layout and styling in the WorkflowEditor for a more polished user interface.
2026-06-15 15:49:58 +08:00
Xin Wang
aae0342a57 Enhance workflow engine and integration in backend and frontend
- Introduce a new WorkflowEngine class to manage workflow graphs, enabling dynamic node-based interactions.
- Update AssistantConfig to include a graph field for workflow definitions, allowing for flexible configuration.
- Modify pipeline execution to support workflow-driven dialogue, integrating node transitions and system prompts based on active nodes.
- Enhance frontend components to visualize active nodes and provide debugging capabilities, including highlighting the current node during interactions.
- Refactor existing components to accommodate new workflow functionalities and improve overall user experience.
2026-06-15 15:32:10 +08:00
Xin Wang
c2a39257ff Add workflow editor and node types support in frontend and backend
- Introduce a new workflow editor component for visualizing and managing workflows, allowing users to add nodes and define connections.
- Implement backend support for node types, including validation and constraints for workflow graphs.
- Add new API endpoints for retrieving node types and their specifications.
- Enhance the AssistantPage to integrate the workflow editor, enabling users to create and edit workflows directly.
- Update frontend components to support new workflow functionalities, including condition edges and generic nodes.
- Refactor existing code to accommodate the new workflow features and improve overall structure.
2026-06-15 10:12:41 +08:00
Xin Wang
0309c154b5 Implement StepFun Realtime service and enhance AssistantConfig
- Add new fields to AssistantConfig for realtime interface configuration, including types, values, and secrets.
- Introduce StepFunRealtimeService to handle speech-to-speech processing via WebSocket, integrating STT, LLM, and TTS functionalities.
- Refactor pipeline execution to support a new realtime mode, allowing direct text input processing and immediate responses.
- Update model resource testing to include validation for StepFun Realtime connections.
- Enhance service factory to create realtime services based on configuration settings.
- Modify README documentation to reflect new realtime capabilities and usage instructions.
2026-06-14 23:41:40 +08:00
Xin Wang
d55b87cfbf Enhance LLM text streaming and message handling in backend and frontend
- Introduce event handlers in PassthroughLLMAssistantAggregator for managing LLM text streaming, including start, delta, and end events.
- Implement a new method to finalize text streams, ensuring proper handling of interruptions.
- Update useVoicePreview to support new message types for LLM text streaming, allowing real-time updates to chat messages.
- Enhance message sorting logic to maintain order based on timestamps and sequence numbers, improving user experience during voice interactions.
2026-06-14 22:18:21 +08:00
Xin Wang
b749d2e075 Enhance text input processing and LLM interaction in the backend
- Refactor TextInputProcessor to handle immediate and silent text inputs, improving user experience during voice interactions.
- Introduce PassthroughLLMAssistantAggregator to manage LLM responses while preserving context for downstream TTS processing.
- Update event handling for text input and client readiness, ensuring timely updates to the conversation context.
- Modify run_pipeline to integrate new aggregators and streamline message handling, enhancing overall pipeline efficiency.
- Improve message ordering in useVoicePreview to ensure accurate display of chat messages based on timestamps.
2026-06-14 22:12:56 +08:00
Xin Wang
86d9acce78 Refactor microphone selection handling in voice components
- Rename `setSelectedDeviceId` to `selectDevice` in `DebugVoicePanel` and `VoiceSessionControls` for clarity and consistency.
- Update `useVoicePreview` hook to implement the `selectDevice` function, enabling dynamic microphone switching during voice sessions.
- Enhance device selection logic to support real-time audio track replacement without requiring session reconnection.
2026-06-14 21:02:03 +08:00
Xin Wang
90e3e8a0c0 Refactor backend to support interface-definition driven model resources
- Introduce a new model structure for managing interface definitions and model resources, enhancing the backend's capability to handle various service integrations.
- Update the Makefile to reflect changes in database seeding and resource management commands.
- Remove the deprecated credentials management routes and replace them with a unified model registry API.
- Modify existing routes and schemas to align with the new model structure, ensuring seamless integration with the frontend.
- Enhance database seeding scripts to populate new model resources and their configurations.
- Update README documentation to reflect the new architecture and usage instructions for model resources and interface definitions.
2026-06-14 19:36:12 +08:00
Xin Wang
e25dfd4003 Add support for Xfyun ASR and TTS services in the backend
- Introduce new Xfyun ASR and TTS services, enabling integration with iFlytek's voice recognition and synthesis capabilities.
- Update AssistantConfig model to include interface types for STT and TTS.
- Enhance credential testing to validate Xfyun credentials.
- Modify service factory to create Xfyun services based on configuration.
- Update README with new configuration details for Xfyun integration.
- Add new frontend components for visualizing audio streams and managing user interactions.
2026-06-11 10:51:08 +08:00
Xin Wang
c69dec04e0 Implement microphone selection feature in voice preview
- Add audio input selection to DebugVoicePanel, allowing users to choose their microphone device.
- Update useVoicePreview hook to manage available audio inputs and selected device state.
- Enhance device enumeration and selection handling to ensure a seamless user experience during voice interactions.
2026-06-10 15:26:33 +08:00
Xin Wang
2c2af1f2cd Enhance voice interaction and transcript handling in the assistant
- Add a new Docker configuration for the UI in launch.json to facilitate development.
- Refactor pipeline.py to integrate a TranscriptProcessor for managing user and assistant transcripts, including event handlers for real-time updates and message handling.
- Update useVoicePreview.ts to establish a data channel for sending and receiving text messages, improving interaction flow.
- Modify AssistantPage.tsx to support displaying chat messages and sending user input, enhancing the user experience during voice interactions.
- Revise DebugTranscriptPanel to dynamically render chat messages with timestamps, improving the visual representation of conversation history.
2026-06-10 15:11:34 +08:00
Xin Wang
b711350c0c Refactor frontend routing and component structure for improved navigation
- Update CLAUDE.md to reflect changes in the navigation model, emphasizing the use of App Router routes for sidebar sections.
- Refactor layout.tsx to wrap children in AppShell, enhancing the overall layout structure.
- Replace AppShell usage in page.tsx with HomePage component for better separation of concerns.
- Introduce new pages for assistants, components, dashboard, history, profile, and test, each rendering their respective components.
- Revise Sidebar component to utilize Next.js Link for navigation and improve active state handling based on the current pathname.
- Update AssistantPage to support routing-driven modes (list, choose, edit) and streamline form handling for assistant creation and editing.
2026-06-10 14:39:52 +08:00
Xin Wang
0adb3ed8a1 Add initial setup for local HTTPS debugging and Nginx configuration
- Introduce `setup-certs.sh` script for generating trusted local TLS certificates using mkcert.
- Add Nginx configuration files for local and Docker environments to handle HTTPS requests and proxy to backend services.
- Update `docker-compose.yaml` to include Nginx service for unified TLS entry and adjust frontend service ports for local development.
- Create `AGENTS.md` and `README.md` files to document the local HTTPS setup process and usage instructions.
- Modify backend startup commands in `README.md` for consistency with new requirements.
- Add `.gitignore` to exclude generated certificates from version control.
2026-06-10 13:37:24 +08:00
Xin Wang
e94d98e947 fix frontend voice preview fallback 2026-06-10 12:36:18 +08:00
Xin Wang
4a948ee609 fix backend TTS provider compatibility 2026-06-10 12:32:55 +08:00
Xin Wang
ac3f4dd806 Enhance voice interaction features and introduce voice preview functionality
- Update README to reflect the integration of the DebugVoicePanel with WebSocket support for voice interactions.
- Refactor voice_webrtc.py to improve error handling during WebRTC signaling and include assistant_id in the offer payload.
- Add useVoicePreview hook to manage microphone access and WebRTC connections for real-time voice previews.
- Modify AssistantPage to incorporate new visualizer options and pass assistantId to DebugVoicePanel, enhancing user experience during audio interactions.
- Update API model to include new fields for voice, speed, and language, supporting TTS and ASR configurations.
2026-06-10 10:17:46 +08:00
Xin Wang
c839779d87 Add color adaptation functions for theme-based palette adjustments
- Implement rgbToHsl and hslToRgb functions for color space conversions.
- Introduce adaptPalette function to adjust colors based on dark/light themes, enhancing visual consistency.
- Add isDarkTheme function to determine the current theme state, improving theme handling across visual components.
2026-06-10 09:32:53 +08:00
Xin Wang
9327cff364 Refactor SpectrumVisualizer for improved audio visualization and responsiveness
- Update SpectrumVisualizer component to enhance the visual representation of audio frequencies with a new layout and smoother animations.
- Modify prop descriptions for clarity and adjust the number of frequency bars for better performance.
- Implement a refined drawing logic that maintains visual consistency across different themes and improves the overall user experience during audio playback.
2026-06-10 09:32:27 +08:00
Xin Wang
df7ce493f1 Enhance audio visualizers with new NebulaVisualizer and refactor existing components
- Introduce the NebulaVisualizer component, featuring particles that respond to audio input, enhancing the visual experience.
- Refactor AuraVisualizer, SpectrumVisualizer, and WaveVisualizer to utilize the adaptPalette function for improved theme handling.
- Update visualizer logic to enhance responsiveness and visual effects based on audio analysis, ensuring a cohesive user experience across components.
2026-06-10 09:17:14 +08:00
Xin Wang
6e83396d64 Add AssistantIdentity component to AssistantPage for assistant ID display and copy functionality
- Introduce the AssistantIdentity component to show the current assistant ID and provide a copy-to-clipboard feature.
- Update multiple sections of AssistantPage to include the AssistantIdentity component, enhancing user interaction with assistant IDs.
- Ensure the component handles the display of the assistant ID and provides feedback when copied successfully.
2026-06-09 16:48:40 +08:00
Xin Wang
b3fbfac5df Implement audio visualizers and refactor AssistantPage
- Introduce three new audio visualizer components: AuraVisualizer, SpectrumVisualizer, and WaveVisualizer, enhancing the audio interaction experience.
- Replace the deprecated VoiceVisualizer with the new visualizers, ensuring a cohesive visual language across components.
- Update the AssistantPage to support dynamic visualization style switching, improving user engagement during audio interactions.
- Refactor DebugVoicePanel to accommodate the new visualizer props and enhance the overall debugging interface.
2026-06-09 16:28:45 +08:00
Xin Wang
4f0f639e8f Refactor AssistantPage and DebugDrawer components
- Remove debug mode state management from AssistantPage, simplifying the component structure.
- Update DebugDrawer to eliminate mode selection, focusing on voice interaction features.
- Enhance the VoiceVisualizer component with improved visual effects and responsiveness to audio input.
- Adjust styles and layout for better user experience in the debugging interface.
2026-06-09 15:28:24 +08:00
Xin Wang
c64b7dcf99 Enhance credential management and testing functionality
- Introduce new fields for voice, speed, and language in the AssistantConfig and ProviderCredential models to support TTS and ASR configurations.
- Update the database schema and seeding script to accommodate the new fields, ensuring backward compatibility.
- Implement credential testing endpoints and logic to validate OpenAI-compatible credentials, enhancing user experience and reliability.
- Modify frontend components to include new fields in the credential forms and improve connection testing feedback.
- Refactor related services and API interactions to support the new credential testing feature.
2026-06-09 14:42:25 +08:00
Xin Wang
3661dab81c reorder seed data 2026-06-09 13:38:45 +08:00
Xin Wang
be0da3449c Enhance OpenCode form handling in AssistantPage
- Introduce a new model field in the OpenCode form to manage language model selection.
- Refactor the form handling logic to improve data loading and error management for OpenCode assistants.
- Update UI components to utilize ResourceSelectField for model and voice configuration, enhancing user experience.
- Clear form fields when creating new OpenCode entries to ensure a fresh start for users.
2026-06-09 12:59:07 +08:00
Xin Wang
a8b6c09920 Refactor API key handling in AssistantPage and ComponentsModelsPage
- Update AssistantPage to use a stored value mask for API keys, improving security and user experience.
- Modify ComponentsModelsPage to display the current API key contextually, enhancing clarity for users.
- Adjust related components to ensure consistent handling of API key visibility and management.
2026-06-09 12:53:03 +08:00
Xin Wang
044411edc6 Enhance AssistantPage and ComponentsModelsPage with API key management improvements
- Update AssistantPage to handle API key input more securely by removing placeholder values and allowing empty submissions to retain existing keys.
- Introduce a new SecretInputField component for API key entry, improving user experience with visibility toggling and contextual hints.
- Modify ComponentsModelsPage to reflect similar API key handling, ensuring users can manage keys effectively while providing feedback on existing configurations.
- Add EditorBackButton for better navigation within the AssistantPage.
2026-06-09 12:49:18 +08:00
Xin Wang
6acbac7d3b Update seed_credentials.sql with new model names and API keys
- Change 'DeepSeek-V3' to 'DeepSeek-Chat' and update its API key.
- Rename 'OpenAI TTS' to 'SiliconFlow-CosyVoice2-0.5B' and update its details.
- Add new models: 'SiliconFlow-TeleSpeechASR' and 'SiliconFlow-Qwen3-Embedding-4B' with corresponding API keys and configurations.
- Adjust existing entries to ensure consistency in the database seeding process.
2026-06-09 11:31:49 +08:00
Xin Wang
519cc0fefe Refactor assistant configuration and database seeding
- Update Makefile to include new database seed commands for assistants and credentials.
- Refactor assistant model to use explicit fields instead of a config dictionary, improving data integrity and clarity.
- Implement new seeding SQL script for assistants, ensuring dependencies on credentials are respected.
- Modify backend routes and frontend components to accommodate the new assistant structure, including direct field access for prompt, API URL, and keys.
- Enhance the AssistantPage component to handle the new data structure and streamline the save process for different assistant types.
2026-06-09 10:37:29 +08:00
Xin Wang
23e1cf5d42 Remove WorkflowPage and associated references from AppShell and Sidebar components 2026-06-09 10:27:47 +08:00
Xin Wang
30b96bb3be Add duplicate functionality for assistants and credentials
Implement server-side duplication for both assistants and credentials, allowing users to create copies with unique IDs and modified names. Update the respective API routes and frontend components to handle duplication requests, ensuring sensitive information is securely managed. Enhance the AssistantPage and ComponentsModelsPage to support this new feature, including loading and error handling for the duplication process.
2026-06-09 09:48:43 +08:00
Xin Wang
b444ea777c Implement knowledge base management and enhance assistant configuration
Add CRUD functionality for knowledge bases, including routes for listing, creating, updating, and deleting knowledge bases. Update the assistant model to include foreign key references to knowledge bases and modify the assistant configuration to handle external API keys securely. Refactor related services and routes to accommodate these changes, ensuring proper handling of credential resolution and configuration normalization.
2026-06-09 08:31:39 +08:00
Xin Wang
34fba494a3 Add duplicate assistant functionality to AssistantPage component
Implement a new feature that allows users to create a copy of an existing assistant. The duplicate is inserted after the original, with the name suffixed by "副本" and the updated timestamp. Update the state management to handle the list of assistants accordingly, and add a dropdown menu item for triggering the duplication.
2026-06-09 08:31:19 +08:00
Xin Wang
7e8e8624b4 Enhance AI Video Assistant platform with new Makefile for development commands, update CORS origins for local access, and implement API client for credential management. Add seed data for model credentials and refactor ComponentsModelsPage to utilize API for dynamic data loading. Update Next.js configuration for Turbopack compatibility. 2026-06-08 22:39:45 +08:00
Xin Wang
42cab2a6ef Initial commit: AI Video Assistant fullstack platform.
Add pipecat-based backend with WebRTC/WS voice routes, Next.js frontend, and Docker Compose orchestration.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-08 13:51:28 +08:00