ai-video-fullstack

Author	SHA1	Message	Date
Xin Wang	809b634420	Enhance AssistantConfig and pipeline for FastGPT integration - Add new fields in AssistantConfig for FastGPT connection details, including `fastgpt_api_url`, `fastgpt_api_key`, and `fastgpt_app_id`. - Update the pipeline to utilize the new FastGPT configuration, ensuring proper integration with external services. - Introduce type handling for different assistant types, including support for realtime modes and external brain management. - Refactor frontend components to include hints for FastGPT configuration inputs, improving user guidance during setup.	2026-06-16 16:55:51 +08:00
Xin Wang	43cc188d85	Add audio output silence configuration to prevent abrupt call termination - Introduce a new parameter `audio_out_end_silence_secs` in the `_base_params` function to control the duration of silence added after the end frame, allowing for smoother call termination. - Set the default value to 0 to ensure immediate hang-up after the end speech, enhancing user experience during call endings.	2026-06-16 09:39:16 +08:00
Xin Wang	b22a9e1045	Enhance pipeline execution and voice preview handling for graceful call termination - Introduce mechanisms in the pipeline to ensure that the end call process waits for the completion of the end speech before hanging up, improving user experience during call termination. - Update the useVoicePreview hook to handle server-initiated call endings gracefully, distinguishing between normal and error disconnections. - Adjust TTS stop frame timeout settings to optimize the timing of call terminations, ensuring timely responses without unnecessary delays. - Refactor related components to support the new end call logic, enhancing overall workflow management and user interaction.	2026-06-16 09:24:24 +08:00
Xin Wang	c2ef76620e	Enhance workflow engine and frontend components with transition speech support - Introduce edge transition speech functionality in the WorkflowEngine to provide optional speech during node transitions. - Update pipeline execution to utilize the new transition speech feature, enhancing user experience by masking delays during transitions. - Modify frontend components to support transition speech in edge specifications, allowing users to define and edit transition speech for edges. - Refactor edge handling logic in the WorkflowEditor to accommodate the new transition speech field, improving workflow management capabilities.	2026-06-15 15:57:05 +08:00
Xin Wang	09a5ffbdbc	Update node specifications and enhance GenericNode component - Change the 'addable' property of a specific node type to true, allowing for dynamic addition of nodes. - Modify the GenericNode component to include a new icon and adjust styles for better visual representation. - Update node handling logic to prevent deletion of 'startCall' nodes and improve node change handling in the workflow editor. - Refactor layout and styling in the WorkflowEditor for a more polished user interface.	2026-06-15 15:49:58 +08:00
Xin Wang	aae0342a57	Enhance workflow engine and integration in backend and frontend - Introduce a new WorkflowEngine class to manage workflow graphs, enabling dynamic node-based interactions. - Update AssistantConfig to include a graph field for workflow definitions, allowing for flexible configuration. - Modify pipeline execution to support workflow-driven dialogue, integrating node transitions and system prompts based on active nodes. - Enhance frontend components to visualize active nodes and provide debugging capabilities, including highlighting the current node during interactions. - Refactor existing components to accommodate new workflow functionalities and improve overall user experience.	2026-06-15 15:32:10 +08:00
Xin Wang	c2a39257ff	Add workflow editor and node types support in frontend and backend - Introduce a new workflow editor component for visualizing and managing workflows, allowing users to add nodes and define connections. - Implement backend support for node types, including validation and constraints for workflow graphs. - Add new API endpoints for retrieving node types and their specifications. - Enhance the AssistantPage to integrate the workflow editor, enabling users to create and edit workflows directly. - Update frontend components to support new workflow functionalities, including condition edges and generic nodes. - Refactor existing code to accommodate the new workflow features and improve overall structure.	2026-06-15 10:12:41 +08:00
Xin Wang	0309c154b5	Implement StepFun Realtime service and enhance AssistantConfig - Add new fields to AssistantConfig for realtime interface configuration, including types, values, and secrets. - Introduce StepFunRealtimeService to handle speech-to-speech processing via WebSocket, integrating STT, LLM, and TTS functionalities. - Refactor pipeline execution to support a new realtime mode, allowing direct text input processing and immediate responses. - Update model resource testing to include validation for StepFun Realtime connections. - Enhance service factory to create realtime services based on configuration settings. - Modify README documentation to reflect new realtime capabilities and usage instructions.	2026-06-14 23:41:40 +08:00
Xin Wang	d55b87cfbf	Enhance LLM text streaming and message handling in backend and frontend - Introduce event handlers in PassthroughLLMAssistantAggregator for managing LLM text streaming, including start, delta, and end events. - Implement a new method to finalize text streams, ensuring proper handling of interruptions. - Update useVoicePreview to support new message types for LLM text streaming, allowing real-time updates to chat messages. - Enhance message sorting logic to maintain order based on timestamps and sequence numbers, improving user experience during voice interactions.	2026-06-14 22:18:21 +08:00
Xin Wang	b749d2e075	Enhance text input processing and LLM interaction in the backend - Refactor TextInputProcessor to handle immediate and silent text inputs, improving user experience during voice interactions. - Introduce PassthroughLLMAssistantAggregator to manage LLM responses while preserving context for downstream TTS processing. - Update event handling for text input and client readiness, ensuring timely updates to the conversation context. - Modify run_pipeline to integrate new aggregators and streamline message handling, enhancing overall pipeline efficiency. - Improve message ordering in useVoicePreview to ensure accurate display of chat messages based on timestamps.	2026-06-14 22:12:56 +08:00
Xin Wang	86d9acce78	Refactor microphone selection handling in voice components - Rename `setSelectedDeviceId` to `selectDevice` in `DebugVoicePanel` and `VoiceSessionControls` for clarity and consistency. - Update `useVoicePreview` hook to implement the `selectDevice` function, enabling dynamic microphone switching during voice sessions. - Enhance device selection logic to support real-time audio track replacement without requiring session reconnection.	2026-06-14 21:02:03 +08:00
Xin Wang	90e3e8a0c0	Refactor backend to support interface-definition driven model resources - Introduce a new model structure for managing interface definitions and model resources, enhancing the backend's capability to handle various service integrations. - Update the Makefile to reflect changes in database seeding and resource management commands. - Remove the deprecated credentials management routes and replace them with a unified model registry API. - Modify existing routes and schemas to align with the new model structure, ensuring seamless integration with the frontend. - Enhance database seeding scripts to populate new model resources and their configurations. - Update README documentation to reflect the new architecture and usage instructions for model resources and interface definitions.	2026-06-14 19:36:12 +08:00
Xin Wang	e25dfd4003	Add support for Xfyun ASR and TTS services in the backend - Introduce new Xfyun ASR and TTS services, enabling integration with iFlytek's voice recognition and synthesis capabilities. - Update AssistantConfig model to include interface types for STT and TTS. - Enhance credential testing to validate Xfyun credentials. - Modify service factory to create Xfyun services based on configuration. - Update README with new configuration details for Xfyun integration. - Add new frontend components for visualizing audio streams and managing user interactions.	2026-06-11 10:51:08 +08:00
Xin Wang	c69dec04e0	Implement microphone selection feature in voice preview - Add audio input selection to DebugVoicePanel, allowing users to choose their microphone device. - Update useVoicePreview hook to manage available audio inputs and selected device state. - Enhance device enumeration and selection handling to ensure a seamless user experience during voice interactions.	2026-06-10 15:26:33 +08:00
Xin Wang	2c2af1f2cd	Enhance voice interaction and transcript handling in the assistant - Add a new Docker configuration for the UI in launch.json to facilitate development. - Refactor pipeline.py to integrate a TranscriptProcessor for managing user and assistant transcripts, including event handlers for real-time updates and message handling. - Update useVoicePreview.ts to establish a data channel for sending and receiving text messages, improving interaction flow. - Modify AssistantPage.tsx to support displaying chat messages and sending user input, enhancing the user experience during voice interactions. - Revise DebugTranscriptPanel to dynamically render chat messages with timestamps, improving the visual representation of conversation history.	2026-06-10 15:11:34 +08:00
Xin Wang	b711350c0c	Refactor frontend routing and component structure for improved navigation - Update CLAUDE.md to reflect changes in the navigation model, emphasizing the use of App Router routes for sidebar sections. - Refactor layout.tsx to wrap children in AppShell, enhancing the overall layout structure. - Replace AppShell usage in page.tsx with HomePage component for better separation of concerns. - Introduce new pages for assistants, components, dashboard, history, profile, and test, each rendering their respective components. - Revise Sidebar component to utilize Next.js Link for navigation and improve active state handling based on the current pathname. - Update AssistantPage to support routing-driven modes (list, choose, edit) and streamline form handling for assistant creation and editing.	2026-06-10 14:39:52 +08:00
Xin Wang	0adb3ed8a1	Add initial setup for local HTTPS debugging and Nginx configuration - Introduce `setup-certs.sh` script for generating trusted local TLS certificates using mkcert. - Add Nginx configuration files for local and Docker environments to handle HTTPS requests and proxy to backend services. - Update `docker-compose.yaml` to include Nginx service for unified TLS entry and adjust frontend service ports for local development. - Create `AGENTS.md` and `README.md` files to document the local HTTPS setup process and usage instructions. - Modify backend startup commands in `README.md` for consistency with new requirements. - Add `.gitignore` to exclude generated certificates from version control.	2026-06-10 13:37:24 +08:00
Xin Wang	e94d98e947	fix frontend voice preview fallback	2026-06-10 12:36:18 +08:00
Xin Wang	4a948ee609	fix backend TTS provider compatibility	2026-06-10 12:32:55 +08:00
Xin Wang	ac3f4dd806	Enhance voice interaction features and introduce voice preview functionality - Update README to reflect the integration of the DebugVoicePanel with WebSocket support for voice interactions. - Refactor voice_webrtc.py to improve error handling during WebRTC signaling and include assistant_id in the offer payload. - Add useVoicePreview hook to manage microphone access and WebRTC connections for real-time voice previews. - Modify AssistantPage to incorporate new visualizer options and pass assistantId to DebugVoicePanel, enhancing user experience during audio interactions. - Update API model to include new fields for voice, speed, and language, supporting TTS and ASR configurations.	2026-06-10 10:17:46 +08:00
Xin Wang	c839779d87	Add color adaptation functions for theme-based palette adjustments - Implement rgbToHsl and hslToRgb functions for color space conversions. - Introduce adaptPalette function to adjust colors based on dark/light themes, enhancing visual consistency. - Add isDarkTheme function to determine the current theme state, improving theme handling across visual components.	2026-06-10 09:32:53 +08:00
Xin Wang	9327cff364	Refactor SpectrumVisualizer for improved audio visualization and responsiveness - Update SpectrumVisualizer component to enhance the visual representation of audio frequencies with a new layout and smoother animations. - Modify prop descriptions for clarity and adjust the number of frequency bars for better performance. - Implement a refined drawing logic that maintains visual consistency across different themes and improves the overall user experience during audio playback.	2026-06-10 09:32:27 +08:00
Xin Wang	df7ce493f1	Enhance audio visualizers with new NebulaVisualizer and refactor existing components - Introduce the NebulaVisualizer component, featuring particles that respond to audio input, enhancing the visual experience. - Refactor AuraVisualizer, SpectrumVisualizer, and WaveVisualizer to utilize the adaptPalette function for improved theme handling. - Update visualizer logic to enhance responsiveness and visual effects based on audio analysis, ensuring a cohesive user experience across components.	2026-06-10 09:17:14 +08:00
Xin Wang	6e83396d64	Add AssistantIdentity component to AssistantPage for assistant ID display and copy functionality - Introduce the AssistantIdentity component to show the current assistant ID and provide a copy-to-clipboard feature. - Update multiple sections of AssistantPage to include the AssistantIdentity component, enhancing user interaction with assistant IDs. - Ensure the component handles the display of the assistant ID and provides feedback when copied successfully.	2026-06-09 16:48:40 +08:00
Xin Wang	b3fbfac5df	Implement audio visualizers and refactor AssistantPage - Introduce three new audio visualizer components: AuraVisualizer, SpectrumVisualizer, and WaveVisualizer, enhancing the audio interaction experience. - Replace the deprecated VoiceVisualizer with the new visualizers, ensuring a cohesive visual language across components. - Update the AssistantPage to support dynamic visualization style switching, improving user engagement during audio interactions. - Refactor DebugVoicePanel to accommodate the new visualizer props and enhance the overall debugging interface.	2026-06-09 16:28:45 +08:00
Xin Wang	4f0f639e8f	Refactor AssistantPage and DebugDrawer components - Remove debug mode state management from AssistantPage, simplifying the component structure. - Update DebugDrawer to eliminate mode selection, focusing on voice interaction features. - Enhance the VoiceVisualizer component with improved visual effects and responsiveness to audio input. - Adjust styles and layout for better user experience in the debugging interface.	2026-06-09 15:28:24 +08:00
Xin Wang	c64b7dcf99	Enhance credential management and testing functionality - Introduce new fields for voice, speed, and language in the AssistantConfig and ProviderCredential models to support TTS and ASR configurations. - Update the database schema and seeding script to accommodate the new fields, ensuring backward compatibility. - Implement credential testing endpoints and logic to validate OpenAI-compatible credentials, enhancing user experience and reliability. - Modify frontend components to include new fields in the credential forms and improve connection testing feedback. - Refactor related services and API interactions to support the new credential testing feature.	2026-06-09 14:42:25 +08:00
Xin Wang	3661dab81c	reorder seed data	2026-06-09 13:38:45 +08:00
Xin Wang	be0da3449c	Enhance OpenCode form handling in AssistantPage - Introduce a new model field in the OpenCode form to manage language model selection. - Refactor the form handling logic to improve data loading and error management for OpenCode assistants. - Update UI components to utilize ResourceSelectField for model and voice configuration, enhancing user experience. - Clear form fields when creating new OpenCode entries to ensure a fresh start for users.	2026-06-09 12:59:07 +08:00
Xin Wang	a8b6c09920	Refactor API key handling in AssistantPage and ComponentsModelsPage - Update AssistantPage to use a stored value mask for API keys, improving security and user experience. - Modify ComponentsModelsPage to display the current API key contextually, enhancing clarity for users. - Adjust related components to ensure consistent handling of API key visibility and management.	2026-06-09 12:53:03 +08:00
Xin Wang	044411edc6	Enhance AssistantPage and ComponentsModelsPage with API key management improvements - Update AssistantPage to handle API key input more securely by removing placeholder values and allowing empty submissions to retain existing keys. - Introduce a new SecretInputField component for API key entry, improving user experience with visibility toggling and contextual hints. - Modify ComponentsModelsPage to reflect similar API key handling, ensuring users can manage keys effectively while providing feedback on existing configurations. - Add EditorBackButton for better navigation within the AssistantPage.	2026-06-09 12:49:18 +08:00
Xin Wang	6acbac7d3b	Update seed_credentials.sql with new model names and API keys - Change 'DeepSeek-V3' to 'DeepSeek-Chat' and update its API key. - Rename 'OpenAI TTS' to 'SiliconFlow-CosyVoice2-0.5B' and update its details. - Add new models: 'SiliconFlow-TeleSpeechASR' and 'SiliconFlow-Qwen3-Embedding-4B' with corresponding API keys and configurations. - Adjust existing entries to ensure consistency in the database seeding process.	2026-06-09 11:31:49 +08:00
Xin Wang	519cc0fefe	Refactor assistant configuration and database seeding - Update Makefile to include new database seed commands for assistants and credentials. - Refactor assistant model to use explicit fields instead of a config dictionary, improving data integrity and clarity. - Implement new seeding SQL script for assistants, ensuring dependencies on credentials are respected. - Modify backend routes and frontend components to accommodate the new assistant structure, including direct field access for prompt, API URL, and keys. - Enhance the AssistantPage component to handle the new data structure and streamline the save process for different assistant types.	2026-06-09 10:37:29 +08:00
Xin Wang	23e1cf5d42	Remove WorkflowPage and associated references from AppShell and Sidebar components	2026-06-09 10:27:47 +08:00
Xin Wang	30b96bb3be	Add duplicate functionality for assistants and credentials Implement server-side duplication for both assistants and credentials, allowing users to create copies with unique IDs and modified names. Update the respective API routes and frontend components to handle duplication requests, ensuring sensitive information is securely managed. Enhance the AssistantPage and ComponentsModelsPage to support this new feature, including loading and error handling for the duplication process.	2026-06-09 09:48:43 +08:00
Xin Wang	b444ea777c	Implement knowledge base management and enhance assistant configuration Add CRUD functionality for knowledge bases, including routes for listing, creating, updating, and deleting knowledge bases. Update the assistant model to include foreign key references to knowledge bases and modify the assistant configuration to handle external API keys securely. Refactor related services and routes to accommodate these changes, ensuring proper handling of credential resolution and configuration normalization.	2026-06-09 08:31:39 +08:00
Xin Wang	34fba494a3	Add duplicate assistant functionality to AssistantPage component Implement a new feature that allows users to create a copy of an existing assistant. The duplicate is inserted after the original, with the name suffixed by "副本" and the updated timestamp. Update the state management to handle the list of assistants accordingly, and add a dropdown menu item for triggering the duplication.	2026-06-09 08:31:19 +08:00
Xin Wang	7e8e8624b4	Enhance AI Video Assistant platform with new Makefile for development commands, update CORS origins for local access, and implement API client for credential management. Add seed data for model credentials and refactor ComponentsModelsPage to utilize API for dynamic data loading. Update Next.js configuration for Turbopack compatibility.	2026-06-08 22:39:45 +08:00
Xin Wang	42cab2a6ef	Initial commit: AI Video Assistant fullstack platform. Add pipecat-based backend with WebRTC/WS voice routes, Next.js frontend, and Docker Compose orchestration. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-08 13:51:28 +08:00

39 Commits