ai-video-fullstack

Author	SHA1	Message	Date
Xin Wang	809b634420	Enhance AssistantConfig and pipeline for FastGPT integration - Add new fields in AssistantConfig for FastGPT connection details, including `fastgpt_api_url`, `fastgpt_api_key`, and `fastgpt_app_id`. - Update the pipeline to utilize the new FastGPT configuration, ensuring proper integration with external services. - Introduce type handling for different assistant types, including support for realtime modes and external brain management. - Refactor frontend components to include hints for FastGPT configuration inputs, improving user guidance during setup.	2026-06-16 16:55:51 +08:00
Xin Wang	43cc188d85	Add audio output silence configuration to prevent abrupt call termination - Introduce a new parameter `audio_out_end_silence_secs` in the `_base_params` function to control the duration of silence added after the end frame, allowing for smoother call termination. - Set the default value to 0 to ensure immediate hang-up after the end speech, enhancing user experience during call endings.	2026-06-16 09:39:16 +08:00
Xin Wang	b22a9e1045	Enhance pipeline execution and voice preview handling for graceful call termination - Introduce mechanisms in the pipeline to ensure that the end call process waits for the completion of the end speech before hanging up, improving user experience during call termination. - Update the useVoicePreview hook to handle server-initiated call endings gracefully, distinguishing between normal and error disconnections. - Adjust TTS stop frame timeout settings to optimize the timing of call terminations, ensuring timely responses without unnecessary delays. - Refactor related components to support the new end call logic, enhancing overall workflow management and user interaction.	2026-06-16 09:24:24 +08:00
Xin Wang	c2ef76620e	Enhance workflow engine and frontend components with transition speech support - Introduce edge transition speech functionality in the WorkflowEngine to provide optional speech during node transitions. - Update pipeline execution to utilize the new transition speech feature, enhancing user experience by masking delays during transitions. - Modify frontend components to support transition speech in edge specifications, allowing users to define and edit transition speech for edges. - Refactor edge handling logic in the WorkflowEditor to accommodate the new transition speech field, improving workflow management capabilities.	2026-06-15 15:57:05 +08:00
Xin Wang	09a5ffbdbc	Update node specifications and enhance GenericNode component - Change the 'addable' property of a specific node type to true, allowing for dynamic addition of nodes. - Modify the GenericNode component to include a new icon and adjust styles for better visual representation. - Update node handling logic to prevent deletion of 'startCall' nodes and improve node change handling in the workflow editor. - Refactor layout and styling in the WorkflowEditor for a more polished user interface.	2026-06-15 15:49:58 +08:00
Xin Wang	aae0342a57	Enhance workflow engine and integration in backend and frontend - Introduce a new WorkflowEngine class to manage workflow graphs, enabling dynamic node-based interactions. - Update AssistantConfig to include a graph field for workflow definitions, allowing for flexible configuration. - Modify pipeline execution to support workflow-driven dialogue, integrating node transitions and system prompts based on active nodes. - Enhance frontend components to visualize active nodes and provide debugging capabilities, including highlighting the current node during interactions. - Refactor existing components to accommodate new workflow functionalities and improve overall user experience.	2026-06-15 15:32:10 +08:00
Xin Wang	c2a39257ff	Add workflow editor and node types support in frontend and backend - Introduce a new workflow editor component for visualizing and managing workflows, allowing users to add nodes and define connections. - Implement backend support for node types, including validation and constraints for workflow graphs. - Add new API endpoints for retrieving node types and their specifications. - Enhance the AssistantPage to integrate the workflow editor, enabling users to create and edit workflows directly. - Update frontend components to support new workflow functionalities, including condition edges and generic nodes. - Refactor existing code to accommodate the new workflow features and improve overall structure.	2026-06-15 10:12:41 +08:00
Xin Wang	0309c154b5	Implement StepFun Realtime service and enhance AssistantConfig - Add new fields to AssistantConfig for realtime interface configuration, including types, values, and secrets. - Introduce StepFunRealtimeService to handle speech-to-speech processing via WebSocket, integrating STT, LLM, and TTS functionalities. - Refactor pipeline execution to support a new realtime mode, allowing direct text input processing and immediate responses. - Update model resource testing to include validation for StepFun Realtime connections. - Enhance service factory to create realtime services based on configuration settings. - Modify README documentation to reflect new realtime capabilities and usage instructions.	2026-06-14 23:41:40 +08:00
Xin Wang	d55b87cfbf	Enhance LLM text streaming and message handling in backend and frontend - Introduce event handlers in PassthroughLLMAssistantAggregator for managing LLM text streaming, including start, delta, and end events. - Implement a new method to finalize text streams, ensuring proper handling of interruptions. - Update useVoicePreview to support new message types for LLM text streaming, allowing real-time updates to chat messages. - Enhance message sorting logic to maintain order based on timestamps and sequence numbers, improving user experience during voice interactions.	2026-06-14 22:18:21 +08:00
Xin Wang	b749d2e075	Enhance text input processing and LLM interaction in the backend - Refactor TextInputProcessor to handle immediate and silent text inputs, improving user experience during voice interactions. - Introduce PassthroughLLMAssistantAggregator to manage LLM responses while preserving context for downstream TTS processing. - Update event handling for text input and client readiness, ensuring timely updates to the conversation context. - Modify run_pipeline to integrate new aggregators and streamline message handling, enhancing overall pipeline efficiency. - Improve message ordering in useVoicePreview to ensure accurate display of chat messages based on timestamps.	2026-06-14 22:12:56 +08:00
Xin Wang	90e3e8a0c0	Refactor backend to support interface-definition driven model resources - Introduce a new model structure for managing interface definitions and model resources, enhancing the backend's capability to handle various service integrations. - Update the Makefile to reflect changes in database seeding and resource management commands. - Remove the deprecated credentials management routes and replace them with a unified model registry API. - Modify existing routes and schemas to align with the new model structure, ensuring seamless integration with the frontend. - Enhance database seeding scripts to populate new model resources and their configurations. - Update README documentation to reflect the new architecture and usage instructions for model resources and interface definitions.	2026-06-14 19:36:12 +08:00
Xin Wang	e25dfd4003	Add support for Xfyun ASR and TTS services in the backend - Introduce new Xfyun ASR and TTS services, enabling integration with iFlytek's voice recognition and synthesis capabilities. - Update AssistantConfig model to include interface types for STT and TTS. - Enhance credential testing to validate Xfyun credentials. - Modify service factory to create Xfyun services based on configuration. - Update README with new configuration details for Xfyun integration. - Add new frontend components for visualizing audio streams and managing user interactions.	2026-06-11 10:51:08 +08:00
Xin Wang	2c2af1f2cd	Enhance voice interaction and transcript handling in the assistant - Add a new Docker configuration for the UI in launch.json to facilitate development. - Refactor pipeline.py to integrate a TranscriptProcessor for managing user and assistant transcripts, including event handlers for real-time updates and message handling. - Update useVoicePreview.ts to establish a data channel for sending and receiving text messages, improving interaction flow. - Modify AssistantPage.tsx to support displaying chat messages and sending user input, enhancing the user experience during voice interactions. - Revise DebugTranscriptPanel to dynamically render chat messages with timestamps, improving the visual representation of conversation history.	2026-06-10 15:11:34 +08:00
Xin Wang	0adb3ed8a1	Add initial setup for local HTTPS debugging and Nginx configuration - Introduce `setup-certs.sh` script for generating trusted local TLS certificates using mkcert. - Add Nginx configuration files for local and Docker environments to handle HTTPS requests and proxy to backend services. - Update `docker-compose.yaml` to include Nginx service for unified TLS entry and adjust frontend service ports for local development. - Create `AGENTS.md` and `README.md` files to document the local HTTPS setup process and usage instructions. - Modify backend startup commands in `README.md` for consistency with new requirements. - Add `.gitignore` to exclude generated certificates from version control.	2026-06-10 13:37:24 +08:00
Xin Wang	e94d98e947	fix frontend voice preview fallback	2026-06-10 12:36:18 +08:00
Xin Wang	4a948ee609	fix backend TTS provider compatibility	2026-06-10 12:32:55 +08:00
Xin Wang	ac3f4dd806	Enhance voice interaction features and introduce voice preview functionality - Update README to reflect the integration of the DebugVoicePanel with WebSocket support for voice interactions. - Refactor voice_webrtc.py to improve error handling during WebRTC signaling and include assistant_id in the offer payload. - Add useVoicePreview hook to manage microphone access and WebRTC connections for real-time voice previews. - Modify AssistantPage to incorporate new visualizer options and pass assistantId to DebugVoicePanel, enhancing user experience during audio interactions. - Update API model to include new fields for voice, speed, and language, supporting TTS and ASR configurations.	2026-06-10 10:17:46 +08:00
Xin Wang	c64b7dcf99	Enhance credential management and testing functionality - Introduce new fields for voice, speed, and language in the AssistantConfig and ProviderCredential models to support TTS and ASR configurations. - Update the database schema and seeding script to accommodate the new fields, ensuring backward compatibility. - Implement credential testing endpoints and logic to validate OpenAI-compatible credentials, enhancing user experience and reliability. - Modify frontend components to include new fields in the credential forms and improve connection testing feedback. - Refactor related services and API interactions to support the new credential testing feature.	2026-06-09 14:42:25 +08:00
Xin Wang	3661dab81c	reorder seed data	2026-06-09 13:38:45 +08:00
Xin Wang	6acbac7d3b	Update seed_credentials.sql with new model names and API keys - Change 'DeepSeek-V3' to 'DeepSeek-Chat' and update its API key. - Rename 'OpenAI TTS' to 'SiliconFlow-CosyVoice2-0.5B' and update its details. - Add new models: 'SiliconFlow-TeleSpeechASR' and 'SiliconFlow-Qwen3-Embedding-4B' with corresponding API keys and configurations. - Adjust existing entries to ensure consistency in the database seeding process.	2026-06-09 11:31:49 +08:00
Xin Wang	519cc0fefe	Refactor assistant configuration and database seeding - Update Makefile to include new database seed commands for assistants and credentials. - Refactor assistant model to use explicit fields instead of a config dictionary, improving data integrity and clarity. - Implement new seeding SQL script for assistants, ensuring dependencies on credentials are respected. - Modify backend routes and frontend components to accommodate the new assistant structure, including direct field access for prompt, API URL, and keys. - Enhance the AssistantPage component to handle the new data structure and streamline the save process for different assistant types.	2026-06-09 10:37:29 +08:00
Xin Wang	30b96bb3be	Add duplicate functionality for assistants and credentials Implement server-side duplication for both assistants and credentials, allowing users to create copies with unique IDs and modified names. Update the respective API routes and frontend components to handle duplication requests, ensuring sensitive information is securely managed. Enhance the AssistantPage and ComponentsModelsPage to support this new feature, including loading and error handling for the duplication process.	2026-06-09 09:48:43 +08:00
Xin Wang	b444ea777c	Implement knowledge base management and enhance assistant configuration Add CRUD functionality for knowledge bases, including routes for listing, creating, updating, and deleting knowledge bases. Update the assistant model to include foreign key references to knowledge bases and modify the assistant configuration to handle external API keys securely. Refactor related services and routes to accommodate these changes, ensuring proper handling of credential resolution and configuration normalization.	2026-06-09 08:31:39 +08:00
Xin Wang	7e8e8624b4	Enhance AI Video Assistant platform with new Makefile for development commands, update CORS origins for local access, and implement API client for credential management. Add seed data for model credentials and refactor ComponentsModelsPage to utilize API for dynamic data loading. Update Next.js configuration for Turbopack compatibility.	2026-06-08 22:39:45 +08:00
Xin Wang	42cab2a6ef	Initial commit: AI Video Assistant fullstack platform. Add pipecat-based backend with WebRTC/WS voice routes, Next.js frontend, and Docker Compose orchestration. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-08 13:51:28 +08:00

25 Commits