13 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	WebSocket Chat Server Documentation
Overview
The WebSocket Chat Server is an intelligent conversation system that provides real-time AI chat capabilities with advanced turn detection, session management, and client-aware interactions. The server supports multiple concurrent clients with individual session tracking and automatic turn completion detection.
Features
- 🔄 Real-time WebSocket communication
- 🧠 AI-powered responses via FastGPT API
- 🎯 Intelligent turn detection using ONNX models
- 📱 Multi-client support with session isolation
- ⏱️ Automatic timeout handling with buffering
- 🔗 Session persistence across reconnections
- 🎨 Professional logging with client tracking
- 🌐 Welcome message system for new sessions
Server Configuration
Environment Variables
Create a .env file in the project root:
# Turn Detection Settings
MAX_INCOMPLETE_SENTENCES=3
MAX_RESPONSE_TIMEOUT=5
# FastGPT API Configuration
CHAT_MODEL_API_URL=http://101.89.151.141:3000/
CHAT_MODEL_API_KEY=your_fastgpt_api_key_here
CHAT_MODEL_APP_ID=your_fastgpt_app_id_here
Default Values
| Variable | Default | Description | 
|---|---|---|
| MAX_INCOMPLETE_SENTENCES | 3 | Maximum buffered sentences before forcing completion | 
| MAX_RESPONSE_TIMEOUT | 5 | Seconds of silence before processing buffered input | 
| CHAT_MODEL_API_URL | None | FastGPT API endpoint URL | 
| CHAT_MODEL_API_KEY | None | FastGPT API authentication key | 
| CHAT_MODEL_APP_ID | None | FastGPT application ID | 
WebSocket Connection
Connection URL Format
ws://localhost:9000?clientId=YOUR_CLIENT_ID
Connection Parameters
| Parameter | Required | Description | 
|---|---|---|
| clientId | Yes | Unique identifier for the client session | 
Connection Example
const ws = new WebSocket('ws://localhost:9000?clientId=user123');
Message Protocol
Message Format
All messages use JSON format with the following structure:
{
  "type": "MESSAGE_TYPE",
  "payload": {
    // Message-specific data
  }
}
Client to Server Messages
USER_INPUT
Sends user text input to the server.
{
  "type": "USER_INPUT",
  "payload": {
    "text": "Hello, how are you?",
    "client_id": "user123"  // Optional, will use URL clientId if not provided
  }
}
Fields:
- text(string, required): The user's input text
- client_id(string, optional): Client identifier (overrides URL parameter)
Server to Client Messages
AI_RESPONSE
AI-generated response to user input.
{
  "type": "AI_RESPONSE",
  "payload": {
    "text": "Hello! I'm doing well, thank you for asking. How can I help you today?",
    "client_id": "user123",
    "estimated_tts_duration": 3.2
  }
}
Fields:
- text(string): AI response content
- client_id(string): Client identifier
- estimated_tts_duration(float): Estimated text-to-speech duration in seconds
ERROR
Error notification from server.
{
  "type": "ERROR",
  "payload": {
    "message": "Error description",
    "client_id": "user123"
  }
}
Fields:
- message(string): Error description
- client_id(string): Client identifier
Session Management
Session Lifecycle
- Connection: Client connects with unique clientId
- Session Creation: New session created or existing session reused
- Welcome Message: New sessions receive welcome message automatically
- Interaction: Real-time message exchange
- Disconnection: Session data preserved for reconnection
Session Data Structure
class SessionData:
    client_id: str                    # Unique client identifier
    incomplete_sentences: List[str]   # Buffered user input
    conversation_history: List[ChatMessage]  # Full conversation history
    last_input_time: float           # Timestamp of last user input
    timeout_task: Optional[Task]     # Current timeout task
    ai_response_playback_ends_at: Optional[float]  # AI response end time
Session Persistence
- Sessions persist across WebSocket disconnections
- Reconnection with same clientIdresumes existing session
- Conversation history maintained throughout session lifetime
- Timeout tasks properly managed during reconnections
Turn Detection System
How It Works
The server uses an ONNX-based turn detection model to determine when a user's utterance is complete:
- Input Buffering: User input buffered during AI response playback
- Turn Analysis: Model analyzes current + buffered input for completion
- Decision Making: Determines if utterance is complete or needs more input
- Timeout Handling: Processes buffered input after silence period
Turn Detection Parameters
| Parameter | Value | Description | 
|---|---|---|
| Model | livekit/turn-detector | Pre-trained turn detection model | 
| Threshold | 0.0009 | Probability threshold for completion | 
| Max History | 6 turns | Maximum conversation history for analysis | 
| Max Tokens | 128 | Maximum tokens for model input | 
Turn Detection Flow
User Input → Buffer Check → Turn Detection → Complete? → Process/Continue
     ↓              ↓              ↓            ↓           ↓
  AI Speaking?   Add to Buffer   Model Predict   Yes      Send to AI
     ↓              ↓              ↓            ↓           ↓
     No         Schedule Timeout   < Threshold   No      Schedule Timeout
Timeout and Buffering System
Buffering During AI Response
When the AI is generating or "speaking" a response:
- User input is buffered in incomplete_sentences
- New timeout task scheduled for each buffered input
- Timeout waits for AI playback to complete before processing
Silence Timeout
After AI response completes:
- Server waits MAX_RESPONSE_TIMEOUTseconds (default: 5s)
- If no new input received, processes buffered input
- Forces completion if MAX_INCOMPLETE_SENTENCESreached
Timeout Configuration
# Wait for AI playback to finish
remaining_playtime = session.ai_response_playback_ends_at - current_time
await asyncio.sleep(remaining_playtime)
# Wait for user silence
await asyncio.sleep(MAX_RESPONSE_TIMEOUT)  # 5 seconds default
Error Handling
Connection Errors
- Missing clientId: Connection rejected with code 1008
- Invalid JSON: Error message sent to client
- Unknown message type: Error message sent to client
API Errors
- FastGPT API failures: Error logged, user message reverted
- Network errors: Comprehensive error logging with context
- Model errors: Graceful degradation with error reporting
Timeout Errors
- Task cancellation: Normal during new input arrival
- Exception handling: Errors logged, timeout task cleared
Logging System
Log Levels
The server uses a comprehensive logging system with colored output:
- ℹ️ INFO (green): General information
- 🐛 DEBUG (cyan): Detailed debugging information
- ⚠️ WARNING (yellow): Warning messages
- ❌ ERROR (red): Error messages
- ⏱️ TIMEOUT (blue): Timeout-related events
- 💬 USER_INPUT (purple): User input processing
- 🤖 AI_RESPONSE (blue): AI response generation
- 🔗 SESSION (bold): Session management events
Log Format
2024-01-15 14:30:25.123 [LEVEL] 🎯 (client_id): Message | key=value | key2=value2
Example Logs
2024-01-15 14:30:25.123 [SESSION] 🔗 (user123): NEW SESSION: Creating session | total_sessions_before=0
2024-01-15 14:30:25.456 [USER_INPUT] 💬 (user123): AI speaking. Buffering: 'Hello' | current_buffer_size=1
2024-01-15 14:30:26.789 [AI_RESPONSE] 🤖 (user123): Response sent: 'Hello! How can I help you?' | tts_duration=2.5s | playback_ends_at=1642248629.289
Client Implementation Examples
JavaScript/Web Client
class ChatClient {
    constructor(clientId, serverUrl = 'ws://localhost:9000') {
        this.clientId = clientId;
        this.serverUrl = `${serverUrl}?clientId=${clientId}`;
        this.ws = null;
        this.messageHandlers = new Map();
    }
    connect() {
        this.ws = new WebSocket(this.serverUrl);
        
        this.ws.onopen = () => {
            console.log('Connected to chat server');
        };
        this.ws.onmessage = (event) => {
            const message = JSON.parse(event.data);
            this.handleMessage(message);
        };
        this.ws.onclose = (event) => {
            console.log('Disconnected from chat server');
        };
    }
    sendMessage(text) {
        if (this.ws && this.ws.readyState === WebSocket.OPEN) {
            const message = {
                type: 'USER_INPUT',
                payload: {
                    text: text,
                    client_id: this.clientId
                }
            };
            this.ws.send(JSON.stringify(message));
        }
    }
    handleMessage(message) {
        switch (message.type) {
            case 'AI_RESPONSE':
                console.log('AI:', message.payload.text);
                break;
            case 'ERROR':
                console.error('Error:', message.payload.message);
                break;
            default:
                console.log('Unknown message type:', message.type);
        }
    }
}
// Usage
const client = new ChatClient('user123');
client.connect();
client.sendMessage('Hello, how are you?');
Python Client
import asyncio
import websockets
import json
class ChatClient:
    def __init__(self, client_id, server_url="ws://localhost:9000"):
        self.client_id = client_id
        self.server_url = f"{server_url}?clientId={client_id}"
        self.websocket = None
    async def connect(self):
        self.websocket = await websockets.connect(self.server_url)
        print(f"Connected to chat server as {self.client_id}")
    async def send_message(self, text):
        if self.websocket:
            message = {
                "type": "USER_INPUT",
                "payload": {
                    "text": text,
                    "client_id": self.client_id
                }
            }
            await self.websocket.send(json.dumps(message))
    async def listen(self):
        async for message in self.websocket:
            data = json.loads(message)
            await self.handle_message(data)
    async def handle_message(self, message):
        if message["type"] == "AI_RESPONSE":
            print(f"AI: {message['payload']['text']}")
        elif message["type"] == "ERROR":
            print(f"Error: {message['payload']['message']}")
    async def run(self):
        await self.connect()
        await self.listen()
# Usage
async def main():
    client = ChatClient("user123")
    await client.run()
asyncio.run(main())
Performance Considerations
Scalability
- Session Management: Sessions stored in memory (consider Redis for production)
- Concurrent Connections: Limited by system resources and WebSocket library
- Model Loading: ONNX model loaded once per server instance
Optimization
- Connection Pooling: aiohttp sessions reused for API calls
- Async Processing: All I/O operations are asynchronous
- Memory Management: Sessions cleaned up on disconnection (manual cleanup needed)
Monitoring
- Performance Logging: Duration tracking for all operations
- Error Tracking: Comprehensive error logging with context
- Session Metrics: Active session count and client activity
Deployment
Prerequisites
pip install -r requirements.txt
Running the Server
python main_uninterruptable2.py
Production Considerations
- Environment Variables: Set all required environment variables
- SSL/TLS: Use WSS for secure WebSocket connections
- Load Balancing: Consider multiple server instances
- Session Storage: Use Redis or database for session persistence
- Monitoring: Implement health checks and metrics collection
- Logging: Configure log rotation and external logging service
Docker Deployment
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 9000
CMD ["python", "main_uninterruptable2.py"]
Troubleshooting
Common Issues
- Connection Refused: Check if server is running on correct port
- Missing clientId: Ensure clientId parameter is provided in URL
- API Errors: Verify FastGPT API credentials and network connectivity
- Model Loading: Check if ONNX model files are accessible
Debug Mode
Enable debug logging by modifying the logging level in the code or environment variables.
Health Check
The server doesn't provide a built-in health check endpoint. Consider implementing one for production monitoring.
API Reference
WebSocket Events
| Event | Direction | Description | 
|---|---|---|
| open | Client | Connection established | 
| message | Bidirectional | Message exchange | 
| close | Client | Connection closed | 
| error | Client | Connection error | 
Message Types Summary
| Type | Direction | Description | 
|---|---|---|
| USER_INPUT | Client → Server | Send user message | 
| AI_RESPONSE | Server → Client | Receive AI response | 
| ERROR | Server → Client | Error notification | 
License
This WebSocket server is part of the turn detection request project. Please refer to the main project license for usage terms.
