2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00
2025-06-19 17:39:45 +08:00

WebSocket Chat Server Documentation

Overview

The WebSocket Chat Server is an intelligent conversation system that provides real-time AI chat capabilities with advanced turn detection, session management, and client-aware interactions. The server supports multiple concurrent clients with individual session tracking and automatic turn completion detection.

Features

  • 🔄 Real-time WebSocket communication
  • 🧠 AI-powered responses via FastGPT API
  • 🎯 Intelligent turn detection using ONNX models
  • 📱 Multi-client support with session isolation
  • ⏱️ Automatic timeout handling with buffering
  • 🔗 Session persistence across reconnections
  • 🎨 Professional logging with client tracking
  • 🌐 Welcome message system for new sessions

Server Configuration

Environment Variables

Create a .env file in the project root:

# Turn Detection Settings
MAX_INCOMPLETE_SENTENCES=3
MAX_RESPONSE_TIMEOUT=5

# FastGPT API Configuration
CHAT_MODEL_API_URL=http://101.89.151.141:3000/
CHAT_MODEL_API_KEY=your_fastgpt_api_key_here
CHAT_MODEL_APP_ID=your_fastgpt_app_id_here

Default Values

Variable Default Description
MAX_INCOMPLETE_SENTENCES 3 Maximum buffered sentences before forcing completion
MAX_RESPONSE_TIMEOUT 5 Seconds of silence before processing buffered input
CHAT_MODEL_API_URL None FastGPT API endpoint URL
CHAT_MODEL_API_KEY None FastGPT API authentication key
CHAT_MODEL_APP_ID None FastGPT application ID

WebSocket Connection

Connection URL Format

ws://localhost:9000?clientId=YOUR_CLIENT_ID

Connection Parameters

Parameter Required Description
clientId Yes Unique identifier for the client session

Connection Example

const ws = new WebSocket('ws://localhost:9000?clientId=user123');

Message Protocol

Message Format

All messages use JSON format with the following structure:

{
  "type": "MESSAGE_TYPE",
  "payload": {
    // Message-specific data
  }
}

Client to Server Messages

USER_INPUT

Sends user text input to the server.

{
  "type": "USER_INPUT",
  "payload": {
    "text": "Hello, how are you?",
    "client_id": "user123"  // Optional, will use URL clientId if not provided
  }
}

Fields:

  • text (string, required): The user's input text
  • client_id (string, optional): Client identifier (overrides URL parameter)

Server to Client Messages

AI_RESPONSE

AI-generated response to user input.

{
  "type": "AI_RESPONSE",
  "payload": {
    "text": "Hello! I'm doing well, thank you for asking. How can I help you today?",
    "client_id": "user123",
    "estimated_tts_duration": 3.2
  }
}

Fields:

  • text (string): AI response content
  • client_id (string): Client identifier
  • estimated_tts_duration (float): Estimated text-to-speech duration in seconds

ERROR

Error notification from server.

{
  "type": "ERROR",
  "payload": {
    "message": "Error description",
    "client_id": "user123"
  }
}

Fields:

  • message (string): Error description
  • client_id (string): Client identifier

Session Management

Session Lifecycle

  1. Connection: Client connects with unique clientId
  2. Session Creation: New session created or existing session reused
  3. Welcome Message: New sessions receive welcome message automatically
  4. Interaction: Real-time message exchange
  5. Disconnection: Session data preserved for reconnection

Session Data Structure

class SessionData:
    client_id: str                    # Unique client identifier
    incomplete_sentences: List[str]   # Buffered user input
    conversation_history: List[ChatMessage]  # Full conversation history
    last_input_time: float           # Timestamp of last user input
    timeout_task: Optional[Task]     # Current timeout task
    ai_response_playback_ends_at: Optional[float]  # AI response end time

Session Persistence

  • Sessions persist across WebSocket disconnections
  • Reconnection with same clientId resumes existing session
  • Conversation history maintained throughout session lifetime
  • Timeout tasks properly managed during reconnections

Turn Detection System

How It Works

The server uses an ONNX-based turn detection model to determine when a user's utterance is complete:

  1. Input Buffering: User input buffered during AI response playback
  2. Turn Analysis: Model analyzes current + buffered input for completion
  3. Decision Making: Determines if utterance is complete or needs more input
  4. Timeout Handling: Processes buffered input after silence period

Turn Detection Parameters

Parameter Value Description
Model livekit/turn-detector Pre-trained turn detection model
Threshold 0.0009 Probability threshold for completion
Max History 6 turns Maximum conversation history for analysis
Max Tokens 128 Maximum tokens for model input

Turn Detection Flow

User Input → Buffer Check → Turn Detection → Complete? → Process/Continue
     ↓              ↓              ↓            ↓           ↓
  AI Speaking?   Add to Buffer   Model Predict   Yes      Send to AI
     ↓              ↓              ↓            ↓           ↓
     No         Schedule Timeout   < Threshold   No      Schedule Timeout

Timeout and Buffering System

Buffering During AI Response

When the AI is generating or "speaking" a response:

  • User input is buffered in incomplete_sentences
  • New timeout task scheduled for each buffered input
  • Timeout waits for AI playback to complete before processing

Silence Timeout

After AI response completes:

  • Server waits MAX_RESPONSE_TIMEOUT seconds (default: 5s)
  • If no new input received, processes buffered input
  • Forces completion if MAX_INCOMPLETE_SENTENCES reached

Timeout Configuration

# Wait for AI playback to finish
remaining_playtime = session.ai_response_playback_ends_at - current_time
await asyncio.sleep(remaining_playtime)

# Wait for user silence
await asyncio.sleep(MAX_RESPONSE_TIMEOUT)  # 5 seconds default

Error Handling

Connection Errors

  • Missing clientId: Connection rejected with code 1008
  • Invalid JSON: Error message sent to client
  • Unknown message type: Error message sent to client

API Errors

  • FastGPT API failures: Error logged, user message reverted
  • Network errors: Comprehensive error logging with context
  • Model errors: Graceful degradation with error reporting

Timeout Errors

  • Task cancellation: Normal during new input arrival
  • Exception handling: Errors logged, timeout task cleared

Logging System

Log Levels

The server uses a comprehensive logging system with colored output:

  • INFO (green): General information
  • 🐛 DEBUG (cyan): Detailed debugging information
  • ⚠️ WARNING (yellow): Warning messages
  • ERROR (red): Error messages
  • ⏱️ TIMEOUT (blue): Timeout-related events
  • 💬 USER_INPUT (purple): User input processing
  • 🤖 AI_RESPONSE (blue): AI response generation
  • 🔗 SESSION (bold): Session management events

Log Format

2024-01-15 14:30:25.123 [LEVEL] 🎯 (client_id): Message | key=value | key2=value2

Example Logs

2024-01-15 14:30:25.123 [SESSION] 🔗 (user123): NEW SESSION: Creating session | total_sessions_before=0
2024-01-15 14:30:25.456 [USER_INPUT] 💬 (user123): AI speaking. Buffering: 'Hello' | current_buffer_size=1
2024-01-15 14:30:26.789 [AI_RESPONSE] 🤖 (user123): Response sent: 'Hello! How can I help you?' | tts_duration=2.5s | playback_ends_at=1642248629.289

Client Implementation Examples

JavaScript/Web Client

class ChatClient {
    constructor(clientId, serverUrl = 'ws://localhost:9000') {
        this.clientId = clientId;
        this.serverUrl = `${serverUrl}?clientId=${clientId}`;
        this.ws = null;
        this.messageHandlers = new Map();
    }

    connect() {
        this.ws = new WebSocket(this.serverUrl);
        
        this.ws.onopen = () => {
            console.log('Connected to chat server');
        };

        this.ws.onmessage = (event) => {
            const message = JSON.parse(event.data);
            this.handleMessage(message);
        };

        this.ws.onclose = (event) => {
            console.log('Disconnected from chat server');
        };
    }

    sendMessage(text) {
        if (this.ws && this.ws.readyState === WebSocket.OPEN) {
            const message = {
                type: 'USER_INPUT',
                payload: {
                    text: text,
                    client_id: this.clientId
                }
            };
            this.ws.send(JSON.stringify(message));
        }
    }

    handleMessage(message) {
        switch (message.type) {
            case 'AI_RESPONSE':
                console.log('AI:', message.payload.text);
                break;
            case 'ERROR':
                console.error('Error:', message.payload.message);
                break;
            default:
                console.log('Unknown message type:', message.type);
        }
    }
}

// Usage
const client = new ChatClient('user123');
client.connect();
client.sendMessage('Hello, how are you?');

Python Client

import asyncio
import websockets
import json

class ChatClient:
    def __init__(self, client_id, server_url="ws://localhost:9000"):
        self.client_id = client_id
        self.server_url = f"{server_url}?clientId={client_id}"
        self.websocket = None

    async def connect(self):
        self.websocket = await websockets.connect(self.server_url)
        print(f"Connected to chat server as {self.client_id}")

    async def send_message(self, text):
        if self.websocket:
            message = {
                "type": "USER_INPUT",
                "payload": {
                    "text": text,
                    "client_id": self.client_id
                }
            }
            await self.websocket.send(json.dumps(message))

    async def listen(self):
        async for message in self.websocket:
            data = json.loads(message)
            await self.handle_message(data)

    async def handle_message(self, message):
        if message["type"] == "AI_RESPONSE":
            print(f"AI: {message['payload']['text']}")
        elif message["type"] == "ERROR":
            print(f"Error: {message['payload']['message']}")

    async def run(self):
        await self.connect()
        await self.listen()

# Usage
async def main():
    client = ChatClient("user123")
    await client.run()

asyncio.run(main())

Performance Considerations

Scalability

  • Session Management: Sessions stored in memory (consider Redis for production)
  • Concurrent Connections: Limited by system resources and WebSocket library
  • Model Loading: ONNX model loaded once per server instance

Optimization

  • Connection Pooling: aiohttp sessions reused for API calls
  • Async Processing: All I/O operations are asynchronous
  • Memory Management: Sessions cleaned up on disconnection (manual cleanup needed)

Monitoring

  • Performance Logging: Duration tracking for all operations
  • Error Tracking: Comprehensive error logging with context
  • Session Metrics: Active session count and client activity

Deployment

Prerequisites

pip install -r requirements.txt

Running the Server

python main_uninterruptable2.py

Production Considerations

  1. Environment Variables: Set all required environment variables
  2. SSL/TLS: Use WSS for secure WebSocket connections
  3. Load Balancing: Consider multiple server instances
  4. Session Storage: Use Redis or database for session persistence
  5. Monitoring: Implement health checks and metrics collection
  6. Logging: Configure log rotation and external logging service

Docker Deployment

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 9000

CMD ["python", "main_uninterruptable2.py"]

Troubleshooting

Common Issues

  1. Connection Refused: Check if server is running on correct port
  2. Missing clientId: Ensure clientId parameter is provided in URL
  3. API Errors: Verify FastGPT API credentials and network connectivity
  4. Model Loading: Check if ONNX model files are accessible

Debug Mode

Enable debug logging by modifying the logging level in the code or environment variables.

Health Check

The server doesn't provide a built-in health check endpoint. Consider implementing one for production monitoring.

API Reference

WebSocket Events

Event Direction Description
open Client Connection established
message Bidirectional Message exchange
close Client Connection closed
error Client Connection error

Message Types Summary

Type Direction Description
USER_INPUT Client → Server Send user message
AI_RESPONSE Server → Client Receive AI response
ERROR Server → Client Error notification

License

This WebSocket server is part of the turn detection request project. Please refer to the main project license for usage terms.

Description
No description provided
Readme 60 KiB
Languages
Python 78.8%
HTML 20.6%
Dockerfile 0.4%
Shell 0.2%