Xin Wang e46f30c742 It works

2025-06-19 17:39:45 +08:00

13 KiB

Raw Permalink Blame History

WebSocket Chat Server Documentation

Overview

The WebSocket Chat Server is an intelligent conversation system that provides real-time AI chat capabilities with advanced turn detection, session management, and client-aware interactions. The server supports multiple concurrent clients with individual session tracking and automatic turn completion detection.

Features

🔄 Real-time WebSocket communication
🧠 AI-powered responses via FastGPT API
🎯 Intelligent turn detection using ONNX models
📱 Multi-client support with session isolation
⏱️ Automatic timeout handling with buffering
🔗 Session persistence across reconnections
🎨 Professional logging with client tracking
🌐 Welcome message system for new sessions

Server Configuration

Environment Variables

Create a .env file in the project root:

# Turn Detection Settings
MAX_INCOMPLETE_SENTENCES=3
MAX_RESPONSE_TIMEOUT=5

# FastGPT API Configuration
CHAT_MODEL_API_URL=http://101.89.151.141:3000/
CHAT_MODEL_API_KEY=your_fastgpt_api_key_here
CHAT_MODEL_APP_ID=your_fastgpt_app_id_here

Default Values

Variable	Default	Description
`MAX_INCOMPLETE_SENTENCES`	3	Maximum buffered sentences before forcing completion
`MAX_RESPONSE_TIMEOUT`	5	Seconds of silence before processing buffered input
`CHAT_MODEL_API_URL`	None	FastGPT API endpoint URL
`CHAT_MODEL_API_KEY`	None	FastGPT API authentication key
`CHAT_MODEL_APP_ID`	None	FastGPT application ID

WebSocket Connection

Connection URL Format

ws://localhost:9000?clientId=YOUR_CLIENT_ID

Connection Parameters

Parameter	Required	Description
`clientId`	Yes	Unique identifier for the client session

Connection Example

const ws = new WebSocket('ws://localhost:9000?clientId=user123');

Message Protocol

Message Format

All messages use JSON format with the following structure:

{
  "type": "MESSAGE_TYPE",
  "payload": {
    // Message-specific data
  }
}

Client to Server Messages

USER_INPUT

Sends user text input to the server.

{
  "type": "USER_INPUT",
  "payload": {
    "text": "Hello, how are you?",
    "client_id": "user123"  // Optional, will use URL clientId if not provided
  }
}

Fields:

text (string, required): The user's input text
client_id (string, optional): Client identifier (overrides URL parameter)

Server to Client Messages

AI_RESPONSE

AI-generated response to user input.

{
  "type": "AI_RESPONSE",
  "payload": {
    "text": "Hello! I'm doing well, thank you for asking. How can I help you today?",
    "client_id": "user123",
    "estimated_tts_duration": 3.2
  }
}

Fields:

text (string): AI response content
client_id (string): Client identifier
estimated_tts_duration (float): Estimated text-to-speech duration in seconds

ERROR

Error notification from server.

{
  "type": "ERROR",
  "payload": {
    "message": "Error description",
    "client_id": "user123"
  }
}

Fields:

message (string): Error description
client_id (string): Client identifier

Session Management

Session Lifecycle

Connection: Client connects with unique clientId
Session Creation: New session created or existing session reused
Welcome Message: New sessions receive welcome message automatically
Interaction: Real-time message exchange
Disconnection: Session data preserved for reconnection

Session Data Structure

class SessionData:
    client_id: str                    # Unique client identifier
    incomplete_sentences: List[str]   # Buffered user input
    conversation_history: List[ChatMessage]  # Full conversation history
    last_input_time: float           # Timestamp of last user input
    timeout_task: Optional[Task]     # Current timeout task
    ai_response_playback_ends_at: Optional[float]  # AI response end time

Session Persistence

Sessions persist across WebSocket disconnections
Reconnection with same clientId resumes existing session
Conversation history maintained throughout session lifetime
Timeout tasks properly managed during reconnections

Turn Detection System

How It Works

The server uses an ONNX-based turn detection model to determine when a user's utterance is complete:

Input Buffering: User input buffered during AI response playback
Turn Analysis: Model analyzes current + buffered input for completion
Decision Making: Determines if utterance is complete or needs more input
Timeout Handling: Processes buffered input after silence period

Turn Detection Parameters

Parameter	Value	Description
Model	`livekit/turn-detector`	Pre-trained turn detection model
Threshold	`0.0009`	Probability threshold for completion
Max History	6 turns	Maximum conversation history for analysis
Max Tokens	128	Maximum tokens for model input

Turn Detection Flow

User Input → Buffer Check → Turn Detection → Complete? → Process/Continue
     ↓              ↓              ↓            ↓           ↓
  AI Speaking?   Add to Buffer   Model Predict   Yes      Send to AI
     ↓              ↓              ↓            ↓           ↓
     No         Schedule Timeout   < Threshold   No      Schedule Timeout

Timeout and Buffering System

Buffering During AI Response

When the AI is generating or "speaking" a response:

User input is buffered in incomplete_sentences
New timeout task scheduled for each buffered input
Timeout waits for AI playback to complete before processing

Silence Timeout

After AI response completes:

Server waits MAX_RESPONSE_TIMEOUT seconds (default: 5s)
If no new input received, processes buffered input
Forces completion if MAX_INCOMPLETE_SENTENCES reached

Timeout Configuration

# Wait for AI playback to finish
remaining_playtime = session.ai_response_playback_ends_at - current_time
await asyncio.sleep(remaining_playtime)

# Wait for user silence
await asyncio.sleep(MAX_RESPONSE_TIMEOUT)  # 5 seconds default

Error Handling

Connection Errors

Missing clientId: Connection rejected with code 1008
Invalid JSON: Error message sent to client
Unknown message type: Error message sent to client

API Errors

FastGPT API failures: Error logged, user message reverted
Network errors: Comprehensive error logging with context
Model errors: Graceful degradation with error reporting

Timeout Errors

Task cancellation: Normal during new input arrival
Exception handling: Errors logged, timeout task cleared

Logging System

Log Levels

The server uses a comprehensive logging system with colored output:

ℹ️ INFO (green): General information
🐛 DEBUG (cyan): Detailed debugging information
⚠️ WARNING (yellow): Warning messages
❌ ERROR (red): Error messages
⏱️ TIMEOUT (blue): Timeout-related events
💬 USER_INPUT (purple): User input processing
🤖 AI_RESPONSE (blue): AI response generation
🔗 SESSION (bold): Session management events

Log Format

2024-01-15 14:30:25.123 [LEVEL] 🎯 (client_id): Message | key=value | key2=value2

Example Logs

2024-01-15 14:30:25.123 [SESSION] 🔗 (user123): NEW SESSION: Creating session | total_sessions_before=0
2024-01-15 14:30:25.456 [USER_INPUT] 💬 (user123): AI speaking. Buffering: 'Hello' | current_buffer_size=1
2024-01-15 14:30:26.789 [AI_RESPONSE] 🤖 (user123): Response sent: 'Hello! How can I help you?' | tts_duration=2.5s | playback_ends_at=1642248629.289

Client Implementation Examples

JavaScript/Web Client

class ChatClient {
    constructor(clientId, serverUrl = 'ws://localhost:9000') {
        this.clientId = clientId;
        this.serverUrl = `${serverUrl}?clientId=${clientId}`;
        this.ws = null;
        this.messageHandlers = new Map();
    }

    connect() {
        this.ws = new WebSocket(this.serverUrl);
        
        this.ws.onopen = () => {
            console.log('Connected to chat server');
        };

        this.ws.onmessage = (event) => {
            const message = JSON.parse(event.data);
            this.handleMessage(message);
        };

        this.ws.onclose = (event) => {
            console.log('Disconnected from chat server');
        };
    }

    sendMessage(text) {
        if (this.ws && this.ws.readyState === WebSocket.OPEN) {
            const message = {
                type: 'USER_INPUT',
                payload: {
                    text: text,
                    client_id: this.clientId
                }
            };
            this.ws.send(JSON.stringify(message));
        }
    }

    handleMessage(message) {
        switch (message.type) {
            case 'AI_RESPONSE':
                console.log('AI:', message.payload.text);
                break;
            case 'ERROR':
                console.error('Error:', message.payload.message);
                break;
            default:
                console.log('Unknown message type:', message.type);
        }
    }
}

// Usage
const client = new ChatClient('user123');
client.connect();
client.sendMessage('Hello, how are you?');

Python Client

import asyncio
import websockets
import json

class ChatClient:
    def __init__(self, client_id, server_url="ws://localhost:9000"):
        self.client_id = client_id
        self.server_url = f"{server_url}?clientId={client_id}"
        self.websocket = None

    async def connect(self):
        self.websocket = await websockets.connect(self.server_url)
        print(f"Connected to chat server as {self.client_id}")

    async def send_message(self, text):
        if self.websocket:
            message = {
                "type": "USER_INPUT",
                "payload": {
                    "text": text,
                    "client_id": self.client_id
                }
            }
            await self.websocket.send(json.dumps(message))

    async def listen(self):
        async for message in self.websocket:
            data = json.loads(message)
            await self.handle_message(data)

    async def handle_message(self, message):
        if message["type"] == "AI_RESPONSE":
            print(f"AI: {message['payload']['text']}")
        elif message["type"] == "ERROR":
            print(f"Error: {message['payload']['message']}")

    async def run(self):
        await self.connect()
        await self.listen()

# Usage
async def main():
    client = ChatClient("user123")
    await client.run()

asyncio.run(main())

Performance Considerations

Scalability

Session Management: Sessions stored in memory (consider Redis for production)
Concurrent Connections: Limited by system resources and WebSocket library
Model Loading: ONNX model loaded once per server instance

Optimization

Connection Pooling: aiohttp sessions reused for API calls
Async Processing: All I/O operations are asynchronous
Memory Management: Sessions cleaned up on disconnection (manual cleanup needed)

Monitoring

Performance Logging: Duration tracking for all operations
Error Tracking: Comprehensive error logging with context
Session Metrics: Active session count and client activity

Deployment

Prerequisites

pip install -r requirements.txt

Running the Server

python main_uninterruptable2.py

Production Considerations

Environment Variables: Set all required environment variables
SSL/TLS: Use WSS for secure WebSocket connections
Load Balancing: Consider multiple server instances
Session Storage: Use Redis or database for session persistence
Monitoring: Implement health checks and metrics collection
Logging: Configure log rotation and external logging service

Docker Deployment

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 9000

CMD ["python", "main_uninterruptable2.py"]

Troubleshooting

Common Issues

Connection Refused: Check if server is running on correct port
Missing clientId: Ensure clientId parameter is provided in URL
API Errors: Verify FastGPT API credentials and network connectivity
Model Loading: Check if ONNX model files are accessible

Debug Mode

Enable debug logging by modifying the logging level in the code or environment variables.

Health Check

The server doesn't provide a built-in health check endpoint. Consider implementing one for production monitoring.

API Reference

WebSocket Events

Event	Direction	Description
`open`	Client	Connection established
`message`	Bidirectional	Message exchange
`close`	Client	Connection closed
`error`	Client	Connection error

Message Types Summary

Type	Direction	Description
`USER_INPUT`	Client → Server	Send user message
`AI_RESPONSE`	Server → Client	Receive AI response
`ERROR`	Server → Client	Error notification

License

This WebSocket server is part of the turn detection request project. Please refer to the main project license for usage terms.

13 KiB Raw Permalink Blame History Unescape Escape

WebSocket Chat Server Documentation

Overview

Features

Server Configuration

Environment Variables

Default Values

WebSocket Connection

Connection URL Format

Connection Parameters

Connection Example

Message Protocol

Message Format

Client to Server Messages

USER_INPUT

Server to Client Messages

AI_RESPONSE

ERROR

Session Management

Session Lifecycle

Session Data Structure

Session Persistence

Turn Detection System

How It Works

Turn Detection Parameters

Turn Detection Flow

Timeout and Buffering System

Buffering During AI Response

Silence Timeout

Timeout Configuration

Error Handling

Connection Errors

API Errors

Timeout Errors

Logging System

Log Levels

Log Format

Example Logs

Client Implementation Examples

JavaScript/Web Client

Python Client

Performance Considerations

Scalability

Optimization

Monitoring

Deployment

Prerequisites

Running the Server

Production Considerations

Docker Deployment

Troubleshooting

Common Issues

Debug Mode

Health Check

API Reference

WebSocket Events

Message Types Summary

License

13 KiB

Raw Permalink Blame History