Connect / Disconnect to any ws:// or wss:// URL.
Microphone selector + mic on/off toggle — available input devices are listed with enumerateDevices, and getUserMedia is requested with echoCancellation, noiseSuppression, and autoGainControl so the browser handles AEC against the bot's voice.
Text composer — type a message and press Enter to send an input.text event (Shift+Enter for newline). Sending interrupts any in-flight bot audio so the next reply is heard cleanly.
Chat history rendered from input.transcript.final (you, when spoken), streamed response.text.delta / response.text.final (assistant — deltas arrive ahead of the synthesized audio), and locally for text you submit (the engine doesn't echo text input back as a transcript).
WebSocket log panel for connection state and compact send/receive events. Audio chunks are summarized so the UI does not flood.
Gapless TTS playback by scheduling each response.audio.delta chunk back-to-back on the AudioContext.
Live VU meter + mic and bot activity indicators.
Clear button to reset history.

No build step, no dependencies — just three files plus an AudioWorklet.

Layout

examples/webpage/
├── index.html
├── styles.css
├── app.js
└── pcm-recorder.worklet.js

Run

Start the engine (default port 8001):

cd AI-VideoAssistant-engine-v5-pipecat-minimal
source .venv/bin/activate
export OPENAI_API_KEY=...
uvicorn engine.main:app --host 127.0.0.1 --port 8001

In another terminal, serve the page from a port that's on the engine's CORS allow-list (see config.json). The default config allows http://localhost:8080:
```
cd AI-VideoAssistant-engine-v5-pipecat-minimal/examples/webpage
python -m http.server 8080
```
Open http://localhost:8080 in Chrome, Edge, or Safari.
- Click Connect (uses ws://127.0.0.1:8001/ws-product by default).
- Pick a microphone if needed, click Enable mic, and start speaking. The browser will prompt for microphone access on first use. Device names may appear only after permission is granted.

The browser's mic API requires a secure context. http://localhost qualifies; if you serve from another host, use HTTPS and a wss:// URL.

Audio details

Input: mono Float32 from getUserMedia is resampled in the AudioWorklet to PCM16 mono @ 16 kHz, framed into 20 ms chunks, and sent as binary websocket messages (the server accepts either binary or the JSON+base64 form).
Output: each response.audio.delta carries base64-encoded PCM16 @ 16 kHz; chunks are decoded and scheduled back-to-back through Web Audio. The browser handles resampling to the device rate.

Notes

Use headphones if you still hear echo despite browser AEC; the bot's voice leaking back into the open mic is the most common cause of feedback loops.
The engine's session has an inactivity timeout (session.inactivity_timeout_sec in config.json). If the bot doesn't respond after a long silence, reconnect.