engine-v5-pipecat-core/examples/webpage/README.md

# Webpage Example — Realtime Voice Chat

A self-contained browser client for the engine's product websocket
(`/ws-product`, protocol `va.ws.v1`).

## Features

- **Connect / Disconnect** to any `ws://` or `wss://` URL.
- **Microphone selector + mic on/off toggle** — available input devices
  are listed with `enumerateDevices`, and getUserMedia is requested with
  `echoCancellation`, `noiseSuppression`, and `autoGainControl` so the
  browser handles AEC against the bot's voice.
- **Text composer** — type a message and press <kbd>Enter</kbd> to send
  an `input.text` event (Shift+Enter for newline). Sending interrupts
  any in-flight bot audio so the next reply is heard cleanly.
- **Chat history** rendered from `input.transcript.final` (you, when
  spoken), streamed `response.text.delta` / `response.text.final`
  (assistant — deltas arrive ahead of the synthesized audio), and locally
  for text you submit (the engine doesn't echo text input back as a
  transcript).
- **WebSocket log** panel for connection state and compact send/receive
  events. Audio chunks are summarized so the UI does not flood.
- **Gapless TTS playback** by scheduling each `response.audio.delta`
  chunk back-to-back on the AudioContext.
- **Live VU meter** + mic and bot activity indicators.
- **Clear** button to reset history.

No build step, no dependencies — just three files plus an AudioWorklet.

## Layout

```text
examples/webpage/
├── index.html
├── styles.css
├── app.js
└── pcm-recorder.worklet.js
```

## Run

1. Start the engine (default port `8000`):

   ```bash
   cd AI-VideoAssistant-engine-v5-pipecat-minimal
   source .venv/bin/activate
   export OPENAI_API_KEY=...
   uvicorn engine.main:app --host 127.0.0.1 --port 8000
   ```

2. Open the demo page served by the same process:

   ```text
   http://127.0.0.1:8000/demo/
   ```

   The default websocket URL is derived from the page host
   (`ws://127.0.0.1:8000/ws-product`). Click **Connect**, pick a
   microphone if needed, click **Enable mic**, and start speaking.

   Mount path and on/off are controlled in `config.json`:

   ```json
   "server": {
     "serve_webpage": true,
     "webpage_mount": "/demo"
   }
   ```

   Set `"serve_webpage": false` in production if you serve the UI elsewhere.

### Standalone static server (optional)

You can still serve the files from another port for UI-only iteration.
Add that origin to `server.cors_origins` in `config.json` if needed:

```bash
cd AI-VideoAssistant-engine-v5-pipecat-minimal/examples/webpage
python -m http.server 8080
```

Then open <http://localhost:8080> and point the URL field at
`ws://127.0.0.1:8000/ws-product`.

> The browser's mic API requires a secure context. `http://localhost`
> qualifies; if you serve from another host, use HTTPS and a `wss://`
> URL.

## Audio details

- Input: mono Float32 from `getUserMedia` is resampled in the
  AudioWorklet to PCM16 mono @ 16 kHz, framed into 20 ms chunks, and
  sent as **binary** websocket messages (the server accepts either
  binary or the JSON+base64 form).
- Output: each `response.audio.delta` carries base64-encoded PCM16 @
  16 kHz; chunks are decoded and scheduled back-to-back through Web
  Audio. The browser handles resampling to the device rate.

## Notes

- Use headphones if you still hear echo despite browser AEC; the bot's
  voice leaking back into the open mic is the most common cause of
  feedback loops.
- The engine's session has an inactivity timeout
  (`session.inactivity_timeout_sec` in `config.json`). If the bot
  doesn't respond after a long silence, reconnect.