Storytelling Chatbot
This example shows how to build a voice-driven interactive storytelling experience. It periodically prompts the user for input for a 'choose your own adventure' style experience.
We use Gemini 2.0 for creating the story and image prompts, and we add visual elements to the story by generating images using Google's Imagen.
It uses the following AI services:
Deepgram - Speech-to-Text
Transcribes inbound participant voice media to text.
Google Gemini 2.0 - LLM
Our creative writer LLM. You can see the context used to prompt it here
ElevenLabs - Text-to-Speech
Converts and streams the LLM response from text to audio
Google Imagen - Image Generation
Adds pictures to our story. Prompting is quite key for style consistency, so we task the LLM to turn each story page into a short image prompt.
Setup
Client
-
Navigate to the client directory:
cd client -
Install dependencies:
npm install -
Build the client:
npm run build
Server
-
Navigate to the server directory
cd ../server -
Set up your virtual environment and install requirements
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt -
Create environment file and set variables
mv env.example .envYou'll need API keys for:
- DAILY_API_KEY
- ELEVENLABS_API_KEY
- ELEVENLABS_VOICE_ID
- GOOGLE_API_KEY
-
(Optional) Deployment:
When deploying to production, to ensure only this app can spawn new bot processes, set your
ENVtoproduction
Run it locally
-
Navigate back to the demo's root directory:
cd .. -
Run the application:
python server/bot_runner.py --host localhostYou can run with a custom domain or port using:
python server/bot_runner.py --host somehost --p someport -
➡️ Open the host URL in your browser: http://localhost:7860
Improvements to make
- Wait for track_started event to avoid rushed intro
- Show 5 minute timer on the UI