Files
pipecat/examples/foundational
Sam Sykes 7596d71460 Speechmatics STT + multi-speaker conversations (#2036)
* initial config

* skeleton

* Added a README (to be added to).

* Payloads coming from the ASR.

* doc update

* handle the partials and finals

* enable diarization in the example

* support sending messages to pipecat pipeline

* requirements fix in README

* updated example (with amusement)

* updated example to match master

* updated docs

* support for diarization tags

* logic fix for wrapper

* Use an internal SpeechFrame for speaker_id (not user_id).

* only include speaker tags on finalised transcript (as this may skew end of utterance detection)

* updated docs

* correction to docs and updated example

* updated requirement

* Fix for using default EU server.

* Updates from PR comments.

* Refactor based on comments in the original PR.

Primary focus on documentation, naming conventions and how `user_id` is used.

* Check for SMX installed when importing.

* Variable name change

* Comment correction.

* Support for Esporanto and Uyghur

* Impoved language support

* function name change

* Locale fix

* intercept

* interim changes

* pass the pipeline task to the module for adding events to the top of the pipeline

* logging for the pipeline

* Reduce timeout for content aggregator.

* staged update

* testing with Azure

* Updated context (Azure was dropping punctuation) and using better ElevenLabs model.

* Updated to RT 0.3.0 and use OpenAI (not Azure).

* Missing OpenAI import; parameter name change for output locale validation.

* Revert to `0.2.0` of RT SDK.

* fix for assignment of `output_locale_code`.

* update Speechmatics library to 0.3.1

* new transcription example

* updated asyncio task handling

* Updated doc strings

* enable OpenTelemetry logging

* removed import from stt for __init__

* updated examples and default values

* updated examples

* prevent lock up when closing the STT connection
2025-07-03 17:25:13 -03:00
..
fmt
2025-06-24 15:59:40 -05:00
fmt
2025-06-24 15:59:40 -05:00

Pipecat Foundational Examples

This directory contains examples showing how to build voice and multimodal agents with Pipecat. Each example demonstrates specific features, progressing from basic to advanced concepts.

Learning Paths

Depending on what you're trying to build, these learning paths will guide you through relevant examples:

  • New to Pipecat: Start with examples 01, 02, 07
  • Building conversational bots: 07, 10, 38
  • Common add-on capabilities: 17, 24, 28, 34
  • Adding visual capabilities: 03, 12, 26
  • Advanced agent capabilities: 14, 20, 37

Quick Start

  1. Set up a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Create a .env file with your API keys.

  4. Run any example:

    python 01-say-one-thing.py
    
  5. Open the web interface at http://localhost:7860 and click "Connect"

Running examples with other transports

It is possible to run most of the examples with other transports such as Twilio or Daily.

Daily

You need to create a Daily account at https://dashboard.daily.co/u/signup. Once signed up, you can create your own room from the dashboard and set the environment variables DAILY_SAMPLE_ROOM_URL and DAILY_API_KEY. Alternatively, you can let the example create a room for you (still needs DAILY_API_KEY environment variable). Then, start any example with -t daily:

python 07-interruptible.py -t daily

Twilio

It is also possible to run the example through a Twilio phone number. You will need to setup a few things:

  1. Install and run ngrok.
ngrok http 7860
  1. Configure your Twilio phone number. One way is to setup a TwiML app and set the request URL to the ngrok URL from step (1). Then, set your phone number to use the new TwiML app.

Then, run the example with:

python 07-interruptible.py -t twilio -x NGROK_HOST_NAME (no protocol)

Examples by Feature

Basics

Conversational AI

Common Utilities

Advanced LLM Features

Media Handling

Vision & Multimodal

Voice & Language

Integration Examples

Performance & Optimization

Utilities

Advanced Usage

Customizing Network Settings

python <example-name> --host 0.0.0.0 --port 8080

Troubleshooting

  • No audio/video: Check browser permissions for microphone and camera
  • Connection errors: Verify API keys in .env file
  • Missing dependencies: Run pip install -r requirements.txt
  • Port conflicts: Use --port to change the port

For more examples, visit our GitHub repository.