Files

Paul Kompfner f3a4b416df Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.

Removing `VisionImageRawFrame` lets us simplify LLM services' logic, getting us closer to the idealized architecture where all they care about is handling context frames.

This change is in service of getting us closer to ready to deprecate usage of `OpenAILLMContext` and subclasses in favor of the universal `LLMContext`, at least for the traditional text-to-text LLMs.

Why remove `VisionImageRawFrame` rather than deprecate? It's "internal"—only created by `VisionImageFrameAggregator`—and never intended to be used directly by users (it would be difficult to use directly anyway).

Move the logic that was once in `VisionImageFrameAggregator` directly into the examples. Reasoning:
- If `UserImageRequester` is defined in the examples, it makes sense for `UserImageProcessor` to be too, as it’s the flip side of the same coin, so to speak
- The logic is now pretty trivial
- This kind of one-shot, history-less image-describing pipeline shouldn't be common at all; it's ok for it to live in examples rather than as a dedicated class
- In the short term, this enables us to create `LLMContext`s for services that support it and `OpenAILLMContext`s for services that don't yet (AWS)

This commit also adds missing translation from OpenAI-format image context messages to AWS format. Note that this isn't a wasted effort in the face of the upcoming migration to universal `LLMContext`—this work will be reused as it has to be implemented there too.

2025-09-08 17:00:08 -04:00

assets

examples(foundational): support multiple transports

2025-05-27 17:42:52 -07:00

01-say-one-thing-piper.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

01-say-one-thing-rime.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

01-say-one-thing.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

01a-local-audio.py

examples: use new services packages

2025-03-30 16:21:00 -07:00

01b-livekit-audio.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

01c-fastpitch.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

02-llm-say-one-thing.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

03-still-frame.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

03a-local-still-frame.py

examples: update camera_* with video_*

2025-04-24 17:14:18 -07:00

03b-still-frame-imagen.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

04-transports-small-webrtc.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

04a-transports-daily.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

04b-transports-livekit.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

05-sync-speech-and-image.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

05a-local-sync-speech-and-image.py

Progress on updating foundational examples to avoid using the newly-deprecated LLMMessagesFrame.

2025-08-07 14:43:37 -04:00

06-listen-and-respond.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

06a-image-sync.py

fix: Specify frame direction in 06a push_frame

2025-09-03 15:07:05 -04:00

07-interruptible-cartesia-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07-interruptible.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07a-interruptible-speechmatics-vad.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07a-interruptible-speechmatics.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07aa-interruptible-soniox.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07ab-interruptible-inworld-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07ac-interruptible-asyncai-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07ac-interruptible-asyncai.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07ad-interruptible-aicoustics.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07b-interruptible-langchain.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07c-interruptible-deepgram-vad.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07c-interruptible-deepgram.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07d-interruptible-elevenlabs-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07d-interruptible-elevenlabs.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07e-interruptible-playht-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07e-interruptible-playht.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07f-interruptible-azure.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07g-interruptible-openai.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07h-interruptible-openpipe.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07i-interruptible-xtts.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07j-interruptible-gladia.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07k-interruptible-lmnt.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07l-interruptible-groq.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07m-interruptible-aws.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07n-interruptible-gemini.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07n-interruptible-google.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07o-interruptible-assemblyai.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07p-interruptible-krisp.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07q-interruptible-rime-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07q-interruptible-rime.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07r-interruptible-riva-nim.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07s-interruptible-google-audio-in.py

example 07s: minor typo updates

2025-09-03 12:11:07 -05:00

07t-interruptible-fish.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07u-interruptible-ultravox.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07v-interruptible-neuphonic-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07v-interruptible-neuphonic.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07w-interruptible-fal.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07x-interruptible-local.py

Add LLMRunFrame to trigger an LLM response, replacing context_aggregator.user().get_context_frame()

2025-08-28 09:53:33 -04:00

07y-interruptible-minimax.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07z-interruptible-sarvam-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

07z-interruptible-sarvam.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

08-bots-arguing.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

09-mirror.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

09a-local-mirror.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

10-wake-phrase.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

11-sound-effects.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

12-describe-video.py

Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.

2025-09-08 17:00:08 -04:00

12a-describe-video-gemini-flash.py

Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.

2025-09-08 17:00:08 -04:00

12b-describe-video-gpt-4o.py

Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.

2025-09-08 17:00:08 -04:00

12c-describe-video-anthropic.py

Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.

2025-09-08 17:00:08 -04:00

12d-describe-video-aws.py

Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.

2025-09-08 17:00:08 -04:00

13-whisper-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13a-whisper-local.py

examples: remove vad_enabled=True

2025-04-24 17:14:18 -07:00

13b-deepgram-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13c-gladia-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13c-gladia-translation.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13d-assemblyai-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13e-whisper-mlx.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13f-cartesia-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13g-sambanova-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13h-speechmatics-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13i-soniox-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

13j-azure-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14-function-calling.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14a-function-calling-anthropic.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14b-function-calling-anthropic-video.py

Rename Anthropic enable_prompt_caching_beta parameter to just enable_prompt_caching

2025-09-04 13:03:06 -04:00

14c-function-calling-together.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14d-function-calling-video.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14e-function-calling-google.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14f-function-calling-groq.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14g-function-calling-grok.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14h-function-calling-azure.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14i-function-calling-fireworks.py

Merge pull request #2565 from pipecat-ai/aleix/reorganize-transports

2025-09-03 08:52:49 -07:00

14j-function-calling-nim.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14k-function-calling-cerebras.py

Add 14k (CerebrasLLMService) to release evals

2025-09-03 17:11:38 -04:00

14l-function-calling-deepseek.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14m-function-calling-openrouter.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14n-function-calling-perplexity.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14o-function-calling-gemini-openai-format.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14p-function-calling-gemini-vertex-ai.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14q-function-calling-qwen.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14r-function-calling-aws.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14s-function-calling-sambanova.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14t-function-calling-direct.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14u-function-calling-ollama.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14v-function-calling-openai.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14w-function-calling-mistral.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14x-function-calling-universal-context.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14y-function-calling-google-universal-context.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

14z-function-calling-anthropic-universal-context.py

Rename Anthropic enable_prompt_caching_beta parameter to just enable_prompt_caching

2025-09-04 13:03:06 -04:00

15-switch-voices.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

15a-switch-languages.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

16-gpu-container-local-bot.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

17-detect-user-idle.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

18-gstreamer-filesrc.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

18a-gstreamer-videotestsrc.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

19-openai-realtime-beta.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

19-openai-realtime.py

Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596 )

2025-09-07 09:09:57 -04:00

19a-azure-realtime-beta.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

19a-azure-realtime.py

Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596 )

2025-09-07 09:09:57 -04:00

19b-openai-realtime-beta-text.py

Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596 )

2025-09-07 09:09:57 -04:00

19b-openai-realtime-text.py

Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596 )

2025-09-07 09:09:57 -04:00

20a-persistent-context-openai.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

20b-persistent-context-openai-realtime-beta.py

Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596 )

2025-09-07 09:09:57 -04:00

20b-persistent-context-openai-realtime.py

Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596 )

2025-09-07 09:09:57 -04:00

20c-persistent-context-anthropic.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

20d-persistent-context-gemini.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

20e-persistent-context-aws-nova-sonic.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

21-tavus-transport.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

21a-tavus-video-service.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

22-natural-conversation.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

22b-natural-conversation-proposal.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

22c-natural-conversation-mixed-llms.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

22d-natural-conversation-gemini-audio.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

23-bot-background-sound.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

24-stt-mute-filter.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

25-google-audio-in.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26-gemini-multimodal-live.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26a-gemini-multimodal-live-transcription.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26b-gemini-multimodal-live-function-calling.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26c-gemini-multimodal-live-video.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26d-gemini-multimodal-live-text.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26e-gemini-multimodal-google-search.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26f-gemini-multimodal-live-files-api.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

26g-gemini-multimodal-live-groundingMetadata.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

27-simli-layer.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

28-transcription-processor.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

29-turn-tracking-observer.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

30-observer.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

31-heartbeats.py

Updating foundation examples to use SmallWebRTCTransport and pipecat-ai-small-webrtc-prebuilt (#1534 )

2025-04-11 19:44:16 -04:00

32-gemini-grounding-metadata.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

33-gemini-rag.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

34-audio-recording.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

35-pattern-pair-voice-switching.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

36-user-email-gathering.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

37-mem0.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

38-smart-turn-fal.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

38a-smart-turn-local-coreml.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

38b-smart-turn-local.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

39-mcp-stdio.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

39a-mcp-run-sse.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

39b-multiple-mcp.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

39c-mcp-run-http.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

40-aws-nova-sonic.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

41a-text-only-webrtc.py

Add LLMRunFrame to trigger an LLM response, replacing context_aggregator.user().get_context_frame()

2025-08-28 09:53:33 -04:00

41b-text-and-audio-webrtc.py

Add LLMRunFrame to trigger an LLM response, replacing context_aggregator.user().get_context_frame()

2025-08-28 09:53:33 -04:00

42-interruption-config.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

43a-heygen-video-service.py

Improving HeyGen example video quality.

2025-09-05 11:30:01 -03:00

44-voicemail-detection.py

transports: reorganize module

2025-09-02 17:31:39 -07:00

README.md

Improve the foundational example README

2025-08-14 11:29:04 -04:00

README.md

Pipecat Foundational Examples

This directory contains examples showing how to build voice and multimodal agents with Pipecat. Each example demonstrates specific features, progressing from basic to advanced concepts.

Setup

Follow the README steps to get your local environment configured.

Run from root directory: Make sure you are running the steps from the root directory.

Using local audio?: The LocalAudioTransport requires a system dependency for portaudio. Install the dependency to use the transport.
Copy the env.example file and add API keys for services you plan to use:
```
cp env.example .env
# Edit .env with your API keys
```
Navigate to the examples directory if you aren't already there:
```
cd examples/foundational
```
Run any example:
```
uv run python 01-say-one-thing.py
```
Open the web interface at http://localhost:7860/client/ and click "Connect"

Running examples with other transports

Most examples support running with other transports, like Twilio or Daily.

Daily

You need to create a Daily account at https://dashboard.daily.co/u/signup. Once signed up, you can create your own room from the dashboard and set the environment variables DAILY_SAMPLE_ROOM_URL and DAILY_API_KEY. Alternatively, you can let the example create a room for you (still needs DAILY_API_KEY environment variable). Then, start any example with -t daily:

uv run 07-interruptible.py -t daily

Twilio

It is also possible to run the example through a Twilio phone number. You will need to setup a few things:

Install and run ngrok.

ngrok http 7860

Configure your Twilio phone number. One way is to setup a TwiML app and set the request URL to the ngrok URL from step (1). Then, set your phone number to use the new TwiML app.

Then, run the example with:

uv run 07-interruptible.py -t twilio -x NGROK_HOST_NAME

Examples by Feature

Basics

01-say-one-thing.py: Most basic bot that says one phrase and exits (Transport, TTS, Event handlers)
02-llm-say-one-thing.py: Bot generates a response with an LLM (LLM initialization)
03-still-frame.py: Displays a static image (Video transport, Image service)
04-transport.py: Different transport options (WebRTC, Daily, Livekit)

Conversational AI

07-interruptible.py: Basic voice assistant bot (STT, TTS, LLM, Interruptible speech)
10-wake-phrase.py: Bot activated by wake phrase (WakeCheckFilter)
22-natural-conversation.py: Smart turn detection (Multiple LLMs, Turn management)
38-smart-turn-fal.py: ML-based turn detection (Fal service, Local models)

Common Utilities

17-detect-user-idle.py: Handle inactive users (UserIdleProcessor)
24-stt-mute-filter.py: Selectively mute user input (STTMuteFilter)
28-transcription-processor.py: Record conversation text (TranscriptProcessor)
30-observer.py: Access frame data (Custom observers)
31-heartbeats.py: Detect idle pipelines (Pipeline monitoring)
34-audio-recording.py: Record conversation audio (Composite and track-level recording)

Advanced LLM Features

14-function-calling.py: Bot with tool usage (Function schemas, Tool registration)
20a-persistent-context-openai.py: Persistent conversation context (Memory management)
32-gemini-grounding-metadata.py: Web search capabilities (Google search integration)
33-gemini-rag.py: Retrieval-augmented generation (Data sources, Grounding)
37-mem0.py: Long-term agent memory (Mem0 service integration)

Media Handling

05-sync-speech-and-images.py: Synchronized narration with images (Custom processors, SyncParallelPipeline)
06a-image-sync.py: Dynamic image updates while speaking (Synchronized A/V pipelines)
09-mirror.py: Mirror user's audio and video (Custom frame processors)
11-sound-effects.py: Add sounds when bot speaks (Sound playback, Event synchronization)
23-bot-background-sound.py: Play background audio (SoundfileMixer)

Vision & Multimodal

12a-describe-video-gemini-flash.py: Bot describes user's video (Video input, Multimodal LLMs)
26c-gemini-multimodal-live-video.py: Gemini with video input (Streaming video, Function calls)

Voice & Language

13-transcription.py: Speech transcription demo (STT providers, Real-time transcription)
15-switch-voices.py: Dynamic voice/language changing (ParallelPipelines, FunctionFilters)
25-google-audio-in.py: Gemini for speech recognition (Alternative transcription)
35-pattern-pair-voice-switching.py: Dynamic TTS voice switching (XML parsing, PatternPairAggregator)
36-user-email-gathering.py: Spelling mode for TTS (Confirmation patterns, XML tags)

Integration Examples

18-gstreamer-filesrc.py: GStreamer video streaming (Video processing)
19-openai-realtime-beta.py: OpenAI Speech-to-Speech (Direct S2S, Function calls)
21-tavus-layer-tavus-transport.py: Tavus digital twin (Avatar integration)
27-simli-layer.py: Simli avatar integration (Video synchronization)

Performance & Optimization

16-gpu-container-local-bot.py: GPU-accelerated local bot (Performance measurement)

Advanced Usage

Customizing Network Settings

uv run python <example-name> --host 0.0.0.0 --port 8080

Troubleshooting

No audio/video: Check browser permissions for microphone and camera
Connection errors: Verify API keys in .env file
Port conflicts: Use --port to change the port

For more examples, visit our the `pipecat-examples repository.