Compare commits
1 Commits
main
...
vp-fix/mcp
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
a4b66aedc1 |
@@ -1 +0,0 @@
|
||||
../../.claude/skills/changelog
|
||||
@@ -1 +0,0 @@
|
||||
../../.claude/skills/cleanup
|
||||
@@ -1 +0,0 @@
|
||||
../../.claude/skills/code-review
|
||||
@@ -1 +0,0 @@
|
||||
../../.claude/skills/docstring
|
||||
@@ -1 +0,0 @@
|
||||
../../.claude/skills/pr-description
|
||||
@@ -1 +0,0 @@
|
||||
../../.claude/skills/pr-submit
|
||||
@@ -1 +0,0 @@
|
||||
../../.claude/skills/update-docs
|
||||
@@ -1,8 +1,3 @@
|
||||
---
|
||||
name: cleanup
|
||||
description: Review, refactor, document, and validate code changes in the current branch
|
||||
---
|
||||
|
||||
# Code Cleanup Skill
|
||||
|
||||
The **Code Cleanup Skill** reviews, refactors, and documents code changes in your current branch, ensuring alignment with **Pipecat's architecture, coding standards, and example patterns**.
|
||||
|
||||
@@ -1,91 +0,0 @@
|
||||
---
|
||||
name: squash-commits
|
||||
description: Reorganize messy branch commits into a small set of logical, meaningful commits without changing any content. Drops merge-from-main commits. Safe: creates a backup branch first.
|
||||
---
|
||||
|
||||
Reorganize the commits on the current branch into a small number of logical commits. Do NOT change any file content — only the commit structure changes.
|
||||
|
||||
## Instructions
|
||||
|
||||
### 1. Safety check
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
If there are uncommitted changes, stop and tell the user to commit or stash them first.
|
||||
|
||||
### 2. Inspect the branch
|
||||
|
||||
```bash
|
||||
git log main..HEAD --oneline
|
||||
git diff main..HEAD --name-only
|
||||
```
|
||||
|
||||
List every file changed vs `main` and every commit on the branch (excluding merge commits from main).
|
||||
|
||||
### 3. Create a backup branch
|
||||
|
||||
```bash
|
||||
git branch backup/<current-branch-name>
|
||||
```
|
||||
|
||||
Tell the user the backup exists so they can recover if needed.
|
||||
|
||||
### 4. Soft-reset to main and unstage everything
|
||||
|
||||
```bash
|
||||
git reset --soft main
|
||||
git restore --staged .
|
||||
```
|
||||
|
||||
All branch changes are now in the working tree, unstaged. No content has changed.
|
||||
|
||||
### 5. Plan the logical groups
|
||||
|
||||
Read the changed files and the original commit messages to understand what the work covers. Group related files into logical commits. Typical groups:
|
||||
|
||||
- Core feature or fix (new source files + modified core files)
|
||||
- Secondary features or fixes (each as its own commit if distinct)
|
||||
- Refactoring or renames
|
||||
- Tests
|
||||
- Changelogs / docs
|
||||
|
||||
Use the changelog files (if any) as a strong hint — each changelog entry often maps to one commit.
|
||||
|
||||
Present the proposed grouping to the user and ask for confirmation before committing.
|
||||
|
||||
### 6. Commit in logical groups
|
||||
|
||||
For each group, stage only the relevant files and commit with a clear message following the project's conventions:
|
||||
|
||||
```bash
|
||||
git add <file1> <file2> ...
|
||||
git commit -m "..."
|
||||
```
|
||||
|
||||
Use conventional commit prefixes if the project uses them (`feat:`, `fix:`, `refactor:`, `test:`, `chore:`).
|
||||
|
||||
### 7. Verify
|
||||
|
||||
```bash
|
||||
git log main..HEAD --oneline
|
||||
git diff main..HEAD --name-only
|
||||
git status --short
|
||||
```
|
||||
|
||||
Confirm:
|
||||
- Commit count is small and each message is meaningful
|
||||
- The set of changed files vs `main` is identical to before
|
||||
- Working tree is clean
|
||||
|
||||
### 8. Remind about force-push
|
||||
|
||||
The branch history has been rewritten. Tell the user they will need to `git push --force-with-lease` when they are ready to update the remote. Do NOT push automatically.
|
||||
|
||||
## Rules
|
||||
|
||||
- Never change file contents. If you find yourself editing a file, stop.
|
||||
- Never skip the backup branch step.
|
||||
- Never force-push without explicit user instruction.
|
||||
- If any step fails or the result looks wrong, tell the user and suggest restoring from the backup: `git reset --hard backup/<branch-name>`.
|
||||
30
.dockerignore
Normal file
30
.dockerignore
Normal file
@@ -0,0 +1,30 @@
|
||||
# flyctl launch added from .gitignore
|
||||
**/.vscode
|
||||
**/env
|
||||
**/__pycache__
|
||||
**/*~
|
||||
**/venv
|
||||
#*#
|
||||
|
||||
# Distribution / packaging
|
||||
**/.Python
|
||||
**/build
|
||||
**/develop-eggs
|
||||
**/dist
|
||||
**/downloads
|
||||
**/eggs
|
||||
**/.eggs
|
||||
**/lib
|
||||
**/lib64
|
||||
**/parts
|
||||
**/sdist
|
||||
**/var
|
||||
**/wheels
|
||||
**/share/python-wheels
|
||||
**/*.egg-info
|
||||
**/.installed.cfg
|
||||
**/*.egg
|
||||
**/MANIFEST
|
||||
**/.DS_Store
|
||||
**/.env
|
||||
fly.toml
|
||||
1
.github/workflows/coverage.yaml
vendored
1
.github/workflows/coverage.yaml
vendored
@@ -42,7 +42,6 @@ jobs:
|
||||
--extra langchain \
|
||||
--extra livekit \
|
||||
--extra piper \
|
||||
--extra runner \
|
||||
--extra sagemaker \
|
||||
--extra tracing \
|
||||
--extra websocket
|
||||
|
||||
8
.github/workflows/format.yaml
vendored
8
.github/workflows/format.yaml
vendored
@@ -32,9 +32,7 @@ jobs:
|
||||
run: uv python install 3.12
|
||||
|
||||
- name: Install development dependencies
|
||||
# `--all-extras` (matching the dev setup in README.md) so pyright can
|
||||
# resolve types from various optional dependencies.
|
||||
run: uv sync --group dev --all-extras --no-extra gstreamer --no-extra local
|
||||
run: uv sync --group dev
|
||||
|
||||
- name: Ruff formatter
|
||||
id: ruff-format
|
||||
@@ -43,7 +41,3 @@ jobs:
|
||||
- name: Ruff linter (all rules)
|
||||
id: ruff-check
|
||||
run: uv run ruff check
|
||||
|
||||
- name: Type check (pyright)
|
||||
id: pyright
|
||||
run: uv run pyright
|
||||
|
||||
1
.github/workflows/tests.yaml
vendored
1
.github/workflows/tests.yaml
vendored
@@ -46,7 +46,6 @@ jobs:
|
||||
--extra langchain \
|
||||
--extra livekit \
|
||||
--extra piper \
|
||||
--extra runner \
|
||||
--extra sagemaker \
|
||||
--extra tracing \
|
||||
--extra websocket
|
||||
|
||||
1
.github/workflows/update-docs.yml
vendored
1
.github/workflows/update-docs.yml
vendored
@@ -114,7 +114,6 @@ jobs:
|
||||
GH_TOKEN=$DOCS_SYNC_TOKEN gh pr create \
|
||||
--repo pipecat-ai/docs \
|
||||
--label auto-docs \
|
||||
--label pipecat \
|
||||
--title "docs: update for pipecat PR #${{ steps.pr.outputs.number }}" \
|
||||
--body "$(cat <<'BODY'
|
||||
Automated documentation update for [pipecat PR #${{ steps.pr.outputs.number }}](https://github.com/pipecat-ai/pipecat/pull/${{ steps.pr.outputs.number }}).
|
||||
|
||||
@@ -11,7 +11,7 @@ build:
|
||||
jobs:
|
||||
post_install:
|
||||
- pip install uv
|
||||
- UV_PROJECT_ENVIRONMENT=$READTHEDOCS_VIRTUALENV_PATH uv sync --group docs --all-extras --no-extra gstreamer --no-extra local_smart_turn --no-extra moondream --no-extra mlx-whisper
|
||||
- UV_PROJECT_ENVIRONMENT=$READTHEDOCS_VIRTUALENV_PATH uv sync --group docs --all-extras --no-extra gstreamer --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
|
||||
|
||||
sphinx:
|
||||
configuration: docs/api/conf.py
|
||||
|
||||
174
AGENTS.md
174
AGENTS.md
@@ -1,174 +0,0 @@
|
||||
# AGENTS.md
|
||||
|
||||
This file provides guidance to AI coding agents when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
Pipecat is an open-source Python framework for building real-time voice and multimodal conversational AI agents. It orchestrates audio/video, AI services, transports, and conversation pipelines using a frame-based architecture.
|
||||
|
||||
## Common Commands
|
||||
|
||||
```bash
|
||||
# Setup development environment
|
||||
uv sync --group dev --all-extras --no-extra gstreamer --no-extra local
|
||||
|
||||
# Install pre-commit hooks
|
||||
uv run pre-commit install
|
||||
|
||||
# Run all tests
|
||||
uv run pytest
|
||||
|
||||
# Run a single test file
|
||||
uv run pytest tests/test_name.py
|
||||
|
||||
# Run a specific test
|
||||
uv run pytest tests/test_name.py::test_function_name
|
||||
|
||||
# Preview changelog
|
||||
uv run towncrier build --draft --version Unreleased
|
||||
|
||||
# Lint and format check
|
||||
uv run ruff check
|
||||
uv run ruff format --check
|
||||
|
||||
# Update dependencies (after editing pyproject.toml)
|
||||
uv lock && uv sync
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Frame-Based Pipeline Processing
|
||||
|
||||
All data flows as **Frame** objects through a pipeline of **FrameProcessors**:
|
||||
|
||||
```
|
||||
[Processor1] → [Processor2] → ... → [ProcessorN]
|
||||
```
|
||||
|
||||
**Key components:**
|
||||
|
||||
- **Frames** (`src/pipecat/frames/frames.py`): Data units (audio, text, video) and control signals. Flow DOWNSTREAM (input→output) or UPSTREAM (acknowledgments/errors).
|
||||
|
||||
- **FrameProcessor** (`src/pipecat/processors/frame_processor.py`): Base processing unit. Each processor receives frames, processes them, and pushes results downstream.
|
||||
|
||||
- **Pipeline** (`src/pipecat/pipeline/pipeline.py`): Chains processors together.
|
||||
|
||||
- **ParallelPipeline** (`src/pipecat/pipeline/parallel_pipeline.py`): Runs multiple pipelines in parallel.
|
||||
|
||||
- **Transports** (`src/pipecat/transports/`): Transports are frame processors used for external I/O layer (Daily WebRTC, LiveKit WebRTC, WebSocket, Local). Abstract interface via `BaseTransport`, `BaseInputTransport` and `BaseOutputTransport`.
|
||||
|
||||
- **Pipeline Task (`src/pipecat/pipeline/task.py`)**: Runs and manages a pipeline. Pipeline tasks send the first frame, `StartFrame`, to the pipeline in order for processors to know they can start processing and pushing frames. Pipeline tasks internally create a pipeline with two additional processors, a source processor before the user-defined pipeline and a sink processor at the end. Those are used for multiple things: error handling, pipeline task level events, heartbeat monitoring, etc.
|
||||
|
||||
- **Pipeline Runner (`src/pipecat/pipeline/runner.py`)**: High-level entry point for executing pipeline tasks. Handles signal management (SIGINT/SIGTERM) for graceful shutdown and optional garbage collection. Run a single pipeline task with `await runner.run(task)` or multiple concurrently with `await asyncio.gather(runner.run(task1), runner.run(task2))`.
|
||||
|
||||
- **Services** (`src/pipecat/services/`): 60+ AI provider integrations (STT, TTS, LLM, etc.). Extend base classes: `AIService`, `LLMService`, `STTService`, `TTSService`, `VisionService`.
|
||||
|
||||
- **Serializers** (`src/pipecat/serializers/`): Convert frames to/from wire formats for WebSocket transports. `FrameSerializer` base class defines `serialize()` and `deserialize()`. Telephony serializers (Twilio, Plivo, Vonage, Telnyx, Exotel, Genesys) handle provider-specific protocols and audio encoding (e.g., μ-law).
|
||||
|
||||
- **RTVI** (`src/pipecat/processors/frameworks/rtvi.py`): Real-Time Voice Interface protocol bridging clients and the pipeline. `RTVIProcessor` handles incoming client messages (text input, audio, function call results). `RTVIObserver` converts pipeline frames to outgoing messages: user/bot speaking events, transcriptions, LLM/TTS lifecycle, function calls, metrics, and audio levels.
|
||||
|
||||
- **Observers** (`src/pipecat/observers/`): Monitor frame flow without modifying the pipeline. Passed to `PipelineTask` via the `observers` parameter. Implement `on_process_frame()` and `on_push_frame()` callbacks.
|
||||
|
||||
### Important Patterns
|
||||
|
||||
- **Context Aggregation**: `LLMContext` accumulates messages for LLM calls; `UserResponse` aggregates user input
|
||||
|
||||
- **Turn Management**: Turn management is done through `LLMUserAggregator` and
|
||||
`LLMAssistantAggregator`, created with `LLMContextAggregatorPair`
|
||||
|
||||
- **User turn strategies**: Detection of when the user starts and stops speaking is done via user turn start/stop strategies. They push `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` respectively.
|
||||
|
||||
- **Interruptions**: Interruptions are usually triggered by a user turn start strategy (e.g. `VADUserTurnStartStrategy`) but they can be triggered by other processors as well, in which case the user turn start strategies don't need to. An `InterruptionFrame` carries an optional `asyncio.Event` that is set when the frame reaches the pipeline sink. If a processor stops an `InterruptionFrame` from propagating downstream (i.e., doesn't push it), it **must** call `frame.complete()` to avoid stalling `push_interruption_task_frame_and_wait()` callers.
|
||||
|
||||
- **Uninterruptible Frames**: These are frames that will not be removed from internal queues even if there's an interruption. For example, `EndFrame` and `StopFrame`.
|
||||
|
||||
- **Events**: Most classes in Pipecat have `BaseObject` as the very base class. `BaseObject` has support for events. Events can run in the background in an async task (default) or synchronously (`sync=True`) if we want immediate action. Synchronous event handlers need to execute fast.
|
||||
|
||||
- **Async Task Management**: Always use `self.create_task(coroutine, name)` instead of raw `asyncio.create_task()`. The `TaskManager` automatically tracks tasks and cleans them up on processor shutdown. Use `await self.cancel_task(task, timeout)` for cancellation.
|
||||
|
||||
- **Error Handling**: Use `await self.push_error(msg, exception, fatal)` to push errors upstream. Services should use `fatal=False` (the default) so application code can handle errors and take action (e.g. switch to another service).
|
||||
|
||||
### Key Directories
|
||||
|
||||
| Directory | Purpose |
|
||||
| -------------------------- | -------------------------------------------------- |
|
||||
| `src/pipecat/frames/` | Frame definitions (100+ types) |
|
||||
| `src/pipecat/processors/` | FrameProcessor base + aggregators, filters, audio |
|
||||
| `src/pipecat/pipeline/` | Pipeline orchestration |
|
||||
| `src/pipecat/services/` | AI service integrations (60+ providers) |
|
||||
| `src/pipecat/transports/` | Transport layer (Daily, LiveKit, WebSocket, Local) |
|
||||
| `src/pipecat/serializers/` | Frame serialization for WebSocket protocols |
|
||||
| `src/pipecat/observers/` | Pipeline observers for monitoring frame flow |
|
||||
| `src/pipecat/audio/` | VAD, filters, mixers, turn detection, DTMF |
|
||||
| `src/pipecat/turns/` | User turn management |
|
||||
|
||||
## Code Style
|
||||
|
||||
- **Docstrings**: Google-style. Classes describe purpose; `__init__` has `Args:` section; dataclasses use `Parameters:` section.
|
||||
- **Deprecations**: Use the `.. deprecated:: <version>` Sphinx directive in docstrings (never inline tags like `[DEPRECATED]`), and pair it with a runtime `warnings.warn(..., DeprecationWarning)` at the call site. See `CONTRIBUTING.md` for full conventions.
|
||||
- **Linting**: Ruff (line length 100). Pre-commit hooks enforce formatting.
|
||||
- **Type hints**: Required for complex async code.
|
||||
- **Dataclass vs Pydantic**: Use `@dataclass` for frames and internal pipeline data (high-frequency, no validation needed). Use Pydantic `BaseModel` for configuration, parameters, metrics, and external API data (benefits from validation and serialization). Specifically:
|
||||
- `@dataclass`: Frame types, context aggregator pairs, internal data containers
|
||||
- `BaseModel`: Service `InputParams`, transport/VAD/turn params, metrics data, API request/response models, serializer params
|
||||
|
||||
### Docstring Example
|
||||
|
||||
```python
|
||||
class MyService(LLMService):
|
||||
"""Description of what the service does.
|
||||
|
||||
More detailed description.
|
||||
|
||||
Event handlers available:
|
||||
|
||||
- on_connected: Called when we are connected
|
||||
|
||||
Example::
|
||||
|
||||
@service.event_handler("on_connected")
|
||||
async def on_connected(service, frame):
|
||||
...
|
||||
"""
|
||||
|
||||
def __init__(self, param1: str, **kwargs):
|
||||
"""Initialize the service.
|
||||
|
||||
Args:
|
||||
param1: Description of param1.
|
||||
**kwargs: Additional arguments passed to parent.
|
||||
"""
|
||||
super().__init__(**kwargs)
|
||||
|
||||
|
||||
# Pydantic params class with a deprecated field
|
||||
class MyParams(BaseModel):
|
||||
"""Configuration parameters for MyService.
|
||||
|
||||
Parameters:
|
||||
new_setting: Replacement for ``old_setting``.
|
||||
old_setting: Legacy setting, no longer used.
|
||||
|
||||
.. deprecated:: 1.2.0
|
||||
Use ``new_setting`` instead. Will be removed in 2.0.0.
|
||||
"""
|
||||
|
||||
new_setting: str = "default"
|
||||
old_setting: str | None = None
|
||||
```
|
||||
|
||||
## Service Implementation
|
||||
|
||||
When adding a new service:
|
||||
|
||||
1. Extend the appropriate base class (`STTService`, `TTSService`, `LLMService`, etc.)
|
||||
2. Implement required abstract methods
|
||||
3. Handle necessary frames
|
||||
4. By default, all frames should be pushed in the direction they came
|
||||
5. Push `ErrorFrame` on failures
|
||||
6. Add metrics tracking via `MetricsData` if relevant
|
||||
7. Follow the pattern of existing services in `src/pipecat/services/`
|
||||
|
||||
## Testing
|
||||
|
||||
Test utilities live in `src/pipecat/tests/utils.py`. Use `run_test()` to send frames through a pipeline and assert expected output frames in each direction. Use `SleepFrame(sleep=N)` to add delays between frames.
|
||||
1525
CHANGELOG.md
1525
CHANGELOG.md
File diff suppressed because it is too large
Load Diff
62
CHANGELOG.md.template
Normal file
62
CHANGELOG.md.template
Normal file
@@ -0,0 +1,62 @@
|
||||
# Changelog
|
||||
|
||||
All notable changes to the **<project name>** SDK will be documented in this file.
|
||||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
Please make sure to add your changes to the appropriate categories:
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
|
||||
<!-- for new functionality -->
|
||||
|
||||
- n/a
|
||||
|
||||
### Changed
|
||||
|
||||
<!-- for changed functionality -->
|
||||
|
||||
- n/a
|
||||
|
||||
### Deprecated
|
||||
|
||||
<!-- for soon-to-be removed functionality -->
|
||||
|
||||
- n/a
|
||||
|
||||
### Removed
|
||||
|
||||
<!-- for removed functionality -->
|
||||
|
||||
- n/a
|
||||
|
||||
### Fixed
|
||||
|
||||
<!-- for fixed bugs -->
|
||||
|
||||
- n/a
|
||||
|
||||
### Performance
|
||||
|
||||
<!-- for performance-relevant changes -->
|
||||
|
||||
- n/a
|
||||
|
||||
### Security
|
||||
|
||||
<!-- for security-relevant changes -->
|
||||
|
||||
- n/a
|
||||
|
||||
### Other
|
||||
|
||||
<!-- for everything else -->
|
||||
|
||||
- n/a
|
||||
|
||||
## [0.1.0] - YYYY-MM-DD
|
||||
|
||||
Initial release.
|
||||
158
CLAUDE.md
158
CLAUDE.md
@@ -1 +1,157 @@
|
||||
@AGENTS.md
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
Pipecat is an open-source Python framework for building real-time voice and multimodal conversational AI agents. It orchestrates audio/video, AI services, transports, and conversation pipelines using a frame-based architecture.
|
||||
|
||||
## Common Commands
|
||||
|
||||
```bash
|
||||
# Setup development environment
|
||||
uv sync --group dev --all-extras --no-extra gstreamer
|
||||
|
||||
# Install pre-commit hooks
|
||||
uv run pre-commit install
|
||||
|
||||
# Run all tests
|
||||
uv run pytest
|
||||
|
||||
# Run a single test file
|
||||
uv run pytest tests/test_name.py
|
||||
|
||||
# Run a specific test
|
||||
uv run pytest tests/test_name.py::test_function_name
|
||||
|
||||
# Preview changelog
|
||||
uv run towncrier build --draft --version Unreleased
|
||||
|
||||
# Lint and format check
|
||||
uv run ruff check
|
||||
uv run ruff format --check
|
||||
|
||||
# Update dependencies (after editing pyproject.toml)
|
||||
uv lock && uv sync
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Frame-Based Pipeline Processing
|
||||
|
||||
All data flows as **Frame** objects through a pipeline of **FrameProcessors**:
|
||||
|
||||
```
|
||||
[Processor1] → [Processor2] → ... → [ProcessorN]
|
||||
```
|
||||
|
||||
**Key components:**
|
||||
|
||||
- **Frames** (`src/pipecat/frames/frames.py`): Data units (audio, text, video) and control signals. Flow DOWNSTREAM (input→output) or UPSTREAM (acknowledgments/errors).
|
||||
|
||||
- **FrameProcessor** (`src/pipecat/processors/frame_processor.py`): Base processing unit. Each processor receives frames, processes them, and pushes results downstream.
|
||||
|
||||
- **Pipeline** (`src/pipecat/pipeline/pipeline.py`): Chains processors together.
|
||||
|
||||
- **ParallelPipeline** (`src/pipecat/pipeline/parallel_pipeline.py`): Runs multiple pipelines in parallel.
|
||||
|
||||
- **Transports** (`src/pipecat/transports/`): Transports are frame processors used for external I/O layer (Daily WebRTC, LiveKit WebRTC, WebSocket, Local). Abstract interface via `BaseTransport`, `BaseInputTransport` and `BaseOutputTransport`.
|
||||
|
||||
- **Pipeline Task (`src/pipecat/pipeline/task.py`)**: Runs and manages a pipeline. Pipeline tasks send the first frame, `StartFrame`, to the pipeline in order for processors to know they can start processing and pushing frames. Pipeline tasks internally create a pipeline with two additional processors, a source processor before the user-defined pipeline and a sink processor at the end. Those are used for multiple things: error handling, pipeline task level events, heartbeat monitoring, etc.
|
||||
|
||||
- **Pipeline Runner (`src/pipecat/pipeline/runner.py`)**: High-level entry point for executing pipeline tasks. Handles signal management (SIGINT/SIGTERM) for graceful shutdown and optional garbage collection. Run a single pipeline task with `await runner.run(task)` or multiple concurrently with `await asyncio.gather(runner.run(task1), runner.run(task2))`.
|
||||
|
||||
- **Services** (`src/pipecat/services/`): 60+ AI provider integrations (STT, TTS, LLM, etc.). Extend base classes: `AIService`, `LLMService`, `STTService`, `TTSService`, `VisionService`.
|
||||
|
||||
- **Serializers** (`src/pipecat/serializers/`): Convert frames to/from wire formats for WebSocket transports. `FrameSerializer` base class defines `serialize()` and `deserialize()`. Telephony serializers (Twilio, Plivo, Vonage, Telnyx, Exotel, Genesys) handle provider-specific protocols and audio encoding (e.g., μ-law).
|
||||
|
||||
- **RTVI** (`src/pipecat/processors/frameworks/rtvi.py`): Real-Time Voice Interface protocol bridging clients and the pipeline. `RTVIProcessor` handles incoming client messages (text input, audio, function call results). `RTVIObserver` converts pipeline frames to outgoing messages: user/bot speaking events, transcriptions, LLM/TTS lifecycle, function calls, metrics, and audio levels.
|
||||
|
||||
- **Observers** (`src/pipecat/observers/`): Monitor frame flow without modifying the pipeline. Passed to `PipelineTask` via the `observers` parameter. Implement `on_process_frame()` and `on_push_frame()` callbacks.
|
||||
|
||||
### Important Patterns
|
||||
|
||||
- **Context Aggregation**: `LLMContext` accumulates messages for LLM calls; `UserResponse` aggregates user input
|
||||
|
||||
- **Turn Management**: Turn management is done through `LLMUserAggregator` and
|
||||
`LLMAssistantAggregator`, created with `LLMContextAggregatorPair`
|
||||
|
||||
- **User turn strategies**: Detection of when the user starts and stops speaking is done via user turn start/stop strategies. They push `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` respectively.
|
||||
|
||||
- **Interruptions**: Interruptions are usually triggered by a user turn start strategy (e.g. `VADUserTurnStartStrategy`) but they can be triggered by other processors as well, in which case the user turn start strategies don't need to. An `InterruptionFrame` carries an optional `asyncio.Event` that is set when the frame reaches the pipeline sink. If a processor stops an `InterruptionFrame` from propagating downstream (i.e., doesn't push it), it **must** call `frame.complete()` to avoid stalling `push_interruption_task_frame_and_wait()` callers.
|
||||
|
||||
- **Uninterruptible Frames**: These are frames that will not be removed from internal queues even if there's an interruption. For example, `EndFrame` and `StopFrame`.
|
||||
|
||||
- **Events**: Most classes in Pipecat have `BaseObject` as the very base class. `BaseObject` has support for events. Events can run in the background in an async task (default) or synchronously (`sync=True`) if we want immediate action. Synchronous event handlers need to execute fast.
|
||||
|
||||
- **Async Task Management**: Always use `self.create_task(coroutine, name)` instead of raw `asyncio.create_task()`. The `TaskManager` automatically tracks tasks and cleans them up on processor shutdown. Use `await self.cancel_task(task, timeout)` for cancellation.
|
||||
|
||||
- **Error Handling**: Use `await self.push_error(msg, exception, fatal)` to push errors upstream. Services should use `fatal=False` (the default) so application code can handle errors and take action (e.g. switch to another service).
|
||||
|
||||
### Key Directories
|
||||
|
||||
| Directory | Purpose |
|
||||
| -------------------------- | -------------------------------------------------- |
|
||||
| `src/pipecat/frames/` | Frame definitions (100+ types) |
|
||||
| `src/pipecat/processors/` | FrameProcessor base + aggregators, filters, audio |
|
||||
| `src/pipecat/pipeline/` | Pipeline orchestration |
|
||||
| `src/pipecat/services/` | AI service integrations (60+ providers) |
|
||||
| `src/pipecat/transports/` | Transport layer (Daily, LiveKit, WebSocket, Local) |
|
||||
| `src/pipecat/serializers/` | Frame serialization for WebSocket protocols |
|
||||
| `src/pipecat/observers/` | Pipeline observers for monitoring frame flow |
|
||||
| `src/pipecat/audio/` | VAD, filters, mixers, turn detection, DTMF |
|
||||
| `src/pipecat/turns/` | User turn management |
|
||||
|
||||
## Code Style
|
||||
|
||||
- **Docstrings**: Google-style. Classes describe purpose; `__init__` has `Args:` section; dataclasses use `Parameters:` section.
|
||||
- **Linting**: Ruff (line length 100). Pre-commit hooks enforce formatting.
|
||||
- **Type hints**: Required for complex async code.
|
||||
- **Dataclass vs Pydantic**: Use `@dataclass` for frames and internal pipeline data (high-frequency, no validation needed). Use Pydantic `BaseModel` for configuration, parameters, metrics, and external API data (benefits from validation and serialization). Specifically:
|
||||
- `@dataclass`: Frame types, context aggregator pairs, internal data containers
|
||||
- `BaseModel`: Service `InputParams`, transport/VAD/turn params, metrics data, API request/response models, serializer params
|
||||
|
||||
### Docstring Example
|
||||
|
||||
```python
|
||||
class MyService(LLMService):
|
||||
"""Description of what the service does.
|
||||
|
||||
More detailed description.
|
||||
|
||||
Event handlers available:
|
||||
|
||||
- on_connected: Called when we are connected
|
||||
|
||||
Example::
|
||||
|
||||
@service.event_handler("on_connected")
|
||||
async def on_connected(service, frame):
|
||||
...
|
||||
"""
|
||||
|
||||
def __init__(self, param1: str, **kwargs):
|
||||
"""Initialize the service.
|
||||
|
||||
Args:
|
||||
param1: Description of param1.
|
||||
**kwargs: Additional arguments passed to parent.
|
||||
"""
|
||||
super().__init__(**kwargs)
|
||||
```
|
||||
|
||||
## Service Implementation
|
||||
|
||||
When adding a new service:
|
||||
|
||||
1. Extend the appropriate base class (`STTService`, `TTSService`, `LLMService`, etc.)
|
||||
2. Implement required abstract methods
|
||||
3. Handle necessary frames
|
||||
4. By default, all frames should be pushed in the direction they came
|
||||
5. Push `ErrorFrame` on failures
|
||||
6. Add metrics tracking via `MetricsData` if relevant
|
||||
7. Follow the pattern of existing services in `src/pipecat/services/`
|
||||
|
||||
## Testing
|
||||
|
||||
Test utilities live in `src/pipecat/tests/utils.py`. Use `run_test()` to send frames through a pipeline and assert expected output frames in each direction. Use `SleepFrame(sleep=N)` to add delays between frames.
|
||||
|
||||
38
README.md
38
README.md
@@ -28,10 +28,6 @@
|
||||
|
||||
## 🌐 Pipecat Ecosystem
|
||||
|
||||
### 🧩 Multi-agent systems
|
||||
|
||||
Need multiple AI agents working together? [Pipecat Subagents](https://github.com/pipecat-ai/pipecat-subagents) lets you build distributed multi-agent systems where each agent runs its own pipeline and communicates through a shared message bus. Hand off conversations between specialists, dispatch background tasks, and scale agents across processes or machines.
|
||||
|
||||
### 📱 Client SDKs
|
||||
|
||||
Building client applications? You can connect to Pipecat from any platform using our official SDKs:
|
||||
@@ -71,7 +67,7 @@ and install any of the available plugins.
|
||||
|
||||
### 🧩 Community Integrations
|
||||
|
||||
Build and share your own Pipecat service integrations! Browse existing [community integrations](https://docs.pipecat.ai/api-reference/server/services/community-integrations) or check out our [guide](COMMUNITY_INTEGRATIONS.md) to create your own.
|
||||
Build and share your own Pipecat service integrations! Browse existing [community integrations](https://docs.pipecat.ai/server/services/community-integrations) or check out our [guide](COMMUNITY_INTEGRATIONS.md) to create your own.
|
||||
|
||||
### 📺️ Pipecat TV Channel
|
||||
|
||||
@@ -83,28 +79,28 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
|
||||
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/simple-chatbot/image.png" width="400" /></a>
|
||||
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
|
||||
<br/>
|
||||
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/daily-multi-translation"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/daily-multi-translation/image.png" width="400" /></a>
|
||||
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>
|
||||
<a href="https://github.com/pipecat-ai/pipecat/blob/main/examples/vision/vision-moondream.py"><img src="https://github.com/pipecat-ai/pipecat/blob/main/examples/assets/moondream.png" width="400" /></a>
|
||||
</p>
|
||||
|
||||
## 🧩 Available services
|
||||
|
||||
| Category | Services |
|
||||
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/api-reference/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/api-reference/server/services/stt/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/api-reference/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/api-reference/server/services/stt/gladia), [Google](https://docs.pipecat.ai/api-reference/server/services/stt/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/groq), [Mistral](https://docs.pipecat.ai/api-reference/server/services/stt/mistral), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/stt/nvidia), [OpenAI (Whisper)](https://docs.pipecat.ai/api-reference/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/api-reference/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/api-reference/server/services/stt/whisper), [xAI](https://docs.pipecat.ai/api-reference/server/services/stt/xai) |
|
||||
| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/api-reference/server/services/llm/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/api-reference/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/api-reference/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/api-reference/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/api-reference/server/services/llm/grok), [Groq](https://docs.pipecat.ai/api-reference/server/services/llm/groq), [Inception](https://docs.pipecat.ai/api-reference/server/services/llm/inception), [Mistral](https://docs.pipecat.ai/api-reference/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/api-reference/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/api-reference/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/api-reference/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/api-reference/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/llm/openai), [OpenAI Responses](https://docs.pipecat.ai/api-reference/server/services/llm/openai-responses), [OpenRouter](https://docs.pipecat.ai/api-reference/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/api-reference/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/api-reference/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/api-reference/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/api-reference/server/services/llm/together) |
|
||||
| Text-to-Speech | [Async](https://docs.pipecat.ai/api-reference/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/api-reference/server/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/api-reference/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/api-reference/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/api-reference/server/services/tts/fish), [Google](https://docs.pipecat.ai/api-reference/server/services/tts/google), [Gradium](https://docs.pipecat.ai/api-reference/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/api-reference/server/services/tts/groq), [Hume](https://docs.pipecat.ai/api-reference/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/api-reference/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/api-reference/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/api-reference/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/api-reference/server/services/tts/minimax), [Mistral](https://docs.pipecat.ai/api-reference/server/services/tts/mistral), [Neuphonic](https://docs.pipecat.ai/api-reference/server/services/tts/neuphonic), [NVIDIA](https://docs.pipecat.ai/api-reference/server/services/tts/nvidia), [OpenAI](https://docs.pipecat.ai/api-reference/server/services/tts/openai), [Piper](https://docs.pipecat.ai/api-reference/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/api-reference/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/api-reference/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/api-reference/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/api-reference/server/services/tts/smallest), [Soniox](https://docs.pipecat.ai/api-reference/server/services/tts/soniox), [Speechmatics](https://docs.pipecat.ai/api-reference/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/api-reference/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/api-reference/server/services/tts/xtts) |
|
||||
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/api-reference/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/api-reference/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/api-reference/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/api-reference/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/api-reference/server/services/s2s/ultravox), |
|
||||
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/api-reference/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/api-reference/server/services/transport/small-webrtc), [Vonage (WebRTC)](https://docs.pipecat.ai/api-reference/server/services/transport/vonage), [WebSocket Server](https://docs.pipecat.ai/api-reference/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/api-reference/server/services/transport/whatsapp), Local |
|
||||
| Serializers | [Exotel](https://docs.pipecat.ai/api-reference/server/services/serializers/exotel), [Genesys](https://docs.pipecat.ai/api-reference/server/services/serializers/genesys), [Plivo](https://docs.pipecat.ai/api-reference/server/services/serializers/plivo), [Twilio](https://docs.pipecat.ai/api-reference/server/services/serializers/twilio), [Telnyx](https://docs.pipecat.ai/api-reference/server/services/serializers/telnyx), [Vonage](https://docs.pipecat.ai/api-reference/server/services/serializers/vonage) |
|
||||
| Video | [HeyGen](https://docs.pipecat.ai/api-reference/server/services/video/heygen), [LemonSlice](https://docs.pipecat.ai/api-reference/server/services/transport/lemonslice), [Tavus](https://docs.pipecat.ai/api-reference/server/services/video/tavus), [Simli](https://docs.pipecat.ai/api-reference/server/services/video/simli) |
|
||||
| Memory | [mem0](https://docs.pipecat.ai/api-reference/server/services/memory/mem0) |
|
||||
| Vision & Image | [fal](https://docs.pipecat.ai/api-reference/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/api-reference/server/services/image-generation/google-imagen), [Moondream](https://docs.pipecat.ai/api-reference/server/services/vision/moondream) |
|
||||
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/api-reference/server/utilities/audio/silero-vad-analyzer), [Krisp Viva](https://docs.pipecat.ai/guides/features/krisp-viva), [Koala](https://docs.pipecat.ai/api-reference/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/api-reference/server/utilities/audio/aic-filter), [RNNoise](https://docs.pipecat.ai/api-reference/server/utilities/audio/rnnoise-filter) |
|
||||
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/api-reference/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/api-reference/server/services/analytics/sentry) |
|
||||
| Community | [Browse community integrations →](https://docs.pipecat.ai/api-reference/server/services/community-integrations) |
|
||||
| Category | Services |
|
||||
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Gradium](https://docs.pipecat.ai/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
|
||||
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
|
||||
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/server/services/tts/smallest), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
|
||||
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/server/services/s2s/ultravox), |
|
||||
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/server/services/transport/whatsapp), Local |
|
||||
| Serializers | [Exotel](https://docs.pipecat.ai/server/services/serializers/exotel), [Genesys](https://docs.pipecat.ai/server/services/serializers/genesys), [Plivo](https://docs.pipecat.ai/server/services/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/services/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/services/serializers/telnyx), [Vonage](https://docs.pipecat.ai/server/services/serializers/vonage) |
|
||||
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [LemonSlice](https://docs.pipecat.ai/server/services/transport/lemonslice), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
|
||||
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
|
||||
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/google-imagen), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
|
||||
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp Viva](https://docs.pipecat.ai/guides/features/krisp-viva), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter), [RNNoise](https://docs.pipecat.ai/server/utilities/audio/rnnoise-filter) |
|
||||
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
|
||||
| Community | [Browse community integrations →](https://docs.pipecat.ai/server/services/community-integrations) |
|
||||
|
||||
📚 [View full services documentation →](https://docs.pipecat.ai/api-reference/server/services/supported-services)
|
||||
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
|
||||
|
||||
## ⚡ Getting started
|
||||
|
||||
|
||||
1
changelog/3984.changed.md
Normal file
1
changelog/3984.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Updated `onnxruntime` from 1.23.2 to 1.24.3, adding support for Python 3.14.
|
||||
1
changelog/4034.changed.md
Normal file
1
changelog/4034.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- MCPClient now requires async with MCPClient(...) as mcp: or explicit start()/close() calls to manage the connection lifecycle.
|
||||
1
changelog/4034.fixed.md
Normal file
1
changelog/4034.fixed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Fixed MCPClient opening a new connection for every tool call instead of reusing the session.
|
||||
@@ -1 +0,0 @@
|
||||
- Added `VonageVideoConnectorTransport`, a new transport integration for real-time Vonage WebRTC sessions using the Vonage Video Connector library.
|
||||
1
changelog/4141.added.md
Normal file
1
changelog/4141.added.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Added WebSocket-based `OpenAIResponsesLLMService` as the new default for the OpenAI Responses API. It maintains a persistent connection to `wss://api.openai.com/v1/responses` and automatically uses `previous_response_id` to send only incremental context, falling back to full context on reconnection or cache miss. The previous HTTP-based implementation is now available as `OpenAIResponsesHttpLLMService`.
|
||||
1
changelog/4191.removed.md
Normal file
1
changelog/4191.removed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `OpenPipeLLMService` and the `openpipe` extra. OpenPipe was acquired by CoreWeave and the package is no longer maintained. If you were using `openpipe` as an LLM provider, switch to the underlying provider directly (e.g. `openai`). The OpenPipe interface can still be used with `OpenAILLMService` by specifying a `base_url`.
|
||||
1
changelog/4192.changed.md
Normal file
1
changelog/4192.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Updated `langchain` extra to require langchain 1.x (from 0.3.x), langchain-community 0.4.x (from 0.3.x), and langchain-openai 1.x (from 0.3.x). If you pin these packages in your project, update your pins accordingly.
|
||||
1
changelog/4202.fixed.md
Normal file
1
changelog/4202.fixed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Fixed `InworldHttpTTSService` streaming responses crashing with `UnicodeDecodeError` when multi-byte UTF-8 characters were split across chunk boundaries. This caused TTS audio to cut off mid-sentence intermittently.
|
||||
1
changelog/4203.fixed.md
Normal file
1
changelog/4203.fixed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Fixed a crash (`JSONDecodeError`) when a user interruption occurs while the LLM is streaming function call arguments. Previously, the incomplete JSON arguments were passed directly to `json.loads()`, causing an unhandled exception. Affected services: OpenAI, Google (OpenAI-compatible), and SambaNova.
|
||||
1
changelog/4204.removed.10.md
Normal file
1
changelog/4204.removed.10.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `observers` field from `PipelineParams`. Pass observers directly to `PipelineTask` constructor instead.
|
||||
1
changelog/4204.removed.11.md
Normal file
1
changelog/4204.removed.11.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `on_pipeline_ended`, `on_pipeline_cancelled`, and `on_pipeline_stopped` events from `PipelineTask`. Use `on_pipeline_finished` instead.
|
||||
1
changelog/4204.removed.12.md
Normal file
1
changelog/4204.removed.12.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `AudioBufferProcessor.user_continuous_stream` parameter. Use `user_audio_passthrough` instead.
|
||||
1
changelog/4204.removed.13.md
Normal file
1
changelog/4204.removed.13.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `camera_in_enabled`, `camera_in_is_live`, `camera_in_width`, `camera_in_height`, `camera_out_enabled`, `camera_out_is_live`, `camera_out_width`, `camera_out_height`, and `camera_out_color` transport params. Use the `video_in_*` and `video_out_*` equivalents instead.
|
||||
1
changelog/4204.removed.14.md
Normal file
1
changelog/4204.removed.14.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `RTVIObserver.errors_enabled` parameter.
|
||||
1
changelog/4204.removed.15.md
Normal file
1
changelog/4204.removed.15.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `vad_enabled` and `vad_audio_passthrough` transport params.
|
||||
1
changelog/4204.removed.16.md
Normal file
1
changelog/4204.removed.16.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `TTSService.say()`. Push a `TTSSpeakFrame` into the pipeline instead.
|
||||
1
changelog/4204.removed.17.md
Normal file
1
changelog/4204.removed.17.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `DailyRunner.configure_with_args()`. Use `PipelineRunner` with `RunnerArguments` instead.
|
||||
1
changelog/4204.removed.18.md
Normal file
1
changelog/4204.removed.18.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated RTVI models, frames, and processor methods including `RTVIConfig`, `RTVIServiceConfig`, `RTVIServiceOptionConfig`, various `RTVI*Data` models, `RTVIActionFrame`, and `RTVIProcessor.handle_function_call`/`handle_function_call_start`. Use the updated RTVI processor API instead.
|
||||
1
changelog/4204.removed.19.md
Normal file
1
changelog/4204.removed.19.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `FrameProcessor.wait_for_task()`. Use `create_task()` and manage tasks with the built-in `TaskManager` instead.
|
||||
1
changelog/4204.removed.2.md
Normal file
1
changelog/4204.removed.2.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `KrispFilter`. The `krisp` extra has been removed from `pyproject.toml`.
|
||||
1
changelog/4204.removed.20.md
Normal file
1
changelog/4204.removed.20.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `LLMService.request_image_frame()`. Push a `UserImageRequestFrame` instead.
|
||||
1
changelog/4204.removed.3.md
Normal file
1
changelog/4204.removed.3.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `create_default_resampler()` from `pipecat.audio.utils`.
|
||||
1
changelog/4204.removed.4.md
Normal file
1
changelog/4204.removed.4.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `FalSmartTurnAnalyzer` and `LocalSmartTurnAnalyzer`.
|
||||
1
changelog/4204.removed.5.md
Normal file
1
changelog/4204.removed.5.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated transport frames: `TransportMessageFrame`, `TransportMessageUrgentFrame`, `InputTransportMessageUrgentFrame`, `DailyTransportMessageFrame`, and `DailyTransportMessageUrgentFrame`. Use `OutputTransportMessageFrame`, `OutputTransportMessageUrgentFrame`, `InputTransportMessageFrame`, `DailyOutputTransportMessageFrame`, and `DailyOutputTransportMessageUrgentFrame` instead.
|
||||
1
changelog/4204.removed.6.md
Normal file
1
changelog/4204.removed.6.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `KeypadEntryFrame` alias.
|
||||
1
changelog/4204.removed.7.md
Normal file
1
changelog/4204.removed.7.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated interruption frames: `StartInterruptionFrame` and `BotInterruptionFrame`. Use `InterruptionFrame` and `InterruptionTaskFrame` instead.
|
||||
1
changelog/4204.removed.8.md
Normal file
1
changelog/4204.removed.8.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `LLMService.start_callback` parameter. Register an `on_llm_response_start` event handler instead.
|
||||
1
changelog/4204.removed.9.md
Normal file
1
changelog/4204.removed.9.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed single-argument function call support from `LLMService`. Functions must use named parameters instead of a single `arguments` parameter.
|
||||
1
changelog/4204.removed.md
Normal file
1
changelog/4204.removed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `NoisereduceFilter`. Use system-level noise reduction or a service-based alternative instead.
|
||||
1
changelog/4208.removed.10.md
Normal file
1
changelog/4208.removed.10.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.riva` package. Use `pipecat.services.nvidia.stt` and `pipecat.services.nvidia.tts` instead (`RivaSTTService` → `NvidiaSTTService`, `RivaTTSService` → `NvidiaTTSService`).
|
||||
1
changelog/4208.removed.11.md
Normal file
1
changelog/4208.removed.11.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.nim` package. Use `pipecat.services.nvidia.llm` instead (`NimLLMService` → `NvidiaLLMService`).
|
||||
1
changelog/4208.removed.2.md
Normal file
1
changelog/4208.removed.2.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.gemini_multimodal_live` package. Use `pipecat.services.google.gemini_live` instead. Note that class names no longer include "Multimodal" (e.g. `GeminiMultimodalLiveLLMService` → `GeminiLiveLLMService`).
|
||||
1
changelog/4208.removed.3.md
Normal file
1
changelog/4208.removed.3.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.aws_nova_sonic` package. Use `pipecat.services.aws.nova_sonic` instead.
|
||||
1
changelog/4208.removed.4.md
Normal file
1
changelog/4208.removed.4.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.openai_realtime` package. Use `pipecat.services.openai.realtime` instead.
|
||||
1
changelog/4208.removed.5.md
Normal file
1
changelog/4208.removed.5.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `OpenAIRealtimeBetaLLMService` and `AzureRealtimeBetaLLMService`. Use `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService` from `pipecat.services.openai.realtime` and `pipecat.services.azure.realtime` instead.
|
||||
1
changelog/4208.removed.6.md
Normal file
1
changelog/4208.removed.6.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.deepgram.stt_sagemaker` and `pipecat.services.deepgram.tts_sagemaker` modules. Use `pipecat.services.deepgram.sagemaker.stt` and `pipecat.services.deepgram.sagemaker.tts` instead.
|
||||
1
changelog/4208.removed.7.md
Normal file
1
changelog/4208.removed.7.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `GoogleLLMOpenAIBetaService` from `pipecat.services.google.openai`. Use `GoogleLLMService` from `pipecat.services.google.llm` instead.
|
||||
1
changelog/4208.removed.8.md
Normal file
1
changelog/4208.removed.8.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.google.llm_vertex` module. Use `pipecat.services.google.vertex.llm` instead.
|
||||
1
changelog/4208.removed.9.md
Normal file
1
changelog/4208.removed.9.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.google.gemini_live.llm_vertex` module. Use `pipecat.services.google.gemini_live.vertex.llm` instead.
|
||||
1
changelog/4208.removed.md
Normal file
1
changelog/4208.removed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.services.ai_services` module. Import from `pipecat.services.ai_service`, `pipecat.services.llm_service`, `pipecat.services.stt_service`, `pipecat.services.tts_service`, etc. instead.
|
||||
1
changelog/4209.changed.md
Normal file
1
changelog/4209.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Changed `GrokLLMService` default model from `grok-3-beta` to `grok-3`, now that the model is generally available.
|
||||
1
changelog/4213.changed.md
Normal file
1
changelog/4213.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- `GoogleImageGenService` now defaults to `imagen-4.0-generate-001` (previously `imagen-3.0-generate-002`).
|
||||
1
changelog/4215.changed.md
Normal file
1
changelog/4215.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ `BaseOpenAILLMService.get_chat_completions()` now accepts an `LLMContext` instead of `OpenAILLMInvocationParams`. If you override this method, update your signature accordingly.
|
||||
22
changelog/4215.removed.2.md
Normal file
22
changelog/4215.removed.2.md
Normal file
@@ -0,0 +1,22 @@
|
||||
- ⚠️ Removed deprecated service-specific context and aggregator machinery, which was superseded by the universal `LLMContext` system.
|
||||
|
||||
Service-specific classes removed: `AnthropicLLMContext`, `AnthropicContextAggregatorPair`, `AWSBedrockLLMContext`, `AWSBedrockContextAggregatorPair`, `OpenAIContextAggregatorPair`, and their user/assistant aggregators. Also removed `create_context_aggregator()` from `LLMService`, `OpenAILLMService`, `AnthropicLLMService`, and `AWSBedrockLLMService`.
|
||||
|
||||
Base aggregator classes removed (from `pipecat.processors.aggregators.llm_response`): `BaseLLMResponseAggregator`, `LLMContextResponseAggregator`, `LLMUserContextAggregator`, `LLMAssistantContextAggregator`, `LLMUserResponseAggregator`, `LLMAssistantResponseAggregator`.
|
||||
|
||||
From the developer's point of view, migrating will usually be a matter of going from this:
|
||||
|
||||
```python
|
||||
context = OpenAILLMContext(messages, tools)
|
||||
context_aggregator = llm.create_context_aggregator(context)
|
||||
```
|
||||
|
||||
To this:
|
||||
|
||||
```python
|
||||
from pipecat.processors.aggregators.llm_context import LLMContext
|
||||
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
|
||||
|
||||
context = LLMContext(messages, tools)
|
||||
context_aggregator = LLMContextAggregatorPair(context)
|
||||
```
|
||||
1
changelog/4215.removed.3.md
Normal file
1
changelog/4215.removed.3.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated frame types `LLMMessagesFrame` and `OpenAILLMContextAssistantTimestampFrame` from `pipecat.frames.frames`. Instead of `LLMMessagesFrame`, use `LLMContextFrame` with the new messages, or `LLMMessagesUpdateFrame` with `run_llm=True`.
|
||||
1
changelog/4215.removed.4.md
Normal file
1
changelog/4215.removed.4.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `GatedOpenAILLMContextAggregator` (from `pipecat.processors.aggregators.gated_open_ai_llm_context`). Use `GatedLLMContextAggregator` (from `pipecat.processors.aggregators.gated_llm_context`) instead.
|
||||
1
changelog/4215.removed.5.md
Normal file
1
changelog/4215.removed.5.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `VisionImageFrameAggregator` (from `pipecat.processors.aggregators.vision_image_frame`). Vision/image handling is now built into `LLMContext` (from `pipecat.processors.aggregators.llm_context`). See the `12*` examples for the recommended replacement pattern.
|
||||
1
changelog/4215.removed.6.md
Normal file
1
changelog/4215.removed.6.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated compatibility modules: `pipecat.services.openai_realtime_beta` (use `pipecat.services.openai.realtime`), `pipecat.services.openai_realtime.context`, `pipecat.services.openai_realtime.frames`, `pipecat.services.openai.realtime.context`, `pipecat.services.openai.realtime.frames`, `pipecat.services.gemini_multimodal_live` (use `pipecat.services.google.gemini_live`), `pipecat.services.aws_nova_sonic.context` (use `pipecat.services.aws.nova_sonic`), `pipecat.services.google.openai` and `pipecat.services.google.llm_openai` (use `pipecat.services.google.llm`).
|
||||
18
changelog/4215.removed.md
Normal file
18
changelog/4215.removed.md
Normal file
@@ -0,0 +1,18 @@
|
||||
- ⚠️ Removed `OpenAILLMContext`, `OpenAILLMContextFrame`, and `OpenAILLMContext.from_messages()`. Use `LLMContext` (from `pipecat.processors.aggregators.llm_context`) and `LLMContextFrame` (from `pipecat.frames.frames`) instead. All services now exclusively use the universal `LLMContext`.
|
||||
|
||||
From the developer's point of view, migrating will usually be a matter of going from this:
|
||||
|
||||
```python
|
||||
context = OpenAILLMContext(messages, tools)
|
||||
context_aggregator = llm.create_context_aggregator(context)
|
||||
```
|
||||
|
||||
To this:
|
||||
|
||||
```python
|
||||
from pipecat.processors.aggregators.llm_context import LLMContext
|
||||
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
|
||||
|
||||
context = LLMContext(messages, tools)
|
||||
context_aggregator = LLMContextAggregatorPair(context)
|
||||
```
|
||||
1
changelog/4219.added.md
Normal file
1
changelog/4219.added.md
Normal file
@@ -0,0 +1 @@
|
||||
- Added `enable_prompt_caching` setting to `AWSBedrockLLMService` for Bedrock ConverseStream prompt caching.
|
||||
1
changelog/4220.fixed.md
Normal file
1
changelog/4220.fixed.md
Normal file
@@ -0,0 +1 @@
|
||||
- Fixed `CartesiaTTSService` failing with "Context has closed" errors when switching voice, model, or language via `TTSUpdateSettingsFrame`. The service now automatically flushes the current audio context and opens a fresh one when these settings change.
|
||||
13
changelog/4220.removed.md
Normal file
13
changelog/4220.removed.md
Normal file
@@ -0,0 +1,13 @@
|
||||
- ⚠️ Removed deprecated service parameters and shims that have been replaced by the `settings=Service.Settings(...)` pattern or direct `__init__` parameters:
|
||||
- `PollyTTSService` alias (use `AWSTTSService`)
|
||||
- `TTSService`: `text_aggregator`, `text_filter` init params
|
||||
- `AWSNovaSonicLLMService`: `send_transcription_frames` init param
|
||||
- `DeepgramSTTService`: `url` init param (use `base_url`)
|
||||
- `FishAudioTTSService`: `model` init param (use `reference_id` or `settings`)
|
||||
- `GladiaSTTService`: `language` and `confidence` from `GladiaInputParams`, `InputParams` class alias
|
||||
- `GeminiTTSService`: `api_key` init param
|
||||
- `GeminiLiveLLMService`: `base_url` init param (use `http_options`)
|
||||
- `GoogleVertexLLMService`: `InputParams` class with `location`/`project_id` fields (use direct init params); `project_id` is now required, `location` defaults to `"us-east4"`
|
||||
- `MiniMaxHttpTTSService`: `english_normalization` from `InputParams` (use `text_normalization`)
|
||||
- `SimliVideoService`: `simli_config` init param (use `api_key`/`face_id`), `use_turn_server` init param; `api_key` and `face_id` are now required
|
||||
- `AnthropicLLMService`: `enable_prompt_caching_beta` from `InputParams` (use `enable_prompt_caching`)
|
||||
1
changelog/4224.changed.md
Normal file
1
changelog/4224.changed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ `LLMService.function_call_timeout_secs` now defaults to `None` instead of `10.0`. Deferred function calls will run indefinitely unless a timeout is explicitly set at the service level or per-call. If you relied on the previous 10-second default, pass `function_call_timeout_secs=10.0` explicitly.
|
||||
1
changelog/4225.removed.2.md
Normal file
1
changelog/4225.removed.2.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.sync` package. Use `pipecat.utils.sync` instead.
|
||||
1
changelog/4225.removed.md
Normal file
1
changelog/4225.removed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.transports.services` and `pipecat.transports.network` module aliases. Update imports to use `pipecat.transports.daily.transport`, `pipecat.transports.livekit.transport`, `pipecat.transports.websocket.*`, `pipecat.transports.webrtc.*`, and `pipecat.transports.daily.utils` respectively.
|
||||
1
changelog/4228.removed.10.md
Normal file
1
changelog/4228.removed.10.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `add_pattern_pair` method from `PatternPairAggregator`. Use `add_pattern` instead.
|
||||
1
changelog/4228.removed.2.md
Normal file
1
changelog/4228.removed.2.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `interruption_strategies` parameter from `PipelineParams`, `StartFrame`, and `FrameProcessor`. Use `LLMUserAggregator`'s `user_turn_strategies` parameter instead.
|
||||
1
changelog/4228.removed.3.md
Normal file
1
changelog/4228.removed.3.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `EmulateUserStartedSpeakingFrame` and `EmulateUserStoppedSpeakingFrame` frames, and the `emulated` field from `UserStartedSpeakingFrame` / `UserStoppedSpeakingFrame`.
|
||||
1
changelog/4228.removed.4.md
Normal file
1
changelog/4228.removed.4.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.audio.interruptions` module (`BaseInterruptionStrategy`, `MinWordsInterruptionStrategy`). Use `pipecat.turns.user_start.MinWordsUserTurnStartStrategy` with `LLMUserAggregator`'s `user_turn_strategies` parameter instead.
|
||||
1
changelog/4228.removed.5.md
Normal file
1
changelog/4228.removed.5.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.processors.transcript_processor` module (`TranscriptProcessor`, `TranscriptProcessorConfig`). Use pipeline observers instead.
|
||||
1
changelog/4228.removed.6.md
Normal file
1
changelog/4228.removed.6.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `TranscriptionMessage`, `ThoughtTranscriptionMessage`, and `TranscriptionUpdateFrame` from `pipecat.frames.frames`.
|
||||
1
changelog/4228.removed.7.md
Normal file
1
changelog/4228.removed.7.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `STTMuteFilter`, `STTMuteConfig`, and `STTMuteStrategy` from `pipecat.processors.filters.stt_mute_filter`. Use `pipecat.turns.user_mute` strategies with `LLMUserAggregator`'s `user_mute_strategies` parameter instead.
|
||||
1
changelog/4228.removed.8.md
Normal file
1
changelog/4228.removed.8.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `UserResponseAggregator` class from `pipecat.processors.aggregators.user_response`. Use `LLMUserAggregator` instead.
|
||||
1
changelog/4228.removed.9.md
Normal file
1
changelog/4228.removed.9.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `pipecat.utils.tracing.class_decorators` module. Use `pipecat.utils.tracing.service_decorators` instead.
|
||||
1
changelog/4228.removed.md
Normal file
1
changelog/4228.removed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed deprecated `allow_interruptions` parameter from `PipelineParams`, `StartFrame`, and `FrameProcessor`. Interruptions are now always allowed by default. Use `LLMUserAggregator`'s `user_turn_strategies` / `user_mute_strategies` parameters to control interruption behavior.
|
||||
1
changelog/4229.removed.2.md
Normal file
1
changelog/4229.removed.2.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `ExternalUserTurnStrategies` and the automatic fallback to it in `LLMUserAggregator` when a `SpeechControlParamsFrame` was received from the transport.
|
||||
1
changelog/4229.removed.md
Normal file
1
changelog/4229.removed.md
Normal file
@@ -0,0 +1 @@
|
||||
- ⚠️ Removed `vad_analyzer` and `turn_analyzer` parameters from `TransportParams` and all transport input classes, along with all deprecated VAD/turn analysis logic in `BaseInputTransport`. VAD and turn detection are now handled entirely by `LLMUserAggregator`.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed Azure TTS last word being missed by observers and RTVI UI. The completion signal was racing with word timestamp processing, causing the final word's `TTSTextFrame` to arrive after `TTSStoppedFrame`. Completion is now routed through the word boundary queue to ensure all words are processed before signaling stream end.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed `BaseOutputTransport` reordering frames that share the same presentation timestamp. Frames with equal PTS values are now emitted in insertion order, preventing subtle audio/text sequencing bugs when multiple frames arrive at the same time.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed Cartesia word timestamps leaking SSML tag text (e.g. `<spell>`, `<emotion>`, `<break>`) into word entries. Tags are now stripped before processing, so word-to-text attribution remains accurate when SSML markup is present in the TTS input.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed `TTSTextFrame` entries losing their original text structure when word timestamps are enabled. Each `TTSTextFrame` now carries a `raw_text` field containing the corresponding span of the original LLM-produced text (including pattern delimiters such as `<card>4111 1111 1111 1111</card>`), so the assistant context receives properly-tagged content rather than the cleaned words returned by the TTS provider. Also handles words that straddle two sentence boundaries by splitting them and attributing each part to its correct source frame.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed skipped TTS frames (e.g. code blocks filtered via `skip_aggregator_types`) being emitted to the assistant context immediately instead of waiting for preceding spoken frames to finish. They now hold their position in the frame sequence and are flushed only after all earlier spoken sentences are complete, keeping context ordering correct.
|
||||
@@ -1 +0,0 @@
|
||||
- Added `InceptionLLMService` for Inception's Mercury 2 diffusion reasoning model, with support for `reasoning_effort` and `realtime` settings.
|
||||
@@ -1 +0,0 @@
|
||||
- Added `GET /status` endpoint to the development runner that reports which transports the running instance accepts (all by default, or the single transport passed via `-t`).
|
||||
@@ -1 +0,0 @@
|
||||
- Added plain WebSocket transport support to the development runner. Bots can now accept connections from non-telephony WebSocket clients (e.g., browser apps using protobuf framing) via the `/ws-client` endpoint alongside other transports.
|
||||
@@ -1 +0,0 @@
|
||||
- ⚠️ The development runner now supports all transports (WebRTC, Daily, telephony, plain WebSocket) simultaneously from a single server. The `/start` endpoint accepts a `"transport"` field to select the transport per-request; omitting `-t` at startup enables all transports instead of defaulting to WebRTC. The Daily browser-redirect route moved from `GET /` to `GET /daily`.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed `ElevenLabsSTTService` crashing when `language` was passed as `None`. When `language` is not set, the service now lets ElevenLabs auto-detect the audio language.
|
||||
@@ -1 +0,0 @@
|
||||
- Fixed websocket STT connection setup failures so services clear stale websocket state and emit non-fatal error frames, allowing `ServiceSwitcher` failover to keep agents running.
|
||||
@@ -1 +0,0 @@
|
||||
- Added `max_endpoint_delay_ms` to `SonioxSTTService.Settings`, controlling the maximum delay (500-3000 ms) before endpoint detection finalizes a turn.
|
||||
@@ -1 +0,0 @@
|
||||
- `SonioxSTTService` now applies settings updates (e.g. via `STTUpdateSettingsFrame`) using a graceful reconnect instead of a hard disconnect/reconnect, preserving the service's reconnect retry behavior.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user