Compare commits

..

1 Commits

Author SHA1 Message Date
Mark Backman
6531858970 Revert LLMRunFrame for quickstart 2025-09-01 20:48:47 -04:00
305 changed files with 15802 additions and 22876 deletions

285
AGENTS.md
View File

@@ -1,285 +0,0 @@
# AGENTS.md
## Project Overview
Pipecat is an open-source Python framework for building real-time voice and multimodal conversational AI agents. The codebase is organized around a pipeline architecture where data flows through connected services (STT → LLM → TTS).
## Development Environment Setup
### Prerequisites
- **Minimum Python Version:** 3.10
- **Recommended Python Version:** 3.12
- **Package Manager:** uv (recommended) or pip
### Setup Commands
```bash
# Clone the repository
git clone https://github.com/pipecat-ai/pipecat.git
cd pipecat
# Install dependencies with uv (recommended)
uv sync --group dev --all-extras \
--no-extra gstreamer \
--no-extra krisp \
--no-extra local \
--no-extra ultravox
# Or with pip
pip install -e ".[dev]"
# Install pre-commit hooks
uv run pre-commit install
# Set up environment variables
cp env.example .env
```
## Build and Test Commands
### Running Tests
```bash
# Run all tests
uv run pytest
# Run specific test file
uv run pytest tests/test_name.py
# Run tests with coverage
uv run pytest --cov=pipecat --cov-report=html
```
### Code Quality
```bash
# Format code (required before commits)
uv run ruff format
# Lint code
uv run ruff check
# Type checking
uv run mypy src/pipecat
# Run pre-commit checks manually
uv run pre-commit run --all-files
```
### Documentation
```bash
# Build API documentation
cd docs/api
./build-docs.sh
# Build docs manually
sphinx-build -b html . _build/html -W --keep-going
```
## Code Style Guidelines
### Python Standards
- **Formatting:** Strict PEP 8 via Ruff
- **Docstrings:** Google-style format
- **Type Hints:** Required for all public APIs
- **Import Organization:** Automated via Ruff
### Docstring Conventions
- **Classes:** Describe purpose + `__init__` with complete `Args:` section
- **Dataclasses:** Use `Parameters:` section, no `__init__` docstring
- **Methods:** Include `Args:` and `Returns:` sections
- **Properties:** Must have `Returns:` section
- **Examples:** Use `Examples:` section with `::` syntax
### File Organization
```
src/pipecat/ # Main package
├── processors/ # Frame processors
├── services/ # AI service integrations
├── transports/ # Communication layers
├── frames/ # Data frame definitions
└── pipeline/ # Pipeline orchestration
examples/foundational/ # Step-by-step tutorials
tests/ # Test suite
```
## Testing Instructions
### Test Structure
- **Unit Tests:** Test individual components in isolation
- **Integration Tests:** Test service interactions
- **Example Tests:** Validate foundational examples work
### Adding Tests
```bash
# Test naming convention
test_<component>_<functionality>.py
# Run specific test pattern
uv run pytest -k "test_pipeline"
# Run with debugging
uv run pytest -s -vv tests/test_name.py::test_function
```
### Pre-commit Requirements
All commits must pass:
- Ruff formatting
- Ruff linting
- Type checking
- Basic test suite
## Dependency Management
### Using uv (Recommended)
```bash
# Add runtime dependency
uv add package-name
# Add optional dependency
uv add --optional service package-name
# Add development dependency
uv add --group dev package-name
# Update lockfile
uv lock
# Sync dependencies
uv sync
```
### Important Notes
- **Always commit both `pyproject.toml` and `uv.lock` together**
- **Never manually edit `uv.lock`** - it's auto-generated
- **Use extras for optional service dependencies** (e.g., `[openai]`, `[cartesia]`)
## Project Structure Guidelines
### Service Integration
When adding new AI services:
1. Create service class in `src/pipecat/services/<provider>/`
2. Follow existing patterns (e.g., STTService, LLMService)
3. Add to appropriate extras in `pyproject.toml`
4. Include tests in `tests/`
5. Add documentation examples
### Frame Processing
For custom processors:
1. Inherit from `FrameProcessor`
2. Implement `process_frame()` method. ALWAYS explicitly call `await super().process_frame(frame, direction)` at the top of this method.
3. Handle frame direction (FrameDirection.UPSTREAM/DOWNSTREAM)
4. Add proper type hints and docstrings
### Transport Implementation
For new transport layers:
1. Inherit from `BaseTransport`
2. Implement required abstract methods
3. Handle connection lifecycle
4. Support both input and output streams
## Security Considerations
### API Keys
- **Never commit API keys** to the repository
- **Use environment variables** for all secrets
- **Reference `env.example`** for required variables
- **Use `.env` files** for local development
### Input Validation
- **Validate all external inputs** (audio, text, API responses)
- **Sanitize user data** before processing
- **Handle rate limiting** for external services
- **Implement proper timeout handling**
## Performance Guidelines
### Memory Management
- **Clean up resources** in transport disconnection handlers
- **Use async context managers** for service connections
- **Implement proper frame lifecycle** management
### Latency Optimization
- **Choose appropriate STT services** for latency requirements
- **Use streaming TTS** when possible
- **Implement connection pooling** for HTTP services
- **Consider WebRTC** for real-time applications
## Common Patterns
### Error Handling
```python
@transport.event_handler("on_error")
async def on_error(transport, error):
logger.error(f"Transport error: {error}")
# Shutdown the pipeline
await task.queue_frame(EndFrame())
```
### Service Configuration
```python
# Use environment variables for configuration
service = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY", ""),
model="gpt-4o",
params={"temperature": 0.7}
)
```
### Pipeline Assembly
```python
pipeline = Pipeline([
transport.input(),
stt_service,
context_aggregator.user(),
llm_service,
tts_service,
transport.output(),
context_aggregator.assistant(),
])
```
## Commit and PR Guidelines
### Commit Message Format
```
<type>(<scope>): <description>
[optional body]
[optional footer]
```
Types: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore`
### PR Requirements
- **All tests must pass**
- **Code must be properly formatted** (Ruff)
- **Include appropriate tests** for new functionality
- **Update documentation** if needed
- **Reference related issues** in description
### Review Process
1. Automated checks must pass
2. Manual code review by maintainers
3. Documentation review for user-facing changes
4. Integration testing for service additions
## Troubleshooting
### Common Issues
- **Import errors:** Run `uv sync` to ensure dependencies are installed
- **Test failures:** Check environment variables in `.env`
- **Format errors:** Run `uv run ruff format` before committing
- **Type errors:** Ensure all public methods have type hints
### Development Tips
- **Use foundational examples** as starting points for testing
- **Check existing services** for integration patterns
- **Run tests frequently** during development
- **Use IDE integration** for Ruff formatting
### Getting Help
- **Documentation:** [docs.pipecat.ai](https://docs.pipecat.ai)
- **Issues:** [GitHub Issues](https://github.com/pipecat-ai/pipecat/issues)

View File

@@ -7,203 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
### Added
- Added `on_pipeline_finished` event to `PipelineTask`. This event will get
fired when the pipeline is done running. This can be the result of a
`StopFrame`, `CancelFrame` or `EndFrame`.
```python
@task.event_handler("on_pipeline_finished")
async def on_pipeline_finished(task: PipelineTask, frame: Frame):
...
```
### Deprecated
- `PipelineTask` events `on_pipeline_stopped`, `on_pipeline_ended` and
`on_pipeline_cancelled` are now deprecated. Use `on_pipeline_finished`
instead.
### Fixed
- Fixed an issue in `AudioBufferProcessor` where a recording is not created
when a bot speaks and user input is blocked.
- Fixed a `FastAPIWebsocketTransport` and `SmallWebRTCTransport` issue where
`on_client_disconnected` would be triggered when the bot ends the
conversation. That is, `on_client_disconnected` should only be triggered when
the remote client actually disconnects.
- Fixed an issue in `HeyGenVideoService` where the `BotStartedSpeakingFrame`
was blocked from moving through the Pipeline.
## [0.0.85] - 2025-09-12
### Added
- `AzureSTTService` now pushes interim transcriptions.
- Added `voice_cloning_key` to `GoogleTTSService` to support custom cloned
voices.
- Added `speaking_rate` to `GoogleTTSService.InputParams` to control the
speaking rate.
- Added a `speed` arg to `OpenAITTSService` to control the speed of the voice
response.
- Added `FrameProcessor.push_interruption_task_frame_and_wait()`. Use this
method to programatically interrupt the bot from any part of the
pipeline. This guarantees that all the processors in the pipeline are
interrupted in order (from upstream to downstream). Internally, this works by
first pushing an `InterruptionTaskFrame` upstream until it reaches the
pipeline task. The pipeline task then generates an `InterruptionFrame`, which
flows downstream through all processors. Once the `InterruptionFrame` has
reaches the processor waiting for the interruption, the function returns and
execution continues after the call. Think of it as sending an upstream request
for interruption and waiting until the acknowledgment flows back downstream.
- Added new base `TaskFrame` (which is a system frame). This is the base class
for all task frames (`EndTaskFrame`, `CancelTaskFrame`, etc.) that are meant
to be pushed upstream to reach the pipeline task.
- Expanded support for universal `LLMContext` to the AWS Bedrock LLM service.
Using the universal `LLMContext` and associated `LLMContextAggregatorPair` is
a pre-requisite for using `LLMSwitcher` to switch between LLMs at runtime.
- Added new fields to the development runner's `parse_telephony_websocket`
method in support of providing dynamic data to a bot.
- Twilio: Added a new `body` parameter, which parses the websocket message
for `customParameters`. Provide data via the `Parameter` nouns in your
TwiML to use this feature.
- Telnyx & Exotel: Both providers make the `to` and `from` phone numbers
available in the websocket messages. You can now access these numbers as
`call_data["to"]` and `call_data["from"]`.
Note: Each telephony provider offers different features. Refer to the
corresponding example in `pipecat-examples` to see how to pass custom data
to your bot.
- Added `body` to the `WebsocketRunnerArguments` as an optional parameter.
Custom `body` information can be passed from the server into the bot file via
the `bot()` method using this new parameter.
- Added video streaming support to `LiveKitTransport`.
- Added `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService` which provide
access to OpenAI Realtime.
### Changed
- `pipeline.tests.utils.run_test()` now allows passing `PipelineParams` instead
of individual parameters.
### Removed
- Remove `VisionImageRawFrame` in favor of context frames (`LLMContextFrame` or
`OpenAILLMContextFrame`).
### Deprecated
- `BotInterruptionFrame` is now deprecated, use `InterruptionTaskFrame` instead.
- `StartInterruptionFrame` is now deprected, use `InterruptionFrame` instead.
- Deprecate `VisionImageFrameAggregator` because `VisionImageRawFrame` has been
removed. See the `12*` examples for the new recommended replacement pattern.
- `NoisereduceFilter` is now deprecated and will be removed in a future
version. Use other audio filters like `KrispFilter` or `AICFilter`.
- Deprecated `OpenAIRealtimeBetaLLMService` and `AzureRealtimeBetaLLMService`.
Use `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService`, respectively.
Each service will be removed in an upcoming version, 1.0.0.
### Fixed
- Fixed a `BaseOutputTransport` issue that caused incorrect detection of when
the bot stopped talking while using an audio mixer.
- Fixed a `LiveKitTransport` issue where RTVI messages were not properly
encoded.
- Add additional fixups to Mistral context messages to ensure they meet
Mistral-specific requirements, avoiding Mistral "invalid request" errors.
- Fixed `DailyTransport` transcription handling to gracefully handle missing
`rawResponse` field in transcription messages, preventing KeyError crashes.
## [0.0.84] - 2025-09-05
### Added
- Add the ability to send DTMF to `LiveKitTransport`.
- Expanded support for universal `LLMContext` to the Anthropic LLM service.
Using the universal `LLMContext` and associated `LLMContextAggregatorPair` is
a pre-requisite for using `LLMSwitcher` to switch between LLMs at runtime.
### Changed
- Updated `daily-python` to 0.19.9.
- Restored `DailyTransport`'s native DTMF support using Daily's `send_dtmf()`
method instead of generated audio tones.
### Fixed
- Fixed a `AWSBedrockLLMService` crash caused by an extra `await`.
- Fixed a `OpenAIImageGenService` issue where it was not creating
`URLImageRawFrame` correctly.
## [0.0.83] - 2025-09-03
### Added
- Added multilingual support for AsyncAI in `AsyncAITTSService` and `AsyncAIHttpTTSService`.
- New `languages`: `es`, `fr`, `de`, `it`.
- Added new frames `InputTransportMessageUrgentFrame` and
`DailyInputTransportMessageUrgentFrame` for transport messages received from
external sources.
- Added `UserSpeakingFrame`. This will be sent upstream and downstream while VAD
detects the user is speaking.
- Expanded support for universal `LLMContext` to more LLM services. Using the
universal `LLMContext` and associated `LLMContextAggregatorPair` is a
pre-requisite for using `LLMSwitcher` to switch between LLMs at runtime.
Here are the newly-supported services:
- Azure
- Cerebras
- Deepseek
- Fireworks AI
- Google Vertex AI
- Grok
- Groq
- Mistral
- NVIDIA NIM
- Ollama
- OpenPipe
- OpenRouter
- Perplexity
- Qwen
- SambaNova
- Together.ai
- Added support for WhatsApp User-initiated Calls.
- Added new audio filter `AICFilter`, speech enhancement for improving VAD/STT
performance, no ONNX dependency.
See https://ai-coustics.com/sdk/
- Added a timeout around cancel input tasks to prevent indefinite hangs when
cancellation is swallowed by third-party code.
## Added
- Added `pipecat.extensions.ivr` for automated IVR system navigation with
configurable goals and conversation handling. Supports DTMF input, verbal
@@ -238,34 +42,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
sending it through the transport. This makes sending DTMF generic across all
output transports.
- Added new config parameters to `GladiaSTTService`.
- Added new config parameters to `GladiaSTTService`.
- PreProcessingConfig > `audio_enhancer` to enhance audio quality.
- CustomVocabularyItem > `pronunciations` and `language` to specify special
pronunciations and in which language it will be pronounced.
- CustomVocabularyItem > `pronunciations` and `language` to specify special pronunciations and in which language it will be pronounced.
### Changed
- `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` are also pushed
upstream.
- `ParallelPipeline` now waits for `CancelFrame` to finish in all branches
before pushing it downstream.
- Added `sip_codecs` to the `DailyRoomSipParams`.
- Updated the `configure()` function in `pipecat.runner.daily` to include new
args to create SIP-enabled rooms. Additionally, added new args to control the
room and token expiration durations.
## Changed
- `pipecat.frames.frames.KeypadEntry` is deprecated and has been moved to
`pipecat.audio.dtmf.types.KeypadEntry`.
- Updated `RimeTTSService`'s flush_audio message to conform with Rime's official
API.
- Updated the default model for `CerebrasLLMService` to GPT-OSS-120B.
### Removed
## Removed
- Remove `StopInterruptionFrame`. This was a legacy frame that was not being
used really anywhere and it didn't provide any useful meaning. It was only
@@ -277,41 +63,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Remove deprecated `DailyTransport.send_dtmf()`.
### Deprecated
- Transports have been re-organized.
```
pipecat.transports.network.small_webrtc -> pipecat.transports.smallwebrtc.transport
pipecat.transports.network.webrtc_connection -> pipecat.transports.smallwebrtc.connection
pipecat.transports.network.websocket_client -> pipecat.transports.websocket.client
pipecat.transports.network.websocket_server -> pipecat.transports.websocket.server
pipecat.transports.network.fastapi_websocket -> pipecat.transports.websocket.fastapi
pipecat.transports.services.daily -> pipecat.transports.daily.transport
pipecat.transports.services.helpers.daily_rest -> pipecat.transports.daily.utils
pipecat.transports.services.livekit -> pipecat.transports.livekit.transport
pipecat.transports.services.tavus -> pipecat.transports.tavus.transport
```
## Deprecated
- `pipecat.frames.frames.KeypadEntry` is deprecated use
`pipecat.audio.dtmf.types.KeypadEntry` instead.
### Fixed
- Fixed an issue where messages received from the transport were always being resent.
- Fixed `SmallWebRTCTransport` to not use `mid` to decide if the transceiver should
be `sendrecv` or not.
- Fixed an issue where Deepgram swallowed `asyncio.CancelledError` during
disconnect, preventing tasks from being cancelled.
## Fixed
- Fixed an issue where `PipelineTask` was not cleaning up the observers.
### Performance
- Reduced latency and improved memory performance in `Mem0MemoryService`.
## [0.0.82] - 2025-08-28
### Added

View File

@@ -28,41 +28,6 @@
- **Composable Pipelines**: Build complex behavior from modular components
- **Real-Time**: Ultra-low latency interaction with different transports (e.g. WebSockets or WebRTC)
## 📱 Client SDKs
You can connect to Pipecat from any platform using our official SDKs:
<table>
<tr>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg" width="40" height="40" alt="JavaScript"/>
<a href="https://docs.pipecat.ai/client/js/introduction">JavaScript</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React"/>
<a href="https://docs.pipecat.ai/client/react/introduction">React</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/react/react-original.svg" width="40" height="40" alt="React Native"/>
<a href="https://docs.pipecat.ai/client/react-native/introduction">React Native</a>
</td>
</tr>
<tr>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/swift/swift-original.svg" width="40" height="40" alt="Swift"/>
<a href="https://docs.pipecat.ai/client/ios/introduction">Swift</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/kotlin/kotlin-original.svg" width="40" height="40" alt="Kotlin"/>
<a href="https://docs.pipecat.ai/client/android/introduction">Kotlin</a>
</td>
<td>
<img src="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/cplusplus/cplusplus-original.svg" width="40" height="40" alt="JavaScript"/>
<a href="https://docs.pipecat.ai/client/c++/introduction">C++</a>
</td>
</tr>
</table>
## 🎬 See it in action
<p float="left">
@@ -73,6 +38,17 @@ You can connect to Pipecat from any platform using our official SDKs:
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
</p>
## 📱 Client SDKs
You can connect to Pipecat from any platform using our official SDKs:
| Platform | SDK Repo | Description |
| -------- | ------------------------------------------------------------------------------ | -------------------------------- |
| Web | [pipecat-client-web](https://github.com/pipecat-ai/pipecat-client-web) | JavaScript and React client SDKs |
| iOS | [pipecat-client-ios](https://github.com/pipecat-ai/pipecat-client-ios) | Swift SDK for iOS |
| Android | [pipecat-client-android](https://github.com/pipecat-ai/pipecat-client-android) | Kotlin SDK for Android |
| C++ | [pipecat-client-cxx](https://github.com/pipecat-ai/pipecat-client-cxx) | C++ client SDK |
## 🧩 Available services
| Category | Services |
@@ -86,7 +62,7 @@ You can connect to Pipecat from any platform using our official SDKs:
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
@@ -153,11 +129,7 @@ You can get started with Pipecat running on your local machine, then move your a
2. Install development and testing dependencies:
```bash
uv sync --group dev --all-extras \
--no-extra gstreamer \
--no-extra krisp \
--no-extra local \
--no-extra ultravox # (ultravox not fully supported on macOS)
uv sync --group dev --all-extras --no-extra gstreamer --no-extra krisp --no-extra local
```
3. Install the git pre-commit hooks:
@@ -166,6 +138,23 @@ You can get started with Pipecat running on your local machine, then move your a
uv run pre-commit install
```
### Python 3.13+ Compatibility
Some features require PyTorch, which doesn't yet support Python 3.13+. Install using:
```bash
uv sync --group dev --all-extras \
--no-extra gstreamer \
--no-extra krisp \
--no-extra local \
--no-extra local-smart-turn \
--no-extra mlx-whisper \
--no-extra moondream \
--no-extra ultravox
```
> **Tip:** For full compatibility, use Python 3.12: `uv python pin 3.12`
> **Note**: Some extras (local, gstreamer) require system dependencies. See documentation if you encounter build errors.
### Running tests

View File

@@ -1,6 +1,3 @@
# AI-COUSTICS
AICOUSTICS_LICENSE_KEY=...
# Anthropic
ANTHROPIC_API_KEY=...
@@ -146,12 +143,3 @@ SENTRY_DSN=...
# Heygen
HEYGEN_API_KEY=...
# Mistral
MISTRAL_API_KEY=...
# NVIDIA
NVIDIA_API_KEY=...
# Qwen
QWEN_API_KEY=...

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.piper.tts import PiperTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.rime.tts import RimeHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -17,8 +17,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -17,7 +17,7 @@ from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.runner.livekit import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.livekit.transport import LiveKitParams, LiveKitTransport
from pipecat.transports.services.livekit import LiveKitParams, LiveKitTransport
load_dotenv(override=True)

View File

@@ -17,8 +17,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.riva.tts import FastPitchTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -18,7 +18,7 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.fal.image import FalImageGenService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -17,7 +17,7 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.google.image import GoogleImageGenService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -27,8 +27,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.smallwebrtc.connection import IceServer, SmallWebRTCConnection
from pipecat.transports.smallwebrtc.transport import SmallWebRTCTransport
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
from pipecat.transports.network.webrtc_connection import IceServer, SmallWebRTCConnection
load_dotenv(override=True)

View File

@@ -21,7 +21,7 @@ from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.runner.daily import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.daily.transport import DailyLogLevel, DailyParams, DailyTransport
from pipecat.transports.services.daily import DailyLogLevel, DailyParams, DailyTransport
load_dotenv(override=True)

View File

@@ -14,7 +14,7 @@ from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
InterruptionFrame,
BotInterruptionFrame,
TextFrame,
TranscriptionFrame,
UserStartedSpeakingFrame,
@@ -28,7 +28,7 @@ from pipecat.runner.livekit import configure
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.livekit.transport import LiveKitParams, LiveKitTransport
from pipecat.transports.services.livekit import LiveKitParams, LiveKitTransport
load_dotenv(override=True)
@@ -115,7 +115,7 @@ async def main():
await task.queue_frames(
[
InterruptionFrame(),
BotInterruptionFrame(),
UserStartedSpeakingFrame(),
TranscriptionFrame(
user_id=participant_id,

View File

@@ -33,7 +33,7 @@ from pipecat.services.cartesia.tts import CartesiaHttpTTSService
from pipecat.services.fal.image import FalImageGenService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -28,8 +28,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -29,7 +29,7 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -66,7 +66,7 @@ class ImageSyncAggregator(FrameProcessor):
)
)
await self.push_frame(frame, direction)
await self.push_frame(frame)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get

View File

@@ -21,8 +21,8 @@ from pipecat.services.cartesia.tts import CartesiaHttpTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -21,8 +21,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -26,8 +26,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.soniox.stt import SonioxSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.inworld.tts import InworldTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.asyncai.tts import AsyncAIHttpTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.asyncai.tts import AsyncAITTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -1,162 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import datetime
import os
import wave
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.filters.aic_filter import AICFilter
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# Create audio buffer processor so we can hear the audio fitler results.
audiobuffer = AudioBufferProcessor(
num_channels=2, # 1 for mono, 2 for stereo (user left, bot right)
enable_turn_audio=False, # Enable per-turn audio recording
)
def _create_aic_filter() -> AICFilter:
license_key = os.getenv("AICOUSTICS_LICENSE_KEY", "")
return AICFilter(
license_key=license_key,
enhancement_level=1.0,
)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
audio_in_filter=_create_aic_filter(),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
audio_in_filter=_create_aic_filter(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
audio_in_filter=_create_aic_filter(),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
audiobuffer, # write audio data to a file
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
await audiobuffer.start_recording()
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
# Save or process the composite audio
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"./conversation_{timestamp}.wav"
# Create the WAV file
with wave.open(filename, "wb") as wf:
wf.setnchannels(num_channels)
wf.setsampwidth(2) # 16-bit audio
wf.setframerate(sample_rate)
wf.writeframes(audio)
logger.info(f"Saved recording to {filename}")
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -33,8 +33,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -12,8 +12,8 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.frames.frames import (
InterruptionFrame,
LLMRunFrame,
StartInterruptionFrame,
UserStartedSpeakingFrame,
UserStoppedSpeakingFrame,
)
@@ -27,8 +27,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -97,7 +97,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
@stt.event_handler("on_speech_started")
async def on_speech_started(stt, *args, **kwargs):
await task.queue_frames([InterruptionFrame(), UserStartedSpeakingFrame()])
await task.queue_frames([StartInterruptionFrame(), UserStartedSpeakingFrame()])
@stt.event_handler("on_utterance_end")
async def on_utterance_end(stt, *args, **kwargs):

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.elevenlabs.tts import ElevenLabsHttpTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.playht.tts import PlayHTHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.playht.tts import PlayHTTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.azure.llm import AzureLLMService
from pipecat.services.azure.stt import AzureSTTService
from pipecat.services.azure.tts import AzureTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.openai.stt import OpenAISTTService
from pipecat.services.openai.tts import OpenAITTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openpipe.llm import OpenPipeLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.xtts.tts import XTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.services.gladia.stt import GladiaSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.lmnt.tts import LmntTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.groq.llm import GroqLLMService
from pipecat.services.groq.stt import GroqSTTService
from pipecat.services.groq.tts import GroqTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -20,8 +20,8 @@ from pipecat.services.aws.llm import AWSBedrockLLMService
from pipecat.services.aws.stt import AWSTranscribeSTTService
from pipecat.services.aws.tts import AWSPollyTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -41,8 +41,8 @@ from pipecat.services.google.stt import GoogleSTTService
from pipecat.services.google.tts import GeminiTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.google.stt import GoogleSTTService
from pipecat.services.google.tts import GoogleTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.assemblyai.stt import AssemblyAISTTService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.rime.tts import RimeHttpTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.rime.tts import RimeTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.nim.llm import NimLLMService
from pipecat.services.riva.stt import RivaSTTService
from pipecat.services.riva.tts import RivaTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -16,10 +16,10 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
Frame,
InputAudioRawFrame,
InterruptionFrame,
LLMFullResponseEndFrame,
LLMFullResponseStartFrame,
LLMRunFrame,
StartInterruptionFrame,
TextFrame,
TranscriptionFrame,
UserStartedSpeakingFrame,
@@ -36,8 +36,8 @@ from pipecat.services.google.llm import GoogleLLMService
from pipecat.services.google.tts import GoogleTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -93,8 +93,9 @@ class UserAudioCollector(FrameProcessor):
elif isinstance(frame, UserStoppedSpeakingFrame):
self._user_speaking = False
self._context.add_audio_frames_message(audio_frames=self._audio_frames)
await self._user_context_aggregator.push_frame(LLMRunFrame())
await self._user_context_aggregator.push_frame(
self._user_context_aggregator.get_context_frame()
)
elif isinstance(frame, InputAudioRawFrame):
if self._user_speaking:
self._audio_frames.append(frame)
@@ -150,7 +151,7 @@ class TranscriptExtractor(FrameProcessor):
await self.push_frame(frame, direction)
class TranscriptionContextFixup(FrameProcessor):
class TanscriptionContextFixup(FrameProcessor):
def __init__(self, context):
super().__init__()
self._context = context
@@ -181,7 +182,9 @@ class TranscriptionContextFixup(FrameProcessor):
if isinstance(frame, MagicDemoTranscriptionFrame):
self._transcript = frame.text
elif isinstance(frame, LLMFullResponseEndFrame) or isinstance(frame, InterruptionFrame):
elif isinstance(frame, LLMFullResponseEndFrame) or isinstance(
frame, StartInterruptionFrame
):
self.swap_user_audio()
self.add_transcript_back_to_inference_output()
self._transcript = ""
@@ -242,7 +245,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
context_aggregator = llm.create_context_aggregator(context)
audio_collector = UserAudioCollector(context, context_aggregator.user())
pull_transcript_out_of_llm_output = TranscriptExtractor(context)
fixup_context_messages = TranscriptionContextFixup(context)
fixup_context_messages = TanscriptionContextFixup(context)
pipeline = Pipeline(
[

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.fish.tts import FishAudioTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -19,8 +19,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.ultravox.stt import UltravoxSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.neuphonic.tts import NeuphonicHttpTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.neuphonic.tts import NeuphonicTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.fal.stt import FalSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.services.minimax.tts import MiniMaxHttpTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.sarvam.tts import SarvamHttpTTSService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.sarvam.tts import SarvamTTSService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -17,7 +17,7 @@ from pipecat.runner.daily import configure
from pipecat.services.azure import AzureLLMService, AzureTTSService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.fal import FalImageGenService
from pipecat.transports.daily.transport import DailyTransport
from pipecat.transports.services.daily import DailyTransport
load_dotenv(override=True)

View File

@@ -22,7 +22,7 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -24,8 +24,8 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport, maybe_capture_participant_camera
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.local.tk import TkLocalTransport, TkTransportParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -22,8 +22,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -32,8 +32,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -11,19 +11,12 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -35,14 +28,12 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.moondream.vision import MoondreamService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -55,32 +46,9 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -110,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
image_processor = UserImageProcessor()
vision_aggregator = VisionImageFrameAggregator()
# If you run into weird description, try with use_cpu=True
moondream = MoondreamService()
@@ -128,7 +96,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
image_processor,
vision_aggregator,
moondream,
tts,
transport.output(),
@@ -151,7 +119,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -11,19 +11,12 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -35,14 +28,12 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.google.llm import GoogleLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -55,32 +46,9 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -110,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
image_processor = UserImageProcessor()
vision_aggregator = VisionImageFrameAggregator()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
@@ -128,7 +96,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
image_processor,
vision_aggregator,
google,
tts,
transport.output(),
@@ -155,7 +123,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -11,19 +11,12 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -35,14 +28,12 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -55,32 +46,9 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -110,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
image_processor = UserImageProcessor()
vision_aggregator = VisionImageFrameAggregator()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
@@ -128,7 +96,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
image_processor,
vision_aggregator,
openai,
tts,
transport.output(),
@@ -155,7 +123,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -11,19 +11,12 @@ from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.frames.frames import Frame, TextFrame, TTSSpeakFrame, UserImageRequestFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.aggregators.vision_image_frame import VisionImageFrameAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
@@ -35,14 +28,12 @@ from pipecat.services.anthropic.llm import AnthropicLLMService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
@@ -55,32 +46,9 @@ class UserImageRequester(FrameProcessor):
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
UserImageRequestFrame(self._participant_id), FrameDirection.UPSTREAM
)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -110,7 +78,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
image_processor = UserImageProcessor()
vision_aggregator = VisionImageFrameAggregator()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
@@ -128,7 +96,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
stt,
user_response,
image_requester,
image_processor,
vision_aggregator,
anthropic,
tts,
transport.output(),
@@ -155,7 +123,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):

View File

@@ -1,187 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
from typing import Optional
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import (
Frame,
LLMContextFrame,
TextFrame,
TTSSpeakFrame,
UserImageRawFrame,
UserImageRequestFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.user_response import UserResponseAggregator
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
create_transport,
get_transport_client_id,
maybe_capture_participant_camera,
)
from pipecat.services.aws.llm import AWSBedrockLLMService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
class UserImageRequester(FrameProcessor):
"""Converts incoming text into requests for user images."""
def __init__(self, participant_id: Optional[str] = None):
super().__init__()
self._participant_id = participant_id
def set_participant_id(self, participant_id: str):
self._participant_id = participant_id
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if self._participant_id and isinstance(frame, TextFrame):
await self.push_frame(
UserImageRequestFrame(self._participant_id, context=frame.text),
FrameDirection.UPSTREAM,
)
else:
await self.push_frame(frame, direction)
class UserImageProcessor(FrameProcessor):
"""Converts incoming user images into context frames."""
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, UserImageRawFrame):
if frame.request and frame.request.context:
# Note: AWS Bedrock does not yet support the universal LLMContext
context = LLMContext()
context.add_image_frame_message(
image=frame.image,
text=frame.request.context,
size=frame.size,
format=frame.format,
)
frame = LLMContextFrame(context)
await self.push_frame(frame)
else:
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_in_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_in_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
user_response = UserResponseAggregator()
# Initialize the image requester without setting the participant ID yet
image_requester = UserImageRequester()
image_processor = UserImageProcessor()
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
# AWS for vision analysis
aws = AWSBedrockLLMService(
aws_region="us-west-2",
model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
# Note: usually, prefer providing latency="optimized" param.
# Here we can't because AWS Bedrock doesn't support it for Claude 3.7,
# which we need for image input.
params=AWSBedrockLLMService.InputParams(temperature=0.8),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
pipeline = Pipeline(
[
transport.input(),
stt,
user_response,
image_requester,
image_processor,
aws,
tts,
transport.output(),
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected: {client}")
await maybe_capture_participant_camera(transport, client)
# Set the participant ID in the image requester
client_id = get_transport_client_id(transport, client)
image_requester.set_participant_id(client_id)
# Welcome message
await task.queue_frame(TTSSpeakFrame("Hi there! Feel free to ask me about what I see."))
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.whisper.stt import WhisperSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -31,9 +31,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -32,9 +32,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
async def main():
transport = LocalAudioTransport(

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService, Language, LiveOptions
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -31,9 +31,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.gladia import GladiaSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -31,9 +31,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -25,8 +25,8 @@ from pipecat.services.gladia.config import (
from pipecat.services.gladia.stt import GladiaSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -40,9 +40,6 @@ class TranscriptionLogger(FrameProcessor):
elif isinstance(frame, TranslationFrame):
print(f"Translation ({frame.language}): {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.assemblyai.stt import AssemblyAISTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -31,9 +31,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -20,8 +20,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.whisper.stt import MLXModel, WhisperSTTServiceMLX
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -52,9 +52,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
self._last_transcription_time = time.time()
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -18,8 +18,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.stt import CartesiaSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -31,9 +31,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -21,8 +21,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.sambanova.stt import SambaNovaSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -53,9 +53,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
self._last_transcription_time = time.time()
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -19,8 +19,8 @@ from pipecat.runner.utils import create_transport
from pipecat.services.speechmatics.stt import SpeechmaticsSTTService
from pipecat.transcriptions.language import Language
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -32,9 +32,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets

View File

@@ -19,8 +19,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.soniox.stt import SonioxSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -32,9 +32,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
transport_params = {
"daily": lambda: DailyParams(

View File

@@ -19,8 +19,8 @@ from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.azure.stt import AzureSTTService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -32,9 +32,6 @@ class TranscriptionLogger(FrameProcessor):
if isinstance(frame, TranscriptionFrame):
print(f"Transcription: {frame.text}")
# Push all frames through
await self.push_frame(frame, direction)
transport_params = {
"daily": lambda: DailyParams(

View File

@@ -24,8 +24,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -1,214 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
create_transport,
get_transport_client_id,
maybe_capture_participant_camera,
)
from pipecat.services.aws.llm import AWSBedrockLLMService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
# Global variable to store the client ID
client_id = ""
async def get_weather(params: FunctionCallParams):
location = params.arguments["location"]
await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.")
async def get_image(params: FunctionCallParams):
question = params.arguments["question"]
logger.debug(f"Requesting image with user_id={client_id}, question={question}")
# Request the image frame
await params.llm.request_image_frame(
user_id=client_id,
function_name=params.function_name,
tool_call_id=params.tool_call_id,
text_content=question,
)
# Wait a short time for the frame to be processed
await asyncio.sleep(0.5)
# Return a result to complete the function call
await params.result_callback(
f"I've captured an image from your camera and I'm analyzing what you asked about: {question}"
)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_in_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_in_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = AWSBedrockLLMService(
aws_region="us-west-2",
model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
# Note: usually, prefer providing latency="optimized" param.
# Here we can't because AWS Bedrock doesn't support it for Claude 3.7,
# which we need for image input.
params=AWSBedrockLLMService.InputParams(temperature=0.8),
)
llm.register_function("get_weather", get_weather)
llm.register_function("get_image", get_image)
weather_function = FunctionSchema(
name="get_weather",
description="Get the current weather",
properties={
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
},
required=["location"],
)
get_image_function = FunctionSchema(
name="get_image",
description="Get an image from the video stream.",
properties={
"question": {
"type": "string",
"description": "The question that the user is asking about the image.",
}
},
required=["question"],
)
tools = ToolsSchema(standard_tools=[weather_function, get_image_function])
system_prompt = """\
You are a helpful assistant who converses with a user and answers questions. Respond concisely to general questions.
Your response will be turned into speech so use only simple words and punctuation.
You have access to two tools: get_weather and get_image.
You can respond to questions about the weather using the get_weather tool.
You can answer questions about the user's video stream using the get_image tool. Some examples of phrases that \
indicate you should use the get_image tool are:
- What do you see?
- What's in the video?
- Can you describe the video?
- Tell me about what you see.
- Tell me something interesting about what you see.
- What's happening in the video?
If you need to use a tool, simply use the tool. Do not tell the user the tool you are using. Be brief and concise.
"""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Start the conversation by introducing yourself."},
]
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt, # STT
context_aggregator.user(), # User speech to text
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses and tool context
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected: {client}")
await maybe_capture_participant_camera(transport, client)
global client_id
client_id = get_transport_client_id(transport, client)
# Kick off the conversation.
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -30,7 +30,7 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -97,7 +97,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm = AnthropicLLMService(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-3-7-sonnet-latest",
params=AnthropicLLMService.InputParams(enable_prompt_caching=True),
enable_prompt_caching_beta=True,
)
llm.register_function("get_weather", get_weather)
llm.register_function("get_image", get_image)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.together.llm import TogetherLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -30,7 +30,7 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -30,7 +30,7 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.google.llm import GoogleLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -26,8 +26,8 @@ from pipecat.services.groq.llm import GroqLLMService
from pipecat.services.groq.stt import GroqSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.grok.llm import GrokLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.fireworks.llm import FireworksLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -75,13 +75,9 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# sent to the same callback with an additional function_name parameter.
llm.register_function("get_current_weather", fetch_weather_from_api)
# Disabling for now, as it ends up tripping up the model in this example
# ("let me check on that" ends up at the end of the context, which the
# model erroneously treats as a nudge to call the tool again; the
# ensuing inference then yields wonky results).
# @llm.event_handler("on_function_calls_started")
# async def on_function_calls_started(service, function_calls):
# await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
@llm.event_handler("on_function_calls_started")
async def on_function_calls_started(service, function_calls):
await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
weather_function = FunctionSchema(
name="get_current_weather",
@@ -103,7 +99,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way. Start by saying hello.",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.nim.llm import NimLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.cerebras.llm import CerebrasLLMService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)
@@ -67,7 +67,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = CerebrasLLMService(api_key=os.getenv("CEREBRAS_API_KEY"))
llm = CerebrasLLMService(api_key=os.getenv("CEREBRAS_API_KEY"), model="llama-3.3-70b")
# You can also register a function_name of None to get all functions
# sent to the same callback with an additional function_name parameter.
llm.register_function("get_current_weather", fetch_weather_from_api)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepseek.llm import DeepSeekLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openrouter.llm import OpenRouterLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -28,8 +28,8 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.perplexity.llm import PerplexityLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.google.llm_openai import GoogleLLMOpenAIBetaService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.google.llm_vertex import GoogleVertexLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -25,8 +25,8 @@ from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.qwen.llm import QwenLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

View File

@@ -23,8 +23,8 @@ from pipecat.services.aws.stt import AWSTranscribeSTTService
from pipecat.services.aws.tts import AWSPollyTTSService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
load_dotenv(override=True)

Some files were not shown because too many files have changed in this diff Show More