Compare commits
203 Commits
hush/rtviS
...
aleix/queu
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
41695806e8 | ||
|
|
7280e390d9 | ||
|
|
4efc3f0a39 | ||
|
|
cb7e7a8aa3 | ||
|
|
9136402846 | ||
|
|
260fc76137 | ||
|
|
7cfb9a4d15 | ||
|
|
2089e0c974 | ||
|
|
9e0b4fe5d1 | ||
|
|
75ce632f84 | ||
|
|
efeb96c4e8 | ||
|
|
fb5438e9c2 | ||
|
|
7da9f66e1c | ||
|
|
9e16e3d614 | ||
|
|
84d040c6d0 | ||
|
|
f3e0beb8f1 | ||
|
|
e00a1196ef | ||
|
|
3867c0f8e7 | ||
|
|
cdf0953722 | ||
|
|
ed00f7d071 | ||
|
|
a3038afa02 | ||
|
|
f9ca0b8cc6 | ||
|
|
2920aa5af4 | ||
|
|
93c9cc4a0e | ||
|
|
b53f9235e4 | ||
|
|
1491462d15 | ||
|
|
c78f779800 | ||
|
|
b013e375fb | ||
|
|
52036138c1 | ||
|
|
4ba9a42861 | ||
|
|
27bff7a759 | ||
|
|
896f8d85f7 | ||
|
|
ed06cdd2c7 | ||
|
|
8473647269 | ||
|
|
5579145a06 | ||
|
|
35848d10b3 | ||
|
|
c7e223e85a | ||
|
|
885b2d1d2f | ||
|
|
73020be511 | ||
|
|
d388c057c0 | ||
|
|
c4d0f91a7f | ||
|
|
467233be04 | ||
|
|
2b02d08f4c | ||
|
|
9fe265ea64 | ||
|
|
cc1f4ba81c | ||
|
|
3784bdbd27 | ||
|
|
4ffdc3b77c | ||
|
|
38c9fa681a | ||
|
|
c477039954 | ||
|
|
d6ef3d64ac | ||
|
|
6938152db6 | ||
|
|
2154db07f0 | ||
|
|
5e0803479e | ||
|
|
3960c604a4 | ||
|
|
394648f1c9 | ||
|
|
da5c4953d5 | ||
|
|
2b7e1cb5b1 | ||
|
|
f182eafb40 | ||
|
|
9f7f42e885 | ||
|
|
9b8bce1914 | ||
|
|
96d05e12fc | ||
|
|
68c1069548 | ||
|
|
5b64613f65 | ||
|
|
1f9baefba8 | ||
|
|
0c255d2618 | ||
|
|
a38206de9c | ||
|
|
260f7c9b85 | ||
|
|
de294caed9 | ||
|
|
e40aa4f99a | ||
|
|
b1d413b9be | ||
|
|
8cbad070ad | ||
|
|
13569a5a5a | ||
|
|
d789334a60 | ||
|
|
7668b27fc0 | ||
|
|
6d30f441e8 | ||
|
|
a9e395b366 | ||
|
|
5e5626f04f | ||
|
|
d80aa5b44e | ||
|
|
80ef6dc4de | ||
|
|
458549f7df | ||
|
|
a8405649d0 | ||
|
|
ce1a72850b | ||
|
|
58de381746 | ||
|
|
bed2e894a2 | ||
|
|
b4de98cfb7 | ||
|
|
a4b9db9e07 | ||
|
|
664111a3c9 | ||
|
|
aa964847f3 | ||
|
|
fa5cac7e0a | ||
|
|
b2b01861b2 | ||
|
|
f014f718eb | ||
|
|
05ae8d3ffa | ||
|
|
88c9e08bd8 | ||
|
|
844f61dfea | ||
|
|
acb7d597cb | ||
|
|
2b18f60261 | ||
|
|
5b66133a6c | ||
|
|
0c5bc6a57a | ||
|
|
7981e00955 | ||
|
|
5e39c0cfeb | ||
|
|
a444701929 | ||
|
|
f6c1eb5d9d | ||
|
|
a1d46cb26b | ||
|
|
99ab148d88 | ||
|
|
d69fa5dba5 | ||
|
|
0d30b000af | ||
|
|
e7c0e742d2 | ||
|
|
2aff2dcca3 | ||
|
|
288f8865c8 | ||
|
|
8691870bcb | ||
|
|
e06146c237 | ||
|
|
c68e990cda | ||
|
|
4583905313 | ||
|
|
9cc498b1fa | ||
|
|
b3c5dc4045 | ||
|
|
3824da7261 | ||
|
|
855d567b1e | ||
|
|
b323a7bd88 | ||
|
|
fa011d0018 | ||
|
|
e15fa8777a | ||
|
|
2143a6d927 | ||
|
|
044e2d3e73 | ||
|
|
be112ec63f | ||
|
|
d2f56c4e8f | ||
|
|
ddc6a9c695 | ||
|
|
2bebdbc371 | ||
|
|
8b9f1f0608 | ||
|
|
b25f3b2ed2 | ||
|
|
a995cf81b6 | ||
|
|
75d261639f | ||
|
|
f720d795d0 | ||
|
|
f6fe83e358 | ||
|
|
0513d0b6a8 | ||
|
|
0679bb217d | ||
|
|
38bd55e518 | ||
|
|
65c7423280 | ||
|
|
f24a85cc94 | ||
|
|
53887b7c98 | ||
|
|
523c012c38 | ||
|
|
97c28989c1 | ||
|
|
c19be6ebb2 | ||
|
|
54971a0735 | ||
|
|
4513e81e13 | ||
|
|
872204b795 | ||
|
|
a94cbfe6f5 | ||
|
|
7152faafb2 | ||
|
|
e6aadaccd8 | ||
|
|
3a73aa71b8 | ||
|
|
814e7509e1 | ||
|
|
e0cf5ec016 | ||
|
|
667bd32e6a | ||
|
|
b2ecd83706 | ||
|
|
b2754117c8 | ||
|
|
6c428c303b | ||
|
|
e7d889a143 | ||
|
|
da60e7069b | ||
|
|
c14406a3b9 | ||
|
|
725ab5ec21 | ||
|
|
daf9d47e58 | ||
|
|
63a65627a2 | ||
|
|
02c07755b0 | ||
|
|
15cbd18acc | ||
|
|
93c40b87dc | ||
|
|
eeaa9f67a1 | ||
|
|
b60691c7b2 | ||
|
|
2bb1b0b343 | ||
|
|
047ef9f86c | ||
|
|
9a2c603c91 | ||
|
|
94c4169407 | ||
|
|
cb8a551db8 | ||
|
|
779f09af70 | ||
|
|
19dc0f2bfb | ||
|
|
f0709e22ba | ||
|
|
8250736f5e | ||
|
|
83348a9f93 | ||
|
|
96d40903a9 | ||
|
|
2560811805 | ||
|
|
2b8c44c008 | ||
|
|
38e2d37674 | ||
|
|
6278561f88 | ||
|
|
750e79c1ce | ||
|
|
71eb2963c5 | ||
|
|
f44e2c86ea | ||
|
|
afe1f0df8c | ||
|
|
458fddfb48 | ||
|
|
8d915c5ccb | ||
|
|
304153dd03 | ||
|
|
a6781b7352 | ||
|
|
5ad0058303 | ||
|
|
75c039de33 | ||
|
|
74e3c3677e | ||
|
|
dc20327f10 | ||
|
|
e738affd29 | ||
|
|
ef3d732607 | ||
|
|
6d63cff1bf | ||
|
|
12f42605a1 | ||
|
|
fac3337927 | ||
|
|
76d198151c | ||
|
|
6a907058de | ||
|
|
6e1f531f64 | ||
|
|
4232cca5b6 | ||
|
|
c510870736 | ||
|
|
e8783f6a33 |
137
CHANGELOG.md
@@ -5,10 +5,113 @@ All notable changes to **Pipecat** will be documented in this file.
|
|||||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||||
|
|
||||||
## [Unreleased]
|
## [0.0.67] - 2025-05-07
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
|
||||||
|
- Added `DebugLogObserver` for detailed frame logging with configurable
|
||||||
|
filtering by frame type and endpoint. This observer automatically extracts
|
||||||
|
and formats all frame data fields for debug logging.
|
||||||
|
|
||||||
|
- `UserImageRequestFrame.video_source` field has been added to request an image
|
||||||
|
from the desired video source.
|
||||||
|
|
||||||
|
- Added support for the AWS Nova Sonic speech-to-speech model with the new
|
||||||
|
`AWSNovaSonicLLMService`.
|
||||||
|
See https://docs.aws.amazon.com/nova/latest/userguide/speech.html.
|
||||||
|
Note that it requires Python >= 3.12 and `pip install pipecat-ai[aws-nova-sonic]`.
|
||||||
|
|
||||||
|
- Added new AWS services `AWSBedrockLLMService` and `AWSTranscribeSTTService`.
|
||||||
|
|
||||||
|
- Added `on_active_speaker_changed` event handler to the `DailyTransport` class.
|
||||||
|
|
||||||
|
- Added `enable_ssml_parsing` and `enable_logging` to `InputParams` in
|
||||||
|
`ElevenLabsTTSService`.
|
||||||
|
|
||||||
|
- Added support to `RimeHttpTTSService` for the `arcana` model.
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
|
||||||
|
- Updated `ElevenLabsTTSService` to use the beta websocket API
|
||||||
|
(multi-stream-input). This new API supports context_ids and cancelling those
|
||||||
|
contexts, which greatly improves interruption handling.
|
||||||
|
|
||||||
|
- Observers `on_push_frame()` now take a single argument `FramePushed` instead
|
||||||
|
of multiple arguments.
|
||||||
|
|
||||||
|
- Updated the default voice for `DeepgramTTSService` to `aura-2-helena-en`.
|
||||||
|
|
||||||
|
### Deprecated
|
||||||
|
|
||||||
|
- `PollyTTSService` is now deprecated, use `AWSPollyTTSService` instead.
|
||||||
|
|
||||||
|
- Observer `on_push_frame(src, dst, frame, direction, timestamp)` is now
|
||||||
|
deprecated, use `on_push_frame(data: FramePushed)` instead.
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- Fixed a `DailyTransport` issue that was causing issues when multiple audio or
|
||||||
|
video sources where being captured.
|
||||||
|
|
||||||
|
- Fixed a `UltravoxSTTService` issue that would cause the service to generate
|
||||||
|
all tokens as one word.
|
||||||
|
|
||||||
|
- Fixed a `PipelineTask` issue that would cause tasks to not be cancelled if
|
||||||
|
task was cancelled from outside of Pipecat.
|
||||||
|
|
||||||
|
- Fixed a `TaskManager` that was causing dangling tasks to be reported.
|
||||||
|
|
||||||
|
- Fixed an issue that could cause data to be sent to the transports when they
|
||||||
|
were still not ready.
|
||||||
|
|
||||||
|
- Remove custom audio tracks from `DailyTransport` before leaving.
|
||||||
|
|
||||||
|
### Removed
|
||||||
|
|
||||||
|
- Removed `CanonicalMetricsService` as it's no longer maintained.
|
||||||
|
|
||||||
|
## [0.0.66] - 2025-05-02
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
- Added two new input parameters to `RimeTTSService`: `pause_between_brackets`
|
||||||
|
and `phonemize_between_brackets`.
|
||||||
|
|
||||||
|
- Added support for cross-platform local smart turn detection. You can use
|
||||||
|
`LocalSmartTurnAnalyzer` for on-device inference using Torch.
|
||||||
|
|
||||||
|
- `BaseOutputTransport` now allows multiple destinations if the transport
|
||||||
|
implementation supports it (e.g. Daily's custom tracks). With multiple
|
||||||
|
destinations it is possible to send different audio or video tracks with a
|
||||||
|
single transport simultaneously. To do that, you need to set the new
|
||||||
|
`Frame.transport_destination` field with your desired transport destination
|
||||||
|
(e.g. custom track name), tell the transport you want a new destination with
|
||||||
|
`TransportParams.audio_out_destinations` or
|
||||||
|
`TransportParams.video_out_destinations` and the transport should take care of
|
||||||
|
the rest.
|
||||||
|
|
||||||
|
- Similar to the new `Frame.transport_destination`, there's a new
|
||||||
|
`Frame.transport_source` field which is set by the `BaseInputTransport` if the
|
||||||
|
incoming data comes from a non-default source (e.g. custom tracks).
|
||||||
|
|
||||||
|
- `TTSService` has a new `transport_destination` constructor parameter. This
|
||||||
|
parameter will be used to update the `Frame.transport_destination` field for
|
||||||
|
each generated `TTSAudioRawFrame`. This allows sending multiple bots' audio to
|
||||||
|
multiple destinations in the same pipeline.
|
||||||
|
|
||||||
|
- Added `DailyTransportParams.camera_out_enabled` and
|
||||||
|
`DailyTransportParams.microphone_out_enabled` which allows you to
|
||||||
|
enable/disable the main output camera or microphone tracks. This is useful if
|
||||||
|
you only want to use custom tracks and not send the main tracks. Note that you
|
||||||
|
still need `audio_out_enabled=True` or `video_out_enabled`.
|
||||||
|
|
||||||
|
- Added `DailyTransport.capture_participant_audio()` which allows you to capture
|
||||||
|
an audio source (e.g. "microphone", "screenAudio" or a custom track name) from
|
||||||
|
a remote participant.
|
||||||
|
|
||||||
|
- Added `DailyTransport.update_publishing()` which allows you to update the call
|
||||||
|
video and audio publishing settings (e.g. audio and video quality).
|
||||||
|
|
||||||
- Added `RTVIObserverParams` which allows you to configure what RTVI messages
|
- Added `RTVIObserverParams` which allows you to configure what RTVI messages
|
||||||
are sent to the clients.
|
are sent to the clients.
|
||||||
|
|
||||||
@@ -37,6 +140,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
|
|
||||||
|
- `TransportParams.audio_mixer` now supports a string and also a dictionary to
|
||||||
|
provide a mixer per destination. For example:
|
||||||
|
|
||||||
|
```python
|
||||||
|
audio_out_mixer={
|
||||||
|
"track-1": SoundfileMixer(...),
|
||||||
|
"track-2": SoundfileMixer(...),
|
||||||
|
"track-N": SoundfileMixer(...),
|
||||||
|
},
|
||||||
|
```
|
||||||
|
|
||||||
- The `STTMuteFilter` now mutes `InterimTranscriptionFrame` and
|
- The `STTMuteFilter` now mutes `InterimTranscriptionFrame` and
|
||||||
`TranscriptionFrame` which allows the `STTMuteFilter` to be used in
|
`TranscriptionFrame` which allows the `STTMuteFilter` to be used in
|
||||||
conjunction with transports that generate transcripts, e.g. `DailyTransport`.
|
conjunction with transports that generate transcripts, e.g. `DailyTransport`.
|
||||||
@@ -70,6 +184,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
case there's no need to push audio to the rest of the pipeline, but this is
|
case there's no need to push audio to the rest of the pipeline, but this is
|
||||||
not a very common case.
|
not a very common case.
|
||||||
|
|
||||||
|
- Added `RivaSegmentedSTTService`, which allows Riva offline/batch models, such
|
||||||
|
as to be "canary-1b-asr" used in Pipecat.
|
||||||
|
|
||||||
### Deprecated
|
### Deprecated
|
||||||
|
|
||||||
- Function calls with parameters
|
- Function calls with parameters
|
||||||
@@ -85,8 +202,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
- `TransportParams.vad_audio_passthrough` parameter is now deprecated, use
|
- `TransportParams.vad_audio_passthrough` parameter is now deprecated, use
|
||||||
`TransportParams.audio_in_passthrough` instead.
|
`TransportParams.audio_in_passthrough` instead.
|
||||||
|
|
||||||
|
- `ParakeetSTTService` is now deprecated, use `RivaSTTService` instead, which uses
|
||||||
|
the model "parakeet-ctc-1.1b-asr" by default.
|
||||||
|
|
||||||
|
- `FastPitchTTSService` is now deprecated, use `RivaTTSService` instead, which uses
|
||||||
|
the model "magpie-tts-multilingual" by default.
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
|
|
||||||
|
- Fixed an issue with `SimliVideoService` where the bot was continuously outputting
|
||||||
|
audio, which prevents the `BotStoppedSpeakingFrame` from being emitted.
|
||||||
|
|
||||||
|
- Fixed an issue where `OpenAIRealtimeBetaLLMService` would add two assistant
|
||||||
|
messages to the context.
|
||||||
|
|
||||||
- Fixed an issue with `GeminiMultimodalLiveLLMService` where the context
|
- Fixed an issue with `GeminiMultimodalLiveLLMService` where the context
|
||||||
contained tokens instead of words.
|
contained tokens instead of words.
|
||||||
|
|
||||||
@@ -102,6 +231,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
|
|
||||||
### Other
|
### Other
|
||||||
|
|
||||||
|
- Added `examples/daily-custom-tracks` to show how to send and receive Daily
|
||||||
|
custom tracks.
|
||||||
|
|
||||||
|
- Added `examples/daily-multi-translation` to showcase how to send multiple
|
||||||
|
simulataneous translations with the same transport.
|
||||||
|
|
||||||
- Added 04 foundational examples for client/server transports. Also, renamed
|
- Added 04 foundational examples for client/server transports. Also, renamed
|
||||||
`29-livekit-audio-chat.py` to `04b-transports-livekit.py`.
|
`29-livekit-audio-chat.py` to `04b-transports-livekit.py`.
|
||||||
|
|
||||||
|
|||||||
24
README.md
@@ -49,18 +49,18 @@ You can connect to Pipecat from any platform using our official SDKs:
|
|||||||
|
|
||||||
## 🧩 Available services
|
## 🧩 Available services
|
||||||
|
|
||||||
| Category | Services |
|
| Category | Services |
|
||||||
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Parakeet (NVIDIA)](https://docs.pipecat.ai/server/services/stt/parakeet), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
|
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Parakeet (NVIDIA)](https://docs.pipecat.ai/server/services/stt/parakeet), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
|
||||||
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
|
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
|
||||||
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [FastPitch (NVIDIA)](https://docs.pipecat.ai/server/services/tts/fastpitch), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
|
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [FastPitch (NVIDIA)](https://docs.pipecat.ai/server/services/tts/fastpitch), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
|
||||||
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
|
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
|
||||||
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
|
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
|
||||||
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
|
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
|
||||||
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
|
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
|
||||||
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
|
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
|
||||||
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
|
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
|
||||||
| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/server/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
|
| Analytics & Metrics | [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
|
||||||
|
|
||||||
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
|
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
|
||||||
|
|
||||||
|
|||||||
@@ -10,7 +10,6 @@ pipecat-ai[anthropic]
|
|||||||
pipecat-ai[assemblyai]
|
pipecat-ai[assemblyai]
|
||||||
pipecat-ai[aws]
|
pipecat-ai[aws]
|
||||||
pipecat-ai[azure]
|
pipecat-ai[azure]
|
||||||
pipecat-ai[canonical]
|
|
||||||
pipecat-ai[cartesia]
|
pipecat-ai[cartesia]
|
||||||
pipecat-ai[cerebras]
|
pipecat-ai[cerebras]
|
||||||
pipecat-ai[deepseek]
|
pipecat-ai[deepseek]
|
||||||
|
|||||||
161
examples/canonical-metrics/.gitignore
vendored
@@ -1,161 +0,0 @@
|
|||||||
# Byte-compiled / optimized / DLL files
|
|
||||||
__pycache__/
|
|
||||||
*.py[cod]
|
|
||||||
*$py.class
|
|
||||||
recordings/
|
|
||||||
# C extensions
|
|
||||||
*.so
|
|
||||||
|
|
||||||
# Distribution / packaging
|
|
||||||
.Python
|
|
||||||
build/
|
|
||||||
develop-eggs/
|
|
||||||
dist/
|
|
||||||
downloads/
|
|
||||||
eggs/
|
|
||||||
.eggs/
|
|
||||||
lib/
|
|
||||||
lib64/
|
|
||||||
parts/
|
|
||||||
sdist/
|
|
||||||
var/
|
|
||||||
wheels/
|
|
||||||
share/python-wheels/
|
|
||||||
*.egg-info/
|
|
||||||
.installed.cfg
|
|
||||||
*.egg
|
|
||||||
MANIFEST
|
|
||||||
|
|
||||||
# PyInstaller
|
|
||||||
# Usually these files are written by a python script from a template
|
|
||||||
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
|
||||||
*.manifest
|
|
||||||
*.spec
|
|
||||||
|
|
||||||
# Installer logs
|
|
||||||
pip-log.txt
|
|
||||||
pip-delete-this-directory.txt
|
|
||||||
|
|
||||||
# Unit test / coverage reports
|
|
||||||
htmlcov/
|
|
||||||
.tox/
|
|
||||||
.nox/
|
|
||||||
.coverage
|
|
||||||
.coverage.*
|
|
||||||
.cache
|
|
||||||
nosetests.xml
|
|
||||||
coverage.xml
|
|
||||||
*.cover
|
|
||||||
*.py,cover
|
|
||||||
.hypothesis/
|
|
||||||
.pytest_cache/
|
|
||||||
cover/
|
|
||||||
|
|
||||||
# Translations
|
|
||||||
*.mo
|
|
||||||
*.pot
|
|
||||||
|
|
||||||
# Django stuff:
|
|
||||||
*.log
|
|
||||||
local_settings.py
|
|
||||||
db.sqlite3
|
|
||||||
db.sqlite3-journal
|
|
||||||
|
|
||||||
# Flask stuff:
|
|
||||||
instance/
|
|
||||||
.webassets-cache
|
|
||||||
|
|
||||||
# Scrapy stuff:
|
|
||||||
.scrapy
|
|
||||||
|
|
||||||
# Sphinx documentation
|
|
||||||
docs/_build/
|
|
||||||
|
|
||||||
# PyBuilder
|
|
||||||
.pybuilder/
|
|
||||||
target/
|
|
||||||
|
|
||||||
# Jupyter Notebook
|
|
||||||
.ipynb_checkpoints
|
|
||||||
|
|
||||||
# IPython
|
|
||||||
profile_default/
|
|
||||||
ipython_config.py
|
|
||||||
|
|
||||||
# pyenv
|
|
||||||
# For a library or package, you might want to ignore these files since the code is
|
|
||||||
# intended to run in multiple environments; otherwise, check them in:
|
|
||||||
# .python-version
|
|
||||||
|
|
||||||
# pipenv
|
|
||||||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
|
|
||||||
# However, in case of collaboration, if having platform-specific dependencies or dependencies
|
|
||||||
# having no cross-platform support, pipenv may install dependencies that don't work, or not
|
|
||||||
# install all needed dependencies.
|
|
||||||
#Pipfile.lock
|
|
||||||
|
|
||||||
# poetry
|
|
||||||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
|
|
||||||
# This is especially recommended for binary packages to ensure reproducibility, and is more
|
|
||||||
# commonly ignored for libraries.
|
|
||||||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
|
|
||||||
#poetry.lock
|
|
||||||
|
|
||||||
# pdm
|
|
||||||
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
|
|
||||||
#pdm.lock
|
|
||||||
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
|
|
||||||
# in version control.
|
|
||||||
# https://pdm.fming.dev/#use-with-ide
|
|
||||||
.pdm.toml
|
|
||||||
|
|
||||||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
|
|
||||||
__pypackages__/
|
|
||||||
|
|
||||||
# Celery stuff
|
|
||||||
celerybeat-schedule
|
|
||||||
celerybeat.pid
|
|
||||||
|
|
||||||
# SageMath parsed files
|
|
||||||
*.sage.py
|
|
||||||
|
|
||||||
# Environments
|
|
||||||
.env
|
|
||||||
.venv
|
|
||||||
env/
|
|
||||||
venv/
|
|
||||||
ENV/
|
|
||||||
env.bak/
|
|
||||||
venv.bak/
|
|
||||||
|
|
||||||
# Spyder project settings
|
|
||||||
.spyderproject
|
|
||||||
.spyproject
|
|
||||||
|
|
||||||
# Rope project settings
|
|
||||||
.ropeproject
|
|
||||||
|
|
||||||
# mkdocs documentation
|
|
||||||
/site
|
|
||||||
|
|
||||||
# mypy
|
|
||||||
.mypy_cache/
|
|
||||||
.dmypy.json
|
|
||||||
dmypy.json
|
|
||||||
|
|
||||||
# Pyre type checker
|
|
||||||
.pyre/
|
|
||||||
|
|
||||||
# pytype static type analyzer
|
|
||||||
.pytype/
|
|
||||||
|
|
||||||
# Cython debug symbols
|
|
||||||
cython_debug/
|
|
||||||
|
|
||||||
# PyCharm
|
|
||||||
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
|
|
||||||
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
|
|
||||||
# and can be added to the global gitignore or merged into this file. For a more nuclear
|
|
||||||
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
|
|
||||||
#.idea/
|
|
||||||
runpod.toml
|
|
||||||
@@ -1,66 +0,0 @@
|
|||||||
# Chatbot with canonical-metrics
|
|
||||||
|
|
||||||
This project implements a chatbot using a pipeline architecture that integrates audio processing, transcription, and a language model for conversational interactions. The chatbot operates within a daily communication environment, utilizing various services for text-to-speech and language model responses.
|
|
||||||
|
|
||||||
## Features
|
|
||||||
|
|
||||||
- **Audio Input and Output**: Captures microphone input and plays back audio responses.
|
|
||||||
- **Voice Activity Detection**: Utilizes Silero VAD to manage audio input intelligently.
|
|
||||||
- **Text-to-Speech**: Integrates ElevenLabs TTS service to convert text responses into audio.
|
|
||||||
- **Language Model Interaction**: Uses OpenAI's GPT-4 model to generate responses based on user input.
|
|
||||||
- **Transcription Services**: Captures and transcribes participant speech for analytics.
|
|
||||||
- **Metrics Collection**: Sends audio data for analysis via Canonical Metrics Service.
|
|
||||||
|
|
||||||
## Requirements
|
|
||||||
|
|
||||||
- Python 3.10+
|
|
||||||
- `python-dotenv`
|
|
||||||
- Additional libraries from the `pipecat` package.
|
|
||||||
|
|
||||||
## Setup
|
|
||||||
|
|
||||||
1. Clone the repository.
|
|
||||||
2. Install the required packages.
|
|
||||||
3. Set up environment variables for API keys:
|
|
||||||
- `OPENAI_API_KEY`
|
|
||||||
- `ELEVENLABS_API_KEY`
|
|
||||||
- `CANONICAL_API_KEY`
|
|
||||||
- `CANONICAL_API_URL`
|
|
||||||
4. Run the script.
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
The chatbot introduces itself and engages in conversations, providing brief and creative responses. Designed for flexibility, it can support multiple languages with appropriate configuration.
|
|
||||||
|
|
||||||
## Events
|
|
||||||
|
|
||||||
- Participants joining or leaving the call are handled dynamically, adjusting the chatbot's behavior accordingly.
|
|
||||||
|
|
||||||
|
|
||||||
ℹ️ The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
|
|
||||||
|
|
||||||
## Get started
|
|
||||||
|
|
||||||
```python
|
|
||||||
python3 -m venv venv
|
|
||||||
source venv/bin/activate
|
|
||||||
pip install -r requirements.txt
|
|
||||||
|
|
||||||
cp env.example .env # and add your credentials
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
## Run the server
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python server.py
|
|
||||||
```
|
|
||||||
|
|
||||||
Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
|
|
||||||
|
|
||||||
## Build and test the Docker image
|
|
||||||
|
|
||||||
```
|
|
||||||
docker build -t chatbot .
|
|
||||||
docker run --env-file .env -p 7860:7860 chatbot
|
|
||||||
```
|
|
||||||
@@ -1,146 +0,0 @@
|
|||||||
#
|
|
||||||
# Copyright (c) 2024–2025, Daily
|
|
||||||
#
|
|
||||||
# SPDX-License-Identifier: BSD 2-Clause License
|
|
||||||
#
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import uuid
|
|
||||||
|
|
||||||
import aiohttp
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
from loguru import logger
|
|
||||||
from runner import configure
|
|
||||||
|
|
||||||
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
|
||||||
from pipecat.frames.frames import EndFrame
|
|
||||||
from pipecat.pipeline.pipeline import Pipeline
|
|
||||||
from pipecat.pipeline.runner import PipelineRunner
|
|
||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
|
||||||
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
|
|
||||||
from pipecat.services.canonical.metrics import CanonicalMetricsService
|
|
||||||
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
|
|
||||||
from pipecat.services.openai.llm import OpenAILLMService
|
|
||||||
from pipecat.transports.services.daily import DailyParams, DailyTransport
|
|
||||||
|
|
||||||
load_dotenv(override=True)
|
|
||||||
|
|
||||||
logger.remove(0)
|
|
||||||
logger.add(sys.stderr, level="DEBUG")
|
|
||||||
|
|
||||||
|
|
||||||
async def main():
|
|
||||||
async with aiohttp.ClientSession() as session:
|
|
||||||
(room_url, token) = await configure(session)
|
|
||||||
|
|
||||||
transport = DailyTransport(
|
|
||||||
room_url,
|
|
||||||
token,
|
|
||||||
"Chatbot",
|
|
||||||
DailyParams(
|
|
||||||
audio_out_enabled=True,
|
|
||||||
audio_in_enabled=True,
|
|
||||||
video_out_enabled=False,
|
|
||||||
vad_analyzer=SileroVADAnalyzer(),
|
|
||||||
transcription_enabled=True,
|
|
||||||
#
|
|
||||||
# Spanish
|
|
||||||
#
|
|
||||||
# transcription_settings=DailyTranscriptionSettings(
|
|
||||||
# language="es",
|
|
||||||
# tier="nova",
|
|
||||||
# model="2-general"
|
|
||||||
# )
|
|
||||||
),
|
|
||||||
)
|
|
||||||
|
|
||||||
tts = ElevenLabsTTSService(
|
|
||||||
api_key=os.getenv("ELEVENLABS_API_KEY"),
|
|
||||||
#
|
|
||||||
# English
|
|
||||||
#
|
|
||||||
voice_id="cgSgspJ2msm6clMCkdW9",
|
|
||||||
#
|
|
||||||
# Spanish
|
|
||||||
#
|
|
||||||
# model="eleven_multilingual_v2",
|
|
||||||
# voice_id="gD1IexrzCvsXPHUuT0s3",
|
|
||||||
)
|
|
||||||
|
|
||||||
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
|
||||||
|
|
||||||
messages = [
|
|
||||||
{
|
|
||||||
"role": "system",
|
|
||||||
#
|
|
||||||
# English
|
|
||||||
#
|
|
||||||
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your responses to 12 words or fewer.",
|
|
||||||
#
|
|
||||||
# Spanish
|
|
||||||
#
|
|
||||||
# "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
|
|
||||||
},
|
|
||||||
]
|
|
||||||
|
|
||||||
context = OpenAILLMContext(messages)
|
|
||||||
context_aggregator = llm.create_context_aggregator(context)
|
|
||||||
|
|
||||||
"""
|
|
||||||
CanonicalMetrics uses AudioBufferProcessor under the hood to buffer the audio. On
|
|
||||||
call completion, CanonicalMetrics will send the audio buffer to Canonical for
|
|
||||||
analysis. Visit https://voice.canonical.chat to learn more.
|
|
||||||
"""
|
|
||||||
audio_buffer_processor = AudioBufferProcessor(num_channels=2)
|
|
||||||
canonical = CanonicalMetricsService(
|
|
||||||
audio_buffer_processor=audio_buffer_processor,
|
|
||||||
aiohttp_session=session,
|
|
||||||
api_key=os.getenv("CANONICAL_API_KEY"),
|
|
||||||
call_id=str(uuid.uuid4()),
|
|
||||||
assistant="pipecat-chatbot",
|
|
||||||
assistant_speaks_first=True,
|
|
||||||
context=context,
|
|
||||||
)
|
|
||||||
pipeline = Pipeline(
|
|
||||||
[
|
|
||||||
transport.input(), # microphone
|
|
||||||
context_aggregator.user(),
|
|
||||||
llm,
|
|
||||||
tts,
|
|
||||||
transport.output(),
|
|
||||||
canonical, # uploads audio buffer to Canonical AI for metrics
|
|
||||||
audio_buffer_processor, # captures audio into a buffer
|
|
||||||
context_aggregator.assistant(),
|
|
||||||
]
|
|
||||||
)
|
|
||||||
|
|
||||||
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
|
|
||||||
|
|
||||||
@transport.event_handler("on_first_participant_joined")
|
|
||||||
async def on_first_participant_joined(transport, participant):
|
|
||||||
await audio_buffer_processor.start_recording()
|
|
||||||
await transport.capture_participant_transcription(participant["id"])
|
|
||||||
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
|
||||||
|
|
||||||
@transport.event_handler("on_participant_left")
|
|
||||||
async def on_participant_left(transport, participant, reason):
|
|
||||||
print(f"Participant left: {participant}")
|
|
||||||
await task.cancel()
|
|
||||||
|
|
||||||
@transport.event_handler("on_call_state_updated")
|
|
||||||
async def on_call_state_updated(transport, state):
|
|
||||||
if state == "left":
|
|
||||||
# Here we don't want to cancel, we just want to finish sending
|
|
||||||
# whatever is queued, so we use an EndFrame().
|
|
||||||
await task.queue_frame(EndFrame())
|
|
||||||
|
|
||||||
runner = PipelineRunner()
|
|
||||||
|
|
||||||
await runner.run(task)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
asyncio.run(main())
|
|
||||||
@@ -1,5 +0,0 @@
|
|||||||
python-dotenv
|
|
||||||
fastapi[all]
|
|
||||||
uvicorn
|
|
||||||
pipecat-ai[daily,openai,silero,elevenlabs,canonical]
|
|
||||||
|
|
||||||
@@ -53,4 +53,3 @@ async def configure(aiohttp_session: aiohttp.ClientSession):
|
|||||||
token = await daily_rest_helper.get_token(url, expiry_time)
|
token = await daily_rest_helper.get_token(url, expiry_time)
|
||||||
|
|
||||||
return (url, token)
|
return (url, token)
|
||||||
return (url, token)
|
|
||||||
|
|||||||
39
examples/daily-custom-tracks/README.md
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
# Daily Custom Tracks
|
||||||
|
|
||||||
|
This example shows how to send and receive Daily custom tracks. We will run a simple `daily-python` application to send an audio file with a custom track (named "pipecat") to a room. Then, the Pipecat bot will mirror that custom track into another custom track (named "pipecat-mirror") in the same room.
|
||||||
|
|
||||||
|
## Get started
|
||||||
|
|
||||||
|
```python
|
||||||
|
python3 -m venv venv
|
||||||
|
source venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
## Run the bot
|
||||||
|
|
||||||
|
Start the bot by giving it a Daily room URL.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python bot.py -u ROOM_URL
|
||||||
|
```
|
||||||
|
|
||||||
|
The bot will wait for the first participant to join. Then, it will mirror a custom track named "pipecat" into a new custom track named "pipecat-mirror".
|
||||||
|
|
||||||
|
## Run the sender
|
||||||
|
|
||||||
|
Now, run the custom track sender. This is a simple `daily-python` application that opens and audio file and sends it as a custom track to the same Daily room.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python custom_track_sender.py -u ROOM_URL -i office-ambience-mono-16000.mp3
|
||||||
|
```
|
||||||
|
|
||||||
|
## Open client
|
||||||
|
|
||||||
|
Finally, open the client so you can hear both custom tracks.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
open index.html
|
||||||
|
```
|
||||||
|
|
||||||
|
Once the client is opened, copy the URL of the Daily room and join it. You should be able to select which custom track you want to hear.
|
||||||
87
examples/daily-custom-tracks/bot.py
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2024–2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import sys
|
||||||
|
|
||||||
|
import aiohttp
|
||||||
|
from loguru import logger
|
||||||
|
from runner import configure
|
||||||
|
|
||||||
|
from pipecat.frames.frames import Frame, InputAudioRawFrame, OutputAudioRawFrame
|
||||||
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
|
||||||
|
from pipecat.transports.services.daily import DailyParams, DailyTransport
|
||||||
|
|
||||||
|
logger.remove(0)
|
||||||
|
logger.add(sys.stderr, level="DEBUG")
|
||||||
|
|
||||||
|
|
||||||
|
class CustomTrackMirrorProcessor(FrameProcessor):
|
||||||
|
def __init__(self, transport_destination: str, **kwargs):
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
self._transport_destination = transport_destination
|
||||||
|
|
||||||
|
async def process_frame(self, frame: Frame, direction: FrameDirection):
|
||||||
|
await super().process_frame(frame, direction)
|
||||||
|
|
||||||
|
if isinstance(frame, InputAudioRawFrame) and frame.transport_source:
|
||||||
|
output_frame = OutputAudioRawFrame(
|
||||||
|
audio=frame.audio,
|
||||||
|
sample_rate=frame.sample_rate,
|
||||||
|
num_channels=frame.num_channels,
|
||||||
|
)
|
||||||
|
output_frame.transport_destination = self._transport_destination
|
||||||
|
await self.push_frame(output_frame)
|
||||||
|
else:
|
||||||
|
await self.push_frame(frame, direction)
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
async with aiohttp.ClientSession() as session:
|
||||||
|
(room_url, _) = await configure(session)
|
||||||
|
|
||||||
|
transport = DailyTransport(
|
||||||
|
room_url,
|
||||||
|
None,
|
||||||
|
"Custom tracks mirror",
|
||||||
|
DailyParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_out_enabled=True,
|
||||||
|
microphone_out_enabled=False, # Disable since we just use custom tracks
|
||||||
|
audio_out_destinations=["pipecat-mirror"],
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
pipeline = Pipeline(
|
||||||
|
[
|
||||||
|
transport.input(), # Transport user input
|
||||||
|
CustomTrackMirrorProcessor("pipecat-mirror"),
|
||||||
|
transport.output(), # Transport bot output
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
task = PipelineTask(
|
||||||
|
pipeline,
|
||||||
|
params=PipelineParams(
|
||||||
|
audio_in_sample_rate=16000,
|
||||||
|
audio_out_sample_rate=16000,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
@transport.event_handler("on_first_participant_joined")
|
||||||
|
async def on_first_participant_joined(transport, participant):
|
||||||
|
await transport.capture_participant_audio(participant["id"], audio_source="pipecat")
|
||||||
|
|
||||||
|
runner = PipelineRunner()
|
||||||
|
|
||||||
|
await runner.run(task)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
74
examples/daily-custom-tracks/custom_track_sender.py
Normal file
@@ -0,0 +1,74 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2024–2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import time
|
||||||
|
|
||||||
|
from daily import CallClient, CustomAudioSource, Daily
|
||||||
|
from pydub import AudioSegment
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
|
||||||
|
parser.add_argument("-u", "--url", type=str, required=True, help="URL of the Daily room to join")
|
||||||
|
parser.add_argument(
|
||||||
|
"-i", "--input", type=str, required=True, help="Input audio file (needs 16000 sample rate)"
|
||||||
|
)
|
||||||
|
|
||||||
|
args, _ = parser.parse_known_args()
|
||||||
|
|
||||||
|
audio = AudioSegment.from_mp3(args.input)
|
||||||
|
|
||||||
|
raw_bytes = audio.raw_data
|
||||||
|
sample_rate = audio.frame_rate
|
||||||
|
channels = audio.channels
|
||||||
|
|
||||||
|
print(f"Length: {len(raw_bytes)} bytes")
|
||||||
|
print(f"Sample rate: {sample_rate}, Channels: {channels}")
|
||||||
|
|
||||||
|
# Initialize the Daily context & create call client
|
||||||
|
Daily.init()
|
||||||
|
|
||||||
|
client = CallClient()
|
||||||
|
|
||||||
|
# Join the room and indicate we have a custom track named "pipecat".
|
||||||
|
client.join(
|
||||||
|
args.url,
|
||||||
|
client_settings={
|
||||||
|
"publishing": {
|
||||||
|
"camera": False,
|
||||||
|
"microphone": False,
|
||||||
|
"customAudio": {"pipecat": True},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Just sleep for a couple of seconds. To do this well we should really use
|
||||||
|
# completions.
|
||||||
|
time.sleep(2)
|
||||||
|
|
||||||
|
# Create the custom audio source. This is where we will write our audio.
|
||||||
|
audio_source = CustomAudioSource(sample_rate, channels)
|
||||||
|
|
||||||
|
# Create an audio track and assign it our audio source.
|
||||||
|
client.add_custom_audio_track("pipecat", audio_source)
|
||||||
|
|
||||||
|
# Just sleep for a second. To do this well we should really use completions.
|
||||||
|
time.sleep(1)
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Just write one second of audio until we have read all the file.
|
||||||
|
chunk_size = sample_rate * channels * 2
|
||||||
|
while len(raw_bytes) > 0:
|
||||||
|
chunk = raw_bytes[:chunk_size]
|
||||||
|
raw_bytes = raw_bytes[chunk_size:]
|
||||||
|
audio_source.write_frames(chunk)
|
||||||
|
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
client.leave()
|
||||||
|
|
||||||
|
# Just sleep for a second. To do this well we should really use completions.
|
||||||
|
time.sleep(1)
|
||||||
|
|
||||||
|
client.release()
|
||||||
173
examples/daily-custom-tracks/index.html
Normal file
@@ -0,0 +1,173 @@
|
|||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>daily custom tracks</title>
|
||||||
|
</head>
|
||||||
|
<script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
|
||||||
|
<script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
|
||||||
|
<link
|
||||||
|
rel="stylesheet"
|
||||||
|
type="text/css"
|
||||||
|
href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
|
||||||
|
/>
|
||||||
|
<script>
|
||||||
|
function enableButton(buttonId, enable) {
|
||||||
|
const button = document.getElementById(buttonId);
|
||||||
|
button.disabled = !enable;
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableJoinButton(enable) {
|
||||||
|
enableButton("join-button", enable);
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableLeaveButton(enable) {
|
||||||
|
enableButton("leave-button", enable);
|
||||||
|
}
|
||||||
|
|
||||||
|
function destroyPlayers(query) {
|
||||||
|
const items = document.querySelectorAll(query);
|
||||||
|
if (items) {
|
||||||
|
for (const item of items) {
|
||||||
|
item.remove();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function destroyParticipantPlayers(participantId) {
|
||||||
|
destroyPlayers(`audio[data-participant-id="${participantId}"]`);
|
||||||
|
destroyPlayers(`button[data-participant-id="${participantId}"]`);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function startPlayer(player, track) {
|
||||||
|
player.muted = false;
|
||||||
|
player.autoplay = true;
|
||||||
|
if (track != null) {
|
||||||
|
player.srcObject = new MediaStream([track]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function buildAudioPlayer(track, participantId) {
|
||||||
|
const audioContainer = document.getElementById("audio-container");
|
||||||
|
const player = document.createElement("audio");
|
||||||
|
player.dataset.participantId = participantId;
|
||||||
|
|
||||||
|
// Create a new button for controlling audio
|
||||||
|
const audioControlButton = document.createElement("button");
|
||||||
|
audioControlButton.className = "ui primary green button"
|
||||||
|
audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
|
||||||
|
audioControlButton.dataset.participantId = participantId;
|
||||||
|
audioControlButton.onclick = () => {
|
||||||
|
if (player.paused) {
|
||||||
|
|
||||||
|
player.play();
|
||||||
|
audioControlButton.className = "ui primary red button"
|
||||||
|
} else {
|
||||||
|
player.pause();
|
||||||
|
audioControlButton.className = "ui primary green button"
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
audioContainer.appendChild(player);
|
||||||
|
audioContainer.appendChild(audioControlButton);
|
||||||
|
|
||||||
|
await startPlayer(player, track);
|
||||||
|
player.pause()
|
||||||
|
|
||||||
|
return player;
|
||||||
|
}
|
||||||
|
|
||||||
|
function subscribeToTracks(participantId) {
|
||||||
|
console.log(`subscribing to track`);
|
||||||
|
|
||||||
|
if (participantId === "local") {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
callObject.updateParticipant(participantId, {
|
||||||
|
setSubscribedTracks: {
|
||||||
|
audio: true,
|
||||||
|
video: false,
|
||||||
|
custom: true,
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function startDaily() {
|
||||||
|
enableJoinButton(true);
|
||||||
|
enableLeaveButton(false);
|
||||||
|
|
||||||
|
window.callObject = window.DailyIframe.createCallObject({});
|
||||||
|
|
||||||
|
callObject.on("participant-joined", (e) => {
|
||||||
|
if (!e.participant.local) {
|
||||||
|
console.log("participant-joined", e.participant);
|
||||||
|
subscribeToTracks(e.participant.session_id);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
callObject.on("participant-left", (e) => {
|
||||||
|
console.log("participant-left", e.participant.session_id);
|
||||||
|
destroyParticipantPlayers(e.participant.session_id);
|
||||||
|
});
|
||||||
|
|
||||||
|
callObject.on("track-started", async (e) => {
|
||||||
|
console.log("track-started", e.track);
|
||||||
|
if (e.track.kind === "audio") {
|
||||||
|
await buildAudioPlayer(e.track, e.participant.session_id);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
async function joinRoom() {
|
||||||
|
enableJoinButton(false);
|
||||||
|
enableLeaveButton(true);
|
||||||
|
|
||||||
|
const meetingUrl = document.getElementById("meeting-url").value;
|
||||||
|
|
||||||
|
callObject.join({
|
||||||
|
url: meetingUrl,
|
||||||
|
startVideoOff: true,
|
||||||
|
startAudioOff: true,
|
||||||
|
subscribeToTracksAutomatically: false,
|
||||||
|
receiveSettings: {
|
||||||
|
base: { video: { layer: 0 } },
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
async function leaveRoom() {
|
||||||
|
enableJoinButton(true);
|
||||||
|
enableLeaveButton(false);
|
||||||
|
|
||||||
|
callObject.leave();
|
||||||
|
|
||||||
|
const audioContainer = document.getElementById("audio-container");
|
||||||
|
audioContainer.replaceChildren();
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
|
||||||
|
<body onload="startDaily()">
|
||||||
|
<div class="ui centered page grid" style="margin-top: 30px">
|
||||||
|
<div class="ten wide column">
|
||||||
|
<div class="ui form" style="margin-top: 30px">
|
||||||
|
<div class="field">
|
||||||
|
<label>Meeting URL</label>
|
||||||
|
<input id="meeting-url" value="" />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="ui centered aligned header" style="margin-top: 30px">
|
||||||
|
<button id="join-button" class="ui primary button" onclick="joinRoom()">
|
||||||
|
Join
|
||||||
|
</button>
|
||||||
|
<button id="leave-button" class="ui button" onclick="leaveRoom()">
|
||||||
|
Leave
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
<div id="tile" class="ui container" style="margin-top: 30px">
|
||||||
|
<div id="tile" class="ui center aligned grid">
|
||||||
|
<div id="audio-container"></div><br/>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
BIN
examples/daily-custom-tracks/office-ambience-mono-16000.mp3
Normal file
2
examples/daily-custom-tracks/requirements.txt
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
pydub
|
||||||
|
pipecat-ai[daily]
|
||||||
@@ -1,7 +1,12 @@
|
|||||||
FROM python:3.10-bullseye
|
FROM python:3.10-bullseye
|
||||||
|
|
||||||
RUN mkdir /app
|
RUN mkdir /app
|
||||||
|
RUN mkdir /app/assets
|
||||||
|
RUN mkdir /app/utils
|
||||||
COPY *.py /app/
|
COPY *.py /app/
|
||||||
COPY requirements.txt /app/
|
COPY requirements.txt /app/
|
||||||
|
|
||||||
|
|
||||||
WORKDIR /app
|
WORKDIR /app
|
||||||
RUN pip3 install -r requirements.txt
|
RUN pip3 install -r requirements.txt
|
||||||
|
|
||||||
39
examples/daily-multi-translation/README.md
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
# Daily Multi Translation
|
||||||
|
|
||||||
|
This example shows how to use Daily to stream multiple simultaneous translations using a single transport. Daily provides custom tracks and in this example we will simultaneously translate incoming audio in English to Spanish, French and German, each of them being sent to a custom track.
|
||||||
|
|
||||||
|
## Get started
|
||||||
|
|
||||||
|
```python
|
||||||
|
python3 -m venv venv
|
||||||
|
source venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
cp env.example .env # and add your credentials
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
## Run the server
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python server.py
|
||||||
|
```
|
||||||
|
|
||||||
|
Then, visit `http://localhost:7860/` in your browser. This will open a Daily Prebuilt room where you will speak in English (make sure you are not muted).
|
||||||
|
|
||||||
|
## Open client
|
||||||
|
|
||||||
|
Next, you need to open the client that will listen to the translations.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
open index.html
|
||||||
|
```
|
||||||
|
|
||||||
|
Once the client is opened, copy the URL of the Daily room created above and join it. You should be able to select which translation you want to hear.
|
||||||
|
|
||||||
|
## Build and test the Docker image
|
||||||
|
|
||||||
|
```
|
||||||
|
docker build -t daily-multi-translation .
|
||||||
|
docker run --env-file .env -p 7860:7860 daily-multi-translation
|
||||||
|
```
|
||||||
165
examples/daily-multi-translation/bot.py
Normal file
@@ -0,0 +1,165 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2024–2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
|
||||||
|
import aiohttp
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from loguru import logger
|
||||||
|
from runner import configure
|
||||||
|
|
||||||
|
from pipecat.audio.mixers.soundfile_mixer import SoundfileMixer
|
||||||
|
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||||
|
from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
|
||||||
|
from pipecat.pipeline.parallel_pipeline import ParallelPipeline
|
||||||
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
|
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||||
|
from pipecat.services.deepgram.stt import DeepgramSTTService
|
||||||
|
from pipecat.services.openai.llm import OpenAILLMService
|
||||||
|
from pipecat.transports.services.daily import DailyParams, DailyTransport
|
||||||
|
|
||||||
|
load_dotenv(override=True)
|
||||||
|
|
||||||
|
logger.remove(0)
|
||||||
|
logger.add(sys.stderr, level="DEBUG")
|
||||||
|
|
||||||
|
BACKGROUND_SOUND_FILE = "office-ambience-mono-16000.mp3"
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
async with aiohttp.ClientSession() as session:
|
||||||
|
(room_url, token) = await configure(session)
|
||||||
|
|
||||||
|
transport = DailyTransport(
|
||||||
|
room_url,
|
||||||
|
token,
|
||||||
|
"Multi translation bot",
|
||||||
|
DailyParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_out_enabled=True,
|
||||||
|
audio_out_mixer={
|
||||||
|
"spanish": SoundfileMixer(
|
||||||
|
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
|
||||||
|
),
|
||||||
|
"french": SoundfileMixer(
|
||||||
|
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
|
||||||
|
),
|
||||||
|
"german": SoundfileMixer(
|
||||||
|
sound_files={"office": BACKGROUND_SOUND_FILE}, default_sound="office"
|
||||||
|
),
|
||||||
|
},
|
||||||
|
audio_out_destinations=["spanish", "french", "german"],
|
||||||
|
microphone_out_enabled=False, # Disable since we just use custom tracks
|
||||||
|
vad_analyzer=SileroVADAnalyzer(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
|
||||||
|
|
||||||
|
tts_spanish = CartesiaTTSService(
|
||||||
|
api_key=os.getenv("CARTESIA_API_KEY"),
|
||||||
|
voice_id="cefcb124-080b-4655-b31f-932f3ee743de",
|
||||||
|
transport_destination="spanish",
|
||||||
|
)
|
||||||
|
tts_french = CartesiaTTSService(
|
||||||
|
api_key=os.getenv("CARTESIA_API_KEY"),
|
||||||
|
voice_id="8832a0b5-47b2-4751-bb22-6a8e2149303d",
|
||||||
|
transport_destination="french",
|
||||||
|
)
|
||||||
|
tts_german = CartesiaTTSService(
|
||||||
|
api_key=os.getenv("CARTESIA_API_KEY"),
|
||||||
|
voice_id="38aabb6a-f52b-4fb0-a3d1-988518f4dc06",
|
||||||
|
transport_destination="german",
|
||||||
|
)
|
||||||
|
|
||||||
|
messages_spanish = [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You will be provided with a sentence in English, and your task is to only translate it into Spanish.",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
messages_french = [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You will be provided with a sentence in English, and your task is to only translate it into French.",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
messages_german = [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You will be provided with a sentence in English, and your task is to only translate it into German.",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
llm_spanish = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
|
llm_french = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
|
llm_german = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
|
|
||||||
|
context_spanish = OpenAILLMContext(messages_spanish)
|
||||||
|
context_aggregator_spanish = llm_spanish.create_context_aggregator(context_spanish)
|
||||||
|
|
||||||
|
context_french = OpenAILLMContext(messages_french)
|
||||||
|
context_aggregator_french = llm_french.create_context_aggregator(context_french)
|
||||||
|
|
||||||
|
context_german = OpenAILLMContext(messages_german)
|
||||||
|
context_aggregator_german = llm_german.create_context_aggregator(context_german)
|
||||||
|
|
||||||
|
pipeline = Pipeline(
|
||||||
|
[
|
||||||
|
transport.input(), # Transport user input
|
||||||
|
stt,
|
||||||
|
ParallelPipeline(
|
||||||
|
# Spanish pipeline.
|
||||||
|
[
|
||||||
|
context_aggregator_spanish.user(),
|
||||||
|
llm_spanish,
|
||||||
|
tts_spanish,
|
||||||
|
context_aggregator_spanish.assistant(),
|
||||||
|
],
|
||||||
|
# French pipeline.
|
||||||
|
[
|
||||||
|
context_aggregator_french.user(),
|
||||||
|
llm_french,
|
||||||
|
tts_french,
|
||||||
|
context_aggregator_french.assistant(),
|
||||||
|
],
|
||||||
|
# German pipeline.
|
||||||
|
[
|
||||||
|
context_aggregator_german.user(),
|
||||||
|
llm_german,
|
||||||
|
tts_german,
|
||||||
|
context_aggregator_german.assistant(),
|
||||||
|
],
|
||||||
|
),
|
||||||
|
transport.output(), # Transport bot output
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
task = PipelineTask(
|
||||||
|
pipeline,
|
||||||
|
params=PipelineParams(
|
||||||
|
audio_in_sample_rate=16000,
|
||||||
|
audio_out_sample_rate=16000,
|
||||||
|
allow_interruptions=True,
|
||||||
|
enable_metrics=True,
|
||||||
|
enable_usage_metrics=True,
|
||||||
|
report_only_initial_ttfb=True,
|
||||||
|
),
|
||||||
|
observers=[TranscriptionLogObserver()],
|
||||||
|
)
|
||||||
|
|
||||||
|
runner = PipelineRunner()
|
||||||
|
|
||||||
|
await runner.run(task)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
@@ -1,6 +1,5 @@
|
|||||||
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
|
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
|
||||||
DAILY_API_KEY=7df...
|
DAILY_API_KEY=7df...
|
||||||
OPENAI_API_KEY=sk-PL...
|
OPENAI_API_KEY=sk-PL...
|
||||||
ELEVENLABS_API_KEY=aeb...
|
DEEPGRAM_API_KEY=efb...
|
||||||
CANONICAL_API_KEY=can...
|
CARTESIA_API_KEY=aeb...
|
||||||
CANONICAL_API_URL=
|
|
||||||
202
examples/daily-multi-translation/index.html
Normal file
@@ -0,0 +1,202 @@
|
|||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>daily multi translation</title>
|
||||||
|
</head>
|
||||||
|
<script crossorigin src="https://unpkg.com/@daily-co/daily-js"></script>
|
||||||
|
<script
|
||||||
|
src="https://code.jquery.com/jquery-3.1.1.min.js"
|
||||||
|
integrity="sha256-hVVnYaiADRTO2PzUGmuLJr8BLUSjGIZsDYGmIJLv2b8="
|
||||||
|
crossorigin="anonymous"
|
||||||
|
></script>
|
||||||
|
<script src="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.js"></script>
|
||||||
|
<link
|
||||||
|
rel="stylesheet"
|
||||||
|
type="text/css"
|
||||||
|
href="https://cdnjs.cloudflare.com/ajax/libs/fomantic-ui/2.8.6/semantic.min.css"
|
||||||
|
/>
|
||||||
|
<script>
|
||||||
|
function enableButton(buttonId, enable) {
|
||||||
|
const button = document.getElementById(buttonId);
|
||||||
|
button.disabled = !enable;
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableJoinButton(enable) {
|
||||||
|
enableButton("join-button", enable);
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableLeaveButton(enable) {
|
||||||
|
enableButton("leave-button", enable);
|
||||||
|
}
|
||||||
|
|
||||||
|
function destroyPlayers(query) {
|
||||||
|
const items = document.querySelectorAll(query);
|
||||||
|
if (items) {
|
||||||
|
for (const item of items) {
|
||||||
|
item.remove();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function destroyParticipantPlayers(participantId) {
|
||||||
|
destroyPlayers(`video[data-participant-id="${participantId}"]`);
|
||||||
|
destroyPlayers(`audio[data-participant-id="${participantId}"]`);
|
||||||
|
destroyPlayers(`button[data-participant-id="${participantId}"]`);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function startPlayer(player, track) {
|
||||||
|
player.muted = false;
|
||||||
|
player.autoplay = true;
|
||||||
|
if (track != null) {
|
||||||
|
player.srcObject = new MediaStream([track]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function buildVideoPlayer(track, participantId) {
|
||||||
|
const videoContainer = document.getElementById("video-container");
|
||||||
|
const player = document.createElement("video");
|
||||||
|
player.dataset.participantId = participantId;
|
||||||
|
|
||||||
|
videoContainer.appendChild(player);
|
||||||
|
|
||||||
|
await startPlayer(player, track);
|
||||||
|
await player.play();
|
||||||
|
|
||||||
|
return player;
|
||||||
|
}
|
||||||
|
|
||||||
|
async function buildAudioPlayer(track, participantId) {
|
||||||
|
const audioContainer = document.getElementById("audio-container");
|
||||||
|
const player = document.createElement("audio");
|
||||||
|
player.dataset.participantId = participantId;
|
||||||
|
|
||||||
|
// Create a new button for controlling audio
|
||||||
|
const audioControlButton = document.createElement("button");
|
||||||
|
audioControlButton.className = "ui primary green button"
|
||||||
|
audioControlButton.innerText = track._mediaTag == "cam-audio" ? "english" : track._mediaTag;
|
||||||
|
audioControlButton.dataset.participantId = participantId;
|
||||||
|
audioControlButton.onclick = () => {
|
||||||
|
if (player.paused) {
|
||||||
|
|
||||||
|
player.play();
|
||||||
|
audioControlButton.className = "ui primary red button"
|
||||||
|
} else {
|
||||||
|
player.pause();
|
||||||
|
audioControlButton.className = "ui primary green button"
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
audioContainer.appendChild(player);
|
||||||
|
audioContainer.appendChild(audioControlButton);
|
||||||
|
|
||||||
|
await startPlayer(player, track);
|
||||||
|
player.pause()
|
||||||
|
|
||||||
|
return player;
|
||||||
|
}
|
||||||
|
|
||||||
|
function subscribeToTracks(participantId) {
|
||||||
|
console.log(`subscribing to track`);
|
||||||
|
|
||||||
|
if (participantId === "local") {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
callObject.updateParticipant(participantId, {
|
||||||
|
setSubscribedTracks: {
|
||||||
|
audio: true,
|
||||||
|
video: true,
|
||||||
|
custom: true,
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function startDaily() {
|
||||||
|
enableJoinButton(true);
|
||||||
|
enableLeaveButton(false);
|
||||||
|
|
||||||
|
window.callObject = window.DailyIframe.createCallObject({});
|
||||||
|
|
||||||
|
callObject.on("participant-joined", (e) => {
|
||||||
|
if (!e.participant.local) {
|
||||||
|
console.log("participant-joined", e.participant);
|
||||||
|
subscribeToTracks(e.participant.session_id);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
callObject.on("participant-left", (e) => {
|
||||||
|
console.log("participant-left", e.participant.session_id);
|
||||||
|
destroyParticipantPlayers(e.participant.session_id);
|
||||||
|
});
|
||||||
|
|
||||||
|
callObject.on("track-started", async (e) => {
|
||||||
|
console.log("track-started", e.track);
|
||||||
|
if (e.track.kind === "video") {
|
||||||
|
await buildVideoPlayer(e.track, e.participant.session_id);
|
||||||
|
} else if (e.track.kind === "audio") {
|
||||||
|
await buildAudioPlayer(e.track, e.participant.session_id);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
async function joinRoom() {
|
||||||
|
enableJoinButton(false);
|
||||||
|
enableLeaveButton(true);
|
||||||
|
|
||||||
|
const meetingUrl = document.getElementById("meeting-url").value;
|
||||||
|
|
||||||
|
callObject.join({
|
||||||
|
url: meetingUrl,
|
||||||
|
startVideoOff: true,
|
||||||
|
startAudioOff: true,
|
||||||
|
subscribeToTracksAutomatically: false,
|
||||||
|
receiveSettings: {
|
||||||
|
base: { video: { layer: 0 } },
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
async function leaveRoom() {
|
||||||
|
enableJoinButton(true);
|
||||||
|
enableLeaveButton(false);
|
||||||
|
|
||||||
|
callObject.leave();
|
||||||
|
|
||||||
|
const videoContainer = document.getElementById("video-container");
|
||||||
|
videoContainer.replaceChildren();
|
||||||
|
|
||||||
|
const audioContainer = document.getElementById("audio-container");
|
||||||
|
audioContainer.replaceChildren();
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
|
||||||
|
<body onload="startDaily()">
|
||||||
|
<div class="ui centered page grid" style="margin-top: 30px">
|
||||||
|
<div class="ten wide column">
|
||||||
|
<div class="ui form" style="margin-top: 30px">
|
||||||
|
<div class="field">
|
||||||
|
<label>Meeting URL</label>
|
||||||
|
<input id="meeting-url" value="" />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="ui centered aligned header" style="margin-top: 30px">
|
||||||
|
<button id="join-button" class="ui primary button" onclick="joinRoom()">
|
||||||
|
Join
|
||||||
|
</button>
|
||||||
|
<button id="leave-button" class="ui button" onclick="leaveRoom()">
|
||||||
|
Leave
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
<div id="tile" class="ui container" style="margin-top: 30px">
|
||||||
|
<div id="tile" class="ui center aligned grid">
|
||||||
|
<div id="audio-container"></div><br/>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div id="tile" class="ui container" style="margin-top: 30px">
|
||||||
|
<div id="tile" class="ui center aligned grid">
|
||||||
|
<div id="video-container" class="ui segment"></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
BIN
examples/daily-multi-translation/office-ambience-mono-16000.mp3
Normal file
5
examples/daily-multi-translation/requirements.txt
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
aiofiles
|
||||||
|
python-dotenv
|
||||||
|
fastapi[all]
|
||||||
|
uvicorn
|
||||||
|
pipecat-ai[daily,deepgram,openai,silero,cartesia]
|
||||||
55
examples/daily-multi-translation/runner.py
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2024–2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
|
||||||
|
import aiohttp
|
||||||
|
|
||||||
|
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
|
||||||
|
|
||||||
|
|
||||||
|
async def configure(aiohttp_session: aiohttp.ClientSession):
|
||||||
|
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
|
||||||
|
parser.add_argument(
|
||||||
|
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"-k",
|
||||||
|
"--apikey",
|
||||||
|
type=str,
|
||||||
|
required=False,
|
||||||
|
help="Daily API Key (needed to create an owner token for the room)",
|
||||||
|
)
|
||||||
|
|
||||||
|
args, unknown = parser.parse_known_args()
|
||||||
|
|
||||||
|
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
|
||||||
|
key = args.apikey or os.getenv("DAILY_API_KEY")
|
||||||
|
|
||||||
|
if not url:
|
||||||
|
raise Exception(
|
||||||
|
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
|
||||||
|
)
|
||||||
|
|
||||||
|
if not key:
|
||||||
|
raise Exception(
|
||||||
|
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
|
||||||
|
)
|
||||||
|
|
||||||
|
daily_rest_helper = DailyRESTHelper(
|
||||||
|
daily_api_key=key,
|
||||||
|
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
|
||||||
|
aiohttp_session=aiohttp_session,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create a meeting token for the given room with an expiration 1 hour in
|
||||||
|
# the future.
|
||||||
|
expiry_time: float = 60 * 60
|
||||||
|
|
||||||
|
token = await daily_rest_helper.get_token(url, expiry_time)
|
||||||
|
|
||||||
|
return (url, token)
|
||||||
@@ -4,6 +4,7 @@
|
|||||||
# SPDX-License-Identifier: BSD 2-Clause License
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
#
|
#
|
||||||
|
|
||||||
|
import asyncio
|
||||||
import os
|
import os
|
||||||
|
|
||||||
import aiohttp
|
import aiohttp
|
||||||
@@ -21,44 +22,23 @@ from pipecat.services.cartesia.tts import CartesiaTTSService
|
|||||||
from pipecat.services.openai.llm import OpenAILLMService
|
from pipecat.services.openai.llm import OpenAILLMService
|
||||||
from pipecat.transports.services.daily import DailyParams, DailyTransport
|
from pipecat.transports.services.daily import DailyParams, DailyTransport
|
||||||
|
|
||||||
# Check if we're in local development mode
|
|
||||||
LOCAL_RUN = os.getenv("LOCAL_RUN")
|
|
||||||
if LOCAL_RUN:
|
|
||||||
import asyncio
|
|
||||||
import webbrowser
|
|
||||||
|
|
||||||
try:
|
|
||||||
from local_runner import configure
|
|
||||||
except ImportError:
|
|
||||||
logger.error("Could not import local_runner module. Local development mode may not work.")
|
|
||||||
|
|
||||||
# Load environment variables
|
# Load environment variables
|
||||||
load_dotenv(override=True)
|
load_dotenv(override=True)
|
||||||
|
|
||||||
|
# Check if we're in local development mode
|
||||||
|
LOCAL_RUN = os.getenv("LOCAL_RUN")
|
||||||
|
|
||||||
async def main(room_url: str, token: str):
|
|
||||||
|
async def main(transport: DailyTransport):
|
||||||
"""Main pipeline setup and execution function.
|
"""Main pipeline setup and execution function.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
room_url: The Daily room URL
|
transport: The DailyTransport object for the bot
|
||||||
token: The Daily room token
|
|
||||||
"""
|
"""
|
||||||
logger.debug("Starting bot in room: {}", room_url)
|
logger.debug("Starting bot")
|
||||||
|
|
||||||
transport = DailyTransport(
|
|
||||||
room_url,
|
|
||||||
token,
|
|
||||||
"bot",
|
|
||||||
DailyParams(
|
|
||||||
audio_in_enabled=True,
|
|
||||||
audio_out_enabled=True,
|
|
||||||
transcription_enabled=True,
|
|
||||||
vad_analyzer=SileroVADAnalyzer(),
|
|
||||||
),
|
|
||||||
)
|
|
||||||
|
|
||||||
tts = CartesiaTTSService(
|
tts = CartesiaTTSService(
|
||||||
api_key=os.getenv("CARTESIA_API_KEY"), voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22"
|
api_key=os.getenv("CARTESIA_API_KEY"), voice_id="71a7ad14-091c-4e8e-a314-022ece01c121"
|
||||||
)
|
)
|
||||||
|
|
||||||
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
@@ -126,10 +106,25 @@ async def bot(args: DailySessionArguments):
|
|||||||
body: The configuration object from the request body
|
body: The configuration object from the request body
|
||||||
session_id: The session ID for logging
|
session_id: The session ID for logging
|
||||||
"""
|
"""
|
||||||
|
from pipecat.audio.filters.krisp_filter import KrispFilter
|
||||||
|
|
||||||
logger.info(f"Bot process initialized {args.room_url} {args.token}")
|
logger.info(f"Bot process initialized {args.room_url} {args.token}")
|
||||||
|
|
||||||
|
transport = DailyTransport(
|
||||||
|
args.room_url,
|
||||||
|
args.token,
|
||||||
|
"Pipecat Bot",
|
||||||
|
DailyParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_in_filter=None if LOCAL_RUN else KrispFilter(),
|
||||||
|
audio_out_enabled=True,
|
||||||
|
transcription_enabled=True,
|
||||||
|
vad_analyzer=SileroVADAnalyzer(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
await main(args.room_url, args.token)
|
await main(transport)
|
||||||
logger.info("Bot process completed")
|
logger.info("Bot process completed")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.exception(f"Error in bot process: {str(e)}")
|
logger.exception(f"Error in bot process: {str(e)}")
|
||||||
@@ -137,18 +132,27 @@ async def bot(args: DailySessionArguments):
|
|||||||
|
|
||||||
|
|
||||||
# Local development functions
|
# Local development functions
|
||||||
async def local_main():
|
async def local_daily():
|
||||||
"""Function for local development testing."""
|
"""Function for local development testing."""
|
||||||
|
from local_runner import configure
|
||||||
|
|
||||||
try:
|
try:
|
||||||
async with aiohttp.ClientSession() as session:
|
async with aiohttp.ClientSession() as session:
|
||||||
(room_url, token) = await configure(session)
|
(room_url, token) = await configure(session)
|
||||||
logger.warning("_")
|
transport = DailyTransport(
|
||||||
logger.warning("_")
|
room_url,
|
||||||
logger.warning(f"Talk to your voice agent here: {room_url}")
|
token,
|
||||||
logger.warning("_")
|
"Pipecat Bot",
|
||||||
logger.warning("_")
|
DailyParams(
|
||||||
webbrowser.open(room_url)
|
audio_in_enabled=True,
|
||||||
await main(room_url, token)
|
audio_out_enabled=True,
|
||||||
|
transcription_enabled=True,
|
||||||
|
vad_analyzer=SileroVADAnalyzer(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
await main(transport)
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.exception(f"Error in local development mode: {e}")
|
logger.exception(f"Error in local development mode: {e}")
|
||||||
|
|
||||||
@@ -156,6 +160,6 @@ async def local_main():
|
|||||||
# Local development entry point
|
# Local development entry point
|
||||||
if LOCAL_RUN and __name__ == "__main__":
|
if LOCAL_RUN and __name__ == "__main__":
|
||||||
try:
|
try:
|
||||||
asyncio.run(local_main())
|
asyncio.run(local_daily())
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.exception(f"Failed to run in local mode: {e}")
|
logger.exception(f"Failed to run in local mode: {e}")
|
||||||
|
|||||||
@@ -1,2 +1,4 @@
|
|||||||
CARTESIA_API_KEY=
|
CARTESIA_API_KEY=
|
||||||
OPENAI_API_KEY=
|
OPENAI_API_KEY=
|
||||||
|
# Local dev only
|
||||||
|
DAILY_API_KEY=
|
||||||
@@ -7,6 +7,7 @@
|
|||||||
import os
|
import os
|
||||||
|
|
||||||
import aiohttp
|
import aiohttp
|
||||||
|
from fastapi import HTTPException
|
||||||
|
|
||||||
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
|
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,8 @@
|
|||||||
agent_name = "my-first-agent"
|
agent_name = "my-first-agent"
|
||||||
image = "your-username/my-first-agent:0.1"
|
image = "your-username/my-first-agent:0.1"
|
||||||
|
image_credentials = "your-dockerhub-creds"
|
||||||
secret_set = "my-first-agent-secrets"
|
secret_set = "my-first-agent-secrets"
|
||||||
|
enable_krisp = true
|
||||||
|
|
||||||
[scaling]
|
[scaling]
|
||||||
min_instances = 0
|
min_instances = 0
|
||||||
|
|||||||
@@ -47,7 +47,7 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
live_options=LiveOptions(vad_events=True, utterance_end_ms="1000"),
|
live_options=LiveOptions(vad_events=True, utterance_end_ms="1000"),
|
||||||
)
|
)
|
||||||
|
|
||||||
tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-helios-en")
|
tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")
|
||||||
|
|
||||||
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
|
|
||||||
|
|||||||
@@ -39,7 +39,7 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
|
|
||||||
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
|
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
|
||||||
|
|
||||||
tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-helios-en")
|
tts = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"), voice="aura-2-andromeda-en")
|
||||||
|
|
||||||
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
|
|
||||||
|
|||||||
@@ -14,6 +14,7 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
|
|||||||
from pipecat.pipeline.pipeline import Pipeline
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
from pipecat.pipeline.runner import PipelineRunner
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
from pipecat.services.groq.llm import GroqLLMService
|
from pipecat.services.groq.llm import GroqLLMService
|
||||||
from pipecat.services.groq.stt import GroqSTTService
|
from pipecat.services.groq.stt import GroqSTTService
|
||||||
@@ -39,7 +40,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
|
|
||||||
stt = GroqSTTService(api_key=os.getenv("GROQ_API_KEY"))
|
stt = GroqSTTService(api_key=os.getenv("GROQ_API_KEY"))
|
||||||
|
|
||||||
llm = GroqLLMService(api_key=os.getenv("GROQ_API_KEY"), model="llama-3.3-70b-versatile")
|
llm = GroqLLMService(
|
||||||
|
api_key=os.getenv("GROQ_API_KEY"), model="meta-llama/llama-4-maverick-17b-128e-instruct"
|
||||||
|
)
|
||||||
|
|
||||||
tts = GroqTTSService(api_key=os.getenv("GROQ_API_KEY"))
|
tts = GroqTTSService(api_key=os.getenv("GROQ_API_KEY"))
|
||||||
|
|
||||||
@@ -51,7 +54,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
]
|
]
|
||||||
|
|
||||||
context = OpenAILLMContext(messages)
|
context = OpenAILLMContext(messages)
|
||||||
context_aggregator = llm.create_context_aggregator(context)
|
context_aggregator = llm.create_context_aggregator(
|
||||||
|
context, user_params=LLMUserAggregatorParams(aggregation_timeout=0.05)
|
||||||
|
)
|
||||||
|
|
||||||
pipeline = Pipeline(
|
pipeline = Pipeline(
|
||||||
[
|
[
|
||||||
|
|||||||
@@ -5,7 +5,6 @@
|
|||||||
#
|
#
|
||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
import os
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
@@ -15,9 +14,9 @@ from pipecat.pipeline.pipeline import Pipeline
|
|||||||
from pipecat.pipeline.runner import PipelineRunner
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
from pipecat.services.aws.tts import PollyTTSService
|
from pipecat.services.aws.llm import AWSBedrockLLMService
|
||||||
from pipecat.services.deepgram.stt import DeepgramSTTService
|
from pipecat.services.aws.stt import AWSTranscribeSTTService
|
||||||
from pipecat.services.openai.llm import OpenAILLMService
|
from pipecat.services.aws.tts import AWSPollyTTSService
|
||||||
from pipecat.transports.base_transport import TransportParams
|
from pipecat.transports.base_transport import TransportParams
|
||||||
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
||||||
@@ -37,17 +36,19 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
|
stt = AWSTranscribeSTTService()
|
||||||
|
|
||||||
tts = PollyTTSService(
|
tts = AWSPollyTTSService(
|
||||||
api_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
|
region="us-west-2", # only specific regions support generative TTS
|
||||||
aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
|
voice_id="Joanna",
|
||||||
region=os.getenv("AWS_REGION"),
|
params=AWSPollyTTSService.InputParams(engine="generative", rate="1.1"),
|
||||||
voice_id="Amy",
|
|
||||||
params=PollyTTSService.InputParams(engine="neural", language="en-GB", rate="1.05"),
|
|
||||||
)
|
)
|
||||||
|
|
||||||
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
llm = AWSBedrockLLMService(
|
||||||
|
aws_region="us-west-2",
|
||||||
|
model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
|
||||||
|
params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
|
||||||
|
)
|
||||||
|
|
||||||
messages = [
|
messages = [
|
||||||
{
|
{
|
||||||
@@ -85,7 +86,7 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
async def on_client_connected(transport, client):
|
async def on_client_connected(transport, client):
|
||||||
logger.info(f"Client connected")
|
logger.info(f"Client connected")
|
||||||
# Kick off the conversation.
|
# Kick off the conversation.
|
||||||
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
|
messages.append({"role": "user", "content": "Please introduce yourself to the user."})
|
||||||
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
||||||
|
|
||||||
@transport.event_handler("on_client_disconnected")
|
@transport.event_handler("on_client_disconnected")
|
||||||
@@ -44,7 +44,8 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
|
|
||||||
tts = RimeHttpTTSService(
|
tts = RimeHttpTTSService(
|
||||||
api_key=os.getenv("RIME_API_KEY", ""),
|
api_key=os.getenv("RIME_API_KEY", ""),
|
||||||
voice_id="rex",
|
voice_id="luna",
|
||||||
|
model="arcana",
|
||||||
aiohttp_session=session,
|
aiohttp_session=session,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|||||||
@@ -16,8 +16,12 @@ from pipecat.pipeline.runner import PipelineRunner
|
|||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
from pipecat.services.nim.llm import NimLLMService
|
from pipecat.services.nim.llm import NimLLMService
|
||||||
from pipecat.services.riva.stt import ParakeetSTTService
|
from pipecat.services.riva.stt import (
|
||||||
from pipecat.services.riva.tts import FastPitchTTSService
|
ParakeetSTTService,
|
||||||
|
RivaSegmentedSTTService,
|
||||||
|
RivaSTTService,
|
||||||
|
)
|
||||||
|
from pipecat.services.riva.tts import FastPitchTTSService, RivaTTSService
|
||||||
from pipecat.transports.base_transport import TransportParams
|
from pipecat.transports.base_transport import TransportParams
|
||||||
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
||||||
@@ -37,11 +41,11 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
stt = ParakeetSTTService(api_key=os.getenv("NVIDIA_API_KEY"))
|
stt = RivaSTTService(api_key=os.getenv("NVIDIA_API_KEY"))
|
||||||
|
|
||||||
llm = NimLLMService(api_key=os.getenv("NVIDIA_API_KEY"), model="meta/llama-3.1-405b-instruct")
|
llm = NimLLMService(api_key=os.getenv("NVIDIA_API_KEY"), model="meta/llama-3.1-405b-instruct")
|
||||||
|
|
||||||
tts = FastPitchTTSService(api_key=os.getenv("NVIDIA_API_KEY"))
|
tts = RivaTTSService(api_key=os.getenv("NVIDIA_API_KEY"))
|
||||||
|
|
||||||
messages = [
|
messages = [
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -4,6 +4,7 @@
|
|||||||
# SPDX-License-Identifier: BSD 2-Clause License
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
#
|
#
|
||||||
|
|
||||||
|
import argparse
|
||||||
import os
|
import os
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
@@ -39,7 +40,7 @@ class TranscriptionLogger(FrameProcessor):
|
|||||||
print(f"Translation ({frame.language}): {frame.text}")
|
print(f"Translation ({frame.language}): {frame.text}")
|
||||||
|
|
||||||
|
|
||||||
async def run_bot(webrtc_connection: SmallWebRTCConnection):
|
async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
|
||||||
logger.info(f"Starting bot")
|
logger.info(f"Starting bot")
|
||||||
|
|
||||||
transport = SmallWebRTCTransport(
|
transport = SmallWebRTCTransport(
|
||||||
|
|||||||
@@ -17,6 +17,7 @@ from pipecat.frames.frames import TTSSpeakFrame
|
|||||||
from pipecat.pipeline.pipeline import Pipeline
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
from pipecat.pipeline.runner import PipelineRunner
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
from pipecat.services.cartesia.tts import CartesiaTTSService
|
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||||
from pipecat.services.groq.llm import GroqLLMService
|
from pipecat.services.groq.llm import GroqLLMService
|
||||||
@@ -53,7 +54,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
||||||
)
|
)
|
||||||
|
|
||||||
llm = GroqLLMService(api_key=os.getenv("GROQ_API_KEY"), model="llama-3.3-70b-versatile")
|
llm = GroqLLMService(
|
||||||
|
api_key=os.getenv("GROQ_API_KEY"), model="meta-llama/llama-4-maverick-17b-128e-instruct"
|
||||||
|
)
|
||||||
# You can also register a function_name of None to get all functions
|
# You can also register a function_name of None to get all functions
|
||||||
# sent to the same callback with an additional function_name parameter.
|
# sent to the same callback with an additional function_name parameter.
|
||||||
llm.register_function("get_current_weather", fetch_weather_from_api)
|
llm.register_function("get_current_weather", fetch_weather_from_api)
|
||||||
@@ -83,7 +86,9 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
]
|
]
|
||||||
|
|
||||||
context = OpenAILLMContext(messages, tools)
|
context = OpenAILLMContext(messages, tools)
|
||||||
context_aggregator = llm.create_context_aggregator(context)
|
context_aggregator = llm.create_context_aggregator(
|
||||||
|
context, user_params=LLMUserAggregatorParams(aggregation_timeout=0.05)
|
||||||
|
)
|
||||||
|
|
||||||
pipeline = Pipeline(
|
pipeline = Pipeline(
|
||||||
[
|
[
|
||||||
|
|||||||
139
examples/foundational/14r-function-calling-aws.py
Normal file
@@ -0,0 +1,139 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2024–2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from loguru import logger
|
||||||
|
|
||||||
|
from pipecat.adapters.schemas.function_schema import FunctionSchema
|
||||||
|
from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
||||||
|
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||||
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
|
from pipecat.services.aws.llm import AWSBedrockLLMService
|
||||||
|
from pipecat.services.aws.stt import AWSTranscribeSTTService
|
||||||
|
from pipecat.services.aws.tts import AWSPollyTTSService
|
||||||
|
from pipecat.services.llm_service import FunctionCallParams
|
||||||
|
from pipecat.transports.base_transport import TransportParams
|
||||||
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
|
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
||||||
|
|
||||||
|
load_dotenv(override=True)
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_weather_from_api(params: FunctionCallParams):
|
||||||
|
await params.result_callback({"conditions": "nice", "temperature": "75"})
|
||||||
|
|
||||||
|
|
||||||
|
async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
|
||||||
|
logger.info(f"Starting bot")
|
||||||
|
|
||||||
|
transport = SmallWebRTCTransport(
|
||||||
|
webrtc_connection=webrtc_connection,
|
||||||
|
params=TransportParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_out_enabled=True,
|
||||||
|
vad_analyzer=SileroVADAnalyzer(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
stt = AWSTranscribeSTTService()
|
||||||
|
|
||||||
|
tts = AWSPollyTTSService(
|
||||||
|
region="us-west-2", # only specific regions support generative TTS
|
||||||
|
voice_id="Joanna",
|
||||||
|
params=AWSPollyTTSService.InputParams(engine="generative", rate="1.1"),
|
||||||
|
)
|
||||||
|
|
||||||
|
llm = AWSBedrockLLMService(
|
||||||
|
aws_region="us-west-2",
|
||||||
|
model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
|
||||||
|
params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
|
||||||
|
)
|
||||||
|
|
||||||
|
# You can also register a function_name of None to get all functions
|
||||||
|
# sent to the same callback with an additional function_name parameter.
|
||||||
|
llm.register_function("get_current_weather", fetch_weather_from_api)
|
||||||
|
|
||||||
|
weather_function = FunctionSchema(
|
||||||
|
name="get_current_weather",
|
||||||
|
description="Get the current weather",
|
||||||
|
properties={
|
||||||
|
"location": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "The city and state, e.g. San Francisco, CA",
|
||||||
|
},
|
||||||
|
"format": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["celsius", "fahrenheit"],
|
||||||
|
"description": "The temperature unit to use. Infer this from the user's location.",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
required=["location", "format"],
|
||||||
|
)
|
||||||
|
tools = ToolsSchema(standard_tools=[weather_function])
|
||||||
|
|
||||||
|
messages = [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
context = OpenAILLMContext(messages, tools)
|
||||||
|
context_aggregator = llm.create_context_aggregator(context)
|
||||||
|
|
||||||
|
pipeline = Pipeline(
|
||||||
|
[
|
||||||
|
transport.input(),
|
||||||
|
stt,
|
||||||
|
context_aggregator.user(),
|
||||||
|
llm,
|
||||||
|
tts,
|
||||||
|
transport.output(),
|
||||||
|
context_aggregator.assistant(),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
task = PipelineTask(
|
||||||
|
pipeline,
|
||||||
|
params=PipelineParams(
|
||||||
|
allow_interruptions=True,
|
||||||
|
enable_metrics=True,
|
||||||
|
enable_usage_metrics=True,
|
||||||
|
report_only_initial_ttfb=True,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_connected")
|
||||||
|
async def on_client_connected(transport, client):
|
||||||
|
logger.info(f"Client connected")
|
||||||
|
# Kick off the conversation.
|
||||||
|
messages.append({"role": "user", "content": "Please introduce yourself to the user."})
|
||||||
|
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_disconnected")
|
||||||
|
async def on_client_disconnected(transport, client):
|
||||||
|
logger.info(f"Client disconnected")
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_closed")
|
||||||
|
async def on_client_closed(transport, client):
|
||||||
|
logger.info(f"Client closed connection")
|
||||||
|
await task.cancel()
|
||||||
|
|
||||||
|
runner = PipelineRunner(handle_sigint=False)
|
||||||
|
|
||||||
|
await runner.run(task)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
from run import main
|
||||||
|
|
||||||
|
main()
|
||||||
267
examples/foundational/20e-persistent-context-aws-nova-sonic.py
Normal file
@@ -0,0 +1,267 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import glob
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from loguru import logger
|
||||||
|
|
||||||
|
from pipecat.adapters.schemas.function_schema import FunctionSchema
|
||||||
|
from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
||||||
|
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||||
|
from pipecat.audio.vad.vad_analyzer import VADParams
|
||||||
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
|
from pipecat.services.aws_nova_sonic.aws import AWSNovaSonicLLMService
|
||||||
|
from pipecat.services.llm_service import FunctionCallParams
|
||||||
|
from pipecat.transports.base_transport import TransportParams
|
||||||
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
|
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
||||||
|
|
||||||
|
load_dotenv(override=True)
|
||||||
|
|
||||||
|
BASE_FILENAME = "/tmp/pipecat_conversation_"
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_weather_from_api(params: FunctionCallParams):
|
||||||
|
temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
|
||||||
|
await params.result_callback(
|
||||||
|
{
|
||||||
|
"conditions": "nice",
|
||||||
|
"temperature": temperature,
|
||||||
|
"format": params.arguments["format"],
|
||||||
|
"timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def get_saved_conversation_filenames(params: FunctionCallParams):
|
||||||
|
# Construct the full pattern including the BASE_FILENAME
|
||||||
|
full_pattern = f"{BASE_FILENAME}*.json"
|
||||||
|
|
||||||
|
# Use glob to find all matching files
|
||||||
|
matching_files = glob.glob(full_pattern)
|
||||||
|
logger.debug(f"matching files: {matching_files}")
|
||||||
|
|
||||||
|
await params.result_callback({"filenames": matching_files})
|
||||||
|
|
||||||
|
|
||||||
|
# async def get_saved_conversation_filenames(
|
||||||
|
# function_name, tool_call_id, args, llm, context, result_callback
|
||||||
|
# ):
|
||||||
|
# pattern = re.compile(re.escape(BASE_FILENAME) + "\\d{8}_\\d{6}\\.json$")
|
||||||
|
# matching_files = []
|
||||||
|
|
||||||
|
# for filename in os.listdir("."):
|
||||||
|
# if pattern.match(filename):
|
||||||
|
# matching_files.append(filename)
|
||||||
|
|
||||||
|
# await result_callback({"filenames": matching_files})
|
||||||
|
|
||||||
|
|
||||||
|
async def save_conversation(params: FunctionCallParams):
|
||||||
|
timestamp = datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
|
||||||
|
filename = f"{BASE_FILENAME}{timestamp}.json"
|
||||||
|
try:
|
||||||
|
with open(filename, "w") as file:
|
||||||
|
messages = params.context.get_messages_for_persistent_storage()
|
||||||
|
# remove the last few messages. in reverse order, they are:
|
||||||
|
# - the in progress save tool call
|
||||||
|
# - the invocation of the save tool call
|
||||||
|
# - the user ask to save (which may encompass one or more messages)
|
||||||
|
# the simplest thing to do is to pop messages until the last one is an assistant
|
||||||
|
# response
|
||||||
|
while messages and not (
|
||||||
|
messages[-1].get("role") == "assistant" and "content" in messages[-1]
|
||||||
|
):
|
||||||
|
messages.pop()
|
||||||
|
if messages: # we never expect this to be empty
|
||||||
|
logger.debug(
|
||||||
|
f"writing conversation to {filename}\n{json.dumps(messages, indent=4)}"
|
||||||
|
)
|
||||||
|
json.dump(messages, file, indent=2)
|
||||||
|
await params.result_callback({"success": True})
|
||||||
|
except Exception as e:
|
||||||
|
await params.result_callback({"success": False, "error": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
async def load_conversation(params: FunctionCallParams):
|
||||||
|
async def _reset():
|
||||||
|
filename = params.arguments["filename"]
|
||||||
|
logger.debug(f"loading conversation from {filename}")
|
||||||
|
try:
|
||||||
|
with open(filename, "r") as file:
|
||||||
|
messages = json.load(file)
|
||||||
|
messages.append(
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": f"{AWSNovaSonicLLMService.AWAIT_TRIGGER_ASSISTANT_RESPONSE_INSTRUCTION}",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
params.context.set_messages(messages)
|
||||||
|
await params.llm.reset_conversation()
|
||||||
|
await params.llm.trigger_assistant_response()
|
||||||
|
except Exception as e:
|
||||||
|
await params.result_callback({"success": False, "error": str(e)})
|
||||||
|
|
||||||
|
asyncio.create_task(_reset())
|
||||||
|
|
||||||
|
|
||||||
|
get_current_weather_tool = FunctionSchema(
|
||||||
|
name="get_current_weather",
|
||||||
|
description="Get the current weather",
|
||||||
|
properties={
|
||||||
|
"location": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "The city and state, e.g. San Francisco, CA",
|
||||||
|
},
|
||||||
|
"format": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["celsius", "fahrenheit"],
|
||||||
|
"description": "The temperature unit to use. Infer this from the user's location.",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
required=["location", "format"],
|
||||||
|
)
|
||||||
|
|
||||||
|
save_conversation_tool = FunctionSchema(
|
||||||
|
name="save_conversation",
|
||||||
|
description="Save the current conversation. Use this function to persist the current conversation to external storage.",
|
||||||
|
properties={},
|
||||||
|
required=[],
|
||||||
|
)
|
||||||
|
|
||||||
|
get_saved_conversation_filenames_tool = FunctionSchema(
|
||||||
|
name="get_saved_conversation_filenames",
|
||||||
|
description="Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
|
||||||
|
properties={},
|
||||||
|
required=[],
|
||||||
|
)
|
||||||
|
|
||||||
|
load_conversation_tool = FunctionSchema(
|
||||||
|
name="load_conversation",
|
||||||
|
description="Load a conversation history. Use this function to load a conversation history into the current session.",
|
||||||
|
properties={
|
||||||
|
"filename": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "The filename of the conversation history to load.",
|
||||||
|
}
|
||||||
|
},
|
||||||
|
required=["filename"],
|
||||||
|
)
|
||||||
|
|
||||||
|
tools = ToolsSchema(
|
||||||
|
standard_tools=[
|
||||||
|
get_current_weather_tool,
|
||||||
|
save_conversation_tool,
|
||||||
|
get_saved_conversation_filenames_tool,
|
||||||
|
load_conversation_tool,
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
|
||||||
|
logger.info(f"Starting bot")
|
||||||
|
|
||||||
|
transport = SmallWebRTCTransport(
|
||||||
|
webrtc_connection=webrtc_connection,
|
||||||
|
params=TransportParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_out_enabled=True,
|
||||||
|
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.8)),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Specify initial system instruction.
|
||||||
|
# HACK: note that, for now, we need to inject a special bit of text into this instruction to
|
||||||
|
# allow the first assistant response to be programmatically triggered (which happens in the
|
||||||
|
# on_client_connected handler, below)
|
||||||
|
system_instruction = (
|
||||||
|
"You are a friendly assistant. The user and you will engage in a spoken dialog exchanging "
|
||||||
|
"the transcripts of a natural real-time conversation. Keep your responses short, generally "
|
||||||
|
"two or three sentences for chatty scenarios. "
|
||||||
|
f"{AWSNovaSonicLLMService.AWAIT_TRIGGER_ASSISTANT_RESPONSE_INSTRUCTION}"
|
||||||
|
)
|
||||||
|
|
||||||
|
llm = AWSNovaSonicLLMService(
|
||||||
|
secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
|
||||||
|
access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
|
||||||
|
region=os.getenv("AWS_REGION"), # as of 2025-05-06, us-east-1 is the only supported region
|
||||||
|
voice_id="tiffany", # matthew, tiffany, amy
|
||||||
|
# you could choose to pass instruction here rather than via context
|
||||||
|
# system_instruction=system_instruction,
|
||||||
|
# you could choose to pass tools here rather than via context
|
||||||
|
# tools=tools
|
||||||
|
)
|
||||||
|
|
||||||
|
llm.register_function("get_current_weather", fetch_weather_from_api)
|
||||||
|
llm.register_function("save_conversation", save_conversation)
|
||||||
|
llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
|
||||||
|
llm.register_function("load_conversation", load_conversation)
|
||||||
|
|
||||||
|
context = OpenAILLMContext(
|
||||||
|
messages=[
|
||||||
|
{"role": "system", "content": f"{system_instruction}"},
|
||||||
|
],
|
||||||
|
tools=tools,
|
||||||
|
)
|
||||||
|
context_aggregator = llm.create_context_aggregator(context)
|
||||||
|
|
||||||
|
pipeline = Pipeline(
|
||||||
|
[
|
||||||
|
transport.input(), # Transport user input
|
||||||
|
context_aggregator.user(),
|
||||||
|
llm, # LLM
|
||||||
|
transport.output(), # Transport bot output
|
||||||
|
context_aggregator.assistant(),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
task = PipelineTask(
|
||||||
|
pipeline,
|
||||||
|
params=PipelineParams(
|
||||||
|
allow_interruptions=True,
|
||||||
|
enable_metrics=True,
|
||||||
|
enable_usage_metrics=True,
|
||||||
|
report_only_initial_ttfb=True,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_connected")
|
||||||
|
async def on_client_connected(transport, client):
|
||||||
|
logger.info(f"Client connected")
|
||||||
|
# Kick off the conversation.
|
||||||
|
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
||||||
|
# HACK: for now, we need this special way of triggering the first assistant response in AWS
|
||||||
|
# Nova Sonic. Note that this trigger requires a special corresponding bit of text in the
|
||||||
|
# system instruction. In the future, simply queueing the context frame should be sufficient.
|
||||||
|
await llm.trigger_assistant_response()
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_disconnected")
|
||||||
|
async def on_client_disconnected(transport, client):
|
||||||
|
logger.info(f"Client disconnected")
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_closed")
|
||||||
|
async def on_client_closed(transport, client):
|
||||||
|
logger.info(f"Client closed connection")
|
||||||
|
await task.cancel()
|
||||||
|
|
||||||
|
runner = PipelineRunner(handle_sigint=False)
|
||||||
|
|
||||||
|
await runner.run(task)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
from run import main
|
||||||
|
|
||||||
|
main()
|
||||||
@@ -12,10 +12,12 @@ from loguru import logger
|
|||||||
|
|
||||||
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||||
from pipecat.audio.vad.vad_analyzer import VADParams
|
from pipecat.audio.vad.vad_analyzer import VADParams
|
||||||
|
from pipecat.frames.frames import TranscriptionMessage
|
||||||
from pipecat.pipeline.pipeline import Pipeline
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
from pipecat.pipeline.runner import PipelineRunner
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
|
from pipecat.processors.transcript_processor import TranscriptProcessor
|
||||||
from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
|
from pipecat.services.gemini_multimodal_live.gemini import GeminiMultimodalLiveLLMService
|
||||||
from pipecat.transports.base_transport import TransportParams
|
from pipecat.transports.base_transport import TransportParams
|
||||||
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
@@ -69,12 +71,16 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
)
|
)
|
||||||
context_aggregator = llm.create_context_aggregator(context)
|
context_aggregator = llm.create_context_aggregator(context)
|
||||||
|
|
||||||
|
transcript = TranscriptProcessor()
|
||||||
|
|
||||||
pipeline = Pipeline(
|
pipeline = Pipeline(
|
||||||
[
|
[
|
||||||
transport.input(),
|
transport.input(),
|
||||||
context_aggregator.user(),
|
context_aggregator.user(),
|
||||||
|
transcript.user(),
|
||||||
llm,
|
llm,
|
||||||
transport.output(),
|
transport.output(),
|
||||||
|
transcript.assistant(),
|
||||||
context_aggregator.assistant(),
|
context_aggregator.assistant(),
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
@@ -103,6 +109,15 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
logger.info(f"Client closed connection")
|
logger.info(f"Client closed connection")
|
||||||
await task.cancel()
|
await task.cancel()
|
||||||
|
|
||||||
|
# Register event handler for transcript updates
|
||||||
|
@transcript.event_handler("on_transcript_update")
|
||||||
|
async def on_transcript_update(processor, frame):
|
||||||
|
for msg in frame.messages:
|
||||||
|
if isinstance(msg, TranscriptionMessage):
|
||||||
|
timestamp = f"[{msg.timestamp}] " if msg.timestamp else ""
|
||||||
|
line = f"{timestamp}{msg.role}: {msg.content}"
|
||||||
|
logger.info(f"Transcript: {line}")
|
||||||
|
|
||||||
runner = PipelineRunner(handle_sigint=False)
|
runner = PipelineRunner(handle_sigint=False)
|
||||||
|
|
||||||
await runner.run(task)
|
await runner.run(task)
|
||||||
|
|||||||
@@ -36,6 +36,7 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
audio_in_enabled=True,
|
audio_in_enabled=True,
|
||||||
audio_out_enabled=True,
|
audio_out_enabled=True,
|
||||||
video_out_enabled=True,
|
video_out_enabled=True,
|
||||||
|
video_out_is_live=True,
|
||||||
video_out_width=512,
|
video_out_width=512,
|
||||||
video_out_height=512,
|
video_out_height=512,
|
||||||
vad_analyzer=SileroVADAnalyzer(),
|
vad_analyzer=SileroVADAnalyzer(),
|
||||||
|
|||||||
@@ -14,19 +14,26 @@ from pipecat.audio.vad.silero import SileroVADAnalyzer
|
|||||||
from pipecat.frames.frames import (
|
from pipecat.frames.frames import (
|
||||||
BotStartedSpeakingFrame,
|
BotStartedSpeakingFrame,
|
||||||
BotStoppedSpeakingFrame,
|
BotStoppedSpeakingFrame,
|
||||||
Frame,
|
EndFrame,
|
||||||
StartInterruptionFrame,
|
StartInterruptionFrame,
|
||||||
|
TTSTextFrame,
|
||||||
|
UserStartedSpeakingFrame,
|
||||||
)
|
)
|
||||||
from pipecat.observers.base_observer import BaseObserver
|
from pipecat.observers.base_observer import BaseObserver, FramePushed
|
||||||
|
from pipecat.observers.loggers.debug_log_observer import DebugLogObserver, FrameEndpoint
|
||||||
from pipecat.observers.loggers.llm_log_observer import LLMLogObserver
|
from pipecat.observers.loggers.llm_log_observer import LLMLogObserver
|
||||||
from pipecat.pipeline.pipeline import Pipeline
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
from pipecat.pipeline.runner import PipelineRunner
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
from pipecat.processors.aggregators.openai_llm_context import (
|
||||||
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
|
OpenAILLMContext,
|
||||||
|
)
|
||||||
|
from pipecat.processors.frame_processor import FrameDirection
|
||||||
from pipecat.services.cartesia.tts import CartesiaTTSService
|
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||||
from pipecat.services.deepgram.stt import DeepgramSTTService
|
from pipecat.services.deepgram.stt import DeepgramSTTService
|
||||||
from pipecat.services.openai.llm import OpenAILLMService
|
from pipecat.services.openai.llm import OpenAILLMService
|
||||||
|
from pipecat.transports.base_input import BaseInputTransport
|
||||||
|
from pipecat.transports.base_output import BaseOutputTransport
|
||||||
from pipecat.transports.base_transport import TransportParams
|
from pipecat.transports.base_transport import TransportParams
|
||||||
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
||||||
@@ -34,7 +41,7 @@ from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
|||||||
load_dotenv(override=True)
|
load_dotenv(override=True)
|
||||||
|
|
||||||
|
|
||||||
class DebugObserver(BaseObserver):
|
class CustomObserver(BaseObserver):
|
||||||
"""Observer to log interruptions and bot speaking events to the console.
|
"""Observer to log interruptions and bot speaking events to the console.
|
||||||
|
|
||||||
Logs all frame instances of:
|
Logs all frame instances of:
|
||||||
@@ -46,21 +53,20 @@ class DebugObserver(BaseObserver):
|
|||||||
Log format: [EVENT TYPE]: [source processor] → [destination processor] at [timestamp]s
|
Log format: [EVENT TYPE]: [source processor] → [destination processor] at [timestamp]s
|
||||||
"""
|
"""
|
||||||
|
|
||||||
async def on_push_frame(
|
async def on_push_frame(self, data: FramePushed):
|
||||||
self,
|
src = data.source
|
||||||
src: FrameProcessor,
|
dst = data.destination
|
||||||
dst: FrameProcessor,
|
frame = data.frame
|
||||||
frame: Frame,
|
direction = data.direction
|
||||||
direction: FrameDirection,
|
timestamp = data.timestamp
|
||||||
timestamp: int,
|
|
||||||
):
|
|
||||||
# Convert timestamp to seconds for readability
|
# Convert timestamp to seconds for readability
|
||||||
time_sec = timestamp / 1_000_000_000
|
time_sec = timestamp / 1_000_000_000
|
||||||
|
|
||||||
# Create direction arrow
|
# Create direction arrow
|
||||||
arrow = "→" if direction == FrameDirection.DOWNSTREAM else "←"
|
arrow = "→" if direction == FrameDirection.DOWNSTREAM else "←"
|
||||||
|
|
||||||
if isinstance(frame, StartInterruptionFrame):
|
if isinstance(frame, StartInterruptionFrame) and isinstance(src, BaseOutputTransport):
|
||||||
logger.info(f"⚡ INTERRUPTION START: {src} {arrow} {dst} at {time_sec:.2f}s")
|
logger.info(f"⚡ INTERRUPTION START: {src} {arrow} {dst} at {time_sec:.2f}s")
|
||||||
elif isinstance(frame, BotStartedSpeakingFrame):
|
elif isinstance(frame, BotStartedSpeakingFrame):
|
||||||
logger.info(f"🤖 BOT START SPEAKING: {src} {arrow} {dst} at {time_sec:.2f}s")
|
logger.info(f"🤖 BOT START SPEAKING: {src} {arrow} {dst} at {time_sec:.2f}s")
|
||||||
@@ -119,7 +125,17 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
enable_usage_metrics=True,
|
enable_usage_metrics=True,
|
||||||
report_only_initial_ttfb=True,
|
report_only_initial_ttfb=True,
|
||||||
),
|
),
|
||||||
observers=[DebugObserver(), LLMLogObserver()],
|
observers=[
|
||||||
|
CustomObserver(),
|
||||||
|
LLMLogObserver(),
|
||||||
|
DebugLogObserver(
|
||||||
|
frame_types={
|
||||||
|
TTSTextFrame: (BaseOutputTransport, FrameEndpoint.DESTINATION),
|
||||||
|
UserStartedSpeakingFrame: (BaseInputTransport, FrameEndpoint.SOURCE),
|
||||||
|
EndFrame: None,
|
||||||
|
}
|
||||||
|
),
|
||||||
|
],
|
||||||
)
|
)
|
||||||
|
|
||||||
@transport.event_handler("on_client_connected")
|
@transport.event_handler("on_client_connected")
|
||||||
|
|||||||
@@ -45,6 +45,7 @@ Note:
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
|
import asyncio
|
||||||
import os
|
import os
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
@@ -102,8 +103,17 @@ async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespac
|
|||||||
voice_name = match.content.strip().lower()
|
voice_name = match.content.strip().lower()
|
||||||
if voice_name in VOICE_IDS:
|
if voice_name in VOICE_IDS:
|
||||||
voice_id = VOICE_IDS[voice_name]
|
voice_id = VOICE_IDS[voice_name]
|
||||||
tts.set_voice(voice_id)
|
|
||||||
logger.info(f"Switched to {voice_name} voice")
|
# Create task to reset the TTS context after voice change
|
||||||
|
async def change_voice():
|
||||||
|
# First flush any existing audio to finish the current context
|
||||||
|
await tts.flush_audio()
|
||||||
|
# Then set the new voice
|
||||||
|
tts.set_voice(voice_id)
|
||||||
|
logger.info(f"Switched to {voice_name} voice")
|
||||||
|
|
||||||
|
# Schedule the voice change task
|
||||||
|
asyncio.create_task(change_voice())
|
||||||
else:
|
else:
|
||||||
logger.warning(f"Unknown voice: {voice_name}")
|
logger.warning(f"Unknown voice: {voice_name}")
|
||||||
|
|
||||||
|
|||||||
128
examples/foundational/38b-smart-turn-local.py
Normal file
@@ -0,0 +1,128 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2024–2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from loguru import logger
|
||||||
|
|
||||||
|
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
|
||||||
|
from pipecat.audio.turn.smart_turn.local_smart_turn import LocalSmartTurnAnalyzer
|
||||||
|
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||||
|
from pipecat.audio.vad.vad_analyzer import VADParams
|
||||||
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
|
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||||
|
from pipecat.services.deepgram.stt import DeepgramSTTService
|
||||||
|
from pipecat.services.openai.llm import OpenAILLMService
|
||||||
|
from pipecat.transports.base_transport import TransportParams
|
||||||
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
|
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
||||||
|
|
||||||
|
load_dotenv(override=True)
|
||||||
|
|
||||||
|
|
||||||
|
async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
|
||||||
|
logger.info(f"Starting bot")
|
||||||
|
|
||||||
|
# To use this locally, set the environment variable LOCAL_SMART_TURN_MODEL_PATH
|
||||||
|
# to the path where the smart-turn repo is cloned.
|
||||||
|
#
|
||||||
|
# Example setup:
|
||||||
|
#
|
||||||
|
# # Git LFS (Large File Storage)
|
||||||
|
# brew install git-lfs
|
||||||
|
# # Hugging Face uses LFS to store large model files, including .mlpackage
|
||||||
|
# git lfs install
|
||||||
|
# # Clone the repo with the smart_turn_classifier.mlpackage
|
||||||
|
# git clone https://huggingface.co/pipecat-ai/smart-turn
|
||||||
|
#
|
||||||
|
# Then set the env variable:
|
||||||
|
# export LOCAL_SMART_TURN_MODEL_PATH=./smart-turn
|
||||||
|
# or add it to your .env file
|
||||||
|
smart_turn_model_path = os.getenv("LOCAL_SMART_TURN_MODEL_PATH")
|
||||||
|
|
||||||
|
transport = SmallWebRTCTransport(
|
||||||
|
webrtc_connection=webrtc_connection,
|
||||||
|
params=TransportParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_out_enabled=True,
|
||||||
|
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
|
||||||
|
turn_analyzer=LocalSmartTurnAnalyzer(
|
||||||
|
smart_turn_model_path=smart_turn_model_path, params=SmartTurnParams()
|
||||||
|
),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
|
||||||
|
|
||||||
|
tts = CartesiaTTSService(
|
||||||
|
api_key=os.getenv("CARTESIA_API_KEY"),
|
||||||
|
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
||||||
|
)
|
||||||
|
|
||||||
|
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
|
|
||||||
|
messages = [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
context = OpenAILLMContext(messages)
|
||||||
|
context_aggregator = llm.create_context_aggregator(context)
|
||||||
|
|
||||||
|
pipeline = Pipeline(
|
||||||
|
[
|
||||||
|
transport.input(), # Transport user input
|
||||||
|
stt,
|
||||||
|
context_aggregator.user(), # User responses
|
||||||
|
llm, # LLM
|
||||||
|
tts, # TTS
|
||||||
|
transport.output(), # Transport bot output
|
||||||
|
context_aggregator.assistant(), # Assistant spoken responses
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
task = PipelineTask(
|
||||||
|
pipeline,
|
||||||
|
params=PipelineParams(
|
||||||
|
allow_interruptions=True,
|
||||||
|
enable_metrics=True,
|
||||||
|
enable_usage_metrics=True,
|
||||||
|
report_only_initial_ttfb=True,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_connected")
|
||||||
|
async def on_client_connected(transport, client):
|
||||||
|
logger.info(f"Client connected")
|
||||||
|
# Kick off the conversation.
|
||||||
|
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
|
||||||
|
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_disconnected")
|
||||||
|
async def on_client_disconnected(transport, client):
|
||||||
|
logger.info(f"Client disconnected")
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_closed")
|
||||||
|
async def on_client_closed(transport, client):
|
||||||
|
logger.info(f"Client closed connection")
|
||||||
|
await task.cancel()
|
||||||
|
|
||||||
|
runner = PipelineRunner(handle_sigint=False)
|
||||||
|
|
||||||
|
await runner.run(task)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
from run import main
|
||||||
|
|
||||||
|
main()
|
||||||
173
examples/foundational/39-aws-nova-sonic.py
Normal file
@@ -0,0 +1,173 @@
|
|||||||
|
#
|
||||||
|
# Copyright (c) 2024–2025, Daily
|
||||||
|
#
|
||||||
|
# SPDX-License-Identifier: BSD 2-Clause License
|
||||||
|
#
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from loguru import logger
|
||||||
|
|
||||||
|
from pipecat.adapters.schemas.function_schema import FunctionSchema
|
||||||
|
from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
||||||
|
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||||
|
from pipecat.audio.vad.vad_analyzer import VADParams
|
||||||
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
|
from pipecat.services.aws_nova_sonic import AWSNovaSonicLLMService
|
||||||
|
from pipecat.services.llm_service import FunctionCallParams
|
||||||
|
from pipecat.transports.base_transport import TransportParams
|
||||||
|
from pipecat.transports.network.small_webrtc import SmallWebRTCTransport
|
||||||
|
from pipecat.transports.network.webrtc_connection import SmallWebRTCConnection
|
||||||
|
|
||||||
|
# Load environment variables
|
||||||
|
load_dotenv(override=True)
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_weather_from_api(params: FunctionCallParams):
|
||||||
|
temperature = 75 if params.arguments["format"] == "fahrenheit" else 24
|
||||||
|
await params.result_callback(
|
||||||
|
{
|
||||||
|
"conditions": "nice",
|
||||||
|
"temperature": temperature,
|
||||||
|
"format": params.arguments["format"],
|
||||||
|
"timestamp": datetime.now().strftime("%Y%m%d_%H%M%S"),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
weather_function = FunctionSchema(
|
||||||
|
name="get_current_weather",
|
||||||
|
description="Get the current weather",
|
||||||
|
properties={
|
||||||
|
"location": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "The city and state, e.g. San Francisco, CA",
|
||||||
|
},
|
||||||
|
"format": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["celsius", "fahrenheit"],
|
||||||
|
"description": "The temperature unit to use. Infer this from the users location.",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
required=["location", "format"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create tools schema
|
||||||
|
tools = ToolsSchema(standard_tools=[weather_function])
|
||||||
|
|
||||||
|
|
||||||
|
async def run_bot(webrtc_connection: SmallWebRTCConnection, _: argparse.Namespace):
|
||||||
|
logger.info(f"Starting bot")
|
||||||
|
|
||||||
|
# Initialize the SmallWebRTCTransport with the connection
|
||||||
|
transport = SmallWebRTCTransport(
|
||||||
|
webrtc_connection=webrtc_connection,
|
||||||
|
params=TransportParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_in_sample_rate=16000,
|
||||||
|
audio_out_enabled=True,
|
||||||
|
camera_in_enabled=False,
|
||||||
|
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.8)),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Specify initial system instruction.
|
||||||
|
# HACK: note that, for now, we need to inject a special bit of text into this instruction to
|
||||||
|
# allow the first assistant response to be programmatically triggered (which happens in the
|
||||||
|
# on_client_connected handler, below)
|
||||||
|
system_instruction = (
|
||||||
|
"You are a friendly assistant. The user and you will engage in a spoken dialog exchanging "
|
||||||
|
"the transcripts of a natural real-time conversation. Keep your responses short, generally "
|
||||||
|
"two or three sentences for chatty scenarios. "
|
||||||
|
f"{AWSNovaSonicLLMService.AWAIT_TRIGGER_ASSISTANT_RESPONSE_INSTRUCTION}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create the AWS Nova Sonic LLM service
|
||||||
|
llm = AWSNovaSonicLLMService(
|
||||||
|
secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
|
||||||
|
access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
|
||||||
|
region=os.getenv("AWS_REGION"), # as of 2025-05-06, us-east-1 is the only supported region
|
||||||
|
voice_id="tiffany", # matthew, tiffany, amy
|
||||||
|
# you could choose to pass instruction here rather than via context
|
||||||
|
# system_instruction=system_instruction
|
||||||
|
# you could choose to pass tools here rather than via context
|
||||||
|
# tools=tools
|
||||||
|
)
|
||||||
|
|
||||||
|
# Register function for function calls
|
||||||
|
# you can either register a single function for all function calls, or specific functions
|
||||||
|
# llm.register_function(None, fetch_weather_from_api)
|
||||||
|
llm.register_function("get_current_weather", fetch_weather_from_api)
|
||||||
|
|
||||||
|
# Set up context and context management.
|
||||||
|
# AWSNovaSonicService will adapt OpenAI LLM context objects with standard message format to
|
||||||
|
# what's expected by Nova Sonic.
|
||||||
|
context = OpenAILLMContext(
|
||||||
|
messages=[
|
||||||
|
{"role": "system", "content": f"{system_instruction}"},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "Tell me a fun fact!",
|
||||||
|
},
|
||||||
|
],
|
||||||
|
tools=tools,
|
||||||
|
)
|
||||||
|
context_aggregator = llm.create_context_aggregator(context)
|
||||||
|
|
||||||
|
# Build the pipeline
|
||||||
|
pipeline = Pipeline(
|
||||||
|
[
|
||||||
|
transport.input(),
|
||||||
|
context_aggregator.user(),
|
||||||
|
llm,
|
||||||
|
transport.output(),
|
||||||
|
context_aggregator.assistant(),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Configure the pipeline task
|
||||||
|
task = PipelineTask(
|
||||||
|
pipeline,
|
||||||
|
params=PipelineParams(
|
||||||
|
allow_interruptions=True,
|
||||||
|
enable_metrics=True,
|
||||||
|
enable_usage_metrics=True,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Handle client connection event
|
||||||
|
@transport.event_handler("on_client_connected")
|
||||||
|
async def on_client_connected(transport, client):
|
||||||
|
logger.info(f"Client connected")
|
||||||
|
# Kick off the conversation.
|
||||||
|
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
||||||
|
# HACK: for now, we need this special way of triggering the first assistant response in AWS
|
||||||
|
# Nova Sonic. Note that this trigger requires a special corresponding bit of text in the
|
||||||
|
# system instruction. In the future, simply queueing the context frame should be sufficient.
|
||||||
|
await llm.trigger_assistant_response()
|
||||||
|
|
||||||
|
# Handle client disconnection events
|
||||||
|
@transport.event_handler("on_client_disconnected")
|
||||||
|
async def on_client_disconnected(transport, client):
|
||||||
|
logger.info(f"Client disconnected")
|
||||||
|
|
||||||
|
@transport.event_handler("on_client_closed")
|
||||||
|
async def on_client_closed(transport, client):
|
||||||
|
logger.info(f"Client closed connection")
|
||||||
|
await task.cancel()
|
||||||
|
|
||||||
|
# Run the pipeline
|
||||||
|
runner = PipelineRunner(handle_sigint=False)
|
||||||
|
await runner.run(task)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
from run import main
|
||||||
|
|
||||||
|
main()
|
||||||
@@ -10,12 +10,16 @@ import subprocess
|
|||||||
from contextlib import asynccontextmanager
|
from contextlib import asynccontextmanager
|
||||||
|
|
||||||
import aiohttp
|
import aiohttp
|
||||||
|
from dotenv import load_dotenv
|
||||||
from fastapi import FastAPI, HTTPException, Request
|
from fastapi import FastAPI, HTTPException, Request
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from fastapi.responses import JSONResponse, RedirectResponse
|
from fastapi.responses import JSONResponse, RedirectResponse
|
||||||
|
|
||||||
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
|
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
|
||||||
|
|
||||||
|
# Load environment variables from .env file
|
||||||
|
load_dotenv(override=True)
|
||||||
|
|
||||||
MAX_BOTS_PER_ROOM = 1
|
MAX_BOTS_PER_ROOM = 1
|
||||||
|
|
||||||
# Bot sub-process dict for status reporting and concurrency control
|
# Bot sub-process dict for status reporting and concurrency control
|
||||||
|
|||||||
@@ -10,12 +10,16 @@ import subprocess
|
|||||||
from contextlib import asynccontextmanager
|
from contextlib import asynccontextmanager
|
||||||
|
|
||||||
import aiohttp
|
import aiohttp
|
||||||
|
from dotenv import load_dotenv
|
||||||
from fastapi import FastAPI, HTTPException, Request
|
from fastapi import FastAPI, HTTPException, Request
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from fastapi.responses import JSONResponse, RedirectResponse
|
from fastapi.responses import JSONResponse, RedirectResponse
|
||||||
|
|
||||||
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
|
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
|
||||||
|
|
||||||
|
# Load environment variables from .env file
|
||||||
|
load_dotenv(override=True)
|
||||||
|
|
||||||
MAX_BOTS_PER_ROOM = 1
|
MAX_BOTS_PER_ROOM = 1
|
||||||
|
|
||||||
# Bot sub-process dict for status reporting and concurrency control
|
# Bot sub-process dict for status reporting and concurrency control
|
||||||
|
|||||||
138
examples/phone-chatbot-daily-twilio-sip/README.md
Normal file
@@ -0,0 +1,138 @@
|
|||||||
|
# Daily + Twilio SIP Voice Bot
|
||||||
|
|
||||||
|
This project demonstrates how to create a voice bot that can receive phone calls via Twilio and use Daily's SIP capabilities to enable voice conversations.
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
1. Twilio receives an incoming call to your phone number
|
||||||
|
2. Twilio calls your webhook server (`/call` endpoint)
|
||||||
|
3. The server creates a Daily room with SIP capabilities
|
||||||
|
4. The server starts the bot process with the room details
|
||||||
|
5. The caller is put on hold with music
|
||||||
|
6. The bot joins the Daily room and signals readiness
|
||||||
|
7. Twilio forwards the call to Daily's SIP endpoint
|
||||||
|
8. The caller and bot are connected, and the bot handles the conversation
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- A Daily account with an API key
|
||||||
|
- A Twilio account with a phone number that supports voice
|
||||||
|
- OpenAI API key for the bot's intelligence
|
||||||
|
- Cartesia API key for text-to-speech
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
|
||||||
|
1. Create a virtual environment and install dependencies
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python -m venv venv
|
||||||
|
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Set up environment variables
|
||||||
|
|
||||||
|
Copy the example file and fill in your API keys:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cp .env.example .env
|
||||||
|
# Edit .env with your API keys
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Configure your Twilio webhook
|
||||||
|
|
||||||
|
In the Twilio console:
|
||||||
|
|
||||||
|
- Go to your phone number's configuration
|
||||||
|
- Set the webhook for "A Call Comes In" to your server's URL + "/call"
|
||||||
|
- For local testing, you can use ngrok to expose your local server
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ngrok http 8000
|
||||||
|
# Then use the provided URL (e.g., https://abc123.ngrok.io/call) in Twilio
|
||||||
|
```
|
||||||
|
|
||||||
|
## Running the Server
|
||||||
|
|
||||||
|
Start the webhook server:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python server.py
|
||||||
|
```
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
Call your Twilio phone number. The system should answer the call, put you on hold briefly, then connect you with the bot.
|
||||||
|
|
||||||
|
## Customizing the Bot
|
||||||
|
|
||||||
|
You can customize the bot's behavior by modifying the system prompt in `bot.py`.
|
||||||
|
|
||||||
|
### Changing the Hold Music
|
||||||
|
|
||||||
|
To change the ringing sound or hold music that callers hear while waiting to be connected to the bot, update the URL in `server.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
resp = VoiceResponse()
|
||||||
|
resp.play(
|
||||||
|
url="https://your-custom-audio-file-url.mp3",
|
||||||
|
loop=10,
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
> Read [Twilio's guide](https://www.twilio.com/en-us/blog/adding-mp3-to-voice-call-using-twilio) on how to set up an mp3 in a voice call.
|
||||||
|
|
||||||
|
## Handling Multiple SIP Endpoints
|
||||||
|
|
||||||
|
The bot is configured to handle multiple `on_dialin_ready` events that might occur with multiple SIP endpoints. It ensures that each call is only forwarded once using a simple flag:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Flag to track if call has been forwarded
|
||||||
|
call_already_forwarded = False
|
||||||
|
|
||||||
|
@transport.event_handler("on_dialin_ready")
|
||||||
|
async def on_dialin_ready(transport, cdata):
|
||||||
|
nonlocal call_already_forwarded
|
||||||
|
|
||||||
|
# Skip if already forwarded
|
||||||
|
if call_already_forwarded:
|
||||||
|
logger.info("Call already forwarded, ignoring this event.")
|
||||||
|
return
|
||||||
|
|
||||||
|
# ... forwarding code ...
|
||||||
|
call_already_forwarded = True
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that normally calls only require a single SIP endpoint. If you are planning to forward the call to a different number, you will need to set up 2 SIP endpoints: one for the initial call and one for the forwarded call. IMPORTANT: ensure that your `on_dialin_ready` handler only handles the first call.
|
||||||
|
|
||||||
|
## Daily SIP Configuration
|
||||||
|
|
||||||
|
The bot configures Daily rooms with SIP capabilities using these settings:
|
||||||
|
|
||||||
|
```python
|
||||||
|
sip_params = DailyRoomSipParams(
|
||||||
|
display_name="phone-user", # This will show up in the Daily UI; optional display the dialer's number
|
||||||
|
video=False, # Audio-only call
|
||||||
|
sip_mode="dial-in", # For receiving calls (vs. dial-out)
|
||||||
|
num_endpoints=1, # Number of SIP endpoints to create
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Call is not being answered
|
||||||
|
|
||||||
|
- Check that your Twilio webhook is correctly configured
|
||||||
|
- Verify your Twilio account has sufficient funds
|
||||||
|
- Check the logs of both the server and bot processes
|
||||||
|
|
||||||
|
### Call connects but no bot is heard
|
||||||
|
|
||||||
|
- Ensure your Daily API key is correct and has SIP capabilities
|
||||||
|
- Check that the SIP endpoint is being correctly passed to the bot
|
||||||
|
- Verify that the Cartesia API key and voice ID are correct
|
||||||
|
|
||||||
|
### Bot starts but disconnects immediately
|
||||||
|
|
||||||
|
- Check the Daily and Twilio logs for any error messages
|
||||||
|
- Ensure your server has stable internet connectivity
|
||||||
183
examples/phone-chatbot-daily-twilio-sip/bot.py
Normal file
@@ -0,0 +1,183 @@
|
|||||||
|
"""Twilio + Daily voice bot implementation."""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from loguru import logger
|
||||||
|
from twilio.rest import Client
|
||||||
|
|
||||||
|
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
||||||
|
from pipecat.pipeline.pipeline import Pipeline
|
||||||
|
from pipecat.pipeline.runner import PipelineRunner
|
||||||
|
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
||||||
|
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
||||||
|
from pipecat.services.cartesia.tts import CartesiaTTSService
|
||||||
|
from pipecat.services.openai.llm import OpenAILLMService
|
||||||
|
from pipecat.transports.services.daily import DailyParams, DailyTransport
|
||||||
|
|
||||||
|
# Setup logging
|
||||||
|
load_dotenv()
|
||||||
|
logger.remove(0)
|
||||||
|
logger.add(sys.stderr, level="DEBUG")
|
||||||
|
|
||||||
|
# Initialize Twilio client
|
||||||
|
twilio_client = Client(os.getenv("TWILIO_ACCOUNT_SID"), os.getenv("TWILIO_AUTH_TOKEN"))
|
||||||
|
|
||||||
|
|
||||||
|
async def run_bot(room_url: str, token: str, call_id: str, sip_uri: str) -> None:
|
||||||
|
"""Run the voice bot with the given parameters.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
room_url: The Daily room URL
|
||||||
|
token: The Daily room token
|
||||||
|
call_id: The Twilio call ID
|
||||||
|
sip_uri: The Daily SIP URI for forwarding the call
|
||||||
|
"""
|
||||||
|
logger.info(f"Starting bot with room: {room_url}")
|
||||||
|
logger.info(f"SIP endpoint: {sip_uri}")
|
||||||
|
|
||||||
|
call_already_forwarded = False
|
||||||
|
|
||||||
|
# Setup the Daily transport
|
||||||
|
transport = DailyTransport(
|
||||||
|
room_url,
|
||||||
|
token,
|
||||||
|
"Phone Bot",
|
||||||
|
DailyParams(
|
||||||
|
audio_in_enabled=True,
|
||||||
|
audio_out_enabled=True,
|
||||||
|
transcription_enabled=True,
|
||||||
|
vad_analyzer=SileroVADAnalyzer(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Setup TTS service
|
||||||
|
tts = CartesiaTTSService(
|
||||||
|
api_key=os.getenv("CARTESIA_API_KEY"),
|
||||||
|
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
|
||||||
|
)
|
||||||
|
|
||||||
|
# Setup LLM service
|
||||||
|
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
||||||
|
|
||||||
|
# Initialize LLM context with system prompt
|
||||||
|
messages = [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": (
|
||||||
|
"You are a friendly phone assistant. Your responses will be read aloud, "
|
||||||
|
"so keep them concise and conversational. Avoid special characters or "
|
||||||
|
"formatting. Begin by greeting the caller and asking how you can help them today."
|
||||||
|
),
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
# Setup the conversational context
|
||||||
|
context = OpenAILLMContext(messages)
|
||||||
|
context_aggregator = llm.create_context_aggregator(context)
|
||||||
|
|
||||||
|
# Build the pipeline
|
||||||
|
pipeline = Pipeline(
|
||||||
|
[
|
||||||
|
transport.input(),
|
||||||
|
context_aggregator.user(),
|
||||||
|
llm,
|
||||||
|
tts,
|
||||||
|
transport.output(),
|
||||||
|
context_aggregator.assistant(),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create the pipeline task
|
||||||
|
task = PipelineTask(
|
||||||
|
pipeline,
|
||||||
|
params=PipelineParams(
|
||||||
|
allow_interruptions=True # Enable barge-in so callers can interrupt the bot
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Handle participant joining
|
||||||
|
@transport.event_handler("on_first_participant_joined")
|
||||||
|
async def on_first_participant_joined(transport, participant):
|
||||||
|
logger.info(f"First participant joined: {participant['id']}")
|
||||||
|
await transport.capture_participant_transcription(participant["id"])
|
||||||
|
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
||||||
|
|
||||||
|
# Handle participant leaving
|
||||||
|
@transport.event_handler("on_participant_left")
|
||||||
|
async def on_participant_left(transport, participant, reason):
|
||||||
|
logger.info(f"Participant left: {participant['id']}, reason: {reason}")
|
||||||
|
await task.cancel()
|
||||||
|
|
||||||
|
# Handle call ready to forward
|
||||||
|
@transport.event_handler("on_dialin_ready")
|
||||||
|
async def on_dialin_ready(transport, cdata):
|
||||||
|
nonlocal call_already_forwarded
|
||||||
|
|
||||||
|
# We only want to forward the call once
|
||||||
|
# The on_dialin_ready event will be triggered for each sip endpoint provisioned
|
||||||
|
if call_already_forwarded:
|
||||||
|
logger.warning("Call already forwarded, ignoring this event.")
|
||||||
|
return
|
||||||
|
|
||||||
|
logger.info(f"Forwarding call {call_id} to {sip_uri}")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Update the Twilio call with TwiML to forward to the Daily SIP endpoint
|
||||||
|
twilio_client.calls(call_id).update(
|
||||||
|
twiml=f"<Response><Dial><Sip>{sip_uri}</Sip></Dial></Response>"
|
||||||
|
)
|
||||||
|
logger.info("Call forwarded successfully")
|
||||||
|
call_already_forwarded = True
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Failed to forward call: {str(e)}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
@transport.event_handler("on_dialin_connected")
|
||||||
|
async def on_dialin_connected(transport, data):
|
||||||
|
logger.debug(f"Dial-in connected: {data}")
|
||||||
|
|
||||||
|
@transport.event_handler("on_dialin_stopped")
|
||||||
|
async def on_dialin_stopped(transport, data):
|
||||||
|
logger.debug(f"Dial-in stopped: {data}")
|
||||||
|
|
||||||
|
@transport.event_handler("on_dialin_error")
|
||||||
|
async def on_dialin_error(transport, data):
|
||||||
|
logger.error(f"Dial-in error: {data}")
|
||||||
|
# If there is an error, the bot should leave the call
|
||||||
|
# This may be also handled in on_participant_left with
|
||||||
|
# await task.cancel()
|
||||||
|
|
||||||
|
@transport.event_handler("on_dialin_warning")
|
||||||
|
async def on_dialin_warning(transport, data):
|
||||||
|
logger.warning(f"Dial-in warning: {data}")
|
||||||
|
|
||||||
|
# Run the pipeline
|
||||||
|
runner = PipelineRunner()
|
||||||
|
await runner.run(task)
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
"""Parse command line arguments and run the bot."""
|
||||||
|
parser = argparse.ArgumentParser(description="Daily + Twilio Voice Bot")
|
||||||
|
parser.add_argument("-u", type=str, required=True, help="Daily room URL")
|
||||||
|
parser.add_argument("-t", type=str, required=True, help="Daily room token")
|
||||||
|
parser.add_argument("-i", type=str, required=True, help="Twilio call ID")
|
||||||
|
parser.add_argument("-s", type=str, required=True, help="Daily SIP URI")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# Validate required arguments
|
||||||
|
if not all([args.u, args.t, args.i, args.s]):
|
||||||
|
logger.error("All arguments (-u, -t, -i, -s) are required")
|
||||||
|
parser.print_help()
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
await run_bot(args.u, args.t, args.i, args.s)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
11
examples/phone-chatbot-daily-twilio-sip/env.example
Normal file
@@ -0,0 +1,11 @@
|
|||||||
|
# Daily credentials
|
||||||
|
DAILY_API_KEY=your_daily_api_key
|
||||||
|
DAILY_API_URL=https://api.daily.co/v1
|
||||||
|
|
||||||
|
# Twilio credentials
|
||||||
|
TWILIO_ACCOUNT_SID=your_twilio_account_sid
|
||||||
|
TWILIO_AUTH_TOKEN=your_twilio_auth_token
|
||||||
|
|
||||||
|
# Service keys
|
||||||
|
OPENAI_API_KEY=your_openai_api_key
|
||||||
|
CARTESIA_API_KEY=your_cartesia_api_key
|
||||||
5
examples/phone-chatbot-daily-twilio-sip/requirements.txt
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
pipecat-ai[daily,elevenlabs,openai,silero]
|
||||||
|
fastapi==0.115.6
|
||||||
|
uvicorn
|
||||||
|
python-dotenv
|
||||||
|
twilio
|
||||||
116
examples/phone-chatbot-daily-twilio-sip/server.py
Normal file
@@ -0,0 +1,116 @@
|
|||||||
|
"""Webhook server to handle Twilio calls and start the voice bot."""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import shlex
|
||||||
|
import subprocess
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
|
|
||||||
|
import aiohttp
|
||||||
|
import uvicorn
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from fastapi import FastAPI, HTTPException, Request
|
||||||
|
from fastapi.responses import PlainTextResponse
|
||||||
|
from twilio.twiml.voice_response import VoiceResponse
|
||||||
|
from utils.daily_helpers import create_sip_room
|
||||||
|
|
||||||
|
# Load environment variables
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
|
||||||
|
# Initialize FastAPI app with aiohttp session
|
||||||
|
@asynccontextmanager
|
||||||
|
async def lifespan(app: FastAPI):
|
||||||
|
# Create aiohttp session to be used for Daily API calls
|
||||||
|
app.state.session = aiohttp.ClientSession()
|
||||||
|
yield
|
||||||
|
# Close session when shutting down
|
||||||
|
await app.state.session.close()
|
||||||
|
|
||||||
|
|
||||||
|
app = FastAPI(lifespan=lifespan)
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/call", response_class=PlainTextResponse)
|
||||||
|
async def handle_call(request: Request):
|
||||||
|
"""Handle incoming Twilio call webhook."""
|
||||||
|
print("Received call webhook from Twilio")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Get form data from Twilio webhook
|
||||||
|
form_data = await request.form()
|
||||||
|
data = dict(form_data)
|
||||||
|
|
||||||
|
# Extract call ID (required to forward the call later)
|
||||||
|
call_sid = data.get("CallSid")
|
||||||
|
if not call_sid:
|
||||||
|
raise HTTPException(status_code=400, detail="Missing CallSid in request")
|
||||||
|
|
||||||
|
# Extract the caller's phone number
|
||||||
|
caller_phone = str(data.get("From", "unknown-caller"))
|
||||||
|
print(f"Processing call with ID: {call_sid} from {caller_phone}")
|
||||||
|
|
||||||
|
# Create a Daily room with SIP capabilities
|
||||||
|
try:
|
||||||
|
room_details = await create_sip_room(request.app.state.session, caller_phone)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error creating Daily room: {e}")
|
||||||
|
raise HTTPException(status_code=500, detail=f"Failed to create Daily room: {str(e)}")
|
||||||
|
|
||||||
|
# Extract necessary details
|
||||||
|
room_url = room_details["room_url"]
|
||||||
|
token = room_details["token"]
|
||||||
|
sip_endpoint = room_details["sip_endpoint"]
|
||||||
|
|
||||||
|
# Make sure we have a SIP endpoint
|
||||||
|
if not sip_endpoint:
|
||||||
|
raise HTTPException(status_code=500, detail="No SIP endpoint provided by Daily")
|
||||||
|
|
||||||
|
# Start the bot process
|
||||||
|
bot_cmd = f"python bot.py -u {room_url} -t {token} -i {call_sid} -s {sip_endpoint}"
|
||||||
|
try:
|
||||||
|
# Use shlex to properly split the command for subprocess
|
||||||
|
cmd_parts = shlex.split(bot_cmd)
|
||||||
|
|
||||||
|
# CHANGE: Keep stdout/stderr for debugging
|
||||||
|
# Start the bot in the background but capture output
|
||||||
|
subprocess.Popen(
|
||||||
|
cmd_parts,
|
||||||
|
# Don't redirect output so we can see logs
|
||||||
|
# stdout=subprocess.DEVNULL,
|
||||||
|
# stderr=subprocess.DEVNULL
|
||||||
|
)
|
||||||
|
print(f"Started bot process with command: {bot_cmd}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error starting bot: {e}")
|
||||||
|
raise HTTPException(status_code=500, detail=f"Failed to start bot: {str(e)}")
|
||||||
|
|
||||||
|
# Generate TwiML response to put the caller on hold with music
|
||||||
|
# You can replace the URL with your own music file
|
||||||
|
# or use Twilio's built-in music on hold
|
||||||
|
# https://www.twilio.com/docs/voice/twiml/play#music-on-hold
|
||||||
|
resp = VoiceResponse()
|
||||||
|
resp.play(
|
||||||
|
url="https://therapeutic-crayon-2467.twil.io/assets/US_ringback_tone.mp3",
|
||||||
|
loop=10,
|
||||||
|
)
|
||||||
|
|
||||||
|
return str(resp)
|
||||||
|
|
||||||
|
except HTTPException:
|
||||||
|
raise
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Unexpected error: {str(e)}")
|
||||||
|
raise HTTPException(status_code=500, detail=f"Server error: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/health")
|
||||||
|
async def health_check():
|
||||||
|
"""Simple health check endpoint."""
|
||||||
|
return {"status": "healthy"}
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
# Run the server
|
||||||
|
port = int(os.getenv("PORT", "8000"))
|
||||||
|
print(f"Starting server on port {port}")
|
||||||
|
uvicorn.run("server:app", host="0.0.0.0", port=port, reload=True)
|
||||||
@@ -0,0 +1,76 @@
|
|||||||
|
"""Helper functions for interacting with the Daily API."""
|
||||||
|
|
||||||
|
import os
|
||||||
|
from typing import Dict, Optional
|
||||||
|
|
||||||
|
import aiohttp
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
|
||||||
|
from pipecat.transports.services.helpers.daily_rest import (
|
||||||
|
DailyRESTHelper,
|
||||||
|
DailyRoomParams,
|
||||||
|
DailyRoomProperties,
|
||||||
|
DailyRoomSipParams,
|
||||||
|
)
|
||||||
|
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
|
||||||
|
# Initialize Daily API helper
|
||||||
|
async def get_daily_helper(session: Optional[aiohttp.ClientSession] = None) -> DailyRESTHelper:
|
||||||
|
"""Get a Daily REST helper with the configured API key."""
|
||||||
|
if session is None:
|
||||||
|
session = aiohttp.ClientSession()
|
||||||
|
|
||||||
|
return DailyRESTHelper(
|
||||||
|
daily_api_key=os.getenv("DAILY_API_KEY", ""),
|
||||||
|
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
|
||||||
|
aiohttp_session=session,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def create_sip_room(
|
||||||
|
session: Optional[aiohttp.ClientSession] = None, caller_phone: str = "unknown-caller"
|
||||||
|
) -> Dict[str, str]:
|
||||||
|
"""Create a Daily room with SIP capabilities for phone calls.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
session: Optional aiohttp session to use for API calls
|
||||||
|
caller_phone: The phone number of the caller to use in display name
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary with room URL, token, and SIP endpoint
|
||||||
|
"""
|
||||||
|
daily_helper = await get_daily_helper(session)
|
||||||
|
|
||||||
|
# Configure SIP parameters
|
||||||
|
sip_params = DailyRoomSipParams(
|
||||||
|
display_name=caller_phone,
|
||||||
|
video=False,
|
||||||
|
sip_mode="dial-in",
|
||||||
|
num_endpoints=1,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create room properties with SIP enabled
|
||||||
|
properties = DailyRoomProperties(
|
||||||
|
sip=sip_params,
|
||||||
|
enable_dialout=True, # Needed for outbound calls if you expand the bot
|
||||||
|
enable_chat=False, # No need for chat in a voice bot
|
||||||
|
start_video_off=True, # Voice only
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create room parameters
|
||||||
|
params = DailyRoomParams(properties=properties)
|
||||||
|
|
||||||
|
# Create the room
|
||||||
|
try:
|
||||||
|
room = await daily_helper.create_room(params=params)
|
||||||
|
print(f"Created room: {room.url} with SIP endpoint: {room.config.sip_endpoint}")
|
||||||
|
|
||||||
|
# Get token for the bot to join
|
||||||
|
token = await daily_helper.get_token(room.url, 24 * 60 * 60) # 24 hours validity
|
||||||
|
|
||||||
|
return {"room_url": room.url, "token": token, "sip_endpoint": room.config.sip_endpoint}
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error creating room: {e}")
|
||||||
|
raise
|
||||||
@@ -235,10 +235,10 @@ For incoming calls from customers, Daily will send a webhook to your `/start` en
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"From": "+CALLERS_PHONE",
|
"From": "+CALLERS_PHONE",
|
||||||
"To": "$PURCHASED_PHONE",
|
"To": "$PURCHASED_PHONE",
|
||||||
"callId": "callid-read-only-string",
|
"callId": "callid-read-only-string",
|
||||||
"callDomain": "callDomain-read-only-string"
|
"callDomain": "callDomain-read-only-string"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -266,63 +266,63 @@ When making requests to the `/start` endpoint, the config object can include:
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"prompts": [
|
"prompts": [
|
||||||
{
|
{
|
||||||
"name": "call_transfer_initial_prompt",
|
"name": "call_transfer_initial_prompt",
|
||||||
"text": "Your custom prompt here"
|
"text": "Your custom prompt here"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "call_transfer_prompt",
|
"name": "call_transfer_prompt",
|
||||||
"text": "Your custom prompt here"
|
"text": "Your custom prompt here"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "call_transfer_finished_prompt",
|
"name": "call_transfer_finished_prompt",
|
||||||
"text": "Your custom prompt here"
|
"text": "Your custom prompt here"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "voicemail_detection_prompt",
|
"name": "voicemail_detection_prompt",
|
||||||
"text": "Your custom prompt here"
|
"text": "Your custom prompt here"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "voicemail_prompt",
|
"name": "voicemail_prompt",
|
||||||
"text": "Your custom prompt here"
|
"text": "Your custom prompt here"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"name": "human_conversation_prompt",
|
"name": "human_conversation_prompt",
|
||||||
"text": "Your custom prompt here"
|
"text": "Your custom prompt here"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"dialin_settings": {
|
"dialin_settings": {
|
||||||
"From": "+CALLERS_PHONE",
|
"From": "+CALLERS_PHONE",
|
||||||
"To": "$PURCHASED_PHONE",
|
"To": "$PURCHASED_PHONE",
|
||||||
"callId": "callid-read-only-string",
|
"callId": "callid-read-only-string",
|
||||||
"callDomain": "callDomain-read-only-string"
|
"callDomain": "callDomain-read-only-string"
|
||||||
},
|
},
|
||||||
"dialout_settings": [
|
"dialout_settings": [
|
||||||
{
|
{
|
||||||
"phoneNumber": "+12345678910",
|
"phoneNumber": "+12345678910",
|
||||||
"callerId": "caller-id-uuid",
|
"callerId": "caller-id-uuid",
|
||||||
"sipUri": "sip:maria@example.com"
|
"sipUri": "sip:maria@example.com"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"call_transfer": {
|
"call_transfer": {
|
||||||
"mode": "dialout",
|
"mode": "dialout",
|
||||||
"speakSummary": true,
|
"speakSummary": true,
|
||||||
"storeSummary": false,
|
"storeSummary": false,
|
||||||
"operatorNumber": "+12345678910",
|
"operatorNumber": "+12345678910",
|
||||||
"testInPrebuilt": false
|
"testInPrebuilt": false
|
||||||
},
|
},
|
||||||
"voicemail_detection": {
|
"voicemail_detection": {
|
||||||
"testInPrebuilt": true
|
"testInPrebuilt": true
|
||||||
},
|
},
|
||||||
"simple_dialin": {
|
"simple_dialin": {
|
||||||
"testInPrebuilt": true
|
"testInPrebuilt": true
|
||||||
},
|
},
|
||||||
"simple_dialout": {
|
"simple_dialout": {
|
||||||
"testInPrebuilt": true
|
"testInPrebuilt": true
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -393,19 +393,19 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"dialin_settings": {
|
"dialin_settings": {
|
||||||
"from": "+12345678901",
|
"from": "+12345678901",
|
||||||
"to": "+19876543210",
|
"to": "+19876543210",
|
||||||
"call_id": "call-id-string",
|
"call_id": "call-id-string",
|
||||||
"call_domain": "domain-string"
|
"call_domain": "domain-string"
|
||||||
},
|
},
|
||||||
"call_transfer": {
|
"call_transfer": {
|
||||||
"mode": "dialout",
|
"mode": "dialout",
|
||||||
"speakSummary": true,
|
"speakSummary": true,
|
||||||
"operatorNumber": "+12345678910"
|
"operatorNumber": "+12345678910"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -413,14 +413,14 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"call_transfer": {
|
"call_transfer": {
|
||||||
"mode": "dialout",
|
"mode": "dialout",
|
||||||
"speakSummary": true,
|
"speakSummary": true,
|
||||||
"operatorNumber": "+12345678910",
|
"operatorNumber": "+12345678910",
|
||||||
"testInPrebuilt": true
|
"testInPrebuilt": true
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -428,11 +428,11 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"voicemail_detection": {
|
"voicemail_detection": {
|
||||||
"testInPrebuilt": true
|
"testInPrebuilt": true
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -440,16 +440,16 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"dialout_settings": [
|
"dialout_settings": [
|
||||||
{
|
{
|
||||||
"phoneNumber": "+12345678910"
|
"phoneNumber": "+12345678910"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"voicemail_detection": {
|
"voicemail_detection": {
|
||||||
"testInPrebuilt": false
|
"testInPrebuilt": false
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -457,15 +457,15 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"dialin_settings": {
|
"dialin_settings": {
|
||||||
"from": "+12345678901",
|
"from": "+12345678901",
|
||||||
"to": "+19876543210",
|
"to": "+19876543210",
|
||||||
"call_id": "call-id-string",
|
"call_id": "call-id-string",
|
||||||
"call_domain": "domain-string"
|
"call_domain": "domain-string"
|
||||||
},
|
},
|
||||||
"simple_dialin": {}
|
"simple_dialin": {}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -473,11 +473,11 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"simple_dialin": {
|
"simple_dialin": {
|
||||||
"testInPrebuilt": true
|
"testInPrebuilt": true
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -485,14 +485,14 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"dialout_settings": [
|
"dialout_settings": [
|
||||||
{
|
{
|
||||||
"phoneNumber": "+12345678910"
|
"phoneNumber": "+12345678910"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"simple_dialout": {}
|
"simple_dialout": {}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -500,37 +500,14 @@ The following table shows which feature combinations are supported when making r
|
|||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"config": {
|
"config": {
|
||||||
"simple_dialout": {
|
"simple_dialout": {
|
||||||
"testInPrebuilt": true
|
"testInPrebuilt": true
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
## Using Twilio (Alternative)
|
|
||||||
|
|
||||||
To use Twilio for call handling:
|
|
||||||
|
|
||||||
1. Start the bot runner:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
python bot_runner.py --host localhost
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Start ngrok:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
ngrok http --domain yourdomain.ngrok.app 7860
|
|
||||||
```
|
|
||||||
|
|
||||||
3. In another terminal, run the Twilio bot:
|
|
||||||
```shell
|
|
||||||
python bot_twilio.py
|
|
||||||
```
|
|
||||||
|
|
||||||
Make requests to `/start_twilio_bot` for Twilio-specific functionality.
|
|
||||||
|
|
||||||
## Deployment
|
## Deployment
|
||||||
|
|
||||||
See Pipecat Cloud deployment docs for how to deploy this example: https://docs.pipecat.daily.co/agents/deploy
|
See Pipecat Cloud deployment docs for how to deploy this example: https://docs.pipecat.daily.co/agents/deploy
|
||||||
|
|||||||
@@ -20,8 +20,7 @@ from bot_runner_helpers import (
|
|||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
from fastapi import FastAPI, HTTPException, Request
|
from fastapi import FastAPI, HTTPException, Request
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from fastapi.responses import JSONResponse, PlainTextResponse
|
from fastapi.responses import JSONResponse
|
||||||
from twilio.twiml.voice_response import VoiceResponse
|
|
||||||
|
|
||||||
from pipecat.transports.services.helpers.daily_rest import (
|
from pipecat.transports.services.helpers.daily_rest import (
|
||||||
DailyRESTHelper,
|
DailyRESTHelper,
|
||||||
@@ -125,32 +124,6 @@ async def start_bot(room_details: Dict[str, str], body: Dict[str, Any], example:
|
|||||||
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
|
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
|
||||||
|
|
||||||
|
|
||||||
async def start_twilio_bot(room_details: Dict[str, str], call_id: str) -> bool:
|
|
||||||
"""Start a Twilio bot process with the given configuration.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
room_details: Room URL, token, and SIP endpoint
|
|
||||||
call_id: Twilio call ID (CallSid)
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Boolean indicating success
|
|
||||||
"""
|
|
||||||
room_url = room_details["room"]
|
|
||||||
token = room_details["token"]
|
|
||||||
sip_endpoint = room_details["sip_endpoint"]
|
|
||||||
|
|
||||||
# Format command for Twilio bot
|
|
||||||
bot_proc = f"python3 -m bot_twilio -u {room_url} -t {token} -i {call_id} -s {sip_endpoint}"
|
|
||||||
print(f"Starting Twilio bot. Room: {room_url}")
|
|
||||||
|
|
||||||
try:
|
|
||||||
command_parts = shlex.split(bot_proc)
|
|
||||||
subprocess.Popen(command_parts, bufsize=1, cwd=os.path.dirname(os.path.abspath(__file__)))
|
|
||||||
return True
|
|
||||||
except Exception as e:
|
|
||||||
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
|
|
||||||
|
|
||||||
|
|
||||||
# ----------------- API Setup ----------------- #
|
# ----------------- API Setup ----------------- #
|
||||||
|
|
||||||
|
|
||||||
@@ -180,47 +153,6 @@ app.add_middleware(
|
|||||||
# ----------------- API Endpoints ----------------- #
|
# ----------------- API Endpoints ----------------- #
|
||||||
|
|
||||||
|
|
||||||
@app.post("/twilio_start_bot", response_class=PlainTextResponse)
|
|
||||||
async def twilio_start_bot(request: Request):
|
|
||||||
"""Handle incoming Twilio webhook calls and start a Twilio bot.
|
|
||||||
|
|
||||||
This endpoint is called directly by Twilio as a webhook when a call is received.
|
|
||||||
It puts the call on hold with music and starts a bot that will handle the call.
|
|
||||||
"""
|
|
||||||
print("POST /twilio_start_bot")
|
|
||||||
|
|
||||||
# Get form data from Twilio webhook
|
|
||||||
try:
|
|
||||||
form_data = await request.form()
|
|
||||||
data = dict(form_data)
|
|
||||||
except Exception as e:
|
|
||||||
raise HTTPException(status_code=400, detail=f"Failed to parse Twilio form data: {str(e)}")
|
|
||||||
|
|
||||||
# Get default room URL from environment
|
|
||||||
room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", None)
|
|
||||||
|
|
||||||
# Extract call ID from Twilio data
|
|
||||||
call_id = data.get("CallSid")
|
|
||||||
if not call_id:
|
|
||||||
raise HTTPException(status_code=400, detail="Missing 'CallSid' in request")
|
|
||||||
|
|
||||||
print(f"CallId: {call_id}")
|
|
||||||
|
|
||||||
# Create Daily room for the Twilio call
|
|
||||||
room_details = await create_daily_room(room_url, None) # No special config for Twilio rooms
|
|
||||||
|
|
||||||
# Start the Twilio bot
|
|
||||||
await start_twilio_bot(room_details, call_id)
|
|
||||||
|
|
||||||
# Put the call on hold until the bot is ready to handle it
|
|
||||||
# The bot will update the call with the SIP URI when it's ready
|
|
||||||
resp = VoiceResponse()
|
|
||||||
resp.play(
|
|
||||||
url="http://com.twilio.sounds.music.s3.amazonaws.com/MARKOVICHAMP-Borghestral.mp3", loop=10
|
|
||||||
)
|
|
||||||
return str(resp)
|
|
||||||
|
|
||||||
|
|
||||||
@app.post("/start")
|
@app.post("/start")
|
||||||
async def handle_start_request(request: Request) -> JSONResponse:
|
async def handle_start_request(request: Request) -> JSONResponse:
|
||||||
"""Unified endpoint to handle bot configuration for different scenarios."""
|
"""Unified endpoint to handle bot configuration for different scenarios."""
|
||||||
@@ -228,21 +160,7 @@ async def handle_start_request(request: Request) -> JSONResponse:
|
|||||||
room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", None)
|
room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", None)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Check if this is form data (from Twilio) or JSON
|
data = await request.json()
|
||||||
content_type = request.headers.get("content-type", "").lower()
|
|
||||||
|
|
||||||
if "application/x-www-form-urlencoded" in content_type:
|
|
||||||
# Handle form data from Twilio
|
|
||||||
form_data = await request.form()
|
|
||||||
data = dict(form_data)
|
|
||||||
|
|
||||||
# Check for CallSid which indicates this is a Twilio webhook
|
|
||||||
if "CallSid" in data:
|
|
||||||
# Redirect to Twilio handler for backward compatibility
|
|
||||||
return await twilio_start_bot(request)
|
|
||||||
else:
|
|
||||||
# Parse JSON request data
|
|
||||||
data = await request.json()
|
|
||||||
|
|
||||||
# Handle webhook test
|
# Handle webhook test
|
||||||
if "test" in data:
|
if "test" in data:
|
||||||
@@ -298,14 +216,6 @@ async def handle_start_request(request: Request) -> JSONResponse:
|
|||||||
return JSONResponse(response)
|
return JSONResponse(response)
|
||||||
|
|
||||||
except json.JSONDecodeError:
|
except json.JSONDecodeError:
|
||||||
# Check if this might be form data from Twilio
|
|
||||||
try:
|
|
||||||
content_type = request.headers.get("content-type", "").lower()
|
|
||||||
if "application/x-www-form-urlencoded" in content_type:
|
|
||||||
return await twilio_start_bot(request)
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
raise HTTPException(status_code=400, detail="Invalid JSON in request body")
|
raise HTTPException(status_code=400, detail="Invalid JSON in request body")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
raise HTTPException(status_code=400, detail=f"Request processing error: {str(e)}")
|
raise HTTPException(status_code=400, detail=f"Request processing error: {str(e)}")
|
||||||
|
|||||||
@@ -1,122 +0,0 @@
|
|||||||
#
|
|
||||||
# Copyright (c) 2024–2025, Daily
|
|
||||||
#
|
|
||||||
# SPDX-License-Identifier: BSD 2-Clause License
|
|
||||||
#
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import asyncio
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
from loguru import logger
|
|
||||||
from twilio.rest import Client
|
|
||||||
|
|
||||||
from pipecat.audio.vad.silero import SileroVADAnalyzer
|
|
||||||
from pipecat.pipeline.pipeline import Pipeline
|
|
||||||
from pipecat.pipeline.runner import PipelineRunner
|
|
||||||
from pipecat.pipeline.task import PipelineParams, PipelineTask
|
|
||||||
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
|
|
||||||
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
|
|
||||||
from pipecat.services.openai.llm import OpenAILLMService
|
|
||||||
from pipecat.transports.services.daily import DailyParams, DailyTransport
|
|
||||||
|
|
||||||
load_dotenv(override=True)
|
|
||||||
|
|
||||||
logger.remove(0)
|
|
||||||
logger.add(sys.stderr, level="DEBUG")
|
|
||||||
|
|
||||||
|
|
||||||
twilio_account_sid = os.getenv("TWILIO_ACCOUNT_SID")
|
|
||||||
twilio_auth_token = os.getenv("TWILIO_AUTH_TOKEN")
|
|
||||||
twilioclient = Client(twilio_account_sid, twilio_auth_token)
|
|
||||||
|
|
||||||
daily_api_key = os.getenv("DAILY_API_KEY", "")
|
|
||||||
|
|
||||||
|
|
||||||
async def main(room_url: str, token: str, callId: str, sipUri: str):
|
|
||||||
# dialin_settings are only needed if Daily's SIP URI is used
|
|
||||||
# If you are handling this via Twilio, Telnyx, set this to None
|
|
||||||
# and handle call-forwarding when on_dialin_ready fires.
|
|
||||||
transport = DailyTransport(
|
|
||||||
room_url,
|
|
||||||
token,
|
|
||||||
"Chatbot",
|
|
||||||
DailyParams(
|
|
||||||
api_key=daily_api_key,
|
|
||||||
dialin_settings=None, # Not required for Twilio
|
|
||||||
audio_in_enabled=True,
|
|
||||||
audio_out_enabled=True,
|
|
||||||
video_out_enabled=False,
|
|
||||||
vad_analyzer=SileroVADAnalyzer(),
|
|
||||||
transcription_enabled=True,
|
|
||||||
),
|
|
||||||
)
|
|
||||||
|
|
||||||
tts = ElevenLabsTTSService(
|
|
||||||
api_key=os.getenv("ELEVENLABS_API_KEY", ""),
|
|
||||||
voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
|
|
||||||
)
|
|
||||||
|
|
||||||
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
|
|
||||||
|
|
||||||
messages = [
|
|
||||||
{
|
|
||||||
"role": "system",
|
|
||||||
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by saying 'Hello! Who dares dial me at this hour?!'.",
|
|
||||||
},
|
|
||||||
]
|
|
||||||
|
|
||||||
context = OpenAILLMContext(messages)
|
|
||||||
context_aggregator = llm.create_context_aggregator(context)
|
|
||||||
|
|
||||||
pipeline = Pipeline(
|
|
||||||
[
|
|
||||||
transport.input(),
|
|
||||||
context_aggregator.user(),
|
|
||||||
llm,
|
|
||||||
tts,
|
|
||||||
transport.output(),
|
|
||||||
context_aggregator.assistant(),
|
|
||||||
]
|
|
||||||
)
|
|
||||||
|
|
||||||
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
|
|
||||||
|
|
||||||
@transport.event_handler("on_first_participant_joined")
|
|
||||||
async def on_first_participant_joined(transport, participant):
|
|
||||||
await transport.capture_participant_transcription(participant["id"])
|
|
||||||
await task.queue_frames([context_aggregator.user().get_context_frame()])
|
|
||||||
|
|
||||||
@transport.event_handler("on_participant_left")
|
|
||||||
async def on_participant_left(transport, participant, reason):
|
|
||||||
await task.cancel()
|
|
||||||
|
|
||||||
@transport.event_handler("on_dialin_ready")
|
|
||||||
async def on_dialin_ready(transport, cdata):
|
|
||||||
# For Twilio, Telnyx, etc. You need to update the state of the call
|
|
||||||
# and forward it to the sip_uri..
|
|
||||||
print(f"Forwarding call: {callId} {sipUri}")
|
|
||||||
|
|
||||||
try:
|
|
||||||
# The TwiML is updated using Twilio's client library
|
|
||||||
call = twilioclient.calls(callId).update(
|
|
||||||
twiml=f"<Response><Dial><Sip>{sipUri}</Sip></Dial></Response>"
|
|
||||||
)
|
|
||||||
except Exception as e:
|
|
||||||
raise Exception(f"Failed to forward call: {str(e)}")
|
|
||||||
|
|
||||||
runner = PipelineRunner()
|
|
||||||
await runner.run(task)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
parser = argparse.ArgumentParser(description="Pipecat Simple ChatBot")
|
|
||||||
parser.add_argument("-u", type=str, help="Room URL")
|
|
||||||
parser.add_argument("-t", type=str, help="Token")
|
|
||||||
parser.add_argument("-i", type=str, help="Call ID")
|
|
||||||
parser.add_argument("-s", type=str, help="SIP URI")
|
|
||||||
config = parser.parse_args()
|
|
||||||
|
|
||||||
asyncio.run(main(config.u, config.t, config.i, config.s))
|
|
||||||
@@ -5,8 +5,6 @@ DEEPGRAM_API_KEY=
|
|||||||
OPENAI_API_KEY=
|
OPENAI_API_KEY=
|
||||||
GOOGLE_API_KEY
|
GOOGLE_API_KEY
|
||||||
CARTESIA_API_KEY=
|
CARTESIA_API_KEY=
|
||||||
TWILIO_ACCOUNT_SID=
|
|
||||||
TWILIO_AUTH_TOKEN=
|
|
||||||
DIAL_IN_FROM_NUMBER=
|
DIAL_IN_FROM_NUMBER=
|
||||||
DIAL_OUT_TO_NUMBER=
|
DIAL_OUT_TO_NUMBER=
|
||||||
OPERATOR_NUMBER=
|
OPERATOR_NUMBER=
|
||||||
@@ -2,5 +2,4 @@ pipecat-ai[daily,cartesia,deepgram,openai,google,silero]
|
|||||||
fastapi==0.115.6
|
fastapi==0.115.6
|
||||||
uvicorn
|
uvicorn
|
||||||
python-dotenv
|
python-dotenv
|
||||||
twilio
|
|
||||||
python-multipart
|
python-multipart
|
||||||
|
|||||||
@@ -53,4 +53,3 @@ async def configure(aiohttp_session: aiohttp.ClientSession):
|
|||||||
token = await daily_rest_helper.get_token(url, expiry_time)
|
token = await daily_rest_helper.get_token(url, expiry_time)
|
||||||
|
|
||||||
return (url, token)
|
return (url, token)
|
||||||
return (url, token)
|
|
||||||
|
|||||||
@@ -1,2 +0,0 @@
|
|||||||
frontend/node_modules
|
|
||||||
frontend/out
|
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
[](https://storytelling-chatbot.fly.dev)
|
[](https://gemini-storybot.vercel.app/)
|
||||||
|
|
||||||
# Storytelling Chatbot
|
# Storytelling Chatbot
|
||||||
|
|
||||||
@@ -9,7 +9,6 @@ It periodically prompts the user for input for a 'choose your own adventure' sty
|
|||||||
|
|
||||||
We use Gemini 2.0 for creating the story and image prompts, and we add visual elements to the story by generating images using Google's Imagen.
|
We use Gemini 2.0 for creating the story and image prompts, and we add visual elements to the story by generating images using Google's Imagen.
|
||||||
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### It uses the following AI services:
|
### It uses the following AI services:
|
||||||
@@ -20,7 +19,7 @@ Transcribes inbound participant voice media to text.
|
|||||||
|
|
||||||
**Google Gemini 2.0 - LLM**
|
**Google Gemini 2.0 - LLM**
|
||||||
|
|
||||||
Our creative writer LLM. You can see the context used to prompt it [here](src/prompts.py)
|
Our creative writer LLM. You can see the context used to prompt it [here](server/prompts.py)
|
||||||
|
|
||||||
**ElevenLabs - Text-to-Speech**
|
**ElevenLabs - Text-to-Speech**
|
||||||
|
|
||||||
@@ -34,47 +33,76 @@ Adds pictures to our story. Prompting is quite key for style consistency, so we
|
|||||||
|
|
||||||
## Setup
|
## Setup
|
||||||
|
|
||||||
**Install requirements**
|
### Client
|
||||||
|
|
||||||
```shell
|
1. Navigate to the client directory:
|
||||||
python3 -m venv venv
|
|
||||||
source venv/bin/activate
|
|
||||||
pip install -r requirements.txt
|
|
||||||
```
|
|
||||||
|
|
||||||
**Create environment file and set variables:**
|
```shell
|
||||||
|
cd client
|
||||||
|
```
|
||||||
|
|
||||||
```shell
|
2. Install dependencies:
|
||||||
mv env.example .env
|
|
||||||
```
|
|
||||||
|
|
||||||
When deploying to production, to ensure only this app can spawn a new bot, set your `ENV` to `production`
|
```shell
|
||||||
|
npm install
|
||||||
|
```
|
||||||
|
|
||||||
**Build the frontend:**
|
3. Build the client:
|
||||||
|
|
||||||
This project uses a custom frontend, which needs to built. Note: this is done automatically as part of the Docker deployment.
|
```shell
|
||||||
|
npm run build
|
||||||
|
```
|
||||||
|
|
||||||
```shell
|
### Server
|
||||||
cd frontend/
|
|
||||||
npm install
|
|
||||||
npm run build
|
|
||||||
```
|
|
||||||
|
|
||||||
The build UI files can be found in `frontend/out`
|
1. Navigate to the server directory
|
||||||
|
|
||||||
## Running it locally
|
```shell
|
||||||
|
cd ../server
|
||||||
|
```
|
||||||
|
|
||||||
Start the API / bot manager:
|
2. Set up your virtual environment and install requirements
|
||||||
|
|
||||||
`python src/bot_runner.py --host localhost`
|
```shell
|
||||||
|
python3 -m venv venv
|
||||||
|
source venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
If you'd like to run a custom domain or port:
|
3. Create environment file and set variables
|
||||||
|
|
||||||
`python src/bot_runner.py --host somehost --p someport`
|
```shell
|
||||||
|
mv env.example .env
|
||||||
|
```
|
||||||
|
|
||||||
➡️ Open the host URL in your browser `http://localhost:7860`
|
You'll need API keys for:
|
||||||
|
|
||||||
If you've run previous versions of the demo, make sure to set `ENV=dev`, and remove the `RUN_AS_VM` line from the .env file.
|
- DAILY_API_KEY
|
||||||
|
- ELEVENLABS_API_KEY
|
||||||
|
- ELEVENLABS_VOICE_ID
|
||||||
|
- GOOGLE_API_KEY
|
||||||
|
|
||||||
|
4. (Optional) Deployment:
|
||||||
|
|
||||||
|
When deploying to production, to ensure only this app can spawn new bot processes, set your `ENV` to `production`
|
||||||
|
|
||||||
|
## Run it locally
|
||||||
|
|
||||||
|
1. Navigate back to the demo's root directory:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
cd ..
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Run the application:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python server/bot_runner.py --host localhost
|
||||||
|
```
|
||||||
|
|
||||||
|
You can run with a custom domain or port using: `python server/bot_runner.py --host somehost --p someport`
|
||||||
|
|
||||||
|
3. ➡️ Open the host URL in your browser: http://localhost:7860
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
|
Before Width: | Height: | Size: 1.1 KiB After Width: | Height: | Size: 1.1 KiB |
|
Before Width: | Height: | Size: 1.3 MiB After Width: | Height: | Size: 1.3 MiB |
|
Before Width: | Height: | Size: 2.4 MiB After Width: | Height: | Size: 2.4 MiB |
@@ -1,11 +1,11 @@
|
|||||||
{
|
{
|
||||||
"name": "frontend",
|
"name": "client",
|
||||||
"version": "0.1.0",
|
"version": "0.1.0",
|
||||||
"lockfileVersion": 3,
|
"lockfileVersion": 3,
|
||||||
"requires": true,
|
"requires": true,
|
||||||
"packages": {
|
"packages": {
|
||||||
"": {
|
"": {
|
||||||
"name": "frontend",
|
"name": "client",
|
||||||
"version": "0.1.0",
|
"version": "0.1.0",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@daily-co/daily-js": "^0.62.0",
|
"@daily-co/daily-js": "^0.62.0",
|
||||||
@@ -1,5 +1,5 @@
|
|||||||
{
|
{
|
||||||
"name": "frontend",
|
"name": "client",
|
||||||
"version": "0.1.0",
|
"version": "0.1.0",
|
||||||
"private": true,
|
"private": true,
|
||||||
"scripts": {
|
"scripts": {
|
||||||
|
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.7 KiB |
|
Before Width: | Height: | Size: 788 KiB After Width: | Height: | Size: 788 KiB |
2
examples/storytelling-chatbot/server/.dockerignore
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
client/node_modules
|
||||||
|
client/out
|
||||||
@@ -44,11 +44,11 @@ COPY ./requirements.txt requirements.txt
|
|||||||
RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
|
RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
|
||||||
|
|
||||||
# Copy everything else
|
# Copy everything else
|
||||||
COPY --chown=user ./src/ src/
|
COPY --chown=user ./server/ server/
|
||||||
|
|
||||||
# Copy frontend app and build
|
# Copy client app and build
|
||||||
COPY --chown=user ./frontend/ frontend/
|
COPY --chown=user ./client/ client/
|
||||||
RUN cd frontend && npm install && npm run build
|
RUN cd client && npm install && npm run build
|
||||||
|
|
||||||
# Start the FastAPI server
|
# Start the FastAPI server
|
||||||
CMD python3 src/bot_runner.py --port ${FAST_API_PORT}
|
CMD python3 server/bot_runner.py --port ${FAST_API_PORT}
|
||||||
|
Before Width: | Height: | Size: 1.4 MiB After Width: | Height: | Size: 1.4 MiB |