Compare commits

...

34 Commits

Author SHA1 Message Date
James Hush
230d92850a example: realtime with transcripts 2025-02-26 16:29:07 +08:00
Aleix Conchillo Flaqué
96c6aeaada Merge pull request #1295 from pipecat-ai/aleix/pipelinetask-keyword-arguments
PipelineTask: force constructor keyword arguments
2025-02-25 19:00:58 -08:00
Aleix Conchillo Flaqué
6722aae598 PipelineTask: force constructor keyword arguments 2025-02-25 18:58:47 -08:00
Aleix Conchillo Flaqué
66564392a6 Merge pull request #1293 from pipecat-ai/aleix/log-pipecat-version
log pipecat version on application startup
2025-02-25 18:57:52 -08:00
Aleix Conchillo Flaqué
f258f5ab66 Merge pull request #1292 from pipecat-ai/aleix/audiocontext-terminate-nicely
AudioContextWordTTSService: wait for all requested audio
2025-02-25 18:56:41 -08:00
Aleix Conchillo Flaqué
f8f0578c3d log pipecat version on application startup 2025-02-25 18:55:45 -08:00
Aleix Conchillo Flaqué
aa60a413f3 Merge pull request #1294 from pipecat-ai/aleix/improve-test-requirements
improve test-requirements.txt
2025-02-25 18:55:18 -08:00
Aleix Conchillo Flaqué
3e66f2378d improve test-requirements.txt 2025-02-25 17:34:33 -08:00
Aleix Conchillo Flaqué
9a50f33e36 AudioContextWordTTSService: wait for all requested audio 2025-02-25 15:35:47 -08:00
Aleix Conchillo Flaqué
4bd5e9c0a7 Merge pull request #1285 from pipecat-ai/aleix/handle-stop-task-gracefully
handle stop task gracefully
2025-02-25 11:25:38 -08:00
Mark Backman
12092c8715 Merge pull request #1288 from pipecat-ai/mb/clean-up-tts-text-input
TTSService: Remove newlines before sending text to TTS service to gen…
2025-02-25 14:00:43 -05:00
Mark Backman
92cc6d39f2 TTSService: Remove newlines before sending text to TTS service to generate 2025-02-25 13:37:25 -05:00
Aleix Conchillo Flaqué
34a50033cb tk: use TkTransportParams in examples 2025-02-25 10:24:24 -08:00
Aleix Conchillo Flaqué
e60b65228b allow multiple StartFrames 2025-02-25 10:24:04 -08:00
Mark Backman
e74864335b Merge pull request #1287 from pipecat-ai/mb/30-observer-pipeline-task
Example 30: Move observers to PipelineTask
2025-02-25 12:11:23 -05:00
Mark Backman
27a088a457 Merge pull request #1286 from pipecat-ai/mb/update-grok-2
Set grok-2 as default model for GrokLLMSService
2025-02-25 12:11:09 -05:00
Mark Backman
cfe72143b8 Example 30: Move observers to PipelineTask 2025-02-25 10:54:25 -05:00
Mark Backman
36a729cbfe Set grok-2 as default model for GrokLLMSService 2025-02-25 10:00:45 -05:00
Aleix Conchillo Flaqué
d2f006682c introduce new BaseTaskManager 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
fb7fe540f5 tts: don't connect to websocket if already connected 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
1ec68bd071 make sure we don't create tasks if already created 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
4536d03e82 FrameProcessor: cancel input/push tasks on CancelFrame 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
699704732c asyncio: re-raise CancelledError in wait_for_task() 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
376d969a77 task: handle StopFrame and StopTaskFrame gracefully 2025-02-24 23:38:51 -08:00
Aleix Conchillo Flaqué
68789dfcf0 frames: add new StopFrame 2025-02-24 21:34:23 -08:00
Aleix Conchillo Flaqué
fe9fc61c4e Merge pull request #1282 from pipecat-ai/aleix/pipelinetask-observers-constructor
PipelineTask: pass observers in contructor parameter
2025-02-24 21:29:46 -08:00
Aleix Conchillo Flaqué
6028f0f23a PipelineTask: pass observers in contructor parameter 2025-02-24 21:29:17 -08:00
Aleix Conchillo Flaqué
e9a0959e28 Merge pull request #1283 from pipecat-ai/aleix/check-dangling-tasks
PipelineTask: add check_dangling_tasks parameter
2025-02-24 21:26:32 -08:00
Dominic Stewart
f66be2cfa7 Dom/gemini system prompt switching (#1260)
* Updated example to use Gemini

* Fixed typo

* Based on feedback, made the gemini file something that can be called separately

* Updated the readme

* Updated the readme

* Changed example to use gemini 2.0 flash lite

* This works

* Improvement

* I think this works

* Updated the code to use the correct prompt broken down into smaller pieces

* Added a few more things to detect in the prompt

* Fixed import ordering

* Updated prompt for non gemini bot to look for more voicemail examples, plus added logic to detect if we're doing dialin or not to avoid a non-fatal dialin related error

* moved terminate call to handlers class

* Simplified logic for dialin

* Forgot to use the same logic for the openai bot

* Starting to add logic for native audio input for flash lite

* Fixed logic

* Fixed some code based on suggestions
2025-02-24 22:29:55 -06:00
Aleix Conchillo Flaqué
f818bed58f Merge pull request #1281 from pipecat-ai/aleix/google-context-aggregator-upgrade-context
google: updgrade OpenAILLMContext to GoogleLLMContext
2025-02-24 17:37:26 -08:00
Aleix Conchillo Flaqué
07b9be5308 PipelineTask: add check_dangling_tasks parameter 2025-02-24 17:33:10 -08:00
Aleix Conchillo Flaqué
40c2452d6e google: updgrade OpenAILLMContext to GoogleLLMContext 2025-02-24 15:35:18 -08:00
Aleix Conchillo Flaqué
30cdd1b71a Merge pull request #1280 from pipecat-ai/aleix/add-completion-timeout
services(llm): add on_completion_timeout event
2025-02-24 15:07:20 -08:00
Aleix Conchillo Flaqué
2110b79507 services(llm): add on_completion_timeout event 2025-02-24 14:55:36 -08:00
141 changed files with 886 additions and 367 deletions

View File

@@ -5,10 +5,32 @@ All notable changes to **Pipecat** will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## Unreleased
## [Unreleased]
### Added
- Pipecat version will now be logged on every application startup. This will
help us identify what version we are running in case of any issues.
- Added a new `StopFrame` which can be used to stop a pipeline task while
keeping the frame processors running. The frame processors could then be used
in a different pipeline. The difference between a `StopFrame` and a
`StopTaskFrame` is that, as with `EndFrame` and `EndTaskFrame`, the
`StopFrame` is pushed from the task and the `StopTaskFrame` is pushed upstream
inside the pipeline by any processor.
- Added a new `PipelineTask` parameter `observers` that replaces the previous
`PipelineParams.observers`.
- Added a new `PipelineTask` parameter `check_dangling_tasks` to enable or
disable checking for frame processors' dangling tasks when the Pipeline
finishes running.
- Added new `on_completion_timeout` event for LLM services (all OpenAI-based
services, Anthropic and Google). Note that this event will only get triggered
if LLM timeouts are setup and if the timeout was reached. It can be useful to
retrigger another completion and see if the timeout was just a blip.
- Added new log observers `LLMLogObserver` and `TranscriptionLogObserver` that
can be useful for debugging your pipelines.
@@ -20,6 +42,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed
- ⚠️ `PipelineTask` now requires keyword arguments (except for the first one for
the pipeline).
- The base `TTSService` class now strips leading newlines before sending text
to the TTS provider. This change is to solve issues where some TTS providers,
like Azure, would not output text due to newlines.
- `GrokLLMSService` now uses `grok-2` as the default model.
- `AnthropicLLMService` now uses `claude-3-7-sonnet-20250219` as the default
model.
@@ -36,12 +67,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
stt = DeepgramSTTService(..., live_options=LiveOptions(model="nova-2-general"))
```
### Deprecated
- `PipelineParams.observers` is now deprecated, you the new `PipelineTask`
parameter `observers`.
### Removed
- Remove `TransportParams.audio_out_is_live` since it was not being used at all.
### Fixed
- Fixed an `AudioContextWordTTSService` issue that would cause an `EndFrame` to
disconnect from the TTS service before audio from all the contexts was
received. This affected services like Cartesia and Rime.
- Fixed an issue that was not allowing to pass an `OpenAILLMContext` to create
`GoogleLLMService`'s context aggregators.
- Fixed a `ElevenLabsTTSService`, `FishAudioTTSService`, `LMNTTTSService` and
`PlayHTTTSService` issue that was resulting in audio requested before an
interruption being played after an interruption.

View File

@@ -3,10 +3,10 @@ coverage~=7.6.12
grpcio-tools~=1.67.1
pip-tools~=7.4.1
pre-commit~=4.0.1
pyright~=1.1.393
pyright~=1.1.394
pytest~=8.3.4
pytest-asyncio~=0.25.2
ruff~=0.9.5
pytest-asyncio~=0.25.3
ruff~=0.9.7
setuptools~=70.0.0
setuptools_scm~=8.1.0
python-dotenv~=1.0.1

View File

@@ -17,7 +17,7 @@ from runner import configure
from pipecat.frames.frames import AudioRawFrame, EndFrame, OutputAudioRawFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport

View File

@@ -119,7 +119,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -124,7 +124,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):

View File

@@ -70,7 +70,7 @@ async def main(room_url: str, token: str):
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -62,7 +62,7 @@ async def main(room_url: str, token: str):
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -18,8 +18,7 @@ from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.fal import FalImageGenService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.tk import TkLocalTransport
from pipecat.transports.local.tk import TkLocalTransport, TkTransportParams
load_dotenv(override=True)
@@ -34,7 +33,9 @@ async def main():
transport = TkLocalTransport(
tk_root,
TransportParams(camera_out_enabled=True, camera_out_width=1024, camera_out_height=1024),
TkTransportParams(
camera_out_enabled=True, camera_out_width=1024, camera_out_height=1024
),
)
imagegen = FalImageGenService(

View File

@@ -44,7 +44,8 @@ async def main():
runner = PipelineRunner()
task = PipelineTask(
Pipeline([imagegen, transport.output()]), PipelineParams(enable_metrics=True)
Pipeline([imagegen, transport.output()]),
params=PipelineParams(enable_metrics=True),
)
@transport.event_handler("on_first_participant_joined")

View File

@@ -30,8 +30,7 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.cartesia import CartesiaHttpTTSService
from pipecat.services.fal import FalImageGenService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.tk import TkLocalTransport, TkOutputTransport
from pipecat.transports.local.tk import TkLocalTransport, TkTransportParams
load_dotenv(override=True)
@@ -152,7 +151,7 @@ async def main():
transport = TkLocalTransport(
tk_root,
TransportParams(
TkTransportParams(
audio_out_enabled=True,
camera_out_enabled=True,
camera_out_width=1024,

View File

@@ -105,7 +105,10 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(enable_metrics=True, enable_usage_metrics=True),
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
)
@transport.event_handler("on_first_participant_joined")

View File

@@ -127,7 +127,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -76,7 +76,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -74,7 +74,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -79,7 +79,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -103,7 +103,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -81,7 +81,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -74,7 +74,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -74,7 +74,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -75,7 +75,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -77,7 +77,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -83,7 +83,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -81,7 +81,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -81,7 +81,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -75,7 +75,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -80,7 +80,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -71,7 +71,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -88,7 +88,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -81,7 +81,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -79,7 +79,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -80,7 +80,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -76,7 +76,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -74,7 +74,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -74,7 +74,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -251,7 +251,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -74,7 +74,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -78,7 +78,11 @@ async def main():
runner = PipelineRunner()
task = PipelineTask(
pipeline, PipelineParams(audio_in_sample_rate=24000, audio_out_sample_rate=24000)
pipeline,
params=PipelineParams(
audio_in_sample_rate=24000,
audio_out_sample_rate=24000,
),
)
await runner.run(task)

View File

@@ -24,8 +24,7 @@ from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.tk import TkLocalTransport
from pipecat.transports.local.tk import TkLocalTransport, TkTransportParams
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
@@ -67,7 +66,7 @@ async def main():
tk_transport = TkLocalTransport(
tk_root,
TransportParams(
TkTransportParams(
audio_out_enabled=True,
camera_out_enabled=True,
camera_out_is_live=True,
@@ -83,7 +82,11 @@ async def main():
pipeline = Pipeline([daily_transport.input(), MirrorProcessor(), tk_transport.output()])
task = PipelineTask(
pipeline, PipelineParams(audio_in_sample_rate=24000, audio_out_sample_rate=24000)
pipeline,
params=PipelineParams(
audio_in_sample_rate=24000,
audio_out_sample_rate=24000,
),
)
async def run_tk():

View File

@@ -76,7 +76,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -112,7 +112,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -99,7 +99,13 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True, enable_metrics=True))
task = PipelineTask(
pipeline,
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -153,7 +153,13 @@ If you need to use a tool, simply use the tool. Do not tell the user the tool yo
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True, enable_metrics=True))
task = PipelineTask(
pipeline,
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
),
)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -152,7 +152,7 @@ indicate you should use the get_image tool are:
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -116,7 +116,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -113,7 +113,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -117,7 +117,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -116,7 +116,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -116,7 +116,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -123,7 +123,7 @@ Start by asking me for my location. Then, use 'get_weather_current' to give me a
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -123,7 +123,7 @@ Start by asking me for my location. Then, use 'get_weather_current' to give me a
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -117,7 +117,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -83,7 +83,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -133,7 +133,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -126,7 +126,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -85,7 +85,13 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True, enable_metrics=True))
task = PipelineTask(
pipeline,
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
),
)
# When a participant joins, start transcription for that participant so the
# bot can "hear" and respond to them.

View File

@@ -108,7 +108,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
report_only_initial_ttfb=True,

View File

@@ -16,10 +16,13 @@ from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import TranscriptionMessage, TranscriptionUpdateFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.transcript_processor import TranscriptProcessor
from pipecat.services.deepgram import DeepgramSTTService
from pipecat.services.openai_realtime_beta import (
InputAudioTranscription,
OpenAIRealtimeBetaLLMService,
@@ -140,21 +143,29 @@ Remember, your responses should be short. Just one or two sentences, usually."""
tools,
)
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
# Create transcript processor and handler
transcript = TranscriptProcessor()
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
transcript.user(), # User transcripts
context_aggregator.user(),
llm, # LLM
context_aggregator.assistant(),
transcript.assistant(), # Assistant transcripts
transport.output(), # Transport bot output
]
)
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
@@ -162,9 +173,16 @@ Remember, your responses should be short. Just one or two sentences, usually."""
),
)
# Register event handler for transcript updates
@transcript.event_handler("on_transcript_update")
async def on_transcript_update(processor, frame):
logger.debug(f"Received transcript update with {len(frame.messages)} new messages")
for msg in frame.messages:
logger.debug(msg)
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
# await transport.capture_participant_transcription(participant["id"])
# Kick off the conversation.
await task.queue_frames([context_aggregator.user().get_context_frame()])

View File

@@ -212,7 +212,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -237,7 +237,7 @@ Remember, your responses should be short. Just one or two sentences, usually."""
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -209,7 +209,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -263,7 +263,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -87,7 +87,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
# We just use 16000 because that's what Tavus is expecting and
# we avoid resampling.
audio_in_sample_rate=16000,

View File

@@ -145,7 +145,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -138,6 +138,7 @@ class OutputGate(FrameProcessor):
self._gate_open = start_open
self._frames_buffer = []
self._notifier = notifier
self._gate_task = None
def close_gate(self):
self._gate_open = False
@@ -178,10 +179,13 @@ class OutputGate(FrameProcessor):
async def _start(self):
self._frames_buffer = []
self._gate_task = self.create_task(self._gate_task_handler())
if not self._gate_task:
self._gate_task = self.create_task(self._gate_task_handler())
async def _stop(self):
await self.cancel_task(self._gate_task)
if self._gate_task:
await self.cancel_task(self._gate_task)
self._gate_task = None
async def _gate_task_handler(self):
while True:
@@ -351,7 +355,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -342,6 +342,7 @@ class OutputGate(FrameProcessor):
self._gate_open = start_open
self._frames_buffer = []
self._notifier = notifier
self._gate_task = None
def close_gate(self):
self._gate_open = False
@@ -382,10 +383,13 @@ class OutputGate(FrameProcessor):
async def _start(self):
self._frames_buffer = []
self._gate_task = self.create_task(self._gate_task_handler())
if not self._gate_task:
self._gate_task = self.create_task(self._gate_task_handler())
async def _stop(self):
await self.cancel_task(self._gate_task)
if self._gate_task:
await self.cancel_task(self._gate_task)
self._gate_task = None
async def _gate_task_handler(self):
while True:
@@ -560,7 +564,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -25,10 +25,8 @@ from pipecat.frames.frames import (
InputAudioRawFrame,
LLMFullResponseEndFrame,
LLMFullResponseStartFrame,
LLMMessagesFrame,
StartFrame,
StartInterruptionFrame,
StopInterruptionFrame,
SystemFrame,
TextFrame,
TranscriptionFrame,
@@ -555,6 +553,7 @@ class OutputGate(FrameProcessor):
self._notifier = notifier
self._context = context
self._transcription_buffer = user_transcription_buffer
self._gate_task = None
def close_gate(self):
self._gate_open = False
@@ -602,10 +601,13 @@ class OutputGate(FrameProcessor):
async def _start(self):
self._frames_buffer = []
self._gate_task = self.create_task(self._gate_task_handler())
if not self._gate_task:
self._gate_task = self.create_task(self._gate_task_handler())
async def _stop(self):
await self.cancel_task(self._gate_task)
if self._gate_task:
await self.cancel_task(self._gate_task)
self._gate_task = None
async def _gate_task_handler(self):
while True:
@@ -740,7 +742,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -87,7 +87,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -122,7 +122,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -354,7 +354,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -63,7 +63,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -89,7 +89,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -120,7 +120,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -79,7 +79,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -106,7 +106,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -1,5 +1,5 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 2024-2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
@@ -34,7 +34,7 @@ search_tool = {"google_search": {}}
tools = [search_tool]
system_instruction = """
You are an expert at providing the most recent news from any place. Your responses will be converted to audio, so avoid using special characters or overly complex formatting.
You are an expert at providing the most recent news from any place. Your responses will be converted to audio, so avoid using special characters or overly complex formatting.
Always use the google search API to retrieve the latest news. You must also use it to check which day is today.
@@ -93,7 +93,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -83,7 +83,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
),

View File

@@ -150,7 +150,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -150,7 +150,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -178,7 +178,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -117,13 +117,13 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
report_only_initial_ttfb=True,
observers=[DebugObserver(), LLMLogObserver()],
),
observers=[DebugObserver(), LLMLogObserver()],
)
@transport.event_handler("on_first_participant_joined")

View File

@@ -32,7 +32,7 @@ async def main():
pipeline = Pipeline([NullProcessor()])
task = PipelineTask(pipeline, PipelineParams(enable_heartbeats=True))
task = PipelineTask(pipeline, params=PipelineParams(enable_heartbeats=True))
runner = PipelineRunner()

View File

@@ -1,5 +1,5 @@
#
# Copyright (c) 2024, Daily
# Copyright (c) 2024-2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
@@ -38,7 +38,7 @@ search_tool = {"google_search_retrieval": {}}
tools = [search_tool]
system_instruction = """
You are an expert at providing the most recent news from any place. Your responses will be converted to audio, so avoid using special characters or overly complex formatting.
You are an expert at providing the most recent news from any place. Your responses will be converted to audio, so avoid using special characters or overly complex formatting.
Always use the google search API to retrieve the latest news. You must also use it to check which day is today.
@@ -117,7 +117,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -230,7 +230,7 @@ Your response will be turned into speech so use only simple words and punctuatio
)
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -92,10 +92,8 @@ async def main():
task = PipelineTask(
pipeline,
params=PipelineParams(
allow_interruptions=True,
observers=[rtvi.observer()],
),
params=PipelineParams(allow_interruptions=True),
observers=[rtvi.observer()],
)
@rtvi.event_handler("on_client_ready")

View File

@@ -140,10 +140,8 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
allow_interruptions=True,
observers=[GoogleRTVIObserver(rtvi)],
),
params=PipelineParams(allow_interruptions=True),
observers=[GoogleRTVIObserver(rtvi)],
)
@rtvi.event_handler("on_client_ready")

View File

@@ -346,7 +346,7 @@ async def main():
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=False))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=False))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -106,12 +106,12 @@ curl -X POST "http://localhost:7860/daily_start_bot" \
-d '{"dialoutNumber": "+18057145330", "detectVoicemail": true}'
```
### New! Using Gemini with Daily
### New! Using Gemini 2.0 Flash Lite with Daily
We have introduced a new example file that uses Gemini. You can find the code within bot_daily_gemini.py.
If you want to spin up a Gemini-based bot for this demo, instead of an OpenAI-based bot, call the same properties above but on the `daily_gemini_start_bot` endpoint instead.
We have introduced support for Google's Gemini 2.0 Flash Lite model in this example. This lightweight model offers faster response times and reduced costs while maintaining good conversational capabilities.
For example:
**Quick Start**
To use the Gemini-based bot instead of OpenAI:
```shell
curl -X POST "http://localhost:7860/daily_gemini_start_bot" \ py pipecat
@@ -119,7 +119,27 @@ curl -X POST "http://localhost:7860/daily_gemini_start_bot" \
-d '{"detectVoicemail": true}'
```
Any request body properties supported by `/daily_start_bot` (such as "detectVoicemail", "dialoutnumber", etc) can also be passed to `/daily_gemini_start_bot`. The only difference is that calling the Gemini endpoint will start a Gemini bot session.
All request body parameters supported by /daily_start_bot (such as detectVoicemail, dialoutNumber, etc.) are also compatible with /daily_gemini_start_bot.
This example uses context switching to help steer the bot in the right direction. As Flash Lite is a smaller model, breaking the prompt down into smaller piece helps to improve the bot's accuracy.
For example, instead of giving one large prompt like:
```python
system_instruction="""You are a chatbot that needs to detect if you're talking to a voicemail system or human, then either leave a message or have a conversation. If it's voicemail, say "Hello, this is a message..." and hang up. If it's a human, introduce yourself and be helpful until they say goodbye."""
```
We break it into stages:
First prompt focuses only on detection: "Determine if this is voicemail or human"
After detection, we switch to a new context: either "Leave this specific voicemail message" or "Have a conversation with the human".
**Implementation Details**
The implementation is available in bot_daily_gemini.py and features:
- Staged prompting approach: Breaking down complex tasks into smaller, more focused prompts to improve the lightweight model's performance
- Dynamic context switching: The bot can change its behavior in real-time based on what it detects (voicemail vs. human caller)
- Function-based architecture: Uses function calling to trigger context switches and call termination
### More information

View File

@@ -49,7 +49,11 @@ async def main(
# If you are handling this via Twilio, Telnyx, set this to None
# and handle call-forwarding when on_dialin_ready fires.
dialin_settings = DailyDialinSettings(call_id=callId, call_domain=callDomain)
# We don't want to specify dial-in settings if we're not dialing in
dialin_settings = None
if callId and callDomain:
dialin_settings = DailyDialinSettings(call_id=callId, call_domain=callDomain)
transport = DailyTransport(
room_url,
token,
@@ -96,6 +100,13 @@ async def main(
- **"Please leave a message after the beep."**
- **"No one is available to take your call."**
- **"Record your message after the tone."**
- **"Please leave a message after the beep"**
- **"You have reached voicemail for..."**
- **"You have reached [phone number]"**
- **"[phone number] is unavailable"**
- **"The person you are trying to reach..."**
- **"The number you have dialed..."**
- **"Your call has been forwarded to an automated voice messaging system"**
- **Any phrase that suggests an answering machine or voicemail.**
- **ASSUME IT IS A VOICEMAIL. DO NOT WAIT FOR MORE CONFIRMATION.**
- **IF THE CALL SAYS "PLEASE LEAVE A MESSAGE AFTER THE BEEP", WAIT FOR THE BEEP BEFORE LEAVING A MESSAGE.**
@@ -139,7 +150,7 @@ async def main(
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
if dialout_number:
logger.debug("dialout number detected; doing dialout")

View File

@@ -7,17 +7,29 @@ import argparse
import asyncio
import os
import sys
from dataclasses import dataclass
from typing import Optional
import google.ai.generativelanguage as glm
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndTaskFrame
from pipecat.frames.frames import (
BotStoppedSpeakingFrame,
EndTaskFrame,
Frame,
InputAudioRawFrame,
SystemFrame,
TranscriptionFrame,
UserStartedSpeakingFrame,
UserStoppedSpeakingFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.frame_processor import FrameDirection
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContextFrame
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.services.ai_services import LLMService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.google import GoogleLLMContext, GoogleLLMService
@@ -33,10 +45,119 @@ daily_api_key = os.getenv("DAILY_API_KEY", "")
daily_api_url = os.getenv("DAILY_API_URL", "https://api.daily.co/v1")
class UserAudioCollector(FrameProcessor):
"""This FrameProcessor collects audio frames in a buffer, then adds them to the
LLM context when the user stops speaking.
"""
def __init__(self, context, user_context_aggregator):
super().__init__()
self._context = context
self._user_context_aggregator = user_context_aggregator
self._audio_frames = []
self._start_secs = 0.2 # this should match VAD start_secs (hardcoding for now)
self._user_speaking = False
async def process_frame(self, frame, direction):
await super().process_frame(frame, direction)
if isinstance(frame, TranscriptionFrame):
# We could gracefully handle both audio input and text/transcription input ...
# but let's leave that as an exercise to the reader. :-)
return
if isinstance(frame, UserStartedSpeakingFrame):
self._user_speaking = True
elif isinstance(frame, UserStoppedSpeakingFrame):
self._user_speaking = False
self._context.add_audio_frames_message(audio_frames=self._audio_frames)
await self._user_context_aggregator.push_frame(
self._user_context_aggregator.get_context_frame()
)
elif isinstance(frame, InputAudioRawFrame):
if self._user_speaking:
self._audio_frames.append(frame)
else:
# Append the audio frame to our buffer. Treat the buffer as a ring buffer, dropping the oldest
# frames as necessary. Assume all audio frames have the same duration.
self._audio_frames.append(frame)
frame_duration = len(frame.audio) / 16 * frame.num_channels / frame.sample_rate
buffer_duration = frame_duration * len(self._audio_frames)
while buffer_duration > self._start_secs:
self._audio_frames.pop(0)
buffer_duration -= frame_duration
await self.push_frame(frame, direction)
class ContextSwitcher:
def __init__(self, llm, context_aggregator):
self._llm = llm
self._context_aggregator = context_aggregator
async def switch_context(self, system_instruction):
"""Switch the context to a new system instruction based on what the bot hears."""
# Create messages with updated system instruction
messages = [
{
"role": "system",
"content": system_instruction,
}
]
# Update context with new messages
self._context_aggregator.set_messages(messages)
# Get the context frame with the updated messages
context_frame = self._context_aggregator.get_context_frame()
# Trigger LLM response by pushing a context frame
await self._llm.push_frame(context_frame)
class FunctionHandlers:
def __init__(self, context_switcher):
self.context_switcher = context_switcher
async def voicemail_response(
self, function_name, tool_call_id, args, llm, context, result_callback
):
"""Function the bot can call to leave a voicemail message."""
message = """You are Chatbot leaving a voicemail message. Say EXACTLY this message and nothing else:
"Hello, this is a message for Pipecat example user. This is Chatbot. Please call back on 123-456-7891. Thank you."
After saying this message, call the terminate_call function."""
await self.context_switcher.switch_context(system_instruction=message)
await result_callback("Leaving a voicemail message")
async def human_conversation(
self, function_name, tool_call_id, args, llm, context, result_callback
):
"""Function the bot can when it detects it's talking to a human."""
message = """You are Chatbot talking to a human. Be friendly and helpful.
Start with: "Hello! I'm a friendly chatbot. How can I help you today?"
Keep your responses brief and to the point. Listen to what the person says.
When the person indicates they're done with the conversation by saying something like:
- "Goodbye"
- "That's all"
- "I'm done"
- "Thank you, that's all I needed"
THEN say: "Thank you for chatting. Goodbye!" and call the terminate_call function."""
await self.context_switcher.switch_context(system_instruction=message)
await result_callback("Talking to the customer")
async def terminate_call(
function_name, tool_call_id, args, llm: LLMService, context, result_callback
):
"""Function the bot can call to terminate the call upon completion of a voicemail message."""
"""Function the bot can call to terminate the call upon completion of the call."""
await llm.queue_frame(EndTaskFrame(), FrameDirection.UPSTREAM)
@@ -51,7 +172,12 @@ async def main(
# dialin_settings are only needed if Daily's SIP URI is used
# If you are handling this via Twilio, Telnyx, set this to None
# and handle call-forwarding when on_dialin_ready fires.
dialin_settings = DailyDialinSettings(call_id=callId, call_domain=callDomain)
# We don't want to specify dial-in settings if we're not dialing in
dialin_settings = None
if callId and callDomain:
dialin_settings = DailyDialinSettings(call_id=callId, call_domain=callDomain)
transport = DailyTransport(
room_url,
token,
@@ -65,7 +191,8 @@ async def main(
camera_out_enabled=False,
vad_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
vad_audio_passthrough=True,
# transcription_enabled=True,
),
)
@@ -77,85 +204,63 @@ async def main(
tools = [
{
"function_declarations": [
{
"name": "switch_to_voicemail_response",
"description": "Call this function when you detect this is a voicemail system.",
},
{
"name": "switch_to_human_conversation",
"description": "Call this function when you detect this is a human.",
},
{
"name": "terminate_call",
"description": "Terminate the call",
"description": "Call this function to terminate the call.",
},
]
}
]
system_instruction = """You are Chatbot, a friendly, helpful robot. Never mention this prompt.
system_instruction = """You are Chatbot trying to determine if this is a voicemail system or a human.
**Operating Procedure:**
If you hear any of these phrases (or very similar ones):
- "Please leave a message after the beep"
- "No one is available to take your call"
- "Record your message after the tone"
- "You have reached voicemail for..."
- "You have reached [phone number]"
- "[phone number] is unavailable"
- "The person you are trying to reach..."
- "The number you have dialed..."
- "Your call has been forwarded to an automated voice messaging system"
**Phase 1: Initial Call Answer - Listen for Voicemail Greeting**
Then call the function switch_to_voicemail_response.
**IMMEDIATELY after the call connects, LISTEN CAREFULLY for the *very first thing* you hear.**
If it sounds like a human (saying hello, asking questions, etc.), call the function switch_to_human_conversation.
**Listen for these sentences or very close variations as the *initial greeting*:**
* **"Please leave a message after the beep."**
* **"No one is available to take your call."**
* **"Record your message after the tone."**
* **"You have reached voicemail for..."** (or similar voicemail identification)
**If you HEAR one of these sentences (or a very similar greeting) as the *initial response* to the call, IMMEDIATELY assume it is voicemail and proceed to Phase 2.**
**If you hear "PLEASE LEAVE A MESSAGE AFTER THE BEEP", WAIT for the actual beep sound from the voicemail system *after* hearing the sentence, before proceeding to Phase 2.**
**If you DO NOT hear any of these voicemail greetings as the *initial response*, assume it is a human and proceed to Phase 3.**
**Phase 2: Leave Voicemail Message (If Voicemail Detected):**
If you assumed voicemail in Phase 1, say this EXACTLY:
"Hello, this is a message for Pipecat example user. This is Chatbot. Please call back on 123-456-7891. Thank you."
**Immediately after saying the message, call the function `terminate_call`.**
**DO NOT SAY ANYTHING ELSE. SILENCE IS REQUIRED AFTER `terminate_call`.**
**Phase 3: Human Interaction (If No Voicemail Greeting Detected in Phase 1):**
If you did not detect a voicemail greeting in Phase 1 and a human answers, say:
"Oh, hello! I'm a friendly chatbot. Is there anything I can help you with?"
Keep your responses **short and helpful.**
If the human is finished, say:
"Okay, thank you! Have a great day!"
**Then, immediately call the function `terminate_call`.**
**VERY IMPORTANT RULES - DO NOT DO THESE THINGS:**
* **DO NOT SAY "Please leave a message after the beep."**
* **DO NOT SAY "No one is available to take your call."**
* **DO NOT SAY "Record your message after the tone."**
* **DO NOT SAY ANY voicemail greeting yourself.**
* **Only check for voicemail greetings in Phase 1, *immediately after the call connects*.**
* **After voicemail or human interaction, ALWAYS call `terminate_call` immediately.**
* **Do not speak after calling `terminate_call`.**
* Your speech will be audio, so use simple language without special characters.
"""
DO NOT say anything until you've determined if this is a voicemail or human."""
llm = GoogleLLMService(
model="models/gemini-2.0-flash-exp",
model="models/gemini-2.0-flash-lite-preview-02-05",
api_key=os.getenv("GOOGLE_API_KEY"),
system_instruction=system_instruction,
tools=tools,
)
llm.register_function("terminate_call", terminate_call)
context = GoogleLLMContext()
context_aggregator = llm.create_context_aggregator(context)
audio_collector = UserAudioCollector(context, context_aggregator.user())
context_switcher = ContextSwitcher(llm, context_aggregator.user())
handlers = FunctionHandlers(context_switcher)
llm.register_function("switch_to_voicemail_response", handlers.voicemail_response)
llm.register_function("switch_to_human_conversation", handlers.human_conversation)
llm.register_function("terminate_call", terminate_call)
pipeline = Pipeline(
[
transport.input(), # Transport user input
audio_collector, # Collect audio frames
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
@@ -166,7 +271,7 @@ If the human is finished, say:
task = PipelineTask(
pipeline,
PipelineParams(allow_interruptions=True),
params=PipelineParams(allow_interruptions=True),
)
if dialout_number:

View File

@@ -77,7 +77,7 @@ async def main(room_url: str, token: str, callId: str, sipUri: str):
]
)
task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):

View File

@@ -90,7 +90,7 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(allow_interruptions=True, enable_metrics=True),
params=PipelineParams(allow_interruptions=True, enable_metrics=True),
)
@transport.event_handler("on_first_participant_joined")

View File

@@ -172,12 +172,12 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
observers=[RTVIObserver(rtvi)],
),
observers=[RTVIObserver(rtvi)],
)
await task.queue_frame(quiet_frame)

View File

@@ -198,12 +198,12 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,
observers=[RTVIObserver(rtvi)],
),
observers=[RTVIObserver(rtvi)],
)
await task.queue_frame(quiet_frame)

View File

@@ -104,7 +104,7 @@ async def main(room_url, token=None):
main_task = PipelineTask(
main_pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=True,
enable_metrics=True,
enable_usage_metrics=True,

View File

@@ -155,8 +155,10 @@ Your task is to help the user understand and learn from this article in 2 senten
task = PipelineTask(
pipeline,
PipelineParams(
audio_out_sample_rate=44100, allow_interruptions=True, enable_metrics=True
params=PipelineParams(
audio_out_sample_rate=44100,
allow_interruptions=True,
enable_metrics=True,
),
)

View File

@@ -183,12 +183,12 @@ async def main():
task = PipelineTask(
pipeline,
PipelineParams(
params=PipelineParams(
allow_interruptions=False, # We don't want to interrupt the translator bot
enable_metrics=True,
enable_usage_metrics=True,
observers=[RTVIObserver(rtvi)],
),
observers=[RTVIObserver(rtvi)],
)
@transport.event_handler("on_first_participant_joined")

View File

@@ -108,7 +108,9 @@ async def run_bot(websocket_client: WebSocket, stream_sid: str, testing: bool):
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=8000, audio_out_sample_rate=8000, allow_interruptions=True
audio_in_sample_rate=8000,
audio_out_sample_rate=8000,
allow_interruptions=True,
),
)

View File

@@ -142,7 +142,9 @@ async def run_client(client_name: str, server_url: str, duration_secs: int):
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=8000, audio_out_sample_rate=8000, allow_interruptions=True
audio_in_sample_rate=8000,
audio_out_sample_rate=8000,
allow_interruptions=True,
),
)

View File

@@ -125,7 +125,9 @@ async def main():
task = PipelineTask(
pipeline,
params=PipelineParams(
audio_in_sample_rate=16000, audio_out_sample_rate=16000, allow_interruptions=True
audio_in_sample_rate=16000,
audio_out_sample_rate=16000,
allow_interruptions=True,
),
)

View File

@@ -0,0 +1,13 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
from importlib.metadata import version
from loguru import logger
__version__ = version("pipecat-ai")
logger.info(f"ᓚᘏᗢ Pipecat {__version__} ᓚᘏᗢ")

Some files were not shown because too many files have changed in this diff Show More